Updating algorithms with multi-step stochastic correction

Nested multi-step stochastic correction offers a possibility to improve updating algorithms for numerical simulations of lattice gauge theories with fermions. The corresponding generalisations of the two-step multi-boson (TSMB) algorithm as well as some applications with hybrid Monte Carlo (HMC) algorithms are considered.


Introduction
The main task in numerical Monte Carlo simulations of lattice gauge theories with fermions is to evaluate the (ratio of) fermion determinants appearing in the Boltzmann weight for the gauge fields.The idea of the stochastic ("noisy") correction [1] is to prepare a new proposal of the gauge configuration during updating by some approximation of the determinant ratio and accept or reject the change based on a stochastic estimator.This "stochastic correction step" takes care of the deviation of the approximate determinant ratio from the exact one.
In multi-boson updating algorithms [2] it is natural to introduce a stochastic correction step in order to correct for the deviations of the applied polynomial approximations.In special cases it is possible to perform the correction by an iterative inverter [3].More generally, the correction step can be based on successively better polynomial approximations, as in the two-step multi-boson (TSMB) algorithm [4].A suitable way to obtain the necessary polynomial approximations is to use a recursive scheme providing least-square optimisation [5,6].Based on this stochastic correction scheme, the TSMB updating algorithm has been successfully applied in several numerical simulation projects both in supersymmetric Yang Mills theory (see [7] and references therein) and in QCD (see, for instance, [8,9,10,11]).
In the present paper we generalise the idea of stochastic correction into a scheme of nested successive corrections based on polynomial approximations with successively increasing precision.(A similar "multi-level Metropolis" scheme has been proposed in Ref. [12,13].)In the next section we consider multi-step multi-boson algorithms.The last section is devoted to different possibilities for combining multi-step stochastic correction with variants of the hybrid Monte Carlo (HMC) updating algorithm [14].In particular, optimised HMC algorithms based on mass preconditioning [12,15] and polynomial hybrid Monte Carlo (PHMC) algorithms [16] are considered.

Multi-step multi-boson algorithms
The multi-step multi-boson (MSMB) algorithm is a generalisation of the TSMB updating algorithm.Therefore, let us briefly recapitulate the basics of TSMB.Let us assume that the determinant of the Hermitian fermion matrix Q = Q † is positive, at least on most of the gauge configurations occurring with non-negligible weight in the path integral.In this case the sign of the determinant can either be neglected or taken into account by reweighting on an ensemble of configurations obtained by updating without the sign.(If the sign of the determinant plays an important role then there is a "sign problem" which cannot be dealt with by a straightforward Monte Carlo simulation procedure.)Without the sign the determinant factor in the Boltzmann weight of the gauge configurations is where in case of N f mass-degenerate Dirac-fermion flavours we have α = 1 2 N f .(Note that for a Majorana fermion α = 1 4 .)Of course, for several fermion flavours with different masses there are several factors as in (1).Applying determinant break-up [17,18] one writes with some positive integer n B .In what follows we always consider a single determinant factor with an effective power α: If there are several such factors in the path integral then each of them can be separately taken into account in the same way.
The basic ingredient of TSMB is a polynomial approximation where the interval [ǫ, λ] covers the eigenvalue spectrum of Q 2 on gauge configurations having a non-negligible weight in the path integral.The determinant factor in the Boltzmann weight can then be taken into account with Lüscher's multi-boson representation.Assuming that the roots of the polynomial P (x) occur in complex conjugate pairs, one can introduce the equivalent forms where n is the degree of P (x) and the roots are r j ≡ (µ j + iν j ) 2 with ρ j ≡ µ j + iν j .
With the help of complex boson (pseudofermion) fields Φ jx one can write In the representation (6) the complex boson fields Φ jx , j = 1, 2, . . ., n carry the indices of the corresponding fermion fields.For instance, in QCD with Wilson-type fermions there are colour and Dirac-spinor indices.Since the multi-boson action in ( 6) is local, similarly to the gauge field action, one can apply the usual bosonic updating algorithms like Metropolis, heatbath or overrelaxation.In fact, the multi-boson action is Gaussian hence for the multi-boson fields a global heatbath update is also possible which creates, for a fixed gauge field, a statistically independent new set of boson fields.
The polynomial approximation in (4) is not exact.In order to obtain an exact updating algorithm one has to correct for its deviation from the function to be approximated.One can easily show that for small fermion masses in lattice units the (typical) smallest eigenvalue of Q 2 behaves as (am) 2 and for a fixed quality of approximation within the interval [ǫ, λ] the degree of the polynomial is growing as In general, the polynomial approximation has to be precise enough in order that the deviations in expectation values be smaller than the statistical errors.In practical applications, for instance in QCD simulations, this would require very high degree polynomials with n of the order 10 3 -10 4 .(For numerical examples showing the convergence rate of the polynomial approximations see [6].) Performing numerical simulations with such a high n is practically impossible (and would be in any case completely ineffective).The way out is to perform the corrections stochastically.
For improving the approximation in (4) a second polynomial is introduced: The first polynomial P 1 (x) gives a crude approximation as in (4) with P 1 (x) ≡ P (x).
The second polynomial P 2 (x) gives a good approximation according to During the updating process P 1 is realized by multi-boson updates whereas P 2 is taken into account stochastically by a noisy correction step.For this, after preparing a new set of gauge fields [U ′ ] from the old one [U ] by local updates, one generates a Gaussian random vector having a distribution and accepts the change of the gauge field where One can show [4] that this update procedure satisfies the detailed balance condition and hence creates the correct distribution of the gauge fields.(See the proof for the more general case of MSMB given below in ( 20)- (23).) The Gaussian noise vector η can be obtained from η ′ distributed according to the simple Gaussian distribution by setting it equal to In order to obtain the inverse square root on the right hand side of ( 13), one can proceed with a polynomial approximation Note that here the interval [ǭ, λ] can be chosen differently, usually with ǭ < ǫ, from the approximation interval [ǫ, λ] for P 2 .The polynomial approximation in (7) can only become exact in the limit when the degree n 2 of the second polynomial P 2 is infinite.Instead of investigating the dependence of expectation values on n 2 by performing several simulations, it is also possible to fix some high value of n 2 for the simulation and perform another correction in the measurement of expectation values by still finer polynomials.This is done by reweighting the configurations.(A similar reweighting procedure is applied in the PHMC algorithm of Ref. [16].)This measurement correction is based on a further polynomial approximation P ′ with polynomial degree n ′ which satisfies lim The interval [ǫ ′ , λ] can be chosen by convenience, for instance, such that ǫ ′ = 0, λ = λ max , where λ max is an absolute upper bound of the eigenvalues of Q 2 .(In case of ǫ ′ = 0 the approximation interval is strictly speaking (ǫ ′ , λ].An absolute upper bound for the eigenvalues of Q 2 exists because the commonly used fermion matrices are bounded from above.)In practice, instead of ǫ ′ = 0, it is more effective to take ǫ ′ > 0 and determine the eigenvalues below ǫ ′ and the corresponding correction factors exactly.
For the evaluation of P ′ one can use n ′ -independent recursive relations [5], which can be stopped by observing the required precision of the result.After reweighting the expectation value of a quantity A is given by where η is a simple Gaussian noise like η ′ in (12).Here . . .U,η denotes an expectation value on the gauge field sequence, which is obtained in the two-step process described before, and on a sequence of independent η's.The expectation value with respect to the η-sequence can be considered as a Monte Carlo updating process with the trivial action S η ≡ η † η.The length of the η-sequence on a fixed gauge configuration can, in principle, be arbitrarily chosen.In practice it has to be optimised for obtaining the smallest possible errors with a given amount of computer time.The polynomial approximations in (4), ( 8), ( 14) and ( 15) can be obtained in a recursive scheme providing least-square optimisation [5,6].Numerical methods to determine the polynomial coefficients can be based either on arbitrary precision arithmetics [19] or on discretisation of the approximation interval [20].The expansion in appropriately defined orthogonal polynomials is an important ingredient, both in determining the polynomial coefficients and in the application of the polynomials of the squared fermion matrix Q 2 on a vector.
Least-square optimisation corresponds to minimising the L 2 -norm of the deviation.An often used alternative is to minimise the L ∞ norm which is equivalent to the minimisation of the maximal relative deviation.In general, the goal is to obtain the smallest possible deviation of the expectation values with the smallest possible polynomial degree.The experience with the least-square optimisation in TSMB has been rather satisfactory because it gives the best overall fit of the lattice action with a given polynomial degree.(For numerical examples comparing L 2 -with L ∞ -optimisation see Ref. [6].)The often stated advantage of minimising the upper limit of the relative deviation of the lattice action is relativised by the fact that the deviation of the expectation values from the correct ones is in general a complicated function of the deviation in the lattice action.
The multi-step multi-boson (MSMB) updating algorithm is a straightforward generalisation of TSMB updating.Instead of the two-step approximation in (7) we now consider a sequence of polynomial approximations of arbitrary length: Here the subsequent polynomials define approximations with increasing precision according to The first polynomial P 1 is realized during updating by local updates as in TSMB.
The higher approximations P 2 , . . ., P k are implemented by a sequence of nested noisy correction steps as in ( 9)- (11).The necessary Gaussian distributions of noise vectors can be obtained by appropriate polynomials, similarly to (14): The proof of the detailed balance condition for MSMB goes essentially in the same way as for TSMB.The aim is to reproduce with the first i correction steps the canonical distribution of the gauge field where the short notation ) is used and S g [U ] denotes the action for the gauge field.Let us assume that detailed balance holds for the first (i − 1) steps, that is the transition probability ) satisfies The transition probability of the i'th step is a product of ) with the acceptance probability ): It is easy to show that if ) is defined according to ( 9)-( 11) with P 2 replaced by P i then the acceptance probability satisfies From this immediately follows that the transition probability of the i'th step ) satisfies the detailed balance condition ( 21) with (i − 1) replaced by (i).
An alternative way to prove that the described procedure creates the correct distribution of the gauge fields is to consider the fields η as additional pseudofermion fields in the Markov chain with the lattice action given by the exponent in (9).
The advantage of the multi-step scheme compared to the two-step one is that the lower approximations can be chosen to be less accurate and consequently have lower polynomial degrees and are faster to perform.The last approximations, which are very precise and need high polynomial degrees, can be done less frequently.The last polynomial P k can already be chosen so precise that, for some given statistical error, the measurement correction with P ′ becomes unnecessary.
An easy generalisation of the multi-step scheme described until now is to require the correct function to be approximated in (17) only in the last step and allow for functions easier to approximate in the previous steps.This means that (18) can be generalised, for instance, to with positive ρ i and ρ k = 0.This has a resemblance to the "mass preconditioning" as introduced for HMC algorithms in Ref. [12,15].The advantage of ( 24) is that for ρ i > 0 one can decrease the degree of the polynomial P i (x) and at the same time, if ρ i /ρ i−1 is not much smaller than 1, the acceptance in the i'th correction step remains high enough.
There are other multi-step approximation schemes conveivable: for instance, one can take P i (x) ≃ x −α/k , (i = 1, . . ., k) which corresponds to the determinant breakup in (2).Similarly, "mass preconditioning" can also be considered as a generalisation of determinant breakup.
We performed several tests with the MSMB algorithms in some of the simulation points of Ref. [11] with the Wilson fermion action for two flavours of quarks and the DBW2 gauge action [21] for the colour gauge field.In particular, on an 8 3 • 16 lattice at β = 0.55, κ = 0.188, µ = 0 (simulation point (c) in [11] with a bare quark mass in lattice units am q ≃ 0.015) a three-step algorithm was tuned for obtaining better performance.(Here µ denotes the "twisted mass" which is actually set equal to zero in these runs.)In another test run on a 16 3 • 32 lattice we have chosen a point where a detailed simulation has been performed recently with both the TSMB and HMC algorithm [22], namely at β = 0.74, κ = 0.158, µ = 0 with a bare quark mass in lattice units am q ≃ 0.024.In a three-step algorithm the following parameters were chosen: n B = 2, n 1 = 60, n 2 = 200, n2 = 300, n 3 = 800, n3 = 900.(The degree of the polynomials P i and Pi is denoted by n i and ni , respectively.)The second correction step was called after performing 10 update cycles involving the first correction.The integrated autocorrelation for the average plaquette in these test runs were typically around τ int plaq ≃ 10 full update cycles including the second correction.The simulation costs in these runs turned out to be, even with a moderate effort put in parameter tuning, by about a factor of 1.5 lower than in the corresponding welltuned TSMB runs.The gain comes from the lower cost of the first correction compared to the correction step in TSMB.The cost of the second correction does not contribute much to the full cost because it is done infrequently.For instance, on the 16 3 •32 lattice the TSMB run had the parameters n B = 4, n 1 = 34, n 2 = 720, n2 = 740.(Note that the cost of the correction is mainly determined by the product n B (n 2 + n2 ) which is 5840 in TSMB and only 1000 in the first correction of MSMB.)

Multi-step correction for HMC
The first (updating) step producing a new gauge field configuration can also be replaced by Hybrid Monte Carlo trajectories [14].In this step some approximation of the fermion determinant can be used and after a few trajectories one can perform a stochastic correction step.The rest within a multi-step correction scheme is the same as in MSMB updating.
A possible application of multi-step stochastic corrections is to perform a HMC update with a mass-preconditioned fermion matrix which corresponds to ρ 1 > 0 in Eq. ( 24) and correct for the exact determinant (that is, ρ 1 = 0) stochastically.The polynomials for the stochastic corrections are defined in the same way as in (24).
Another possibility is to start by an update step as in polynomial hybrid Monte Carlo (PHMC) [16].In order to generate the correct distribution of pseudofermion fields at the beginning of the trajectory one needs a polynomial as in (19) In order to avoid very high degree first polynomials P 1 (x), which would cause problems with rounding errors in the calculation of the fermionic force [23], one should use determinant break-up (see Eq. ( 2)).The ordering of the root factors in the expression of the fermionic force [16] is best done according to the procedure proposed in [5].Again, the stochastic correction steps can be performed during the update according to the procedure described in Section 2.
Besides decreasing the polynomial degrees in the PHMC update step, another advantage of applying determinant breakup is that both magnitude and variance of the quark force is decreased approximately proportional to n −1/2 B .In some test runs on 8 3 • 16 lattices the performance of the PHMC algorithm with stochastic correction turned out to be promisingly good.In particular, we performed simulations with the parameters β = 0.55, κ = 0.184, 0.186, 0.188, µ = 0 corresponding to the points (a), (b) and (c) in Ref. [11] with bare quark masses in lattice units am q ≃ 0.071, 0.039, 0.015, respectively.The PHMC trajectories were created by applying the Sexton-Weingarten-Peardon integration scheme with multiple time scales [24,25].Gains up to factors of 5 were observed in comparison with the costs of the TSMB runs.The origin of this better performance is that the integrated autocorrelations are shorter, whereas the costs for one update cycle are similar to TSMB (see Table 3 of [11]).These numbers also show that in these points PHMC with stochastic correction is better than MSMB.

Summary
In summary, multi-step stochastic correction is a useful and flexible tool which can be implemented in both multi-bosonic and hybrid Monte Carlo update algorithms.In the present paper we reported on first tests with the multi-step multi-boson (MSMB) and stochastically corrected polynomial Monte Carlo algorithm which look promising.In our test runs on relatively small lattices and with moderately small quark masses the PHMC algorithm with stochastic correction is faster than MSMB.Of course, further tests on larger lattices and at smaller quark masses are necessary before applying these updating algorithms in large scale simulations.The relation between the cost factors of MSMB versus PHMC may also be different depending on the lattice volume and quark mass.
Based on our experience with the TSMB algorithm, we expect the computational costs of our multi-step stochastic correction schemes to increase only slightly faster than linear with the number of lattice sites.This differs from the multi-level Metropolis scheme proposed in Ref. [12,13] where the volume dependence is quadratic.
An important feature of both the MSMB and of the PHMC algorithm with multistep stochastic correction is that they are applicable for odd numbers of flavours, too, provided that there is no sign problem with the fermion determinant.The same holds for the rational hybrid Monte Carlo (RHMC) algorithm [26] where multi-step stochastic correction might also be useful.
The main advantage of the stochastic correction in several steps compared to a single stochastic correction is that the costly last correction has to be done infrequently.This feature becomes increasingly more important for large lattices at small fermion masses where the cost of the last correction increases proportional to the inverse quark mass in lattice units.