Vacuum Decay and Euclidean Lattice Monte Carlo

The decay rate of a metastable vacuum is usually calculated using a semiclassical approximation to the Euclidean path integral. The extension to a complete Euclidean lattice Monte Carlo computation, however, is hampered by analytic continuations that are ill-suited to numerical treatment, and the nonequilibrium nature of a metastable state. In this paper we develop a new methodology to compute vacuum decay rates from Monte Carlo simulations of Euclidean lattice theories. To test the new method, we consider simple quantum mechanical systems systems with metastable vacua. This work can be extended to Euclidean field theories, which we discuss in the Conclusions.


Introduction
The decay of a metastable vacuum state is an old and well-studied problem in quantum mechanics (QM) and quantum field theory (QFT). It is well-known how to compute the tunneling rate in QM using semiclassical methods, and these techniques can be extended in a natural way to QFT [1,2]. In recent years the theory of tunneling has received renewed attention [3][4][5][6][7][8][9][10][11].
Since the standard semiclassical analysis is performed using the Euclidean path integral, it is natural to ask whether Euclidean lattice theory can also be used to study vacuum decay. In addition to ordinary barrier penetration problems, lattice methods could be useful for quantitative studies of vacuum decay in situations where the semiclassical methods are inadequate, such as the decay of vacua that emerge from strong dynamics (see e.g. Ref. [12]). Formulating and refining a lattice approach to these problems might also yield methods of more general interest and applicability.
However, Euclidean Monte Carlo (MC) simulations of false vacua are not without subtleties. A configuration which begins in a metastable state, or in a false vacuum (FV), will evolve in Monte Carlo time to eventually thermally fluctuate over the barrier. In the semiclassical limit, the barrier "peak" is a saddle point of the classical action, a solution known as the bounce [1], and the Monte Carlo time evolution can be thought of schematically as "false vacuum → bounce → true vacuum." If the true vacuum (TV) is deep, as a practical matter, the system will never return to the false vacuum after thermalization, so all configurations in the thermalized ensemble describe the true vacuum. They are exponentially more important than the bounce and they are only rendered innocuous after a final analytic continuation back to real time, a point emphasized in the study of Ref. [3] which sought to place the problem of vacuum decay on more rigorous footing. This analytic continuation is more or less straightforward in semiclassical analyses, but it is impractical in an MC approach.
In this paper, we develop a new framework to compute approximate but accurate decay rates from Euclidean lattice simulations. To test the approach, we consider QM tunneling problems as illustrated in Fig. 1. Our primary results are the definition of a new observable that approximates the decay rate of a quantum mechanical metastable vacuum, a prescription for its computation in Euclidean Monte Carlo simulations, and numerical simulations testing the accuracy of the method.
The remainder of this paper is organized as follows. In Sec. 2 we develop the necessary theoretical tools, define our computational approach, and describe the systematic uncertainties introduced by the associated approximations. In Sec. 3 we apply the method to a representative family of potentials. An advantage of studying QM tunneling problems is the ability to compute the decay rate by solving the time-dependent Schrödinger equation (TDSE). We perform three-way comparisons between results obtained from solving the TDSE ("exact"), from Euclidean lattice Monte Carlo computations ("lattice"), and from semiclassical analyses. We find good agreement between the results over several decades in the decay rate, thus establishing the accuracy of our lattice method. In Sec. 4 we turn our attention to very long lifetimes, where computing the rate from ensembles of practical sizes requires a different approach. We propose the "constrained ensemble reweighting" method and illustrate it with an example. Our conclusions are presented in Sec. 5, where we further outline how our framework can be extended to Euclidean quantum field theories. We consider single-particle quantum mechanics with a tunneling potential. An example potential is shown in Fig. 1. The continuum Euclidean action is In this normalization x is treated as a 0 + 1D field: the kinetic term has a dimensionless coefficient 1/2, so that the dimension of x is [x] = [E −1/2 ]. This definition of x is used throughout this paper. With the false vacuum positioned at x FV = 0, we parametrize the leading term in the expansion of the potential around x FV as V (x) = 1 2 m 2 x 2 + . . .. Since this term has the same form as the mass term in scalar field theories, we can consider the dimensionful parameter m as the mass of the particle. A more detailed description of the potential is given in Sec. 3.1.
The continuum Euclidean path integral facilitates a convenient semiclassical treatment of false vacuum decay. One first constructs the bounce, a solution x b (t) to the Euclidean equations of motion that asymptotes to the classical false vacuum at early and late times.
The leading order (LO) decay rate is governed by the bounce action, Γ ∼ e −S E [x b ] . The nextto-leading-order (NLO) correction is given by the quadratic fluctuation integrals around the bounce. In these integrals the low lying modes of the fluctuation operator must be treated separately. Zero modes associated with symmetries can be treated with a collective coordinate method. More importantly, the bounce is always associated with a single mode of negative eigenvalue. The integral over the amplitude of this mode is divergent and is generally defined by analytic continuation.
On the lattice, a simple choice for the discretized action is where a is the lattice spacing and N T = 2T /a is the total number of sites (2T is the total time). The difference between the lattice action and the continuum action is O(a 2 ) due to the discrete second-order derivative.
In order to study vacuum decay in Euclidean lattice Monte Carlo simulations, we must first identify an observable that can be related to the desired decay rate and computed with Monte Carlo methods. We show that the probability density to find the particle at the classical turning point has the desired properties and describe its computation with Euclidean path integrals and its relation to the decay rate in Sec. 2.2.
Any continuum calculation in Euclidean time must be analytically continued to real time. However, such continuations are impractical in lattice Monte Carlo computations because they require exponential sensitivity. We elaborate on the problem in Sec. 2.3 and define a procedure that avoids the need for analytic continuation, removing the exponential sensitivity requirement, at the cost of introducing a systematic error.

Probability Densities from Euclidean Path Integrals
The probability density for the system to be in the state |x⟩ at time t, given that we started from a normalized state ψ at t = 0, is When |ψ⟩ = |FV⟩, a metastable state localized near the classical false vacuum, the decay rate is defined as The result for Γ should not be sensitive to the exact definition of the FV region, as long as it reasonably contains the point x FV and does not extend beyond b. The long T limit of Eq. (4) is satisfied when T is large compared to the "escape attempt time" ∼ 1/m in the false vacuum, 1/m ≪ T . If we consider times within the long T limit that are short compared to 1/Γ, then the probability P (FV, T ) ≈ 1, and the decay rate can be estimated as in this regime. Now let us relateṖ (R, T ) to ρ. We havė Here j(b, T ) is a probability current flowing through x = b, and we have used the continuity equationρ(x, T ) = −∂ x j(x, T ). We can also define a probability flow velocity u through Semiclassically, the probability flow velocity can be estimated from the classical definition of the kinetic energy The relationship u(b, T ) ≈ √ m is easily validated for specific examples by the numerical solution of the time-dependent Schrödinger equation (TDSE). In Fig. 2, we compare u = j/ρ from the full quantum mechanics and the approximation u cl = 2(E FV − V (x)), exhibiting good agreement when x ≳ b. u cl is not expected to match j/ρ in the classically forbidden region, i.e., when x is substantially smaller than b.
Therefore, if ρ(b, T ) can be computed by other means, then the decay rate can be estimated as The advantage of this formulation is that ρ(b, T ) can be evaluated with a Euclidean path integral and is approximately independent of T in the time range of interest 1/m ≪ T ≪ 1/Γ described above. We define a Euclidean transition amplitude, where the Euclidean propagator over time T between some y and z is or in the classically allowed region is plotted.
The real-time probability density is In the third line, we make variable changes t = −iτ in the first integral and t = iτ in the second integral. At this point there is no analytic continuation and τ is imaginary. Subsequently we analytically continue to real τ in the clockwise direction in both integrals. The "| T →iT " operation denotes a counterclockwise continuation back to Minkowski time after the integrals are computed. The Euclidean quantity A(ψ * , b; −T ) is defined formally by computing A(ψ * , b; T ) and continuing T → −T , where |ψ * ⟩ is the complex conjugate of the state |ψ⟩ in the position representation. Generalizing to an unnormalized initial state ψ, we have 5 Before the replacement T → iT , both the numerator and the denominator are Euclidean path integrals and the total time extent is 2T . The numerator has a path constraint x(τ = 0) = b while the denominator does not.
In the decay of a false vacuum, there is a range of Lorentzian time over which we expect ρ is approximately time-independent. This occurs on timescales 1/m ≪ T ≪ 1/Γ. This is also true in the Euclidean picture if each amplitude in the numerator of Eq. (12) is dominated by localized events (similar to half of a single bounce solution, in semiclassical language), so that again changing the duration T does not appreciably change the amplitude. We now make this assumption and interrogate it in Sec. 3.
With both the Euclidean and Lorentzian amplitudes approximately time-independent, the continuation in Eq. (12) can be ignored. Any T -dependence in the normalization of the initial state cancels with the T -dependence in the normalization of the denominator. Another way to describe this time-independence is to say that the false vacuum is almost an energy eigenstate |ϵ⟩ of the complete Hamiltonian. Therefore on timescales short compared to 1/Γ, the state does not change appreciably and √ mρ(b, T ) ≈ Γ, a constant. The dominant Euclidean time evolution in A, cancels between the numerator and the denominator of ρ.
Equation (12) is still not in the form of an expectation value of an observable, which would be convenient for computation in Euclidean MC simulations. To relate it to such an observable, we exploit the time-independence described above and the symmetry of the Euclidean amplitudes. We consider real initial wave functions ψ(y) ∈ R and writê where In practice, when δ is chosen finite and small enough, f (b; 0) is an observable that returns 1/δ if a path is in a small region [b−δ, b] at time t = 0, and zero otherwise. In our calculation, we use δ = 0.04 β/m, since β/m is a characteristic scale for x as is shown in Eq. (38).
The definition ofρ differs from that of ρ by T → −T in the second factors of A and the absence of analytic continuation of T . However, if in the time regime of interest both ρ and ρ are approximately T -independent, then We examine the T -dependence ofρ below, where we find that with one important modification we can indeed approximate it as T -independent.
In the Monte Carlo simulation, we use periodic boundary conditions (PBCs) x(−T ) = x(T ),ẋ(−T ) =ẋ(T ) in Euclidean time with large T , so that the state ψ into which the ensemble initially thermalizes is approximately the perturbative ground state in the false vacuum. This allows us to exploit time translation symmetry and improve the ensemble statistics. With PBCs, the rare events where x(t) ≳ b can occur at a random Euclidean time t, and all random times have an equal chance for such rare events. Therefore, we can average the probability density at b over all Euclidean times t ∈ [−T, T ) to approximatê ρ(FV, 0; b, T ).
To summarize, we have related the decay rate to an observableρ(b) that can be computed in MC. There are three primary approximations which introduce uncertainties into the result. First, we assume that T can be chosen so that 1/m ≪ T ≪ 1/Γ, which allows both the approximation Γ ≈ −Ṗ (FV) and the analytic continuations described above. Second, we approximate the probability flow velocity by u ≈ √ m, which is a fairly good approximation in practice, as we verify by explicit comparison with the TDSE solution. Third, we assume that the relevant Euclidean amplitudes are dominated by trajectories that probe beyond the barrier in localized, rare events, so that they are insensitive to T . The T interval 1/m ≪ T ≪ 1/Γ is necessary but not sufficient for this to be true, as we discuss in the next subsection.
We note that our method is complementary to the "direct method" of Ref. [3], which is also expressed using the Euclidean path integral. The direct method involves taking an imaginary part after the analytic continuation. Such a procedure, when applied to a Monte Carlo calculation, may be sensitive to the details of how the analytic continuation is performed. Instead, our method avoids taking an imaginary part by constructing an observable that is approximately independent of T , so that the analytic continuation T → iT is rendered innocuous.

Cuts in Ensemble Generation and Postselection: Controlling the Negative Mode
Although we have identified a useful lattice observable, there is still an issue of the unwanted dominance of true-vacuum-like configurations in MC that must be addressed before we can apply it to real simulations. We now illustrate the problem in detail, using semiclassical language for convenience, and describe a practical resolution for lattice MC computations.
Let us briefly review the NLO semiclassical contribution to the decay rate to establish notation and ideas. We decompose paths near the bounce as with the normalization condition, The basis {x n } is chosen such that it diagonalizes the Euclidean action expanded to the quadratic order as with the ordering of n defined through λ 0 ≤ λ 1 ≤ λ 2 ≤ · · · . The NLO contribution to the path integral around the bounce is up to an overall normalization. However, the lowest eigenvalue is negative, λ 0 < 0, and the second-lowest eigenvalue is zero, λ 1 = 0. The zero-mode x 1 (t) reflects the time translation invariance of the bounce x b (t), so the integral of c 1 can be replaced by an integral of the center time of the bounce which gives a factor of 2T , still convergent for large but finite T . The negative mode x 0 (t) leads to an exponential divergence. Qualitatively, this divergence can be explained by the the fact that a path x(t) that spends the majority of its time near the true vacuum has an action about −2|V TV |T < 0, much lower than the bounce action Typically, an analytic continuation in the c 0 contour is taken to make the integral converge. However, in a Monte Carlo simulation, an analogous procedure to "analytic continuation of the c 0 contour" is not available. Once the ensemble generation passes near the bounce saddle point of the action, with high probability it will rapidly evolve toward configurations that spend most of their time near the true vacuum. The action of a typical configuration in the above situation is then even lower than the action of a typical false-vacuum-like configuration, so it is extremely unlikely to fluctuate back over the saddle point. This behavior is illustrated in Fig. 3. Starting with a false-vacuum-like configuration enables the observation of two distinct perturbative vacua, but the ensemble is not useful for quantitatively computing the decay rate.
To obtain a useful result from Monte Carlo, we impose a cut to discard configurations that go too far into the direction of the true vacuum. First, let us return to the semiclassical picture and see the effect of cutting off the c 0 integral instead of continuing it. shows the Euclidean action. The initial configuration is taken to be x(t) = x FV , no thermalization steps are taken, and the adjacent MC times are relatively correlated. At early Monte Carlo times, the configuration remains near the false vacuum, with ⟨x⟩ ≈ x FV = 0 and a relatively high Euclidean action due to the quantum fluctuations around the false vacuum. For MC times from about 500 to 1000, a transition starts which takes the configuration from the false vacuum to the true vacuum region with x > x TV = 6.82/ √ m in this example. The Euclidean action after the transition is significantly lower than before the transition. In this example, the difference in action ∼ 100 implies an enormous suppression ∼ e −100 for the ensemble to ever return to the false vacuum. In other words, if the ensemble reaches global equilibrium, then the FV-static configuration is essentially never observed, and the thermalized ensemble behaves like that of a free particle because of the flat potential at x > x TV . [The free particle has equal probability to move in both directions, which is responsible for the (random) decrease in ⟨x⟩ ET after the transition.] The constant pure imaginary term takes "+" for the integral limits deformed to arg(−c min 0 ) = arg(c max 0 ) ∈ (0, π) and "−" for arg(−c min 0 ) = arg(c max 0 ) ∈ (π, 2π). At arg(−c min 0 ) = arg(c max 0 ) = 0 or π, the asymptotic expansion at large |c min 0 | and |c max 0 | is ill-defined. For a finite T , with a convention x 0 (0) > 0, the increasing direction of c 0 drives the configuration x b (t) + c 0 x 0 (t) toward the true vacuum region R. In fact, in the full functional integral, when T is finite, there are always effective cutoffs on fluctuations in the c 0 direction, and these cutoffs are proportional to T . For example, in the positive c 0 direction, the lowest possible action configuration is the true vacuum, where the action is S TV = 2V TV T . c max 0 is a function of T with an unknown functional form, but c max 0 → ∞ when T → ∞. In the other direction there is an effective cutoff associated with the false vacuum configuration. Therefore, if the integral in Eq. (21) is analytically continued by replacing T → iT and with the limit T → ∞ taken before computing the integral, then the final result is −i 2π/|λ 0 |. Its imaginary part combined with the fluctuation integrals of other modes gives the NLO decay rate. This is why the continuation T → iT is both subtle and important: it removes exponentially large T -dependent contributions to the Euclidean amplitudes [3]. However, it is impractical to numerically evaluate the path integral at large T with such high precision that the finite constant term −i 2π/|λ 0 | can be resolved against a "background" term that exponentially grows with T . We need a more aggressive cut on configurations that fluctuate too far toward the true vacuum.
Again we begin with the semiclassical computation. When finite cuts c max 0 and c min 0 are imposed, then the c 0 integral is a finite number that is generically unequal to 2π/|λ 0 |, but may be close to it for a suitable choice of cuts. For example, ordinary Gaussian integrals are dominated by the region within a standard deviation or so of the peak. Let us therefore set As long as the cutoff c max Therefore, without continuing the contour and simply placing cutoffs on the negative mode integral, we can compute the NLO decay rate up to an O(1) relative correction.
However, beyond the semiclassical approximation, for example in Monte Carlo simulation, it is not obvious how to implement a cut on c 0 when the theory is formulated in configurations {x(t)} instead of the {c n } basis. We need a different approach with similar properties.
Instead, we consider a functional of x(t) defined as where Θ(·) is the Heaviside step function. Only times t such that x(t) > b, i.e., the configuration goes beyond the point b and into the classically allowed region R, contributes to S b V . V (x(t)) is lower than V (b) = V FV = 0 and thus negative when x(t) > b. In other words, S b V measures the contribution to the action solely from the parts that can lower it below the action of the false vacuum. Configurations can be characterized into a one-parameter family using We place a hard wall on S b V [x] during ensemble generation, and then place a more stringent cut on it during postselection. The latter is taken at the minimum location of the probability density function p(S b V ). This corresponds to not rejecting too many configurations (cutting off the Gaussian c 0 integral too close to the peak) while not moving too far in the direction of the true vacuum (where the result becomes exponentially sensitive to the cutoff). At the minimum of p(S b V ), results for observables are also minimally sensitive to the precise choice of the cut. In Appendix B we give a more detailed justification for this choice and test it on example potentials.
With this prescription for eliminating unwanted configurations, we anticipate that the ensembles indeed satisfy the conditions such thatρ is approximately T -independent and provides a good estimate of the rate Γ. We now turn to testing the method numerically on various example potentials.

The Potentials and Semiclassical Properties
We use "modified double-well potentials" of the form shown schematically in Fig. 1 as a family of useful QM examples. We parametrize the potential as where the value of V TV is defined to maintain the continuity of the potential at x = x TV .
(We remind the reader that in our normalization x is a 0 + 1D scalar field and thus has the dimension energy −1/2 rather than the dimension of a physical position, energy −1 .) We define the potential so that The large flat region to the right of x TV is useful to have a continuum or quasicontinuum of unbound states for the metastable state localized around x FV to decay into. The classical turning point is labeled by b and the classically allowed region is In the region x ≤ x TV this potential is exactly a quartic potential, so the semiclassical analysis is very similar to the case of the latter potential.
The only three parameters in this model are m, η, and λ. We then reparametrize the theory using a similar parametrization as in Ref. [13]. With the nondimensionalization intō t andx,t = mt (26) the Euclidean action of a path x(t) that does not enter the modified region x ≥ x TV can be rewritten as where there are two dimensionless parameters, We then choose m as the only dimensionful parameter. Thus m sets the energy scales of the theory, and we mostly work in units where m = 1. When needed, m can be restored from dimensional analysis. m, α, and β form the new set of parameters that are a rearrangement of m, η, and λ.
With the new parametrization, the potential in Eq. (25) takes the form We can analytically solve for the classical vacua and turning point, We further define the dimensionless potential, the dimensionless Euclidean Lagrangian, and the corresponding action,S This action is dependent only on α and independent of m and β. The complete action is proportional to β, Some useful relations between the two parametrizations are and the n-th-order derivative ofV (x), The parameter α always satisfies 0 < α < 1 and controls the shape of the potential. In the limit α → 1, the false and true vacua become degenerate as V TV → −2mβ(1 − α). In the limit α → 0, the true vacuum approaches minus infinity with leading behavior V TV → −27mβ/(8α 3 ). β is always positive and controls the overall scale of V . β → ∞ is the semiclassical limit where the quantum theory is governed by the classical bounce solution x b (t) (saddle point). Effects from quantum fluctuations δx(t) = x(t) − x b (t), except for the negative and zero modes, are exponentially suppressed by exp −β( when β is large. 13

Simulation Results
After introducing the S b V cut described in Sec. 2.3, we can perform a lattice Monte Carlo computation ofρ(FV, 0; b, T ), i.e., the probability density at x = b at Euclidean time T starting from the false vacuum state at time zero. We impose periodic boundary conditions in Euclidean time to improve the statistics; for large T , the temperature is low enough that the system initially thermalizes close to the false vacuum if the Markov chain is seeded with an initial configuration equal to the semiclassical false vacuum, x = 0.
To establish an appropriate cut on S b V , we first compute the probability density function p(S b V ). To find the minimum of this function, a finite sample may not be sufficient, since the function exhibits statistical fluctuations and we are interested in the region where p(S b V ) is approximately flat. We use kernel density estimation (KDE) [14,15] and gradient descent to compute p(S b V ) and search for the minimum. We use the Epanechnikov kernel [16] with the kernel width small enough to capture local variation of the density function but still large enough to contain sufficient configurations. The typical scale of the kernel width for our setup is O(10 −1 ). In each iteration step, KDE can compute p at the target S b V from the gradient descent with a low cost. In the gradient descent method, we start from several initial values of S b V and compare the local minima found by different initial values, due to statistical fluctuation, to find the global minimum.
As shown above, the decay rate Γ ≈ ρ b (T ) when 1/m ≪ T ≪ 1/Γ. Therefore, we report ρ b computed from MC as Γ and compare it against Γ computed from the solution of the TDSE, the NLO semiclassical Gel'fand-Yaglom (GY) method, and the LO semiclassical/dimensional analysis (DA) method Γ DA = me −S b . The TDSE and semiclassical results are only expected to agree in the far semiclassical limit, and comparing both with the MC results provides a measure of how much information the MC can access beyond the different levels of semiclassical approximation in intermediate regimes.
The parameters used in our MC ensembles are given in Table 1  Since we work in units where the mass is unity, and other scales in the problem like the spatial size of the semiclassical bounce solution are expected to be of this order, 1 we mostly work with lattice spacing am = 0.1 and volume N T = 400. These choices are expected to avoid large corrections from lattice artifacts and finite volume effects, which we validate by varying these choices in two of the analyses described below. The ensemble-level S b Vcut is mostly taken to be −2.0, which is large enough in magnitude to avoid impacting the postselection S b V -cut, while at the same time preventing the ensemble from probing 1 In quantum field theory, the bounce can easily be much larger than the scalar mass parameter since it scales as the inverse of the semiclassical energy splitting between the true and false vacua. This is an effect of a friction term in the equation of motion defining the bounce. In quantum mechanics the friction term is not present, and to obtain a bounce much larger than the input mass scale requires an exponential tuning of the energy splitting. Typically, the bounce is still larger than 1/m, so our estimate Γ DA = me −S b is actually larger than the usual LO estimate Γ LO = R −1 b e −S b common in the literature. We see from the figures that the latter would only worsen the discrepancy of the LO estimate with the NLO, TDSE, and MC results.
configurations too close to the true vacuum, where it could get stuck. With these reasonable choices for the lattice/ensemble parameters, we compute Γ for a range of potentials defined by α and β. In Fig. 4 we vary β with fixed α. From the semiclassical perspective, varying β is a probe of the LO exponential factor, Γ ∼ e −βS b . We find that the MC computation matches the exact TDSE result up to a factor < 2 over a range Γ/m ∼ 10 −4 -10 −1 . In the same range the NLO GY method achieves similar accuracy, with somewhat worse performance at higher rates. The LO estimate (DA) with dimensional analysis typically underestimates the rate by around an order of magnitude for these parameters. In Fig. 5 we vary α at fixed β. From the semiclassical perspective, this is a probe of the mild α-dependence of the LO exponential factor e −βS b (since S b only depends on α), as well as beyond-LO effects. Our MC results are in good agreement with both the TDSE and GY results. They are closer to the "exact" TDSE values than GY, which could be an indication that our MC method for computing Γ is capable of accurately capturing some information beyond the NLO semiclassical approximation. In Fig. 6(a), with all other parameters fixed, we vary the ensemble-level S b V cuts. The results from these ensembles are expected to be about the same. There is an uncertainty in finding the minimum of the probability distribution p(S b V ) measured on the ensemble, and this is the primary source of discrepancy among the values in Fig. 6(a). In principle, the minimum should be nearly independent of the S b V -cut at ensemble generation, but there is an uncertainty introduced by numerical minimization with a finite sample. As shown in Table 1 in Appendix C, the postselection S b V -cuts for the these ensembles are not the same, although they are all around −0.5. The uncertainty in the postselection S b V -cuts is not reflected in the statistical error bars in Fig. 6(a).
In Fig. 6(b), we vary N T with all other parameters fixed to test the T -dependence of our results. As discussed in Sec. 2.2, we expect the T -dependence of the measured quantity ρ b (T ) to be weak when 1/m ≪ T ≪ 1/Γ. With a = 0.1m −1 , N T ranges from 200 to 1000, so T = aN T /2 ranges from 10m −1 to 50m −1 . With α = 0.9 and β = 8.0, the value of 1/Γ obtained by solving the TDSE is about 300m −1 , so the condition 1/m ≪ T ≪ 1/Γ is satisfied. There is some mild variation in the MC results as we vary T , but within statistical uncertainties they fall between the TDSE and GY results for this model point, and the uncertainty in Γ associated with residual T -dependence is again a factor < 2.
Finally, in Fig. 6(c) we vary the lattice spacing a with other parameters fixed. There is an O(a 2 ) difference between the lattice action and the continuum action, so reducing the value of a can make the result more precise. Since m is the characteristic scale in the continuum theory, a should not be substantially greater than 1/m. However, for fixed time range T , smaller a leads to a greater number of sites N T = 2T /a, and greater computational cost. We find that values of a in the range [0.05m −1 , 0.4m −1 ] all give accurate results, justifying the use of a = 0.1m −1 for the majority of our previous computations.

Long Lifetimes
In the previous section, we saw that straightforward ensemble generation with a hard wall on the quantity S V b allows an accurate computation of the probability density ρ and thus a good estimate of the decay rate, when these quantities are not too small. However, when the lifetime becomes very long, direct generation of the ensembles becomes impractical: starting from the vicinity of the false vacuum, the saddle point is simply too difficult to find by random fluctuations.
Instead, we consider a modification of the computation which we refer to as constrained ensemble reweighting. In the ensemble generation, we fix the trajectories to the classical turning point b at the midpoint in Euclidean time. In doing so we give up time translation invariance and the associated improvement in statistics, but we gain much more by "telling" the MC that it needs to reach b. To be more precise, for each rate computation, we generate two ensembles, one with the constraint applied and one without, and attempt to compute the probability of finding configurations from the constrained ensemble in the unconstrained ensemble.
In an ensemble of N configurations with N T sites, the number of configurations ∆N [x ⋆ ] near a given configuration {x ⋆ i } 1≤i≤Nt in a vicinity of volume N T i=1 ∆x i is given by where c is a normalization factor. The ensemble generation may have some imposed constraints in the space of configurations. These constraints affect which configurations are allowed but still retain the relative probabilities of allowed configurations. The factor c may depend on the constraints but does not depend on configurations x ⋆ as long as x ⋆ is not forbidden by the constraints. c is also independent of the total number of configurations N .
For an ensemble with N 1 configurations generated by the modified double-well potential we are interested in, which we denoted as "ensemble 1," we first consider x ⋆ ≡ x ⋆ i = x FV for all i, i.e., the FV-static configuration. The number of configurations in the vicinity of the static x FV configuration is given by Now consider x ⋆ = x b , the bounce solution, in the same ensemble. Configurations in its vicinity are representative contributors to ρ. The number of such configurations is is exponentially suppressed in the semiclassical limit. In such a case, from ensemble 1, To circumvent the exponential suppression we can generate a second ensemble, denoted as "ensemble 2," with N 2 configurations constrained by x N T /2 = b, corresponding to a center time constraint x(t = 0) = b in the continuum. Due to this constraint, effectively there are now only N T − 1 sites on the lattice. The number of configurations in ensemble 2 in the vicinity of a configuration x ⋄ is where c 2 is a normalization factor different from c 1 (and even has a different dimension, [c 2 ] = [x][c 1 ]). In ensemble 2, false-vacuum-like configurations are not allowed due to the constraint, so for relevant configurations near the bounce, ] is numerically calculable without suffering from an exponential suppression.
We can use Eqs. (42) and (44) to estimate the probability density at x = b. We write Thus we extract the decay rate, where c 1 /c 2 can be computed from two ensembles as There is still an "exponentially hard" aspect of the method: for large lattices the probability of finding a configuration in a volume N T i=1 ∆x i near another configuration is exponentially small in N T . To ameliorate this we find that it is sufficient to work with somewhat larger lattice spacings and smaller volumes, without substantially sacrificing accuracy.
We test the method on a benchmark point with α = 0.9, β = 60.0, and we generate two ensembles with a = 0.3m −1 , N T = 120. As described above, in ensemble 2 we impose a constraint x(t = 0) = b and S b V = −1.20 during the ensemble generation to avoid the dominance of true-vacuum-like configurations. The number of configurations in ensemble 1 is N cf,1 = 100, 000. Ensemble 1 has no constraint at x(t = 0), and we have effectively set no S b V -cut either, because β is very large. It is highly improbable for a configuration in ensemble 1 to approach b, by a factor of order e −βS b ≈ 10 −27 , and we find that all configurations have S b V = 0. Therefore in the formulas above N 1 = N cf,1 = 100, 000. For ensemble 2 we still need to impose cuts on S b V , similar to the procedure described in Sec hold exactly. However, the general principle that the cut should be chosen prior to the onset of the exponential rise in the S b V distribution still applies, and in practice we find that the same choice of postselection cut S b V > −0.5 is adequate. In general the variation of the cut within a range that does not sample the exponential rise, or approach unnecessarily close to zero, leads to an O(1) impact on the final result for the rate. This would be a reasonable target accuracy for this method, but in our initial investigation here we find somewhat larger sources of error. After postselection, ensemble 2 contains N 2 = 48, 674 configurations.
To carry out the analysis we must define the configurations around which to count neighboring configurations in each ensemble. For ensemble 1 we could simply use x ⋆ = x FV (t) = 0, as used in the formulas above. For ensemble 2, a convenient choice for x ⋄ is to construct a smoothed configuration by taking the mean or median value of x evaluated at each t over all the configurations in the postselected ensemble. We use the median configuration, shown in Fig. 7, to reduce the effects of possible outliers, but the mean configuration is in fact extremely similar. (To keep the ensembles on the same footing, we also use the median configuration in ensemble 1 for x ⋆ rather than directly using x ⋆ = x FV = 0, but the difference is negligible and we continue to refer to the central configuration for this ensemble as x FV .) We also overlay the semiclassical bounce solution in Fig. 7, demonstrating, as a by-product, that the smoothed configurations closely approximate the bounce, as one might expect deep in the semiclassical regime.
The vicinity of the median configuration is defined by choosing the windows {∆x i }. In principle we would like all ∆x i to be infinitesimal, but this is not possible in practice, because the number of configurations in the neighborhood is exponentially small in the number of sites N T . Instead, we take ∆x i to be finite at order O( β/m), i.e., the characteristic scale of the potential in x-space. For simplicity, we choose ∆x i ≡ ∆x to be site-independent. For ensemble 2 a configuration x(t i ) is identified as lying in the vicinity of    In Fig. 8, we use these two different prescriptions for the exponential factor in (47) to compute c 1 /c 2 with finite-sized neighborhoods. The statistical uncertainties are greater at smaller ∆x because fewer configurations survive. At intermediate ∆x the results are very close to an exponential function of ∆x. Heuristically this can be understood as follows. In ensemble 1, fluctuations can only raise the action, so as ∆x increases the number of neigh-boring configurations rapidly saturates to an O(1) fraction of the total in ensemble 1. In ensemble 2 the fluctuations do not necessarily raise the action and saturation only occurs at larger ∆x. These behaviors are reflected in Fig. 8. The difference in the typical action of fluctuations then implies an exponential difference in the ∆x distribution of configurations which is measured by c 1 /c 2 . By contrast, even in the second method, the exponential prefactor is highly stable with ∆x, as shown in Fig. 8. (In the first method this factor does not change, by definition.) We now perform four estimates of the decay rate from these results, corresponding to each of the two methods of computing the difference −S [x ⋄ ] + S [x FV ], and taking the results in Fig. 8 with and without exponential extrapolation to ∆x = 0. With exponential extrapolation, we fit c 1 /c 2 as a function of ∆x to the form p 0 exp(−p 1 ∆x) where p 0 and p 1 are fit parameters. Since we are only interested in semiquantitative extrapolation, we use a naive fit that ignores the correlation among data at different ∆x and treats them as uncorrelated. In this way we obtain a conservative estimate of the uncertainties arising from practical limitations on the smallest ∆x that can be accessed directly.
The values of c 1 /c 2 at different ∆x are correlated, and we perform exponential fits only using data with relatively small statistical uncertainties. The extrapolated results at ∆x = 0 are (8a) c 1 /c 2 ≈ 9.0 × 10 −26 √ m and (8b) c 1 /c 2 ≈ 1.5 × 10 −24 √ m. Since c 1 /c 2 is an estimate with order-of-magnitude uncertainty associated with finite ∆x.
The semiclassical NLO estimate for the decay rate is Γ GY = 8.61 × 10 −26 m, while the leading order estimate is about two orders of magnitude smaller, Γ DA = 1.56 × 10 −27 m. The central value in Eq. (48) is close to the NLO result and the conservative uncertainty band is still tighter than the LO-NLO difference.
We regard the method and analysis presented in this section as a promising first exploration of simple reweighting techniques for systems with long lifetimes. To better control the uncertainties, a more rigorous argument for the exponential extrapolation is essential, and the two estimates of −S [x ⋄ ] + S [x FV ] can be compared with larger ensembles across a range of potentials. Nonuniform ∆x i might also provide a useful tool. We leave these directions to future work.

Conclusions and Outlook
In this work, we develop a new framework for studying systems with metastable vacua in Euclidean Monte Carlo simulations. Our main results are (i) In quantum mechanics with a metastable vacuum state in the potential, the decay rate can be estimated if the probability density is known at the classical turning point, as shown in Eq. (8). The probability density can be expressed in terms of a lattice observableρ, see Eqs. (16), (15), (14), and (9).
(ii) Direct lattice simulation is feasible if the lifetime is not too long and a wall is inserted to prevent the ensemble from wandering into the basin of the true vacuum. For this purpose we find that a cut on the total contribution to the potential energy from the classically allowed region, S b V , provides an effective barrier, Eq. (24). We place a loose cut during ensemble generation and a tighter cut in postselection. A good choice for the latter is the minimum of the sample distribution of S b V . This cut avoids the need for any analytic continuation, while introducing an uncertainty into the final result.
(iii) Testing the method over a family of example models, we find that we can reproduce the results of numerical exact diagonalization to similar or better accuracy than nextto-leading-order semiclassical analysis with the NLO prefactor computed numerically using the Gel'fand-Yaglom method. The differences are generally an O(1) factor, while the leading order semiclassical estimate with prefactor fixed on dimensional grounds is generally off by more than an order of magnitude. The lattice results show satisfactory stability when varying over a range of lattice simulation parameters.
(iv) For long lifetimes, a direct lattice computation is again infeasible, but we find that a simple modification of the technique is effective to compute the probability density at the classical turning point b: we generate an additional ensemble with a constraint that the trajectories reach b at a fixed time. By a reweighting procedure we can then estimate ρ(b) using Eqs. (46) and (47). In an example case this method gives results consistent with NLO semiclassics within an order of magnitude, while LO semiclassics differs by 2 orders of magnitude. The uncertainties are driven by an extrapolation and might be improved by refinements of the method.
Our work is of an exploratory nature and as such we focus here on the simplest one-particle quantum mechanical theories. In these theories there are multiple other accurate means of computation (exact diagonalization, NLO semiclassics), which we use to benchmark our method. Lattice techniques would be of limited interest if they were confined to one-particle quantum mechanics. Fortunately, there are reasons to be optimistic about the future extensions to multiparticle quantum mechanics and field theories. The main new aspects in the more complex theories are the presence of a classical turning surface, rather than a turning point, and of renormalization effects. A natural first step would be to generalize the probability density as a function of particle coordinate x to a probability density in the energy of field configurations on spatial slices; the density at the turning point b should then be replaced by the probability density at energy equal to that of the false vacuum. This energy is shifted by quantum effects, as are the model parameters in the usual way, and one could attempt to account for renormalization effects by standard lattice methods. Our analysis in Sec. 2 would need to be extended to obtain the relationship between Γ and ρ(E) appropriate for field theories. We hope to address this problem in future work.
Following the real-time evolution of metastable states is also an important problem for the 23 nascent field of quantum simulations applied to high energy physics. It would be interesting to explore hybrid classical-quantum techniques utilizing the lattice methods developed here.
The most exciting application of lattice Monte Carlo techniques to theories with metastable vacua is in cases where a precise semiclassical formulation is not well-understood. These include scalar theories where the false vacua are not present in the classical potential, but are generated by quantum effects, and gauge theories where long-lived false vacua are believed to be generated by strong dynamics (e.g. Yang-Mills at large N [17].) Our work is only a first step in this direction, and both theoretical and computational developments are needed to perform accurate computations in all of the theories of interest. It would be interesting to explore application of the multicanonical method [18][19][20][21][22], which has been developed to address critical slowing down in systems with first-order phase transitions, to the case at hand with exponentially slow quantum tunneling. In addition to the theoretical aspects mentioned above, on the computational side, smarter sampling such as creating ensembles using machine learning techniques [23,24] might improve the accuracy when the decay rates are very slow. However, for the purpose of simply verifying the existence of metastable states, straightforward lattice simulations may in fact be quite effective.
where r ≡ |t| is the distance in Euclidean time from the center of the bounce.
It is more convenient to work with dimensionless quantities. The decay rate is then The potential in the dimensionless form is In the semiclassical limit, the flat region atx ≥x TV does not affect the result, and we can instead useV ( wherer = |t| and These two differential operators are both parity-conserving, so each operator has two superselection sectors: odd functions oft and even functions oft. The zero mode of M is an odd function. We can thus break up the operators into and compute the functional determinant ratios for each sector.
All even modes have nonzero eigenvalues. From the Gel'fand-Yaglom theorem, where ψ even and ψ free even are regular solutions of M free even ψ free even = 0, and R even (r) ≡ ψ even (r) ψ free even (r) . 25 After some algebra, we obtain the equation for R even , R ′′ even (r) + 2 tanh (r) R ′ even (r) − V (r) R even (r) = 0 (63) with the initial condition R even (0) = 1 and R ′ even (0) = 0. This is an ordinary differential equation that can be solved numerically once the exact form of the potential is given. Then we take the limitr → ∞ to compute R even (∞).
We cannot use the same method to compute the determinant ratio in the odd sector because of the zero mode. Instead, we apply the collective coordinate method to systematically remove the zero mode [13]. The result is wherex ∞ is defined by the asymptotic behavior ofx b at r → ∞ Combining Eqs. (52), (62), and (64) we obtain where R even (∞) is negative.

B Specification of S b V -cut
In Sec. 2.3 we introduced the quantity S b V defined on each MC configuration and used two cuts on it (ensemble-generation-level and postselection) to prevent sampling problematic configurations that probe too close to the true vacuum. The postselection cut was placed at the minimum of the probability density of configurations as a function of S b V , In this appendix we discuss the properties of p S b V in more detail and give a physical model to explain the typical finding S b V | min(p(S b V )) ≈ −1/2. The denominator in Eq. (67) is independent of S b V and serves as a normalization factor for the total probability such that 26 for simplicity. We define the density of number of configurations per S b V as and the average value of e −S over configurations conditional on S b V as Then the probability density of configurations per S b V can be rewritten as  Further, with decreasing S b V , the total action S is approximately decreasing. The negative mode with λ 0 < 0 is the only mode that lowers the total action when going away from the bounce solution. Therefore, in this region, the change in c 0 dominates the change in the total action and also the change in S b V . Under this assumption, we have , no longer independent of the value of S b V . Then, V and the total action S calculated from kernel regression (also by using the Epanechnikov kernel with suitable choice of the kernel width). Linear fits for both S and S − S b V are shown for comparison. The linear coefficient for S − S b V in Fig. 9(d) is qualitatively close to 0. In Fig. 9(b), the linear coefficient 0.88 is greater than 0, but this discrepancy is comparable to the generic statistical uncertainty at each point.
where c 0 [x] is the negative mode coefficient of the configuration x(t), and c 0 <c 0 [x]<c 0 +∆c 0 Dx = constant×∆c 0 . Combining these observations we obtain an approximate model for the probability density, The minimum is Physically we expect (S b V ) 0 to be small, and approximating (S b V ) 0 ≈ 0 gives the minimum (S b V ) min = −1/2. Now let us compare with Monte Carlo. In Figs. 9(a) and 9(c), we examine results from two simulated potentials and we fit the measured p(S b V ) with a model similar to (but slightly generalizing) Eq. (73). The fit is not used for the computations of the decay rate, only for the illustration of the physics of the quantity S b V . There is some subjectivity in choosing the fit range of S b V , because the lower end of the MC result is affected by the cut on S b V , and the upper end of S b V ≈ 0 is not expected to satisfy the conditions for the above arguments. Statistical errors in the density of configurations from Monte Carlo are not considered in the fit for simplicity.
Our argument for the functional form of p(S b V ) is not meant to be precise. We see that the model fit is good, but there are deviations from Eq. (73). For example, the coefficient in the exponent returned by the fits is not exactly −1. The constant (S b V ) 0 , in the example of Fig.9 (b), is about −0.1. The fit in Fig. 9(a) gives (S b V ) 0 ≈ −0.085. Similar inaccuracies in the model can also be seen in Figs. 9(d) and 9(c). However, it suffices as a qualitative description, and indeed we find in our numerical studies that the stationary point of p(S b V ) is generically in the range −1.0 to −0.1. The most important conclusion is that it is reasonable to expect the probability density to have a minimum, roughly somewhere in this range.
We now use semiclassical arguments to assert that the effect of varying the cut on which differs from the result from analytic continuation only by an O(1) factor.
To summarize, we propose to place a cut the configurations at the value of S b V at the minimum of the sample distribution p(S b V ). In practical Monte Carlo simulations, a lower cut in S b V that contains the stationary point is needed in ensemble generations in order to find the appropriate cut in S b V . A relatively small ensemble may be enough for giving a conservative estimation of where to cut. Then, a postselection of configurations discards 29 configurations with S b V lower than the stationary point. Computation of observables is then performed on the ensemble after postselection.

C Details of the numerical computations
In this appendix we provide details of the MC ensembles and the methods used to numerically analyze the MC data.

C.2 Binning the postselected ensembles
After analyzing the distribution of configurations in S b V -space we apply the postselection S b V -cut. The retained configurations define the postselection ensemble on which we measure observables. Since our original ensembles have autocorrelation, the postselection ensemble is also autocorrelated. With a bin size K, the effective number of independent configurations is N cf,post /K.
Considering small and large limits of the ratio N cf,post /N cf can be used to justify binning the Monte Carlo configurations in the postselected ensemble. In the small limit, the ratio N cf,post /N cf → 0, and the postselected configurations become essentially uncorrelated. In this case, binning is unnecessary. In the large limit, the ratio N cf,post /N cf → 1, and the postselected ensemble is similar to the original ensemble, where binning is standard.
On our postselected ensembles, to determine the suitable bin sizes K, we change K and calculate the statistical error ofρ on binned configurations (by using the mean value ofρ over each bin). For ensembles in Table 1, a generic suitable choice of K turns out to be about 200, where the statistical error starts to saturate. This bin size is used to obtain the statistical errors shown in Figs. 4, 5, 6.   When applying the constrained ensemble reweighting technique introduced in Sec. 4, we find that the relevant quantities used in the calculation, such as the frequency of the event x ⋄ (t i ) − ∆x/2 < x(t i ) < x ⋄ (t i ) + ∆x/2, are not sensitive to the bin size K. In practice, we use K = 5 as a safer choice than K = 1. To prevent the smoothing artifacts due to taking the average of K configurations, for every K configurations, we only use one configuration in the calculation and skip the remaining K − 1 configurations.

C.3 Numerical details of the constrained ensemble reweighting
We show the probability distribution of original configurations in ensembles 1 and 2 in Fig. 10 with ∆x = 4.0m −1/2 . We find that p(S 1 ) and p(S 2 ) are peaked at much higher values of S 1 and S 2 than the actions of the median smoothed configurations (about 0.142 for ensemble 1 and about 62.082 for ensemble 2). As explained above this is to be expected due to high-frequency fluctuations in the original configurations. In calculation of Sec. 4, we use the statistics of S 1 and S 2 over the distributions (also subject to the change in ∆x) at variable ∆x and use them jointly to evaluate e −S[x ⋄ ]+S[x FV ] in Eq. (47).