Extraction of bare Form Factors for $\mathrm B_\mathrm s \to \mathrm K \ell \nu$ Decays in non-perturbative HQET

We discuss the extraction of the ground state $\langle \mathrm{K} ({\bf p})|V_\mu(0)|\mathrm{B} ({\bf 0})\rangle$ matrix elements from Euclidean lattice correlation functions. The emphasis is on the elimination of excited state contributions. Two typical gauge-field ensembles with lattice spacings $0.075, \; 0.05$ fm and pion masses $330,\;270$ MeV are used from the O($a$)- improved CLS 2-flavour simulations and the final state momentum is $|{\bf p}|=0.5\,{\rm GeV}$. The b-quark is treated in HQET including the $1/m_\mathrm{b}$ corrections. Fits to two-point and three-point correlation functions and suitable ratios including summed ratios are used, yielding consistent results with precision of around 2% which is $not$ limited by the $1/m_\mathrm{b}$ corrections but by the dominating static form factors. Excited state contributions are under reasonable control but are the bottleneck towards precision. We do not yet include a specific investigation of multi-hadron contaminations, a gap in the literature which ought to be filled soon.


Introduction
Decays of b-quarks in the form B → π ν and B s → K ν are very relevant in constraining the Standard Model of particle physics and hence also in the search for deviations from it.Lattice QCD is the method of choice for predicting the necessary form factors [1][2][3][4][5][6][7][8][9].While computations are progresssing significantly, there are still few groups carrying them out.Crosschecks by independent (lattice) methods are largely absent.In Ref. [10] we have started an investigation which takes significantly different avenues in almost every choice that can be made.We use (improved) Wilson fermions, treat the b-quark in nonperturbative HQET and compute three-point functions for all time separations rather than fixing its total time-span (called τ = t K + t Bs later) to one or few values.We opt for non-perturbative HQET because a complete and practical method is known to nonperturbatively renormalize the theory and match it to QCD [11][12][13][14][15].When completed, this will provide an independent crosscheck.
Once these choices are made, one needs to carry out the basic steps as written in Ref. [10]: a) obtain the ground state matrix elements K|V µ (0)|B s that mediate the transition, b) renormalize the currents (and thus matrix elements) and, if an effective theory is used, relate them to QCD ("matching"), c) take the continuum limit of the matrix elements, d) extrapolate to the quark masses realized in Nature, e) map out the q 2 dependence.
In the previous work we saw that the continuum limit is smooth in the static approximation, which is expected to yield the by far dominating part of the full result.In this paper we discuss the step a) in detail, including the 1/m terms.It is not at all an easy enterprise.As discussed at length below, the issue is that ground state matrix elements are obtained at large Euclidean time separations.At time separations of around 1  2 fm -a typical QCD length scale -the low-lying states start dominating the two-and three-point functions.
Unfortunately the statistical quality of the Monte Carlo estimates is typically deteriorating fast once 2 fm in total (or 1 fm for K and B s each) are reached.It is thus a matter of the details of methods and QCD dynamics whether there is a window to determine the desired form factors at the percent level.
In other words, control over the extraction of the bare form factors is crucial in a proper determination of the decay rates.In this publication we thus explain our methods and their limitations in detail.
2 Setup for the non-perturbative evaluation To leading order in the weak interactions, the B s → K ν decay rate depends on two form factors, h (E K ) and h ⊥ (E K ). which are related to the commonly used 2 The kinematics is explained in Ref. [10], and in the rest-frame of the B s , we have where (E K , p K ) is the four-momentum of the final state and |B s (0) is the initial state with the B s at rest, p Bs = 0.The above hadron-to-hadron matrix elements are in Minkowski space, as indicated by the subscript M. They can be obtained from the Euclidean threepoint function (fig.1), where the spatial volume is a L 3 torus, with suitable interpolating fields O qq , O qq (see (2.14), (2.16)) and Removing the overlaps B s (0)|O bs |0 , 0|O us |K(p K ) (amputation), requires also the twopoint functions C Bs (t) = a 6 L 3 x f ,x i O bs (x f )O bs (x i ) , t = (x f − x i ) 0 . (2.8) The explicit relation between the Euclidean correlation functions and the desired matrix elements is (see e.g.[16]) (2.9) A few comments are in order -We chose to define the correlation functions on a lattice.Replacing a 3 x → d 3 x yields the continuum expressions.
-The derivation of Eq. (2.9) (on a lattice) is based just on the existence of a transfer matrix with standard properties and on interpolating fields which are local in time; non-locality (smearing) in space is allowed.
-Of course, on a finite lattice the limit t → ∞ has to be replaced by a proper procedure.Controlling it is the main topic of this paper.As a first, straightforward, procedure we note that it would be sufficient to have t (as well as the spatial box length L) large compared to the typical QCD scales Overview of the subset of N f = 2 CLS ensembles used.They have T = 2 × L. Lattice spacings, a, are taken from Ref. [19] and Ref. [20].The pion mass is denoted by m π , and we have m π L ≥ 4 such that finite-volume effects are sufficiently (exponentially) suppressed at the level of accuracy we are aiming for.
-In order to obtain the physical form factors, the current V µ has to be properly renormalized and the continuum limit has to be taken.
So far we have spoken of standard relativistic (lattice) QCD.The non-perturbative HQET expressions including O (1/m) terms are obtained by the following generic expansion of correlation functions [11,17,18] where [C Bs→K ] stat is the correlation function in the static approximation and The "stat" terms refer to lowest order (static) HQET, while k ∈ {kin, spin} are the terms due to the 1/m corrections to the action, and the remaining terms in the sum over k are 1/m corrections to the currents, see Sect.6.2.In δ O , we collect all terms which arise from the renormalization and 1/m corrections to the interpolating fields.Their form is irrelevant since they cancel in the ratios of interest Eq.(2.9).Apart from δ O , all divergences strictly cancel within Eq. (2.10) iff the parameters log Z HQET

Vµ
and ω k are determined nonperturbatively.The expansion of log(C Bs ) reads the same apart from the current-terms which are absent.Finally, functions of correlation functions such as the ratio Eq. (2.9) are trivially obtained from the expansion of the correlation functions.
For completeness we remind the reader of the rule of the game: the expansion is "derived" by considering the parameters ω k as O (1/m) and expanding in 1/m, despite the fact that individual pieces may be highly divergent.The logarithm is taken in Eq. (2.10) since then all expressions are automatically linear in the parameters and higher order terms in the parameters (and thus 1/m) are directly avoided.Those would otherwise have to be dropped to preserve renormalizability.Details of the measurements.The hopping parameter of the valence strange quark, κ s , has been determined as described in the text.The number of configurations on which we computed the correlations functions is denoted by N cfg .The number of smearing iterations applied to the light quark(s) in the K and B s meson is given by I K and I Bs r , respectively.The integer vector n and the scalar twist angle ϑ determine the momentum of the K meson by setting

Lattice Setup
For the lattice computation of the correlation functions we use gauge field configurations generated by the Coordinated Lattice Simulations (CLS) effort [21] with two degenerate flavors of O(a)-improved Wilson fermions [22] and Wilson gauge action [23].In the following, we present only results from the two representative ensembles, labeled A5 and O7, see Table 1.The lattice spacings were determined from the Kaon decay constant f K at physical pion mass in Ref. [21] and updated in Ref. [24].They do not depend on the quark mass, i.e. we use a mass-independent renormalisation scheme.
The natural choice for the mass of the valence up quark is equal to the mass of the degenerate up and down sea quarks, m l , and thus fixed by the gauge field ensembles.For the mass of the spectator (valence) strange quark we are free, however, to choose any smooth function m s (m l ) passing through the physical point.We follow strategy 1 of Ref. [21] and define m s (m l ) by fixing the squared ratio between the Kaon mass and the Kaon decay constant to its physical value R phys K ≡ (m phys K /f phys K ) 2 with m phys K = 494.2MeV and f phys K = 155 MeV.We expect that this will lead to a flat extrapolation to the physical value of m l , the "physical point".At the simulated (unphysical) values of m l , however, the resulting Kaon mass is m and only approximately equal to its physical value (see e.g.Fig. 3).
In the static correlation functions on the r.h.s. of Eq. (2.10), the heavy quark field ψ b is replaced by a static quark ψ h = γ 0 ψ h with the HYP2 discretizations [25,26].At NLO also the k ∈ {kin, spin} correlation functions are needed, see Appendix A.1.They are static correlation functions with an extra insertion of O kin or O spin from the HQET action [11].Finally there are the extra 1/m terms in the current V µ , see Sec. 6.2 and Appendix A.1.We implement O(a) improvement perturbatively by writing the vector current as when we evaluate the static matrix elements, but we use c V k = c V 0 = 0 when V stat µ enters ratios at order 1/m such as Eq.(2.11).
The interpolating field for the Kaon, is constructed from two smeared quark fields, ψ sm s and ψ sm u , each with I K iterations of Gaussian smearing.
Integrating out the Grassmann-valued quark fields yields the correlation functions in terms of traces of products of quark propagators and other factors such as smearing operators.In order to make full use of translation invariance, we evaluate the traces by a stochastic estimator which we represent by a single random U (1) vector with support on time slice (x f ) 0 .All values of (x f ) 0 are averaged over ("full time dilution").More details can be found in Appendix A.
The momentum transfer q 2 is computed from the continuum relation (2.17) with p K in the rest frame of the B s and physical values of the masses (using m phys Bs = 5366.77MeV).In order to have q 2 fixed to the same value on all ensembles, we need to adjust p K accordingly.This is achieved by using flavor-twisted boundary conditions for the (valence) s and b quarks, i.e. instead of periodic boundary conditions, we impose ) for the unit vectors k in k-direction.In this way, the momentum of the K meson becomes while the B s meson is kept at rest (p Bs ∼ θ b − θ s = 0) by choosing θ b = θ s .We fix the value of |p K | = 0.535 GeV, and hence q 2 = 21.23 GeV 2 . 1 1 Note that small changes of the input parameters lead to The particular value for pK was chosen on the "N6" lattice which we do not discuss here, but which will be used in the physics analysis.On N6 we have L/a = 48 and β = 5.5 such that n = (1, 0, 0) T and θ s = 0 lead to |pK| = 0.535 GeV.

Static correlation functions
For t K , t Bs a, where any possible violations of positivity are exponentially damped2 , and assuming infinite T for the moment, we can decompose the Euclidean correlation functions as (in the following we keep the p dependence of C K and C Bs→K implicit): where the indices r, s label the smearing levels used for the B s meson, while the indices m, n label the Kaon and B s meson energy levels respectively.A rather small number of states leads to a good precision if the time separations t K , t Bs are sufficiently large.In our fits we will use N K = 1 and N Bs ≤ 3.
By comparing the Euclidean representations of the correlation functions with Eq. (2.9) we see that the bare form factors of interest are given by ϕ (0,0) µ from Eq. (3.3), e.g.
In the following we describe two methods to determine them: 1.By directly fitting3 the correlation functions in a window of time extents where one can reliably limit the influence of excited states, statistical noise and other systematic effects (such as the finite-T extent of the available periodic lattices).This will be the focus of Section 4.
2. By forming suitable ratios of correlation functions (like the one in Eq. (2.9)) where at large enough t the dependence on all or most of the additional parameters cancels out.This will be the topic of Section 5.
As we shall see, for both of these methods it is beneficial to first obtain a good determination of the two-point function parameters, as well as to determine regions in the three-point functions where one is safe from the finite-T effects.These prerequisite steps are described in the following subsections.First we describe our methods to extract κ (0) and E for ensemble A5 and two values of momentum, together with the fitted energies.In this graph as well as all following ones we use MeV and fm units just for illustration.We thus convert from lattice units to physical ones without taking into account the error in the lattice spacings.The lattice parameters for the Kaon with momentum |p| = 535 MeV are given in Table 2.For the Kaon at |p| = 757 MeV, we have used n = (1, 0, 0) and ϑ = −0.9263.

Kaon correlation functions
While Eq. (3.1) is true in the limit of infinite T , in practice we have to take into account that T is finite.With our periodic boundary conditions in time, the ground-state contribution reads Since the signal-to-noise problem in the Kaon sector is very mild, we average C K (t) and C K (T − t) and always fit until the middle of the lattice, t max K2 = T /2.For the start of the fit range, we follow Refs.[19,32] and use the (rather conservative) criterion that t K2 min is the first t for which the excited-state contribution, estimated by a two-state fit, is less than 25 % of the statistical error of the effective mass In practice, we obtain t min K2 ≈ 1.3 − 1.4 fm.In this range, we fit the correlator to Eq. (3.5) to get the ground-state energy and amplitude.Note that the gap between the ground state and excited states in the Kaon sector is large, approximately 1 GeV ≈ 5 fm −1 .
To check the behaviour of the signal-to-noise ratio and the dispersion relation, we calculated the Kaon two-point function at two values of momentum.The effective mass plots for both momenta are presented together with the corresponding fits in Fig. 2. At higher momentum the signal-to-noise problem becomes significantly more severe.The dispersion relation, also including a measurement for p = 0, is presented in Fig. 3.The continuum dispersion relation describes the data very well even at the coarsest lattice spacing, showing that the cutoff effects in E K /m K are below our statistical precision.

B s correlation functions
For the two-point heavy-light (B s ) correlator, we have three different smearings (in addition to the unsmeared correlator, which is not used due to its large contamination from highly excited states).With the diagonal and the off-diagonal terms, we have six independent correlators in a symmetric C Bs rs matrix.The determination of the parameters of the B s correlation functions is divided into several steps, which are described in the following subsections.

Energies from the GEVP
We first determine the energies by solving the generalized eigenvalue problem (GEVP) on C Bs rs [33].We use all N Bs = 3 interpolating fields.The GEVP is defined as and yields approximations to the lowest N Bs energy levels via We use t 0 = t/2 since this asymptotically accelerates the convergence (with t) to the energies E (n) Bs [34].For each n separately we then find the GEVP estimate of the energy E for the excited states) are included in the average.The start of the plateau, t min , is chosen by requiring r(t min ) = is the result of the plateau fit starting at t min and δt ≈ 2/(E Bs ), which we approximate by choosing δt = 0.3 fm.For the ground state this criterion coincides with that of Ref. [35], ensuring that the statistical error dominates over the systematic one.Examples of E (0) Bs,eff (t) are presented in Fig. 4 (together with E K,eff (t) for comparison).Note that a good estimate of the ground-state energy will be important in Sec. 5.
For the excited states one should in principle use a higher value of δt ≈ 2/(E (N ) Bs ) but this is not possible with the current precision of the data, as one ends up in regions dominated by noise.However, as opposed to the ground state, for the excited-state energies we only need reasonable rough estimates, as any residual systematic error will be eliminated in the subsequent fitting steps as described in Sec. 4.

Amplitudes
Having obtained the energies, we determine the amplitudes and estimate the squared amplitudes through a linear fit.These values are used as initial guess for a non-linear fit for the amplitudes β to all the correlators, including the offdiagonal ones.In this way also the relative signs of the amplitudes are obtained.
Then, we try the full fit for both amplitudes and energies whenever possible.On certain ensembles the full fit becomes unstable with N Bs = 3, in this case we fall back to the fit with fixed energies.
The upper end of the fit range, t max B2 , is common for all the six correlators and is determined by the relative noise criterion, i.e. as the last point at which the relative error is smaller than 2.5% for all of the C Bs rs .For our ensembles this gives 2.1 − 2.4 fm.Also the lower end of the fit range, t min B2 , is the same for all C Bs rs .It is a tunable parameter that is common for the determination of the amplitudes and for the combined fit (cf.Sec. 4).Due to the finite time extent of the lattice, the three-point function also receives contributions from particles propagating ("wrapping") around the torus in time direction and is the sum of the contributions from the two diagrams shown in Fig. 5.At large enough time separations, the contributions of the corresponding lowest states can be written as

Three-point static correlation functions
where ξ µ,r = 0|V µ |B * B * |P hl |K contains unknown matrix elements of the state B * , the lightest heavy-light state contributing to the "wrapper" diagram.At static order the energy Bs .Instead of introducing extra fit parameters ξ µ,r (and possibly others, if excited states need to be included), we restrict ourselves to time separations where the wrapper contributions are negligible.
For this purpose, at every fixed t Bs we fit the three-point function to the form with B µ,r and C µ,r being linear fit parameters (which one can express in terms of the amplitudes and matrix elements from Eq. (3.9)) and E being a non-linear fit parameter (although one could in principle set it to E K extracted from the two-point function).The fit is done in a region where ground-state dominance in the Kaon sector is expected (starting at approximately t min K3 = 0.8 fm).Fig. 6 shows an example of such a fit at fixed t Bs .For every t Bs , µ, and r separately, we then find t max,wr K3 as the last t K that fulfills the condition: with c wr = 0.25.The times t max,wr K3,µ (t Bs ) are shown by the red lines in Fig. 7 for the two representative ensembles A5 and O7 (together with other constraints for the fits as discussed in Sect.4).We observe that the wrapper criterion yields considerably different t for µ = 1).On the other hand, the different smearing levels give very similar limits, but are still all treated separately.

Matrix elements at static order from a combined fit
From the two-point function fits, we have obtained (estimates of) the energies and amplitudes, κ (m) , β (n) r .Taking them as input values, we can now fit Eq. (3.3) for each µ = 0, 1 with ϕ (m,n) µ , i.e. the form factors, as N K × N Bs linear parameters.These could in principle serve as the final result.However, for better stability, accuracy, and control of the systematic errors, we take all these results only as initial conditions for a combined simultaneous fit to the two-point and the three-point functions.In this way, we can use the information on the energies and amplitudes also from the three-point functions, which contain many data points.To estimate the errors we use [36,37] but with the derivatives with respect to the fit parameters calculated analytically.
The numerical results presented in this section are for the fit with N K = 1 and N Bs = 3, corresponding to a total of 20 fit parameters 4 .We find that this choice gives the best stability with respect to the changes of the fit ranges, and therefore we use it to extract the values of ϕ (0) µ .An alternative fit with The fits with both N K > 1 and N Bs > 1 can easily become very unstable without some form of prior assumptions about the values of the excited-state form factors.
To keep statistical and systematic errors under control, only a small and carefully chosen subset of all the available time separations 0 ≤ t K + t Bs < T of the three-point function in the t Bs -t K -plane can be used in the fit., and t min K3 for lattices A5 and O7.The different lines with same color show the values for the three smearings.The vertical dashed lines are the two choices of t min B3 for the combined fit of Sec. 4 and the dots correspond to the data points included in the fit.To guide the eye, we also show the line t K = t Bs which is relevant for the ratios in Sec. 5.
Moreover, it is important to have a criterion for good determination of the groundstate form factors by the combined fit.However, it is not easy to find a simple strict criterion, like the one from Sec. 3.1 for E K , and we monitor several criteria.Details of both issues, the choice of the fit ranges and the quality of the fit, are discussed in the following.

Fit ranges
For the three-point functions we select the data points to be included in the combined fit by the constraints t with c noise = 2.5%.The corresponding curves are shown in Fig. 7 and illustrate that A5 is an example of a wrapper-limited lattice (due to a small physical time extent).On the other hand, O7 is noise limited and by reducing statistical errors one may hope to improve the determination of the form factors. Also the function t min K3 (t Bs ) is selected by an automatic criterion6 analogous to the one described in Sec.3.1: for each t Bs , µ, and smearing, we do a two-exponential fit to t K ∈ [0.4 fm, t max K3 (t Bs )] (with the energies fixed to the ones extracted from the two-point function).We then find the minimum time t min K3 * (t Bs ) at which the excited-state contribution is smaller than 25% of the statistical error.The final value is set to to avoid values getting too small in the regions with large noise, where the two-state fit does not work well.The resulting t min K3 is shown by the blue curves in Fig. 7.The remaining tunable parameters of the fit are t min B2 and t min B3 .In Ref. [10] we used t min B2 ≈ 0.45 fm, which seems sufficient when N Bs = 3.With better smearings and improved analysis methods, we can use slightly lower values, with two choices: t min B2 ≈ 0.32 fm ("aggressive") and 0.38 fm ("conservative").
Concerning t min B3 it was observed in Ref. [10] that the fit performs better if one takes a positive t min B∆ ≡ t min B3 − t min B2 .This can be understood by noting that the amplitudes in the two-point function are proportional to |β i .Thus, the suppression of excited states is expected to be stronger in the two-point functions if i .We find that this is in fact the case, except for the lowest smearing. 7 With increasing time separation t Bs the suppression of the first excited state is proportional to e −(E (1) Bs )t , which approximately amounts to 0.67 at t = 0.2 fm, 0.4 at t = 0.4 fm, and 0.25 at t = 0.5 fm.Although it is not immediately obvious which value one should take, this justifies a non-zero value of t min B∆ .We keep t min B∆ ≈ 0.2 fm, which gives a good compromise in suppressing the higher excited-state contributions and keeping the number of data points large enough for good fit stability.

Fit quality and results
The combined fit is uncorrelated, i.e. we use a diagonal covariance matrix.For such a big matrix the full covariance matrix is very badly determined and conditioned (in fact, if the number of data points is bigger than the number of measurements, it is not possible to invert [38,39]).
As a way to monitor the quality of the uncorrelated fit, we use χ 2 exp , the expectation value of χ 2 given normally distributed data with the measured covariance matrix [40].It can be estimated efficiently including autocorrelations [41], but the uncertainty on its determination can be large and it is not easy to exactly quantify what is a good fit.In general we consider fits with χ 2 2χ 2 exp as acceptable.Another way of monitoring the fit quality is to look at the contributions to the χ 2 coming from different correlation functions.In particular, an unusually large contribution from C Bs should be monitored as it can indicate that t min B2 was chosen too small (even though the overall χ 2 looks acceptable because of the many points in the three-point functions).
Apart from monitoring the χ 2 , we analyze the stability of the extracted fit values and errors with respect to changes of t min B2 and t min B3 .The stability plots are always organized in the following way: the data is divided in groups with different t min B2 , which is also plotted on the x-axis in the middle of each group.Inside every group t min B∆ varies from 0 to approx.0.5 fm.Additionally, we highlight in different colors the values for selected t min B2/3 , which are described in the following.In choosing t min B2 and t min B3 one must find a good window between too small values, where the fit is plagued by contamination from the higher excited states (this can be seen by high χ 2 as well as by lack of stability of the extracted fit parameters with respect to small changes in t min B2/3 ) and too high values, where it can no longer resolve the three B s states.The latter is demonstrated in Fig. 8.At large t min B2/3 we see that for both ensembles at least one of the energies is no longer resolved.Therefore, we refrain from showing the largest values of t min B2 in Figs. 9 -11.In general, the ground-state energies (cf.Fig. 9) and the ground-state form factors (cf.Fig. 10) are reasonably stable with respect to changes of t min B2/3 (within quickly growing errors).We also show representative examples for the excited-state form factors in Fig. 11.Their precision, especially of ϕ (2) µ , is much worse than of their ground-state counterparts.The values for the ground-state form factors extracted with both choices of t min B2 are collected in Table 3, together with the results from ratio methods described in the next 7 For the ensembles we analysed, typical values are:    The results are rather stable with respect to changes of t min B2/3 , although errors grow quickly.for ensemble A5 is the systematic error due to the wrapper effect.The last column shows the results of the linear fit to S I µ,3 (τ ), see the shaded band in Fig. (14). section.
5 Matrix elements at static order from ratios

Ordinary ratios
An alternative to extracting the ground-state matrix elements from fitting the correlation functions directly is to construct an appropriate ratio of the correlation functions such that the dependence on all or most of the other parameters cancels in the limit of large time.
One example of such a ratio was actually used to define the form factor, cf.Eq. (2.9).In the following it will be beneficial to generalize this definition to the case where t K = t Bs in the three-point function (τ ≡ t K + t Bs ): (5.1) In the limit t K , t Bs → ∞, this ratio converges to the desired bare form factor ϕ (0,0) µ if additional parameters ẼK (τ ), Ẽ(0) Bs (τ ) satisfy Ẽ(τ ) = E +O (exp(−∆Eτ )).Possible choices will be discussed below.
In Sec.5.2 we will see the advantage of this definition, while its obvious disadvantage is that in the denominator one needs the two-point correlation functions at time τ = t K +t Bs .This means that e.g. for a simple estimate of a plateau in the three point function at t K ≈ t Bs ≈ 1 fm one needs the two-point correlation functions at a large time separation τ ≈ 2 fm, which is particularly problematic in the case of C Bs where the signal-to-noise problem is much more severe than for C K .
For lattices with short time extent, like A5, τ may come close to the middle of the lattice and we need to take into account the influence of the second term in Eq. (3.5).We do that by multiplying the ratio by an additional factor 1 + e E K (T −4t) 1/2 which cancels the unwanted contribution up to excited states which are negligible in this region.
In addition, one can consider another ratio which has a more favourable signal-to-noise behaviour of the denominator but requires energy estimates even for the choice t K = t Bs .A similar ratio is where the additional normalization factors N K , N Bs r can be expressed in terms of the correlation function parameters κ (0) and β (0) r respectively.In practice, our data for R III shows very similar or slightly inferior behaviour compared to R II , therefore we do not present the numerical results for this ratio.
Let us now discuss the choice of the ground-state energy estimates.One can set them to the effective masses of the corresponding two-point functions, avoiding any fit procedure.This, however, results in large statistical fluctuations in the large-time region.For our data, it is beneficial to instead use the time-independent estimates Bs extracted in Sections 3.1 and 3.2 respectively.The plateaux in both cases were chosen in a conservative manner, and as a crosscheck we calculated the ratios for different choices of the plateau ranges of both energies.From that exercise we find that the systematic error associated with using the fitted ground-state energies is negligible with respect to our statistical uncertainties.
The resulting ratios R I and R II are presented in Figs. 12 and 13, respectively.In addition to the data we also plot the fit results from Sec. 4, including both the fitted values of ϕ (0,0) µ and the respective ratios of correlation functions reconstructed by inserting the fit parameters (ϕ

1)-(3.3).
There is a good agreement of the actual ratios with their fitted counterparts at large values of t, where excited states neglected in the fits are irrelevant.
On ensemble A5, due to the short time extent of the lattice, the wrapper criterion Eq. (3.11) becomes relevant.For µ = 1 it restricts the available times to approx.1.1 fm, cf.Fig. 7.At larger times we observe that the ratio R II 1 starts to grow rapidly, while for R I 1 this is most likely masked by the large uncertainties.Furthermore, for µ = 0 the data is close to violating the wrapper criterion (see Fig. 7) at the maximal time separations used.
On both lattices, A5 and O7, we see that at large times, t 1 fm, R II is clearly superior to R I in terms of the signal to noise and has a comparable precision to the results of the combined fit.To use R II for the extraction of ϕ (0,0) µ , we need to select a suitable plateau: looking at the bands from the fit which give us an estimate of the excited-state contamination from the B s sector, we start the plateau at t min = 1.1 fm and fit until the loss of precision of the signal below 5% or until we hit the wrapper criterion.On A5 for µ = 1 there is no valid plateau, because the wrapper is hit before t min .In this case, we quote the first data point above 1.1 fm as the final result and add to it a systematic error associated with the wrapper contribution estimated by the fit.

Summed ratios
A way to get improved convergence to the ground state is to sum8 the ratio [42-44] R I and determine the matrix element from The asymptotic excited-state contaminations are then O(τ ∆e −τ ∆ ), where ∆ = min(E K − E Bs − E (0) Bs ), as opposed to O(e −τ ∆/2 ) in ordinary ratios [43,44]. 9The accelerated convergence of the summed ratios can be a decisive advantage in the case of ensembles with a limited extent in time such as A5: contaminations by wrappers are less relevant.Example results are shown in Fig. 14.We observe that the convergence is indeed improved.
In fact, especially for µ = 0 the plateaux seem to start very early.This early onset of the plateaux is discussed in Appendix C where we show that it is caused by an accidental cancellation of the excited-state contributions in the B s and K sector and therefore should be treated with caution.
Note, however, that the ratios reconstructed from the fits of Sec. 4 only include the influence of the excited states in the B s sector.As in the previous subsection, these fit bands can be used to estimate the start of the plateaux.We choose τ min = 1 fm and τ max by the 5% relative noise criterion. 10In this range we extract the form factor from the slope of the linear fit directly to S I (τ ).Note that on both lattices we are not affected by the wrapper criterion as opposed to ordinary ratios.
The results from the different methods are gathered in Table 3 and are in good agreement.One slight difference is seen in R II 0 on ensemble A5.However, in this case if we start the plateau one data point earlier (which, judging from Fig. 13, seems legitimate) the result goes down to 1.117 (15) which removes most of the difference.In general the results from the fits and the summed ratio have similar precision, in which case we prefer to quote the result from the summed ratio as the final results, as the latter method is simpler.

GEVP method
We have also implemented the GEVP method of Ref. [43], both the ratio one, eq.(2.16) of [43], and the summed one, eq.(3.7) of [43].In both cases we specialized to using the GEVP only in the B s channel.The results are very similar to the ones of the previous two sections (with the largest wave function), but with errors which are a little bit larger.The fact that we do not see a significant improvement might be due to our interpolating fields, which likely do not distinguish between single hadron and (excited) multi-hadron states.The GEVP is then not able to significantly reduce the multi-hadron contributions.We comment further on this issue in the conclusions.

Matrix elements at 1/m order
The HQET expansion of B-meson observables becomes a precision tool only when 1/m terms are included.We now discuss the determination of these crucial terms for the matrix elements.As in the static approximation we have the option to perform fits to the (two-point and three-point) correlation functions, or to consider ratios or summed ratios.Fits have been discussed and applied in Ref. [18] for the somewhat simpler case of the Bto-vacuum matrix element f Bs .As there, in our present case, all parameters β (n) s , E (n) get triplicated with static and 1/m pieces corresponding to kin and spin insertions.Furthermore pure exponentials turn into exponentials plus terms of the form E kin n t exp(−E stat n t).This proliferation of terms in addition to the 1/m corrections to the matrix elements themselves makes an analysis in terms of fits cumbersome and difficult.We therefore concentrate on the analysis of ratios, which we already have seen to be just as good as fits in the static approximation.We will also see that there are some simplifications in the 1/m expansion of ratios which make them rather accessible.
As a preparation for expanding the ratios, we start with the underlying two-and three-point functions.First note that we choose the arbitrary interpolating fields O bs,r , eq. (2.14), not to contain a 1/m piece.Therefore, the 1/m expansion of the two-point function reads Only the (dimensionful) HQET parameters of the action, m bare as well as ω k ∼ 1/m, appear.Here and everywhere below all O(1/m 2 ) terms are dropped without notice.Note also that m bare drops out in the expressions for matrix elements.
The energy E Bs = lim t→∞ −∂ t log(C Bs (t)) is 1/m-expanded as E Bs = m bare + E stat + ω kin E kin + ω spin E spin with E x appearing in the large time behaviour of [45] All E x refer to the ground state; the first excited state contribution leads to the term with Because one always first expands in 1/m and then takes a limit of large time, terms such as exp(−∆E kin t) do not appear.
In the numerical applications we will take ∂ t to be the forward derivative, a∂ t f (t) = f (t + a) − f (t).On integrating Eq. ( 6.3) , we get: (6.4) The integration constants A Bs,k r do depend on the kin and spin insertions as well as the smearing level used.
In complete analogy we have for the three-point functions (t = t Bs = t K ): Now we turn to the the ratios and insert the 1/m expansion of the quantities which enter their definition in (5.1) and (5.2).To understand the structure, consider the expansion , k ∈ {kin, spin} , (6.6) . (6.7) The sums over k run as indicated and for j the range is seen in Table 5.It follows that the desired 1/m corrections to the ground state matrix elements are given by the large time limits, .8) These equalities are what we alluded to before as simplification in the 1/m expansion.In fact, since the ground state matrix elements in the last expression are just given by the A terms, it should not come as a surprise that the same final formulae for ρ j µ and ρ k µ are obtained if one starts from ratios R II or R III .
The strategy to obtain the 1/m terms of the matrix elements is then to extract A Bs,k r and A Bs→K,k µ,r from fits to Eq. (6.4) and Eq.(6.5) and ρ j µ from Eq. (6.9).

Numerical results for the kin and spin insertions at 1/m order
As a first analysis we performed combined fits with parameters A Bs,k r , A Bs→K,k µ,r and E k to Eq. (6.4) and Eq.(6.5) in the time region where a linear behavior in t is observed.This was done for fixed insertion k ∈ {kin, spin} and fixed smearing level, typically the largest smearing.The resulting errors on the matrix elements were rather large because the data did not constrain the fitted energies E k so well.An improvement could be achieved by determining E k from the GEVP at order 1/m, exactly as described in Ref. [46] and then using that as a constraint in Eq. (6.4) and Eq.(6.5).The GEVP takes into account information from all smearing levels of the two point functions.In a little more detail, we expand the ground state GEVP eigenvalues in 1/m (see Ref. [46] for explicit formulae) as λ (0) (t, t 0 ) = λ stat (t, t 0 ) + ω kin λ kin (t, t 0 ) + ω spin λ spin (t, t 0 ) and then form We consider just t 0 ≥ t/2 as the asymptotic convergence is proven to be much better [46] under that condition.For the present case, it turns out that there is rather little dependence on t 0 in practice but errors of course grow as it is increased.We then form a weighted average of the first (up to) three values with t 0 at and above t/2.These averages are shown as E k eff (t) in Fig. 15 and 16 together with the effective masses of the best smearing level.The GEVP estimates behave significantly better than just the best smearing which we show for comparison.Our final numbers come from plateaux fits of the GEVP estimates starting at t = 0.5fm, and are shown as bands in Figs. 15 and  an agreement with a plateau is only seen starting at t = 0.8 fm and in fact we would like the plateaux to be more convincing in one or two cases.Nevertheless, taking weighted averages starting at t = 0.8 fm is reasonable and we collect their results combined to ρ k µ in Table 4.The bands in Figs. 17 and 18 show the chosen fit values.

Current insertions at 1/m order
The 1/m vector current contributions are j=1 ω 0,j V 0,j (x) , (6.11) Figure 19: Overview of the current insertions for ensembles A5 (left) and O7 (right).Fit bands are plateaux averages starting at 0.86 fm.For A5 the V 1,4 contribution is exactly equal to V 1,3 , while on the O7 they are not exactly identical but still indistinguishable on the plot, therefore V 1,4 was not plotted.
with the operators Γ µ,j detailed in Table 5.Note that with a momentum purely along the x-axis (A5 ensemble), only i = 1 contributes and ρ 3 1 = ρ 4 1 is exact.
Table 5: Overview of the 1/m vector current insertions, their tree-level matching coefficients, and the results extracted from the highest light-quark smearing, r = 3.We use symmetric covariant derivatives ∇ S i .
The results obtained for Eq.(6.7) are presented in Fig. 19.We see plateaux starting at roughly 0.8-0.9fm and average from 0.85 fm on.The precision is better than that of the kin and spin terms.A full quantitative error budget for the 1/m contributions to the form factors has to wait until the corresponding non-perturbative matching coefficients ω are available.

Discussion
Flavor changing transitions are important channels for learning about possible limitations of the standard model.Exclusive semi-leptonic decays of B or B s mesons are particularly clean theoretically.However, lattice computations of the relevant form factors are needed and they are non-trivial in practice.A major reason is the infamous signal-to-noise problem.At large Euclidean time separations the noise in Monte Carlo evaluation of the correlation functions is too large to determine the form factors (matrix elements) with interesting precision.
While the issue is not new, we have exposed the problem in a few graphs more clearly than often done, see Figs 4, 14.For the three-point functions and derived (summed) ratios, this was possible because we have evaluated correlation functions at all time separations.We are only considering the pseudoscalar sector, where the signal-to-noise problem is very mild for a relativistic formulation and at zero momentum.However, we need finite momentum (see Fig. 2 for the momentum dependence) and for reasons explained in the beginning of the paper we use HQET for the b-quark (see Fig. 4 for the difference of Kaon and static B s meson).In HQET, the signal-to-noise problem becomes worse as one decreases the lattice spacing.Nevertheless, we are determining the matrix elements at total time separations τ of the three-point functions of around 2 fm and for a ≈ 0.05 fm.Only for the summed ratio we use about half that time-separation (Fig. 14), in agreement with the predicted better suppression of excited states after summation.
Despite our use of HQET, these separations are larger or similar to the ones typically used.E.g. most recently Ref. [9] used a fixed τ ≈ 2.2 fm.
We have presented good evidence that the chosen plateaux or fit-windows are reasonably safe, but nevertheless it would be better to have larger times accessible.Maybe multilevel strategies [47,48] will help to reach those separations in the future.At present we derive our confidence from the good agreement of • fits with t Bs > ∼ 0.3 − 0.6 and 3 states in the B s sector (and also with 2 B s states and t Bs > ∼ 0.5 − 0.7, cf.Appendix B) • ratios Eq. (5.2) with τ = t Bs + t K > ∼ 2 fm, • and summed ratios Eq. (5.4) at total separation τ > ∼ 1 fm .
Note that also the precision of the different estimates is quite comparable.It is therefore preferable to use the technically easier ratio methods.At the lowest (static) order in 1/m the most relevant form factors for µ = 1 have an accuracy around 2%, which is good for precision physics.The first order corrections in 1/m are actually more precise than that.The relative errors induced into the form factors are given by the absolute ones of the quantities ρ in Tables 4,5 multiplied with the appropriate ω-coefficients.From the errors of ρ kin and ρ spin , using the non-perturbative ω kin , ω spin [13], we get error contributions of 1% and 0.5% to the form factors, respectively.The uncertainties of ρ j , translate into below 0.5% errors assuming coefficients ω j which do not exceed the tree-level values by more than a factor of two.Of course, the uncertainties in the coefficients ω are to be added separately.
A positive result of our detailed analysis is thus that 1/m corrections can be determined precisely, when the coefficients (HQET parameters) are known with reasonable    results, cf.Fig. 22.The case of ϕ (0) 0 for O7 is by far the worst, but most look more like the stable and consistent ϕ (0) 1 .When choosing a final form factor value from the fit, one clearly has to keep t min B2/3 slightly higher than for the N Bs = 3 fit.We use t min B2 ≈ 0.52 fm and the same t min B∆ as before.The values summarized in Table 6 are consistent within errors with other methods and give similar precision.id µ Fit value A5 0 1.099(12) A5 1 0.580(9) O7 0 1.100(13) O7 1 0.606 (9) Table 6: Results for the ground-state form factors ϕ (0) µ using the fit with N Bs = 2.

C Note on the convergence of the static summed ratio
In Section 5.2, devoted to static summed ratios, we observe that the plateaux for µ = 0 in Fig. 14 seem to start very early.Taking the results for O7 at face value, one could start the plateau fit as early as 0.4 fm.To see whether this is a genuine ground state dominance we investigate truncated sums of the form: By putting t min Bs or t min K larger than 0 we can suppress the excited states in the B s and K sector respectively. 12The results of this procedure are shown in Fig. 23.We see that the excited states in the K sector push the result upwards, while the excited states in the B s sector push the results downwards.Their cancellation results in a flat "fake" plateau which can start at very early times.A similar but much less pronounced effect is at work for µ = 1.One should therefore not rely solely on the flatness of the plots but also devise an independent criterion for the beginning of the plateaux, as was done in Section 5.2.The summed ratio for ensemble O7, µ = 0 with three versions: with symmetric sum limits and with either only t min Bs or t min K larger than 0. Note that due to different (worse) convergence properties, these plots cannot be directly compared with the ones in Fig. 14.
of Baden-Württemberg (MWK), Bayern (StMWFK) and Nordrhein-Westfalen (MIWF).We acknowledge PRACE for awarding us access to resource JUQUEEN in Germany at Jülich and to resource SuperMUC in Germany at München.We also thank the LRZ for a CPU time grant on SuperMUC, project pr85ju, and DESY for access to the PAX cluster in Zeuthen.

2. 1 Figure 1 :
Figure 1: The correlation function used to extract the B s → K decay matrix element.

Figure 3 :
Figure 3: Kaon energies at three values of momentum for ensemble A5.The lattice parameters are indicated in the caption of Fig 2. The dashed curve shows the continuum dispersion relation evaluated with the rest mass from Ref. [19] (it is not a fit).

Figure 4 :
Figure 4: Dependence of the K and B s ground-state energies on t for ensemble A5 (left) and O7 (right).E (0) Bs,eff is shifted vertically for presentation.Dashed (and full) horizontal lines show the plateau fit values (and ranges).

Figure 5 :
Figure 5: Contributions to the three-point function corresponding to the B s → K decay (left), and a wrap-around Kaon (right).Figures taken from Ref. [32].

i | 2 ,
while in the three-point functions they are only proportional to β (n)

Figure 8 :
Figure 8: Stability plots for all three B s energies on ensemble A5 (left) and O7 (right).The selected t min B2/3 values are highlighted with a black color, while the dashed lines are values from the GEVP, cf.Sec.3.2.1.The behaviour at large t min B2 and t min B3 can be traced to the fact that the fit can no longer resolve three separate states.The selected fit values are discussed in Sec.4.1.A zoom into the behaviour of E (0) Bs is shown in Fig. 9.

Figure 10 :
Figure 10: Stability plots of the ground-state form factors for ensemble A5 (left) and O7 (right).The results are rather stable with respect to changes of t min B2/3 , although errors grow quickly.

Figure 12 :Figure 13 :
Figure12: Overview of the ratios R I with t K = t Bs for ensembles A5 (left) and O7 (right).Only the highest smearing, r = 3, is presented.For comparison we also show the results from the combined fit in Sec.4: the dashed horizontal lines are the fitted values of ϕ (0,0) µ and the curves (1σ bands) are the respective ratios of correlation functions reconstructed from the fitted parameters.

Figure 14 :
Figure 14: Overview of the M I for ensemble A5 (left) and O7 (right), together with the fit curves analogous as in Figs. 12 and 13.The shaded bands represents c 2 of fits S I µ,r (τ ) = c 1 + c 2 τ to the sum in the indicated range of τ , chosen such that wrapper effects are negligible.

Figure 15 :
Figure 15: a 2 E kin for ensembles A5 (left) and O7 (right).The highest smearing and the GEVP result are shown.The band shows the GEVP plateaux average.

Figure 16 :
Figure 16: a 2 E spin for ensembles A5 (left) and O7 (right).The highest smearing and the GEVP result are shown.The band shows the GEVP plateaux average.

Figure 20 :
Figure 20: Stability plots for the B s energies on ensemble A5 (left) and O7 (right).The selected t min B2/3 values are highlighted with a black color, while the dashed lines are values from the GEVP, cf.Sec.3.2.1.A zoom into the behaviour of E (0) Bs is shown in Fig. 21.

Figure 22 :
Figure 22: Stability plots of the ground-state form factors for ensemble A5 (left) and O7 (right).

Figure 23 :
Figure 23:The summed ratio for ensemble O7, µ = 0 with three versions: with symmetric sum limits and with either only t min Bs or t min K larger than 0. Note that due to different (worse) convergence properties, these plots cannot be directly compared with the ones in Fig.14.

Table 2 :
Bs = 8a ≈ 0.6 fm for ensemble A5, together with the fitted function.The start of the finite-T fit and t max,wr (t K , t Bs )/(C Bs→K 1,3 (t K + a, t Bs )) with t K3 are also shown.µ = 1 (with lower t max,wr K3 min B3 ≤ t Bs and t min K3 (t Bs ) ≤ t K ≤ t max K3 (t Bs ).The function t max K3 (t Bs ) is chosen, for each µ and smearing level separately 5 , as Bs ) is given by the wrapper criterion of (3.11) and t max,noise Bs ) is determined from a relative noise criterion, i.e. as the largest t K which fulfills K3(t µ

Table 3 :
Results for the ground-state form factors ϕ