How perturbative are heavy sea quarks?

Effects of heavy sea quarks on the low energy physics are described by an effective theory where the expansion parameter is the inverse quark mass, 1/$M$. At leading order in 1/$M$ (and neglecting light quark masses) the dependence of any low energy quantity on $M$ is given in terms of the ratio of $\Lambda$ parameters of the effective and the fundamental theory. We define a function describing the scaling with the mass $M$. We find that its perturbative expansion is very reliable for the bottom quark and also seems to work very well at the charm quark mass. The same is then true for the ratios of $\Lambda^{(4)}/\Lambda^{(5)}$ and $\Lambda^{(3)}/\Lambda^{(4)}$, which play a major r\^ole in connecting lattice determinations of $\alpha^{(3)}_{MSbar}$ from the three-flavor theory with $\alpha^{(5)}_{MSbar}(M_Z)$. Also the charm quark content of the nucleon, relevant for dark matter searches, can be computed accurately from perturbation theory. We investigate a very closely related model, namely QCD with $N_f=2$ heavy quarks. Our non-perturbative information is derived from simulations on the lattice, with masses up to the charm quark mass and lattice spacings down to about 0.023 fm followed by a continuum extrapolation. The non-perturbative mass dependence agrees within rather small errors with the perturbative prediction at masses around the charm quark mass. Surprisingly, from studying solely the massive theory we can make a prediction for the ratio $Q^{1/\sqrt{t_0}}_{0,2}=[\Lambda \sqrt{t_0(0)}]_{N_f=2}/[\Lambda \sqrt{t_0}]_{N_f=0}$, which refers to the chiral limit in $N_f=2$. Here $t_0$ is the Gradient Flow scale of [1]. The uncertainty for $Q$ is estimated to be 2.5%. For the phenomenologically interesting $\Lambda^{(3)}/\Lambda^{(4)}$, we conclude that perturbation theory introduces errors which are at most at the 1.5% level, far smaller than other current uncertainties.


Introduction
At present most simulations of lattice Quantum Chromodynamics (QCD) include two light (up and down) quarks and a strange quark.It is important to investigate the effects of the charm quark, whose mass M is about 12 times larger than that of the strange quark.Effective field theory [2] arguments predict that the effects of a heavy quark are described by the theory without the heavy quark with leading order power corrections of size O(1/M 2 ).At lowest order in 1/M only the light quark masses and the coupling need to be adjusted to match the two theories (with and without the heavy quarks).For the coupling this issue has been discussed in perturbation theory in [3].The matching of the coupling in the case of the decoupling of one heavy quark is known to four loops in perturbation theory [4,5].Equivalently to match the couplings at a given renormalization scale one can formulate a relation between the renormalization group invariant Λ parameters of the effective and the fundamental theory.In this article we present a study of the perturbative behaviour of the ratio of Λ parameters computed up to four loops in the matching of the couplings, which requires the knowledge of the five loop β function, which had been computed in Refs.[6][7][8][9][10].
Besides studying the behaviour of perturbation theory itself it is desirable to compare to non-perturbative data.This is especially the case for the charm quark given that matching is performed at a fairly low scale ≈ 1.3 GeV in this case.It is very difficult to compare directly 2+1 flavor and 2+1+1 flavor lattice simulations, because various systematic uncertainties mask the physical effect.We proposed instead to simulate a model, namely QCD with two heavy, mass-degenerate quarks [11].The effective theory is the Yang-Mills theory up to 1/M 2 corrections.The mass dependence of ratios of hadronic scales such as t 0 (0)/ t 0 (M ), where t 0 is the Gradient flow scale [1] factorizes [11] at leading order in a non-perturbative and mass-independent factor, and a factor P , which is the ratio of the Λ parameters and depends on the heavy quark mass through the matching.Since the latter can be evaluated in perturbation theory we can compare the perturbative mass dependence of hadronic scales to the non-perturbative results from the simulations.We define a mass-scaling function which is the logarithmic derivative of P with respect to the logarithm of the mass.It can be determined directly from the simulations and compared to its perturbative expansion.
This article is organized as follows.In section 2 we describe the effective theory of decoupling.Section 3 contains a review of the matching of the effective and fundamental theory at leading order.We present a perturbative study of the ratio P of the Λ parameters, which results from the matching of the theories at leading order, and of the mass-scaling function.In section 4 we explain our non-perturbative study of decoupling in a theory with N f = 2 mass-degenerate heavy fermions with masses ranging up to (slightly above) the charm quark mass.We introduce the hadronic scales which we calculate in Monte Carlo simulations of lattice QCD and give details of the lattice simulations.The comparison of the non-perturbative mass dependence of hadronic scales computed from the simulations with perturbation theory is presented in section 5.The implications of these results for the applicability of perturbation theory at the scale of the charm quark mass are discussed in section 6.We summarize our results in section 7.In the appendix A we reproduce the explicit formulae for the matching of the couplings up to four loops and the perturbative coefficients of the mass-scaling function.The asymptotic behavior for large masses of P is derived in appendix B. Finally appendix C contains tables listing the simulations parameters.

The effective theory: decQCD
The effective theory associated with the decoupling of heavy quarks is formally obtained by integrating out the heavy quark fields.The resulting effective theory contains a tower of non-renormalizable interactions, which however are suppressed at low energies by negative powers of the heavy quark masses [2].The (infinite number of) couplings of the effective theory can be matched order by order and used to describe the effect of heavy quarks at low energies.
To be precise, let us consider QCD N f with N f quarks in total, of which N are light and N f − N are heavy.For simplicity we assume the light and the heavy quarks to be mass degenerate with the heavy mass given by M .Non-degenerate quark masses are conceptually similar, see note at the end of this section.In general the Lagrangian of the effective theory is where the leading order equals QCD N with N quarks and the corrections L k , k ≥ 1 consist of linear combinations of local operators of dimension 4 + k.These operators are composed of only the light quark and gauge fields, and include possible light mass factors.
They have to satisfy the symmetries of QCD N f , most prominently gauge, Euclidean (or Lorentz) and chiral invariance.For the cases of interest, operators of dimension five are excluded and corrections to the leading order start at O(M −2 ) Here we write L 2 explicitly as a linear combination of local dimension six operators Φ i , multiplied by dimensionless couplings ω i .The simplest situation in which (2.2) holds is N = 0, i.e., when light quarks are absent: the leading order is Yang-Mills theory and there is no gauge invariant dimension five operator made up of gauge fields alone.Thus at leading order only the gauge coupling has to be matched.We are basing our non-pertubative investigations in sections 4-5 on this setting.
In the presence of N ≥ 2 mass-less quarks the non-singlet, non-anomalous chiral symmetry in the light quark sector forbids any dimension five operator.The gauge coupling is still the only coupling to be matched at leading order.Note that the dynamical (nonperturbative) breaking of chiral symmetry plays no role here as we may consider full and effective theory in a finite (but large) volume where dynamical symmetry breaking is absent, in full analogy with the elegant derivation of automatic O(a) improvement of twisted mass QCD in [12].More explicitly consider a chirally non-invariant observable in the full theory in finite volume.It vanishes, while a priori in the effective theory at dimension five the Pauli term ω Pauli ψiσ µν F µν ψ/M contributes as the only dimension five gauge invariant operator.Matching of full and effective theory requires ω Pauli = 0.
In section 3 we consider the leading order in 1/M in perturbation theory for various values of N , N f .
For finite light quark masses there are dimension five operators, which are formed of the operators in L QCD N multiplied by the light quark masses.Their effect can be absorbed in a redefinition of the gauge coupling and light quark masses at the order m l /M .The Pauli term multiplied by the light quark masses contributes at dimension six.It is one of the Φ i in eq.(2.2).Besides the gauge coupling now also the light quark mass needs to be matched.
All in all, finite light quark masses do not change the structure of eq.(2.2).Of course couplings in the effective Lagrangian now also depend on the light quark mass.The only restriction is that when light quarks are present, we need at least a doublet, such that there is a non-anomalous chiral symmetry of the mass-less theory and we can conclude ω Pauli = 0 as sketched above.
In the following we concentrate on N ≥ 2 mass-less or N = 0 quarks.
3 Mass-dependence in the leading order effective theory At leading order, the only parameter of the effective theory, QCD N , is its running coupling and the theory predicts all observables when the coupling is prescribed at a given renormalization scale in a given renormalization scheme.It is conceptually cleaner, but completely equivalent in terms of the physical content to specify the renormalization group invariant (RGI) Λ-parameter.The scale dependence of the input is then gone and the scheme-dependence is easily computable: the one-loop relation of couplings yields the exact relation of the associated Λ-parameters.
Explicitly the Λ-parameter of QCD with N f quarks, is defined as the integration constant of the solution to the renormalization group equation (RGE) for the renormalised coupling ḡ at renormalisation scale µ with the QCD β-function We shall also make use of the RGI mass which appears as an integration constant in the solution of the RGE for the renormalised mass at scale µ.Amongst different mass definitions, the RGI mass is distinguished by scale and scheme independence and represents our choice to discuss mass-dependences.The above holds in any mass-independent renormalisation scheme.
In the following subsection we discuss how the relation of the Λ-parameters of fundamental and effective theory determine the (heavy-) mass dependence of low energy observables and then turn to the available perturbative information.This serves to prepare for our subsequent non-perturbative investigation.

Non-perturbative matching and mass-dependence
The leading order (in 1/M ) effective theory describes the fundamental one at low energy when Λ has the proper value.In other words, it has to be chosen as a function of M and Λ f .To make that precise, we specify the Λ-parameters in units of an arbitrary (but low energy) mass scale S. One may think of a hadron mass or low energy scales such as r −1 0 , t 13,14].The relation between the Λ-parameters of fundamental and effective theory may then be written as Since ratios of low energy scales are the same in the leading order effective theory and in the fundamental theory, 1 we may also omit the units and write remembering that (non-perturbatively) P ,f (M/Λ f ) has an O((Λ f /M ) 2 ) fuzziness and that the Λ's have to be measured in units of the same low energy scale in the two theories.One may also read eq.(3.9) in this way: once the intrinsic non-perturbative scale of the fundamental theory is specified the equation determines the one of the effective theory through the factor P ,f (M/Λ f ).Note that by definition the Λ-parameter of the fundamental theory does not depend on M , but the value of the dimensionful Λ-parameter in the effective theory Λ does depend on it through eq.(3.9).Multiplication of eq.(3.7) with S f (0)/Λ f yields the interesting equation defined entirely through the two mass-less theories.The ratio S f (M ) S f (0) can be computed in the fundamental theory and eq.(3.10) is a consequence of decoupling which can be tested.
We call eq.(3.10) factorisation formula because it separates the mass dependence into a "perturbative" (see section 3.2) factor P ,f and a non-perturbative factor Q S ,f respectively.In the same loose sense as usually used in factorisation formulae, the long-distance physics is in Q while the short-distance one is in P .The scale for long/short is given by 1/M .We have "perturbative" in quotes, because the meaning is not that perturbation theory gives the complete answer but that it yields an asymptotic expansion.
To simplify the notation, we will from now on omit the subscripts , f when referring to the quantities Q, P .
In phenomenology, eq.(3.10) does not seem interesting since one is usually not interested in the, e.g., proton mass at vanishing charm-or bottom-quark mass.However, in non-perturbative studies of QCD for different flavours the ratio Q is a natural quantity to determine, and is known to some degree, see below.Testing eq.(3.10) is thus a natural question.Furthermore, taking a logarithmic derivative of the nucleon mass w.r.t. the mass M yields the charm content in the nucleon, see Sect.6.3.
Indeed we will study the mass-scaling function (P (x) = d dx P (x)) which can be computed in perturbation theory when M is sufficiently large, cf.[15].We can estimate η M from the mass dependence of hadronic quantities by taking the logarithmic derivative in eq.(3.10) with respect to the mass where S f (0) and Q drop out.Their uncertainties play no role and we will therefore be able to make a more stringent comparison between perturbation theory and the full theory.Of course the Λ 2 /M 2 dependence of S in eq.(3.10) is inherited by η M .

Perturbation theory
We consider a mass-independent renormalization scheme; whenever we insert perturbative coefficients, it will be in the MS-scheme.To simplify notation we use ḡ(µ/Λ) ≡ g f (µ/Λ f ).

Matching of couplings
In general form, the relation between the couplings ḡ(µ/Λ) of the fundamental theory and g (µ/Λ ) of the leading order effective theory reads In principle the function F depends on which low energy observable is matched as discussed in the previous section for P S ,f .However, that dependence is only through powers of µ match /M , where µ match is the typical energy scale of the matched observable.In perturbation theory (µ match /M ) n terms can uniquely be separated from the logarithmic ḡ2 terms.Dropping the power corrections as appropriate for the leading order theory, the coupling relation (i.e. the function F ) is thus universal, i.e. independent of the matching condition.
Choosing the particular scale µ = m * [2,3] in eq.(3.14), the first order perturbative correction vanishes in the MS scheme and we have [4,16] The scale m * is defined such that the running MS quark mass fulfills m(m * ) = m * .The two loop coefficient is then given by c 2 = (N f − N ) 11 72 (4π 2 ) −2 .The coefficients c 3 and c 4 are known for N f − N = 1, 2 and N f − N = 1, respectively.They are listed in Appendix A. One should remember that through eq.(3.4), m * and M are in one-to-one relation.

Mass scaling function η M
In order to find the perturbative expansion of η M , eq. (3.12), we start from the related function (considering P (M/Λ) = P (M (m * , Λ)/Λ)) which appears upon taking a derivative with respect to the logarithm of m * on both sides of eq.(3.15).The left hand side yields where we used the matching condition g (m * /Λ ) = g (m * /(P Λ)).Combined with the straightforward derivative of the right hand side we can solve for η m and obtain where we used eq.(3.15) to replace g = g * C(g * ).Finally, with M m * ∂m * ∂M = (1 − τ f (g * )) −1 , (see eg. [17], section 3.3.2) we derive The first terms in the perturbative expression are given by is obtained from (3.20) and the coefficients are given by the recursion For example η M 1 = η 1 − d 0 η 0 , where d 0 is the universal coefficient of the QCD anomalous dimension (3.6).The higher order coefficients η i , up to i = 4, are collected in appendix A.
We note that for fixed N the first two coefficients are exactly proportional to At higher orders this is only true up to small corrections.The dependence on N at fixed N f − N is weak and amounts to a difference of about 20% at leading order between N = 0 and N = 3.In table 1 we list numerical values for interesting combinations of N f and N .Integrating eq.(3.12) now gives an asymptotic expression for the mass dependence of non-perturbative low energy scales S f from perturbation theory (L M = log(M/Λ)) where the constant k is fixed by the conventions for the Λ-parameter and the RGI mass M , which we specified at the beginning of the section, to: See Appendix B for the derivation of eq.(3.26).We note that the leading correction in the expansion eq.(3.26) is log(L M )/L M .It contains a term g 2 * log(g 2 * ), cf.eq.(B.5), which makes the convergence of the expansion slow.Therefore for the numerical evaluation of P we prefer to use the formula eq.(3.28) which has corrections only in powers of g 2 * (no logarithms), see the details in section 3.3.Accidentally, for the interesting cases, the asymptotic expression eq.(3.26) for P is dominated by exp(η 0 L M ) = (M/Λ) η 0 .This can be seen by the numerical smallness of η M 1 /2b 0 (N f ) and log(k) in table 1.

Accuracy of perturbation theory
A consistency check on the applicability of perturbation theory is the comparison of different orders.Indeed, figures 1-3 show that higher orders do not contribute very much, in particular when one uses the mass dependence in terms of the RGI mass, η M .This also suggests that it is an advantage to consider the perturbative prediction for P in terms of M/Λ instead of working with m * /Λ.We have worked with M/Λ in [11] and will do so below in our comparison to a non-perturbative investigation.Details for η m and η M are seen in figure 1-3.In the legends of the plots the number of loops corresponds to the highest loop order of the β function which is used.We note that in the right plot of figure 2 the 5-loop correction is larger in magnitude than the 4-loop correction for g 2 3.But the corrections are amazingly small.
The number of loops corresponds to the highest loop order of the β function which is used.The functions η m (g 2 ), η M (g 2 ) for the case N f = 4, N = 3.The number of loops corresponds to the highest loop order of the β function which is used.
Renormalization group improved perturbative predictions for the function P (M/Λ) = Λ /Λ f can be obtained from (cf. eq.(3.1))where The coupling where M is the RGI mass corresponding to m * .For this equation we have combined eqs.The predictions for different orders of perturbation theory are very close to the unsystematic one-loop "approximation", P (1) = (M/Λ) η 0 , as long as M/Λ < 30 or so and the number of flavors is small.This is accidental.In figure 5-7 we plot the one-loop "approximation" and the 4-loop result on the left and the relative correction (P − P (1) )/P (1) .
(3.32) at 2,3,4-loop on the right.When it is available we also add the 5-loop result.In this comparison, when we consider at least 2-loop precision, we always work to a consistent order in the renormalization group functions.Note that we truncate the renormalization group functions β, τ in the integrals eq.(3.30), eq.(3.31) and 2-loop accuracy means, e.g.,  The function C(g) only enters at 3-loop precision since c 1 = 0.It is only needed for the upper integration limit in eq.(3.28) and there we compute explicitely C(g * ) = C(g * ).
In the numerical results we observe in particular that for the phenomenologically relevant case of N f = 5, N = 4, the 3-loop contribution (difference 3-loop to 2-loop) is around 2% while the 4-and 5-loop ones are then nice and small, see the right plot in figure 7. Judging by perturbation theory alone, the perturbative predicition for decoupling the b-quark should be very reliable.Also for the other phenomenologically relevant case of decoupling the c-quark (N f = 4, N = 3) perturbation theory appears to work quite well.
These curves suggest that perturbative decoupling introduces only errors at the subpercent level for the ratios of Lambda parameters, once perturbation theory applies at all.In table 2 we list the values of P computed from eq. (3.28) using different orders of perturbation theory.We evaluate P at an argument M/Λ which depends on N f and N .For N f = 2, N = 0 we obtain M/Λ from the PDG value for m c [18] and Λ 2 = 310 MeV from [19].In this case there is no 5-loop result because the coefficient c 4 is not known.4 Non-perturbative investigation for We investigate a model, namely QCD with N f = 2 heavy, mass-degenerate quarks.The decoupling is then 2 → 0 and the Lagrangian of the effective theory, L dec , is the Yang-Mills one up to 1/M 2 corrections.We target the RGI quark mass values (see below)  mass M c from [20] in agreement with [18] within the present uncertainties.However, for our model study the exact value is not important.

Low energy observables
In principle any low-energy hadronic scale S(M ) can be used to study decoupling, but in practice some choices are far superior to others.Ideally we look for a quantity that is easily non-perturbatively renormalizable, well defined in both full and effective theory, has controllable lattice artifacts, is cheap to compute and can be determined with a high precision.Since in our case the effective theory has no fermionic content, we are restricted to purely gluonic observables.Glueball masses would be natural candidates.However, it is difficult to determine them precisely enough.Hadronic scales derived from the static quark potential fulfill all criteria and have been popular for many years.If F (r) denotes the force between two static quarks (defined in terms of the fundamental Wilson loop), a distance r x can be defined implicitly [13] by choosing a number c and solving The choices r 0 ⇔ c = 1.65 [13] and r 1 ⇔ c = 1.0 [21] have become standards.In a lattice calculation the latter has a better statistical precision, but larger lattice artifacts.Moreover we expect decoupling to be more precise for the longer distance, r 0 .In recent years, these scales have been largely replaced by scales based on the gradient flow [1,22].The gauge field A µ is used as an initial condition in a flow equation, that describes the relaxation of a field B µ as a function of a flow time t.
The field strength tensor G νµ and the covariant derivative D ν are defined in the usual way, but at flow time t.The crucial observation, that correlators of the B µ fields at finite flow time are renormalized quantities [23], allowed to introduce a family of scales.The definition of scales √ t c and w 0 [14] is based on the dimensionless combination together with In our simulations we compute the hadronic scales The rest of this section contains technical details about the lattice simulations.It can be omitted if one is only interested in the physical results presented in section 5.
4.2 Fixing the RGI parameters of the theory and details of the simulations

Discretization
We use Wilson's plaquette gauge action [24] and include quarks treated with two discretizations: O(a) improved Wilson fermions [25,26] and twisted mass [27] Wilson fermions at maximal twist.For both actions the clover term [25,26] has the non-perturbatively determined improvement coefficient c sw [28].Twisted mass fermions at maximal twist are automatically O(a) improved [29] also without a clover term.However, with the clover term added our two discretizations have a common chiral limit in a finite volume (see L 1 below).Furthermore the clover term reduces O(a 2 ) lattice artifacts as it was shown for example in [30].
In appendix C we list the ensembles generated with standard Wilson fermions in table 4 and with twisted mass Wilson fermions in table 5.The twisted mass simulations are the same as in [31].
We determine the lattice spacings through the scale L 1 [19,32], which is defined by ḡ2 SF (L 1 ) = 4.484 through the so-called Schrödinger Functional coupling at zero quark mass and in a finite volume of size L 4 1 .Note that in this situation the two discretizations are identical.Thus at a given gauge coupling β = 6/g 2 0 they have one and the same lattice spacing.The values of L 1 /a and the corresponding lattice spacings are listed in table 6.

O(a) improvement and finite size effects
O(a) improvement of quark mass effects requires to keep the improved bare coupling g2 0 = (1 + b g (N f ) am q ) g 2 0 fixed, where m q = 1/(2κ) − 1/(2κ c ) is the bare subtracted standard mass.Twisted mass fermions at maximal twist have m q = 0 and therefore the improved coupling is g0 = g 0 .Instead our simulations with standard Wilson fermions were done at fixed g 0 (and not g0 ).We correct for the resulting O(am) effects in the lattice spacing by decreasing the values of aS(M ) using the 1-loop result b g (N f ) = 0.01200 N f g 2 0 [33, 34] and the 1-loop β-function.For S = 1/ t 0 (M ) these effects shift the value of t 0 (M )/a according to We use am q = am/(Zr m ) and the factor Zr m is taken from [19] (at 6/g2 0 = 5.7 we get Zr m = 1.194 from a Padé fit).Here am denotes the PCAC mass.We added in quadrature 100% of the correction to the errors as an estimate of unknown O(g4 0 ) terms in b g .After the corrections the values of aS(M ) correspond to simulations performed at β = 6/g 2 0 .Our volumes are such that the lightest pseudo-scalar mass times the box size is m PS L ≥ 7.4 and L/ t 0 (M ) ≥ 12 and L/r 0 (M ) ≥ 3.8.At our largest masses the situation is comparable to the pure gauge theory, where significant finite volume effects can be excluded for a lattice size L ≈ 4r 0 = 2.0fm.Approximate decoupling of the heavy quarks means that also our finite mass simulations are practically free of finite volume effects.

Quark masses
Before taking the continuum limit, we non-perturbatively fix the value of the RGI quark mass M in units of the Λ parameter through the following steps.We take the Λ parameter to be defined in the MS scheme while the RGI mass M is independent of the scheme.
In the case of standard Wilson fermions the renormalized quark mass in lattice units am SF (L 1 ) at length scale L 1 is defined by am SF (L 1 ) = Z A /Z P (L 1 ) am, where the renormalisation factor Z P (L 1 ) is defined in the Schrödinger Functional scheme as in [19] and also the details of the definition of m are found there.The axial current renormalization factor, Z A , is fixed by a chiral Ward identity [35] 2 .For the determination of the PCAC mass am we use our publicly available program 3 .The ratio M/Λ is then obtained from where we take M/m SF (L 1 ) = 1.308( 16) from [19,37] and Λ L 1 = 0.649 (45) from [38].The values of the PCAC mass m and of M/Λ are tabulated in table 4. The accuracy of M/Λ is around 7% with an error dominated by the one of Λ L 1 .Thus, ratios of masses M 1 /M 2 or equivalently logarithmic derivatives with respect to masses are known significantly more precisely.
In the case of twisted mass fermions at maximal twist the difference is that the renormalized quark mass am SF (L 1 ) is calculated through am SF (L 1 ) = aµ/Z P (L 1 ), where aµ is the twisted mass parameter.The ratio M/Λ is again obtained from eq. (4.10).For twisted mass fermions we actually invert eq.(4.10) to determine the twisted mass parameter corresponding to given values of M/Λ which are tabulated in table 5.

Hadronic scales on the lattice
In our simulations we measure the observables discussed in section 4.1.Various details concerning their computation in the discretized theory are as follows.
The clover (symmetric) definition of the action density E is used in eq. ( 4.4) and we use the Wilson-flow equation, cf.[1].
The scale r 0 is defined with the "HYP2" action for the static quarks [39].It is determined with our publicly available program 4 following the details explained in Ref. [40].We use a variational basis with up to four levels of spatial HYP smearing [41] to construct a matrix of Wilson loops.Due to the open boundary conditions, Wilson loops are averaged only in a temporal region sufficiently far away from the boundaries to exclude contaminations from boundary effects.The static potential as a function of r is obtained by solving the generalised eigenvalue problem as discussed in Ref. [40].
Hadronic scales such as t 0 are non-linear functions of one or more Monte-Carlo averages of "primary observables" O 1 , . . ., O N ob , like for instance the action densities at different flow times.The derivative of such a function with respect to the twisted mass, as needed for the MC evaluation of η M (below in eq.(5.6)), is in general given by and the derivative of a primary observable O, For most observables ∂f ∂µ and dO dµ are absent.The derivative of the action is given by dS/dµ = ia 4 x ψ(x)γ 5 τ 3 ψ(x).In cases like ours, where the observables do not contain fermionic fields, no new Wick contractions arise in the first term, and one simply needs to determine the observable and the action-derivative on each configuration and compute their connected correlation.For the action-derivative we write (cf.[42]) where in the last step a property of the twisted mass Dirac operators D u,d (for up and down quark), D u − D d = 2iγ 5 µ, was exploited, leading to an expression that has a smaller variance, when the trace is estimated stochastically.A stochastic estimation is necessary to avoid a full matrix inversion, and amounts to solving equations D u ξ = η, with 4D noise spinors η, for ξ and a subsequent dot product ξ •ξ.We find that different noise distributions (e.g.normal or U (1)-noise) yield a similar variance, and further refinements like spin or color dilution [43] do not pay off.Not many noise-sources are needed for the final error to be close to the limiting error due to gauge field fluctuations.In our measurements we settle for 64 U (1) noise spinors per configuration.

Simulation algorithms
In the case of standard Wilson fermions, part of the simulations are performed using periodic boundary conditions (except for anti-periodic boundary conditions in temporal direction for the fermions) and the MP-HMC algorithm [44].In order to avoid the freezing of the topological charge (see also next section), for simulations with t 0 /a 2 > 5.5 [45,46] we adopt open boundary conditions in time and use the publicly available openQCD package5 [47].We set the boundary improvement coefficients to their tree-level values c G = 1 and c F = 1.In both cases the fermion determinant is Hasenbusch-factorized [48] using a splitting in two factors, thus two pseudo-fermion fields are needed and a hierarchical numerical integrator is employed (Leapfrog and Omelyan-Mryglod-Folk integrator schemes are used at the different levels).The trajectory length is always set to 2.0 and configurations and measurements are separated by at least four trajectories.Most computer resources are spent in the solution of the Dirac equation with the smallest mass.For M/Λ > 1 we use the SAP preconditioned GCR algorithm [49] while for M/Λ < 1 it is profitable to use a multigrid solver [50], which is implemented as the two-grid "locally deflated" solver in the openQCD package since version 1.2.The cost of the simulations is low compared to simulations in the chiral regime.
In the case of twisted mass fermions we use a version of openQCD, in which the SAP preconditioner can have a different value of µ than the simulated one.In the preconditioner the twisted mass term is defined only on the even sites.We achieve a significant speed up of the SAP preconditioned GCR algorithm by choosing a value of µ for the SAP preconditioner which is larger by approximately a factor 6 than the simulated one (the multi-grid inverter of [51] implements a similar strategy inspired by our findings).
Open boundary conditions are used as specified above.In this setup the Wilson-Dirac operator has two mass parameters, the standard bare quark mass m 0 and the twisted mass µ.Maximal twist means that m 0 is set to its critical value m c which corresponds to the vanishing of the current (PCAC) quark mass.We extracted the critical mass from table 13 in [19], interpolating the data to the desired β values by a Padé fit in g 2 0 = 6/β of the form where the coefficients u 1 and u 2 coincide with two-loop perturbation theory [52].The values of the hopping parameter κ = 1/(2am c + 8) are listed in table 5.

Autocorrelation times and error analysis
We measure the integrated autocorrelation time τ int for all measured quantities including the hadronic scales, the PCAC mass and additionally the topological susceptibility.We find the largest τ int for the scale t 0 and for the topological susceptibility χ corr as defined in [46], see figure 8, which we use as a rough estimate of the exponential autocorrelation time τ exp , cf. [45].At the smallest lattice spacing a = 0.036 fm that we reach with standard Wilson fermions we estimate τ exp 200 − 300 MDU (Molecular Dynamics Units).Our statistics of 4000 − 8000 MDU is therefore adequate but does require a particularly careful error analysis.With twisted mass fermions at maximal twist we reach a smallest lattice spacing of a = 0.023 fm (β = 6.0).There we estimate τ exp = 357 MDU and have a statistics of 63τ exp .For the twisted mass simulations at M/Λ = 4.87 and β = 5.88 , 6.0 the statistics is too small to determine τ int for χ corr .The autocorrelation times shown in figure 8 are reasonably well described by the dotted line where one has to take into account that determinations of τ exp including an error estimate are notoriously difficult.Thus the data in figure 8 is consistent with the expectation, that for simulations with open boundary conditions autocorrelation times scale with 1/a 2 .The error analysis is performed with the program6 of [45].It is based on [53] and adds a tail to the autocorrelation function as an estimate of the slow mode contribution [45].

Test of the factorization formula
We remind that our model is QCD with two heavy, mass-degenerate quarks and thus the effective theory, decQCD, is the Yang-Mills theory up to 1/M 2 corrections (N f = 2, N = 0).For the hadronic scale S = 1/ √ t 0 [1], the factorization formula eq.(3.10) takes the form We turn now to a comparison of eq. ( 5.1) to non-perturbative data.Preliminary results have been presented in [54], where only data for Wilson fermions were available.Now we can combine those data with the new simulations with twisted mass fermions and perform careful continuum extrapolations.In the extrapolations we only use data points which satisfy a 2 /t 0 (M ) < 0.32.
In order to compute the ratio in eq. ( 5.1) we write and separately take continuum limits for the two factors on the right hand side.There the mass independent scale L 1 enters, see section  Where both are available a combined continuum extrapolation is performed.
quadratic fit of ln(L 1 /a) as a function of β.We take data for L 1 /a from table 13 of [19] and add the newly determined values L 1 /a = 20.31(69) at β = 6.1569 and L 1 /a = 24.83(88) at β = 6.2483.
The first factor on the right hand side of eq. ( 5.2) is computed using the data t 0 (M )/a 2 obtained in the simulations listed in table 4 and table 5.For the simulations with standard Wilson fermions we include the b g effects as explained in section 4.2.2.We have data for five values of the quark masses given in eq.(4.1).Some of our data for the ratio t 0 (M )/L 1 are shown in the right plot of figure 9 together with their continuum extrapolations.We show the two extreme values of the quark mass, separated by a factor of 8.The extrapolations linear in a 2 /t 0 work very well and we observe that the size of cut-off effects is smaller for the twisted mass data.For this reason we opted for the twisted mass discretization to simulate masses at or larger than the charm quark mass.
In order to compute the second factor on the right hand side of eq. ( 5.2) we use the values of t 0 (0)/a 2 in the chiral limit which are known for β = 6/g 2 0 = 5.2, 5.3 and 5.5 from [55].The continuum extrapolation of √ t 0 (0)/L 1 linear in a 2 /t 0 (0) using the three β values works well, see the left plot of figure 9 and yields t 0 (0)/L 1 = 0.3881 (52).
Our continuum results for the ratio t 0 (M )/t 0 (0) are listed in table 3. Correlations of the two factors originating from the common data of the scale L 1 /a help to reduce the overall error.
Figure 10 shows the values of t 0 (M )/t 0 (0) of table 3   The values of t 0 (M )/t 0 (0) computed through eq. ( 5.2).The errors are obtained from error propagation which takes into account the correlation between the two factors in eq. ( 5.2).
We display a horizontal error stemming from the uncertainty of M/Λ originating from ΛL 1 in eq.(4.10).The vertical dotted lines mark the values of the quark mass M c , M c /2 and M c /4.We compare the Monte Carlo data to the factorization formula eq. ( 5.1), where the factor P 0,2 is computed to 2-(blue dashed line) and 4-loops (black line).The error on the factorization formula comes from the numerical values and is displayed by the gray shaded band only for the 4-loop curve.For completeness, in figure 10 the magenta line to the right shows the mass dependence in the chiral limit estimated from [55,57], cf.[58].
From figure 10 we see that there is agreement between the Monte Carlo data of table 3 and the factorization formula eq. ( 5.1) for quark masses at the charm quark mass value M c .Thus within our precision of 10% due to the uncertainty of the factor Q in eq. ( 5.3), the data match the upper error band of the perturbative prediction.In [11] we presented results for the ratio r 0 (M )/r 0 (0) and reached similar conclusions albeit with less precise data covering only the region below the charm quark mass.Our new results for t 0 (M )/t 0 (0) are much more precise than the value of Q extracted from the literature.This allows to turn the tables and predict obtained by taking M/Λ = 5.7781 in eq. ( 5.1).For t 0 (M )/t 0 (0) we use our result in the last line of table 3. We evaluate the factor P 0,2 (M/Λ = 5.7781) = 1.2328 and assign to it a conservative 2% error as it will be estimated in section 6.This determination avoids entirely the computation of the running of the coupling at high energy [38,56].In a nutshell it is replaced by perturbation theory for the difference of the running.The essential point is that the latter is given by the contribution of quark loops for which we non-perturbatively confirm that perturbation theory is very accurate.We will comment more on this in the conclusions.The mass-dependence of the ratio t 0 (M )/t 0 (0) in the theory with two massdegenerate quarks.Monte Carlo data after continuum extrapolation are compared with the perturbative predictions for 1/(QP ) at large M eq.(5.1).The gray shaded error band represent the error of the 4-loop curve (black line) deriving from Q.The dashed line is the 4-loop curve adjusting the value of Q to go through the point at M/Λ = 5.7781, see eq. (5.4).A fit function which describes the mass-dependence close to the chiral limit is also shown to the right.The vertical dotted lines mark the values of the quark mass M c , M c /2 and M c /4.

The mass-scaling function η M
By discretizing the derivative in eq.(3.13) we obtain from our simulations numerical estimates of the mass-scaling function We use this definition to compute η M (M ) at M = √ 1.28 × 0.59 and √ 2.50 × 1.28 using S = 1/ √ t 0 , 1/ √ t c and 1/w 0 .As emphasized before, these estimates differ by 1/M 2 effects.
We have data at three values of the lattice coupling β = 6/g 2 0 = 5.3, 5.5 and 5.7 for both standard Wilson and twisted mass discretizations.We can also compute a value of η M (M ) at M = √ 4.87 × 5.7781 but its statistical errors are large.For the case S = 1/ √ t 0 and M = √ 1.28 × 0.59, the simulation data are shown Examples of continuum limits of η M (M ) extracted from S = 1/ √ t 0 using a linear extrapolation in a 2 /t 0 (M ).In the left plot η M (M ) is computed at M /Λ = √ 1.28 × 0.59 using the definition eq.(5.5).Shown are data for standard Wilson (black circles) and twisted mass (red squares) and their combined continuum extrapolation.In the right plot η M (M ) is computed at M = M c (M/Λ = 4.87) using the definition eq.(5.6).
in the left plot of figure 11.The continuum value results from a combined continuum extrapolation linear in a 2 /t 0 (M ).In all our continuum extrapolations we apply the cut a 2 /t 0 (M ) < 0.32 to the data to be fitted.The plot shows the continuum extrapolation for both discretizations together with its error bands.
The continuum values of η M (M ) for the various choices of S are presented in figure 12 and plotted against Λ 2 /M 2 .Notice that the data points corresponding to different quantities S are slightly displaced horizontally for clarity of presentation.The spread of the data due to 1/M 2 effects decreases when M increases as expected.For comparison we plot in figure 12 also the 1-loop (the constant value η 0 ) and 4-loop (up to the η M 3 term) expressions, see eq. (3.23), eq.(3.24) and appendix A.
The mass-scaling function η M can also be computed directly from a simulation at a single quark mass.Using the twisted mass discretization we can rewrite eq.(3.13), for example taking S = 1/ √ t 0 , as The derivative dt 0 dµ is computed as explained in section 4.2.4.Using S = 1/ √ t c or 1/w 0 results in determinations of η M (M ) similar to eq. (5.6).
In the right plot of figure 11 we show the data for the quantity on the left-hand side of eq.(5.6) computed from our simulations at M = M c (M/Λ = 4.87) with twisted mass fermions at four values of the lattice coupling β = 6/g 2 0 = 5.6, 5.7, 5.88 and 6.0.Our fine lattices are needed to control the cut-off effects at this large value of the mass.We perform The mass dependence of the mass-scaling function η M in the theory with two mass-degenerate quarks.η M is obtained from the hadronic scales 1/ √ t 0 , 1/ √ t c and 1/w 0 and the data for a given mass M are slightly diplaced horizontally for clarity.The Monte Carlo data are compared to the perturbative curves.The dash-dotted lines are the fits eq. ( 6.1) and eq.( 6.2) for 1/ √ t c and 1/w 0 .The vertical dotted lines mark the values of the quark mass M c , M c /2 and M c /4. continuum extrapolations by "fits" to a constant.Taking three, two or just the last point yields results which are in agreement.We settle for the two-point average which of course has a larger error than the three-point one.The continuum values are plotted in figure 12, together with similar determinations of η M (M c ) from S = 1/ √ t c and 1/w 0 .At M = M c the different determinations agree well with each other signaling the smallness of the 1/M 2 corrections [31].
For our model with two charm quarks we see from figure 12 that η M is about 1/10, both in perturbation theory and non-perturbatively.For a single charm quark there is an additional factor 1/2. Thus a 2% shift of the charm quark mass leads only to a 1‰ change of a low energy hadronic quantity of mass-dimension one.
The precision of η M (M c ) that we can achieve is around 10%.Within this error the nonperturbative values agree with the perturbative one.This does not look very precise, but in absolute terms this is ∆η M = 0.01.We put this into the perspective of phenomenology in the following section.

How big are the effects of charm loops?
We recapitulate that the effects of charm loops at low energies come in two classes.One is when we are concerned with dimensionless low energy observables which do not refer to quantities at energies around or above the charm mass.In lattice slang: the quantity is long distance and the lattice spacing a is set through long distance physics in the theory with the heavy quark.In this case the value of the Λ-parameter drops out and the only effects of the heavy quark mass are due to the power corrections originating from L 2 studied in [11,31].These effects are very small.To be specific, when decoupling two charm quarks, the power corrections in ratios of hadronic scales eq.(4.8) were found to be approximately 0.4%.
The prototype for the second class is given by the connection of the fundamental scales of the four-flavor and the three-flavor theory.In our model it is the connection between the two-flavor theory and the zero-flavor theory.The very relevant question is what the uncertainty is when one uses the perturbatively computed P ,f (M/Λ f ).In section 3.3 we have seen that 3,4,5-loop corrections are very small.How big can non-perturbative effects be?The close agreement of our non-perturbative η M (section 5.2) with perturbation theory and the dashed curve in figure 10 with the non-perturbative points shows that they are small.We now put this into numbers, estimating the non-perturbative effects to η M and to P ,f (M/Λ f ) in our model calculation with N f = 2, N = 0.As will become clear, these estimates are rough and, depending on the assumptions made, can vary quite a bit.Still, their smallness can be quantified at a reasonable level.
6.1 Non-perturbative effects on η M and P 0,2 In figure 12 we include dash-dotted curves corresponding to the fits where η M pert is the 4-loop expression and η M, S NP the remainder, which depends on the quantity S. As a first estimate of the non-perturbative contribution we assume that η M, S NP is dominated by the terms in L 2 and neglect the logarithmic (in M/Λ) corrections.This means we assume for large masses.Note that the fit function eq. ( 6.1) has the correct asymptotics as guaranteed by asymptotic freedom in the form lim M →∞ g * = 0.In figure 12 we compare fits for S = 1/ √ t c and S = 1/w 0 .The fits include the Monte Carlo data of η M for M/Λ = 4.87 (the charm-quark mass) and M/Λ = √ 2.50 × 1.28 = 1.8.They yield the values c 1/ √ tc = −0.167(22) and c 1/w 0 = −0.048(39).In the following we will take c = −0.2,which is a conservative choice accommodating both values and their errors.Covering the end of the error bars at the charm would require values of |c| larger by a factor two to three.
We recall from eq. (3.12) that the mass scaling function is defined as η M ≡ ∂ log(P 0,2 ) ∂x Λ , with x = log(M/Λ).The effect of the Λ 2 M 2 term on P 0,2 (M/Λ), ∆ log(P 0,2 ) ≡ log [P 0,2 (M/Λ)] − log P 0,2 (M/Λ)| pert (6.4)Note that due to the asymptotics eq. ( 6.3) the contribution to eq. (6.4) from the integration limit at ∞ cancels in the difference.Inserting c = −0.2 and the approximate charm-quark mass value Λ 2 M 2 c ≈ 1/25 yields ∆ log(P 0,2 ) = 0.004.This means a 0.4% change (or better uncertainty) due to non-perturbative effects of the described form and magnitude.In other words a 0.4% precision for perturbation theory in the conversion of the Λ-parameter.We consider this a good estimate, but it clearly depends on the assumptions made.Therefore, we present a second, very conservative, estimate.
As illustrated in figure 13, we split the integral into where M pert is high enough such that h and therefore B can be neglected or replaced by the previous estimate.For the lower mass region we just bound The ratio eq. ( 6.12) of the mass-dependence function η M computed from the hadronic scales S 1 = 1/ √ t 0 and S 2 = 1/ √ t c .The lines in the red and blue bands are fits assuming leading non-perturbative effects proportional to (Λ/M ) 2 and Λ/M respectively.
where h max is the maximum of |h(x)| in the interval log(M/Λ) ≤ x ≤ log(M pert /Λ).
Numerical information is now obtained by making the reasonable assumption that beyond the masses that we have reached η M continues approaching the perturbative one.We can then replace h max by what we find for our largest mass, η M (M c /Λ) − η M pert (M c /Λ) = −0.006(13) or |h| ≤ 0.019.Further setting M pert = 3M c where 1/M 2 terms are suppressed by an order of magnitude compared to at M c , we arrive at |A| ≤ 0.021.We here took the scale S = 1/ √ t 0 but the others yield numbers which are very close.Given that no decay of |h| is used this is likely an overestimate of the integral and we neglect the small piece B. We thus cite as the conservative estimate ∆ log(P 0,2 ) = 0.02 , ( a 2% non-perturbative contribution to P 0,2 .

Power corrections
In eq. ( 6.2) we made the assumption that the non-perturbative effects are dominated by the leading (Λ/M ) 2 ones for our largest masses.It was tested in [31] for ratios of two different hadronic scales S 1 /S 2 in the same mass-range.We corroborate it for the case of η M by computing the ratio R = η M, S 1 η M, S 2 (6.12) of η M calculated as in eq.(3.13) from two different hadronic scales.Using eq. ( 6.1) and eq.( 6.5) we see that In figure 14 we show the results for the choice S 1 = 1/ √ t 0 and S 2 = 1/ √ t c .The line in the red band is a fit to the two largest mass points using the assumption in eq.(6.2) and neglecting higher order terms in R. It yields h pert with a χ 2 per degree of freedom equal to 0.003.For comparison we also show the line in the blue band which corresponds to nonperturbative effects proportional to Λ/M .It yields h with a worse χ 2 per degree of freedom equal to 2.4.
We can use the fits to the ratio R to estimate the size of non-perturbative effects in the difference of log(P 0,2 ) extracted from S 1 and S 2 : for the difference of log(P 0,2 ) eq. (6.13).This difference is a further test of the nonperturbative effects.The absolute values are significantly smaller than the conservative estimate in eq.(6.11), confirming the latter.

Heavy quark content of the nucleon
The matrix element of the scalar heavy quark density between nucleon states is a relevant contribution to the cross-section for the scalar interaction of dark matter with ordinary matter [59].It can be related, by the Hellmann-Feynman theorem, to the derivative of the nucleon mass m N with respect to the heavy quark mass.In the chiral limit for the up, down and strange quark and up to O(Λ 2 /M 2 q ) this derivative is the mass-scaling function η M , see eq. (3.13), where m q,0 is the bare heavy quark mass and (qq) 0 is the bare scalar density of quark q, and (qq) RGI is the RGI-renormalized scalar heavy quark density.Our result in figure 12 shows that perturbation theory can be safely applied to compute η M as it was done in [15,60,61] and non-perturbative effects in η M are below 0.02/2 for the case of a single charm quark as just discussed.

From the model to QCD
Note that currently the precision for the Λ-parameter is at the level of around 4% [18,62].This sets the scale for what is small and what is big.Furthermore, there is no reason why our toy-model computation should give a significantly different result for the magnitude (not the details) of non-perturbative effects except that we have decoupled two heavy quarks.Indeed, since we are dealing with small effects of quark loops, it is very plausible that the effect of more than one quark-loop effects are smaller than the ones of a single quark loop, which scales proportionally to the number of quarks.We are here just counting quark loops in arbitrary gauge backgrounds, so the argument is valid independently of whether the gauge coupling is large or small.It is non-perturbative.It means that these small effects will be about a factor two smaller for the decoupling of the charm-quark in QCD, compared to the studied model.We use this for the magnitude of all effects, also for the uncertainty of perturbation theory.
We saw in table 1 that the dependence on the number of light quarks of η M between N = 0 and N = 3 amounts to about 20% at leading order in perturbation theory.For this reason we include a safety margin of 50% in our estimate of non-perturbative effects h max , see eq. (6.10).We conclude that one can safely neglect non-perturbative effects all-together for connecting three-flavor and four-flavor Λ at a level down to ∆ log(P 3,4 ) = 0.015 , (6.15) a 1.5% non-perturbative contribution to P 3,4 .
In the same way non-perturbative effects to eq. (6.14) are estimated to be below 1.5 × h max /2 = 0.014 in QCD when q is the charm quark.

Conclusions
In this article we presented a numerical study of the decoupling of heavy quarks.In particular we study the dependence of hadronic, low energy quantities on the mass M of the decoupled heavy quark.We define and compute in perturbation theory a massdependence function η M eq.(3.13).This computation is performed in leading order in the effective theory which describes the decoupling of the heavy quarks at low energy.We study the behavior of perturbation theory for the function η M and show that perturbation theory by itself suggests that it is well within the region of asymptotic convergence even for the case of decoupling a charm quark.We remark that η M can be related to the heavy quark content of the nucleon, see eq. (6.14), which is a relevant input for dark matter searches.
To test the applicability of perturbation theory at the charm quark mass we compare the mass dependence of the ratio t 0 (M )/t 0 (0) defined in terms of the hadronic scale 1/ √ t 0 to the perturbative prediction, see figure 10.We also determine the mass-scaling function η M non-perturbatively, see figure 12.In order to be able to control the continuum extrapolations and have precise results we do this in a model consisting of two massdegenerate quarks whose mass ranges up to the charm quark mass.The non-perturbative mass dependence agrees with the perturbative prediction at a level of about 10% for the small mass-scaling function η M computed at the charm quark mass.This means that we confirm that a 2% shift of the charm quark mass leads only to a 1‰ change of a low energy hadronic quantity of mass-dimension one.We explained in section 6 that this precision is good enough to conclude that at the charm mass, the function P ,f in eq.(3.9) can be predicted by perturbation theory with 2% accuracy for N f = 2, N = 0 and 1.5% accuracy for N f = 4, N = 3.This allows to predict Λ MS √ t 0 N =0 = 1.134(28) .(7.1) Moreover we estimate that the non-perturbative effects in η M are below 0.014 for the charm quark.These numbers are for the blue curve in figure 13, while we think that the red curve ∼ 1/M 2 is more realistic; it yields non-perturbative uncertainties which are a factor five smaller for P ,f .
On the other hand, in the direct comparison of t 0 (M c )/t 0 (0) to the product Q P , eq. (3.10) we presently have only 10% accuracy because in the literature the ratio, Q is not known more precisely.
Our most important conclusion concerns phenomenology: the ratio of three-flavor and four-flavor Λ-parameters can be computed in perturbation theory with a precision of 1.5% or better.Power corrections ∼ 1/M 2 c were found to be much smaller in low energy observables [11,31].This means that the Λ-parameter of the five-flavor theory is safely predicted at the 1-2 percent level from three-flavor low energy physics once the running of the coupling is under control [63], see section 6.4 for details.Note that the present precision of ∆α MS (M Z ) = 0.0008 of [63] corresponds to 3.5% in the Λ-parameter.Thus, there is plenty of room for relevant improvement within the three-flavour theory.
Similarly we conclude that non-perturbative effects to the charm quark content of the nucleon, eq.(6.14) are below 0.014.The columns show the lattice sizes, the gauge coupling β = 6/g 2 0 , the boundary conditions (periodic (p) or open (o)), the hopping parameter κ (which is related to the bare mass m 0 through κ = 1/(2am 0 + 8)), the PCAC mass am, the ratio of the RGI mass M to the Λ parameter (computed using eq.(4.10)), the scales r 0 /a and t 0 /a 2 and the total statistics in molecular dynamics units.

C Simulation parameters
In table 6 we list the values of the hadronic scale L 1 /a [19,32].At β = 5.3, 5.5 they are taken from Table 7 of [19].At the other β values they are obtained from a quadratic fit in β of ln(L 1 /a), where data for the latter are taken from Table 13 of [19].The lattice spacing for β > 5.5 (not covered by the simulations in [19]) can be inferred from the value L 1 = 0.400 (10)fm determined in [19].

C.1 Mass corrections
The data for a hadronic scale S such as r −1 0 , t −1/2 0 obtained from the simulations with standard Wilson fermions are corrected for small mismatches of the values M/Λ compared to the target values M t /Λ given in eq.(4.1), see table 4.This is done by fitting the β = 5.7 data to the form aS(M ) = s 1 × (M/Λ) α , (C.1) with fit coefficients s 1 and α.This fit formula is motivated by eq.(3.10) taking the asymptotic expression P = (M/Λ f ) η 0 .For example for S = 1/ √ t 0 we get α = 0.123 (2) and for S = 1/r 0 we get α = 0.139 (12) which are close to η 0 = 0.121212.The corrected values S(M t ) are computed as ln(aS(M t )) = ln(aS(M )) + α ln(M t /M ) .(C.2) Note that eq.(C.2) being a small correction is applied for all lattice spacings a. Moreover the Λ parameter drops out in eq.(C.2).Since the main contribution to the error on M/Λ comes from ΛL 1 , it does not affect the mass corrections.In order to determine the final Table 5: Overview of the ensembles generated with N f = 2 twisted mass fermions at maximal twist.The columns show the lattice sizes, the gauge coupling β = 6/g 2 0 , the hopping parameter κ (for maximal twist), the twisted mass parameter aµ, the ratio of the RGI mass M to the Λ parameter (computed using eq.(4.10)), the scales r 0 /a (where it is measured) and t 0 /a 2 and the total statistics in molecular dynamics units.error of aS(M t ), we propagate the error of the exponent α and linearly add its contribution (for a conservative estimate) multiplied by a factor of two.
No corrections is needed for the hadronic scales from twisted mass simulations since their parameters are tuned for the target mass values, see section 4.2.

Figure 2 :
Figure 2:The functions η m (g 2 ), η M (g 2 ) for the case N f = 4, N = 3.The number of loops corresponds to the highest loop order of the β function which is used.

Figure 3 :
Figure 3: The functions η m (g 2 ), η M (g 2 ) for the case N f = 5, N = 4.The number of loops corresponds to the highest loop order of the β function which is used.
(3.1) and (3.4) using µ = m = m * .For reference the resulting relation is plotted in the left panel of figure 4 together with the values for M c /Λ and M b /Λ which were obtained from the PDG values [18] for m c /Λ and m b /Λ, and inverting eq.(3.1).Of course, in case of the charm quark N f = 4 and in the case of the bottom quark N f = 5 were used.

Figure 4 :
Figure 4: Left: The relation between M/Λ and g * at 5-loop.Right: The 2, 3 and 4-loop relation divided by the 5-loop one for the case of N f = 2, N = 0.

Figure 8 :
Figure 8: Autocorrelation times derived from observables which are expected to have large overlap with the slowest modes in the simulation are plotted as a function of t 0 (M )/a 2 .The dotted line represents eq.(4.15).

Figure 9 :
Figure 9: Continuum extrapolations of √ t 0 /L 1 , using a linear extrapolation in a 2 /t 0 .The shaded bands are the extrapolation errors.The left plot shows the results for t 0 (0) in the chiral limit.The right plot shows the results for t 0 (M ) at M/Λ = 0.59 (upper data set in the plot) and M/Λ = 4.87 (the charm quark mass M c , lower data set in the plot).Black circles represent the standard Wilson and red squares the twisted mass discretizations.Where both are available a combined continuum extrapolation is performed.

Figure 10 :
Figure10: The mass-dependence of the ratio t 0 (M )/t 0 (0) in the theory with two massdegenerate quarks.Monte Carlo data after continuum extrapolation are compared with the perturbative predictions for 1/(QP ) at large M eq.(5.1).The gray shaded error band represent the error of the 4-loop curve (black line) deriving from Q.The dashed line is the 4-loop curve adjusting the value of Q to go through the point at M/Λ = 5.7781, see eq.(5.4).A fit function which describes the mass-dependence close to the chiral limit is also shown to the right.The vertical dotted lines mark the values of the quark mass M c , M c /2 and M c /4.

Figure 12 :
Figure12:The mass dependence of the mass-scaling function η M in the theory with two mass-degenerate quarks.η M is obtained from the hadronic scales 1/ √ t 0 , 1/ √ t c and 1/w 0

Figure 13 :
Figure 13: The integrand of eq.(6.5) for the scale S = 1/ √ t 0 .Data points are η M − η M pert .The red line corresponds to the estimate eq.(6.6) and the blue line represents eq.(6.10) together with the exp(−2x) decay from M = 3M c on.The vertical dotted lines mark the values of the quark mass M c and 3M c .

Table 2 :
Perturbative values of P ,f defined in eq.(3.9) for various cases of interest, see main text for details.

Table 4
and table 5 summarize the parameters of our simulations of N f = 2 mass-degenerate quarks using O(a) improved standard Wilson fermions and twisted mass Wilson fermions at maximal twist respectively.

Table 4 :
Overview of the ensembles generated with N f = 2 O(a) improved Wilson fermions.

Table 6 :
3. The values of the scale L 1 /a used in our simulations and the corresponding lattice spacings.