A Hybrid Renormalization Scheme for Quasi Light-Front Correlations in Large-Momentum Effective Theory

In large-momentum effective theory (LaMET), calculating parton physics starts from calculating coordinate-space-$z$ correlation functions $\tilde h(z, a,P^z)$ in a hadron of momentum $P^z$ in lattice QCD. Such correlation functions involve both linear and logarithmic divergences in lattice spacing $a$, and thus need to be properly renormalized. We introduce a hybrid renormalization procedure to match these lattice correlations to those in the continuum $\overline{\rm MS}$ scheme, without introducing extra non-perturbative effects at large $z$. We analyze the effect of ${\cal O}(\Lambda_{\rm QCD})$ ambiguity in the Wilson line self-energy subtraction involved in this hybrid scheme. Beyond the extent of a lattice, we propose extrapolating the data to a large asymptotic distance through Regge-type behavior. We also propose to apply Bayesian constraints to avoid remnant contributions in unphysical regions in the factorization reconstruction of physical parton distributions.

In large-momentum effective theory (LaMET), calculating parton physics starts from calculating coordinate-space-z correlation functionsh(z, a, P z ) in a hadron of momentum P z in lattice QCD. Such correlation functions involve both linear and logarithmic divergences in lattice spacing a, and thus need to be properly renormalized. We introduce a hybrid renormalization procedure to match these lattice correlations to those in the continuum MS scheme, without introducing extra non-perturbative effects at large z. We analyze the effect of O(ΛQCD) ambiguity in the Wilson line self-energy subtraction involved in this hybrid scheme. Beyond the extent of a lattice, we propose extrapolating the data to a large asymptotic distance through Regge-type behavior. We also propose to apply Bayesian constraints to avoid remnant contributions in unphysical regions in the factorization reconstruction of physical parton distributions.

I. INTRODUCTION
Parton physics is important both for understanding the dynamics of high-energy collisions of hadrons and for studying their internal structure [1,2]. The most familiar examples are quark and gluon parton distribution functions (PDFs) which, on one hand, provide the beam information for high-energy productions at colliders [3], and on the other hand, describe the bound-state physics of the colliding hadrons.
The key idea of LaMET is that partons in the infinite momentum frame (IMF) can be approximated by physical properties of a hadron at large but finite momentum. Due to the existence of ultraviolet (UV) divergences, this approximation is not completely straightforward. It requires using the standard EFT technology of matching and running. Detailed investigations have shown that the standard DGLAP evolution [33][34][35] has its origin in the momentum evolution of physical properties of the hadron [31].
In LaMET applications, one begins with lattice calculations of spatial correlation functions. Since lattice breaks the continuum symmetry, power divergences appear in bare correlation functions.
They must be subtracted when matched to those in a continuum scheme such as dimensional regularization and (modified)-minimal subtraction, MS. In the past, the main approaches suggested in practical applications include the regularization-independent momentum subtraction method or RI/MOM [36][37][38][39] and the ratio method [40][41][42][43]. The latter relies on the validity of Euclidean operator product expansion (OPE) and can only be applied to correlations at short distances, and therefore cannot be used directly for LaMET applications. In contrast, the RI/MOM method appears to be applicable to large-z (z is the gauge-link length) distance at first glance. However, a detailed examination shows that this method introduces potential non-perturbative effects through infrared (IR) logarithms such as ln(z 2 µ 2 ) (µ is a renormalization scale) in the scheme matching. Since UV divergences are supposed to be perturbative in asymptotically-free theories such as QCD, it is important to find a renormalization procedure which does not introduce extra non-perturbative effects.
To achieve this, we propose in this paper a hybrid renormalization procedure for lattice correlations in LaMET applications. At short distances where OPE is valid, the standard RI/MOM or ratio method is recommended. At large distances, we suggest to use the auxiliary field formalism [44][45][46][47] which has been advocated in LaMET applications by a number of authors [48][49][50]. In this formalism, the Wilson line is replaced by two-point functions of the auxiliary field. The linear divergence in lattice correlation functions is then linked to the mass renormalization of the auxiliary field, whereas the logarithmic divergence appears in the renormalization of the "heavy"-light "currents" at the end of the Wilson line. Both divergences can be separately renormalized in a manner which is consistent with the MS scheme [49,50]. Although the mass subtraction of the Wilson line has been suggested before [51,52], it has not been put into wide practical use because, to our knowledge, a reliable approach to calculate the non-perturbative mass has not been well-established in the literature. Here we suggest several ways to do so which shall be investigated through systematic lattice simulations in the future.
In addition, we also address several other issues that are important in extracting parton physics using LaMET, e.g., how to match appropriately to the continuum scheme near z ∼ 0 and how to utilize the asymptotic behavior of relevant correlation functions at large lightfront (LF) distance to supplement the limited lattice data and avoid unphysical truncation effects. We also make a comparison between the large momentum and short distance expansions. In the Appendix, we propose to use Bayesian constraints in the inverse construction of the physical distribution to avoid non-vanishing contributions in unphysical regions.

II. PARTONS AS QUANTA OF INFINITE-MOMENTUM STATES AND LARGE-MOMENTUM EXPANSION
Let us begin with a brief overview of the parton formalism. In the textbooks, PDFs are usually defined in terms of LF correlations in QCD [53,54]. A LF is defined by t − z = constant, if a massless particle is travelling along the z-direction, with variations of other coordinates, t + z and transverse-space dimensions, defining a three-dimensional front surface. Introduce two independent LF four-vectors with dimension-one parameter Λ, then p 2 = n 2 = 0, and p·n = 1. Different LFs are defined by different coordinate distance λ along the n-direction.
Consider now the quark PDFs in a state |P with mass M and four-momentum P µ = (P 0 , 0, 0, P z ) = p µ + (M 2 /2)n µ , which can be used to solve for Λ = (P 0 + P z )/2. Use ψ to denote a full-QCD quark field, the LF correlation function in coordinate space is, where Γ is a Dirac matrix, W is a straight Wilson-line gauge link, and λ is the LF distance. All other coordinates have been taken to be zero. Due to the invariance of the LF under Lorentz boosts along the z-direction, the above correlation function is independent of the momentum P z . Quite often, P z is taken to be zero. The quark PDF is just the Fourier transform of the above LF correlation [54], In this way, partons can be studied without using the usual EFT machinery, although they are indeed effective degrees of freedom (dof's) to describe the LF collinear modes. The reason is that, the parton dof's are automatically projected out through the LF correlators applied to the full QCD state |P . On the other hand, these parton dof's can also be explicitly separated in the QCD Lagrangian, as is done in soft-collinear effective theory (SCET) where they are represented by LF collinear fields [55][56][57]. . In the traditional parton formalism, the correlations are explicitly time-dependent, or in other words, the operators are in the Heisenberg picture. As such, we say that the formalism is Minkowskian and thus difficult for Monte Carlo simulations due to the famous "sign" problem. If one chooses ξ − = (t + z)/ √ 2 as the "new time" coordinate, and integrates it out, one obtains a Hamiltonian formalism for partons, which has been called LF quantization (LFQ) in the literature [58]. LFQ is also a very difficult formalism to work with, despite the fact that much progress has been made [59].
An alternative parton formalism can be obtained by adapting Feynman's original idea about partons to the context of a field theory [5,31]. Feynman considered [60] the momentum distribution of a composite system, f (k z , P z ), where P z is the center-of-mass momentum and k z the longitudinal momentum carried by the parton whose transverse momentum k ⊥ has been integrated over. The P z -dependence of the momentum distribution is clearly a relativistic effect: According to Poincaré symmetry, the Hamiltonian of a system depends on the frame, and changes under Lorentz boosts according to, where K i (i = 1, 2, 3) are the boost operators. Therefore, the wave functions are frame-dependent, leading to frame-dependent momentum distributions. Another feature of the momentum distribution of a system is that it is a static or time-independent quantity. In QCD, it is related to the following spatial correlation, where W (z, 0) is a spacelike straight gauge link, and N is a kinematic factor depending on the Dirac matrix. Feynman then considered the infinite-momentum limit, assuming that such a limit exists, i.e., the relevant correlation function for partons is It is clear that in field theories this is a non-trivial limit. In fact, it can be shown that such a limit only exists in asymptotically-free theories, where the high-momentum modes are perturbative [31].
If one ignores the subtlety of this limit, the correlation in Eq. (7) is related to that in Eq. (2) by an infinite Lorentz transformation [4]. Our "new" form of parton formalism works with time-independent correlators and the IMF wave function. Since the operator is timeindependent, it is in the Schrödinger representation of parton physics if an analogy between time-translation and Lorentz boost is made [31].
In QCD, however, the correlationsh(λ) and h(λ) are different. The difference arises from the presence of the UV cut-off. In the physical momentum distribution, the cut-off must always be much larger than the hadron momentum. As a result, the parton momentum is allowed to be larger than the hadron momentum, or |x| can be larger than 1, without violating any laws of physics. On the other hand, the standard PDFs have support |x| ≤ 1, corresponding to a UV cut-off smaller than the hadron momentum. Thanks to the asymptotic freedom, these two different UV procedures can be connected to each other by perturbation theory in QCD. This makes it possible to extract LF parton physics defined in Eq. (2) from the Euclidean form in Eq. (7).
Eq. (7) is the starting point of the LaMET expansion, where we first compute the quasi-LF correlation functions at a finite, but large momentum P z Λ QCD . To make the expansion work, in principle one needsh(z, P z ) with −∞ < z < ∞, orh(λ, P z ) at all quasi-LF distances λ = zP z . While in reality, of course, due to the finite volume, lattice data will always stop at some large z(λ) which we call z L (λ L ). We will deal with issues of finitẽ λ L later. For the discussion in this section, we assume thath(λ, P z ) is known in [−∞, ∞], i.e., in the wholeλ range at a large P z .
With the above quasi-LF correlation, one can make a straightforward Fourier transformatioñ The physical interpretation off (y, P z ) hinges on the large momentum expansion [4,[61][62][63] f (y, where µ is a factorization scale, Λ QCD is the hadronic scale, and f (x, µ) is the standard PDF that can be extracted from the above equation. Clearly, the validity of this expansion relies on the smallness of the expansion parameters Λ 2 QCD /[y 2 (P z ) 2 ] and Λ 2 QCD /[(1 − y) 2 (P z ) 2 ]. The C factor in the above equation can be calculated perturbatively. At leading-order in α s , we can identifỹ f (y, P z ) with f (y, µ) (ignoring the power corrections for the moment), thus they have the same asymptotic behavior as y → 0 and y → 1. Beyond leading-order, this will be changed by perturbative corrections. To see this, let us take the following simple form of f (x, µ) as an example with a, b controlling the asymptotic behavior at x → 0 and x → 1, respectively. The perturbative one-loop corrections lead to the following change in the asymptotic behavior δf (y, P z ) ∼ α s y a ln y as y → 0.
When resummed to all orders in perturbation theory, this yields a power law behavior of the formf (y, P z ) ∼ y a+γ with γ being associated with the anomalous dimension of the operator definingf (y, P z ). Similar behavior also occurs as y → 1 for realistic PDFs with b > 0. This can also be seen from the coordinate space analysis to be presented below. Therefore, for a given large P z , there is a range of y where high-order corrections as well as power corrections are small, and this range can be translated into a valid range x for the PDFs. Thus, one can systematically obtain the PDFs in an interval [x min , x max ] (x min will approach 0 and x max approach 1 as P z → ∞). In other words, the LaMET expansion provides a natural way to calculate parton distributions in an interval of the parton momentum x, similar to extracting parton distributions from experimental processes.

III. A HYBRID RENORMALIZATION PROCEDURE
As explained in the previous section, the LaMET expansion starts from calculating the coordinate-space correlation functionsh(z, P z ) at large momentum P z and for the whole range of distance −∞ < z < ∞. On a discrete lattice with spacing a, the nonlocal quark bilinear operator that definesh(z, P z ) in Eq. (5) can be multiplicatively renormalized as [48,49,52] up to lattice artifacts [36,64]. Here the operator on the l.h.s. is defined in terms of bare fields and couplings, denoted by the subscript "B", while the operator on the r.h.s. is renormalized and denoted by the subscript "R". There are both z-independent logarithmic and z-dependent linear divergences. The former arises from the renormalization of quark and gluon fields as well as the vertices at the endpoints of the Wilson line, which is included in the factor Z(a), while the latter comes from the Wilson-line self-energy, which is factored into the exponential e δm|z| with δm being the "mass correction" . A number of proposals have been made in the literature [36,37,40,42,43,49,51,65,66] to renormalize the above lattice correlation functionsh(z, a, P z ), among which the RI/MOM scheme has frequently been used [36,37]. In this approach, one calculates the matrix elements (amputated Green's function) Z(z, −p 2 , a) of the bilocal operators O(z, a) in a deep Euclidean state with momentum squared −p 2 Λ 2 QCD in a fixed gauge, and then defines MS operators as, where Z MS converts the RI/MOM renormalized result to the MS scheme. The gauge and −p 2 dependence cancel between two Z-factors. The r.h.s. has a proper continuum limit a → 0 without divergences. However, while the RI/MOM approach is justified for local operators, it has potential problems when applied to nonlocal ones. For instance, when z becomes large, Z MS (z, −p 2 , µ 2 ) contains IR logarithms of z and the perturbative calculation of z-dependence is not reliable. Moreover, although the RI/MOM factor Z(z, −p 2 , a) helps to cancel the lattice UV divergences, the composite operator at large-z contains non-perturbative physics as well. Therefore, both Z-factors contain non-cancelling non-perturbative effects which alter the IR properties of O(z). Thus, the RI/MOM renormalization scheme is not reliable at large-z. Moreover, when gluon distributions are involved, it requires external off-shell gluon states which bring in potential mixing with gauge-variant operators and make things much more complicated [67].
In addition to the renormalization issues at large distances, there are also subtleties for renormalization at short distances. While the standard renormalization of a bilocal operator makes it finite at any non-vanishing z, it remains divergent in the z → 0 limit. In fact, the two limits, a → 0 and z → 0, are not interchangeable.
To resolve these issues, we propose in this section a hybrid scheme to renormalize the correlation functions.
The key point of this scheme is that we separate the correlations at short and long distances and renormalize them separately, and match both procedures at an intermediate distance z S . The matching point must lie within [0, z LT ] where the leading-twist (LT) approximation for the correlation operator is valid. Discussions on the value of z LT can be found in Sec. V.
A. Renormalization at short distance 0 ≤ |z| ≤ zS To renormalizeh(z, a, P z ) for 0 ≤ |z| ≤ z S , particular attention shall be paid to the behavior of the correlation functions in the limit z → 0.
In the continuum MS sheme, the z → 0 limit is not smooth and additional logarithmic UV divergences ∼ ln z 2 arise. However, this is not the case for the lattice matrix elementh(z, a, P z ). For finite lattice spacing and non-vanishing z,h(z, a, P z ) includes UV divergences related to the wave function renormalization of the bare fields, of the form α s (a) ln(z 2 /a 2 ). At small z, particularly when z = 0 or a,h(z, a, P z ) has discretization effects and is related to the lattice-regulated local matrix elementψΓψ. In particular, when Γ = γ µ ,ψγ µ ψ is conserved and its matrix element is finite in the a → 0 limit. A function demonstrating this interesting interplay between lattice regulator and small physical distance is The above discrepancy in the small-z regime can be removed through a perturbative conversion between lattice regularization and the contiuum MS scheme, which, however, is known to converge slowly. Instead, a more efficient strategy is to cancel the ln z 2 -dependences through lattice renormalization, which corresponds to a scheme "X" that is different from MS. As long as z is in the leading-twist region |z| ≤ z S where z S is smaller than z LT , the difference between the X-scheme and MS can be calculated in perturbation theory.
For example, the X-scheme can be implemented by forming the ratio ofh(z, a, P z ) and another matrix element of the same operator O(z, a), , where the renormalization factor Z X corresponds to different choices of the matrix element. Possible choices for Z X include • Amputated Green's function of O(z, a) in a singleparticle deep Euclidean state, fixed in a particular gauge, e.g., Landau gauge, which defines the RI/MOM-type of schemes [36][37][38][39].
• Matrix element of O(z, a) in a hadron state with P z = 0, depending on applications [40,41].
In the second and third option, the matrix elements are gauge invariant, and therefore no gauge fixing is needed.
For the third option, the quantum numbers of the operator must be the same as those of the vacuum. As discussed above, z S has to be smaller than z LT , which is estimated in Sec. V to be about 0.25 ∼ 0.33 fm. Of course, the stability of the final result with respect to small variations of z S shall be explicitly verified. Due to the multiplicative renormalizability of the operator O(z, a), all UV divergences cancel in the ratio in Eq. (14), thus allowing us to take the continuum limit, where the term after the first equal sign refers to a MS calculation of the same ratio with corresponding to dimensional regularization d = 4 − 2 in the continuum theory. In the limit |z| → 0, the ln z 2 -dependence is independent of the external state, so it cancels in the ratio, making the latter finite at z = 0. Moreover, in the leading-twist region z ≤ z S ≤ z LT , we can perturabtively match the ratio for any X-scheme to the LF correlation h(λ, µ) through the coordinate-space factorization formula [63,68] where C X is the matching coefficient, and we have suppressed its dependence on the renormalization scale in the X-scheme such as the RI/MOM. Also, higher-twist contributions have been suppressed in the above equation.
The one-loop matching coefficient for the second option for Z X has been obtained in Refs. [63,68,69], and that for the RI/MOM scheme can be extracted from Ref. [36,63]. The two-loop results for both the second and third option have been calculated as a series expansion in Ref. [43].
Eq. (16) also has an equivalent form in momentum space [63], with one-loop results for the RI/MOM scheme calculated in Refs. [13,15,37], and two-loop results for the second option extractable from Ref. [70].
B. Renormalization at large distances zS ≤ |z| ≤ zL At large z S ≤ |z| ≤ z L where z L is the size of the lattice, UV renormalization needs a careful assessment because both the RI/MOM and the ratio scheme will introduce undesired non-perturbative effects. The only renormalization approach that will not introduce such extra non-perturbative physics is the explicit and separate subtraction of linear divergences (or δm) and logarithmic divergences [48,49,51,52], which in principle can be done using the auxiliary field method [49,50].
To calculate the mass renormalization δm of the Wilson line, there exist many suggestions in the literature.
Here we provide a probably incomplete list: • One can fit the hadron matrix element at large z, where the dominant decay is δm can be obtained by fitting the ratio ln(h(z + a, a, P z )/h(z, a, P z )) to a constant in z at large z.
Of course, the result has to be independent of P z , e.g., one can choose P z = 0. This approach has not yet been studied in the literature before.
• One can use the single-quark Green's function as in the RI/MOM renormalization factor, which asymptotically goes like Z(z) ∼ exp(−δm|z|).
This matrix element needs a fixed gauge, and has been studied in Refs. [17,71].
• On can also use the vacuum matrix element of O(z, a) which is gauge invariant. S(z) again at very large z behaves like S(z) ∼ exp(−δm|z|). This has been considered in Refs. [42,43].
• One can also calculate the vacuum expectation value of the Wilson line W (z) directly in a fixed gauge, and again W (z) ∼ exp(−δm|z|) at large distance. This has been considered in Refs. [49,50] using the auxiliary field method.
The mass renormalization δm has to be gaugeindependent, just like the pole mass of a quark [77]. In the above suggestions where no gauge fixing is needed, this is obviously true. In the cases where a gauge-fixing is needed, one can demonstrate that the results in any other gauge are the same by constructing appropriate gauge-invariant operators [78]. Despite being gaugeindependent, δm will depend on the specific action used in Monte Carlo simulations and on the definition of the matrix elements above. In the cases of vacuum matrix elements, δm may be interpreted as the non-perturbative pole mass in certain gauges [78].
The δm calculated from all the matrix elements above will have the following dependence on the lattice spacing a, where m −1 is the coefficient of the power divergence, which is independent of the specific matrix element. The a-independent term m 0 has a more complicated origin. It can arise from various sources: • Renormalon effect: In principle, m −1 can be calculated perturbatively, and is proportional to α s (a) at leading order, just like the mass counterterm in the Wilson formulation of fermions. However, the perturbation series is not convergent. When truncated at order n ∼ 1/α s , the perturbation series has an uncertainty of order aΛ QCD , which generates a contribution to m 0 [79,80]. This means that in non-perturbative fitting, m −1 is determined only with an uncertainty of the order of aΛ QCD . Thus, an additional m 0 contribution of order Λ QCD is expected.
• Pole mass: For certain matrix elements, like vacuum elements of bilocal operators, the z dependence can be viewed as originating from the pole mass of a meson consisting of an infinitely-heavy quark and a light one. In this case, δm is the pole mass apart from the linear divergence.
• Finite P z effects: The correlation function at finite P z has a long-range correlation, exp(−z/ξ), where ξ is the correlation length. This contribution is included in m 0 .
• Fitting effect. Since the data is always in finite z where the exponential decay cannot always be separated from an algebraic decay, there is a data fitting uncertainty contributing to m 0 .
To summarize, m 0 depends on the lattice matrix-element used and the fitting procedure [17,22,23,71]. In Fig. 1 we show, as an example, the values of m −1 and m 0 determined from the quark RI/MOM renormalization factor calculated at the scale µ R = 1.8 GeV and p z R = 0, using the four ensembles with a ≈ {0.045, 0.06, 0.09, 0.12}fm and 310 MeV pion mass from MILC collaboration [81]. Inspired by the asymptotic behavior at large z to be studied in Sec. IV, we use the following simplified form to fit the renormalization factors at four different lattice spacings. It is worth pointing out that m −1 starts from O(α s ) in perturbation theory. We therefore also include its dependence on the coupling in the fitting. The thus determined m −1 depends logarithmically on a. In principle, one can choose to subtract the power divergent piece only, namely m −1 /a. The less-welldetermined m 0 term can be left in the lattice matrix elements, the momentum expansion will take care of the rest. Indeed, the difference between subtracting different m 0 is O(1/P z ) effect, as has been demonstrated perturbatively in [61]. More precisely, assuming there are two quasi-LF correlations that define the quasi-PDFs and differ from each other by a factor e −m|z| with m ∼ Λ QCD , then after Fourier transforming into momentum space, they are related bỹ with δ = m/P z . If thef 's are square integrable and their first order derivatives are continuous, one can show that as δ → 0, Therefore, the ambiguity of different schemes disappears in the large P z limit. However, this is still unsatisfactory because it appears that, due to non-perturbative effects from the linear divergence, the LaMET expansion will be an expansion in powers of M/P z instead of (M/P z ) 2 , which will significantly reduce the speed of convergence. Here we consider possible ways to overcome this deficiency. Recall that the LaMET expansion is in (M/P z ) 2 is made in the MS scheme where no linear divergence exists, in a general scheme this expansion might contain odd powers in 1/P z . Nevertheless, it is possible that there is a way to choose m 0 such that the condition of the MS scheme can be matched non-perturbatively. We shall denote such a value of the subtracted mass by Unfortunately, none of the m 0 in the above list can be identified with m 0c , and to determine a properly nonperturbatively-defined m 0 is challenging, although in principle one could try the following: First calculate the pole masses of a heavy-light meson and light-light meson, and subtract the former by one half of the latter; or calculate the mass of a heavy-heavy quarkonium in the heavy quark limit. The critical mass m 0c is of the order of Λ QCD and genuinely non-perturbative. Therefore, a more practical strategy could be to vary m 0 in a certain range near Λ QCD , and identify the value m 0c for which the sensitivity on P z of the renormalized quasi-distributions is the weakest. This is very much like searching for the critical value of κ c for Wilson fermions for which one encounters a similar power divergent bare quark mass [82]. Systematic lattice studies of m 0c for various fermion formulations are clearly called for.
The mass-subtracted operator O(z, a)e −δm(a)|z| has no power divergence, but still has logarithmic dependence on a. These remnant logarithmic divergences are independent of z and can be renormalized, in principle, using the auxiliary field method [49,50]. However, a more convenient option in practice is to fix the renormalization constant Z hybrid (a) by directly matching the renormalized matrix elements of O(z, a) at z = z S from the short and long distance regimes, which is essentially a continuity condition, Z hybrid e −δm|zS| P |O(z S , a)|P = P |O(z S , a)|P Z X (z S , a) , which leads to Z hybrid (z S , a) = e δm|z S | /Z X (z S , a) .
In this way, one only has to calculate δm. Of course, one needs to vary z S to check whether the final result is stable.
The matching coefficient C hybrid for the long distance regime is related to that for the ratio scheme C ratio perturbatively. For example, if one adopts the P z = 0 matrix element for renormalization [63,68,69], then However, due to the logarithms of z 2 µ 2 and z 2 /z 2 S , the above matching coefficient is only valid for |z| Λ −1 QCD , otherwise one has to resum the large logarithms for |z| ∼ Λ −1 QCD by evolving α s to a highly nonperturabtive regime. Since our ultimate goal is to Fourier transform the final result to obtain the PDF, this will introduce uncontrolled sytematics.
To have a clearer way of separating the perturbative and nonperturbative regimes, we can perform the matching in momentum space, where nothing prevents using the correlations at large z, provided that P z is sufficiently large. The limitation to small z in coordinate space is translated to the limitation to the accessible x range in momentum space. The momentum space matching coefficient is given by the double Fourier transform from where C ratio (ξ, µ 2 /p 2 z ) can be found in [63], and λ S = z S p z with p z = xP z being the parton momentum. The plus function is defined as In momentum space, the matching coefficient includes the logarithm of µ/(xP z ) which becomes nonperturbative for x ∼ Λ QCD /P z . This is consistent with the power counting parameter Λ 2 QCD /(yP z ) 2 . Therefore, the nature of the systematic uncertainties is clear, and we can keep improving precision at small x by pushing to higher P z .

IV. ASYMPTOTIC BEHAVIOR AT LARGE DISTANCE zL ≤ z ≤ ∞
Due to the finite volume, lattice calculations of quasi-LF correlations at finite momentum always end up with data at finite λ L = P max z L . However, to reconstruct the full parton distribution, we need the correlations at all quasi-LF distances. In most LaMET calculations that have been carried out so far, the parton distributions are extracted using correlations truncated at a maximum quasi-LF distance, inducing a sizable truncation/oscillation effect in the final result. This can be greatly improved if we utilize our knowledge on parton distributions at small and large x which determines the asymptotic correlations at large light-cone distance.
At small x, it is well known that parton distributions behave asymptotically like x a , as suggested by Regge theory [83]. For the non-singlet combination, the leading Regge trajectory indicates that a ∼ −1/2. For the singlet combination, its mixing with gluon distributions under evolution makes things more subtle. In the so-called soft pomeron model, one has a ∼ −1. However, scattering data at large momentum transfer indicate a more singular asymptotic behavior, reflecting the potential need for a contribution of the hard pomeron [84]. At large x, the asymptotic behavior is dictated by the quark counting rules [85]. As x → 1, the hadron momentum is carried by the struck quark and no momentum is left for other spectator partons. The asymptotic behavior is then predicted to be (1 − x) b , where b = 2n s − 1 + 2|∆S z | with n s being the minimum number of spectator partons and ∆S z the difference of the spin projections for the struck parton and the parent hadron [84,86]. For example, for a valence quark in the proton b = 3 (5) if the struck quark has helicity parallel (antiparallel) to the proton as n s = 2 and |∆S z | = 0 (1), while for the pion one has b = 2 since n s = 1 and |∆S z | = 1/2. The above features have been widely used in global fits of PDFs, where one parametrizes the PDFs such that they behave as x a for x → 0 and (1 − x) b for x → 1 and fit the powers a, b to a large variety of experimental data. The role of such a power law behavior in global fits has been examined in detail in Refs. [87,88].
Here we utilize the above features to help extrapolate the coordinate space correlations to asymptotically large quasi-LF distance that cannot be reached on a lattice. When Fourier transformed to coordinate space, the asymptotic behavior described above implies that the correlation in the longitudinal space decays algebraically as λ −α (α is a positive number related to a, b) rather than exponentially, and thus has an infinite correlation length. A similar algebraic decay behavior was also observed in a recent analysis of the LF wave functions [59] when Fourier transformed to conjugating coordinate space [89].
To see how the asymptotic behavior can help improve our determination of parton distributions from lattice data, let us begin with the following simple form of parton distributions that incorporates the above behavior, The coordinate space matrix element can be defined as from which it follows that at large λ whose real (imaginary) part is even (odd) in λ, ensuring that parton distributions are real functions in momentum space. Therefore, the conjugate light-cone correlations behave at large λ as λ −α(a,b) with α(a, b) = min(a + 1, b + 1).
In most cases we are interested in, α(a, b) = a+1. Applying the matching in Eq. (9) converts the light-cone correlations to quasi-LF correlations, and also induces logarithmic corrections to the asymptotic behavior. At leading logarithmic (LL) accuracy, such corrections can be resummed leading to terms ∼ exp(γ ln z 2 µ 2 ) = (z 2 µ 2 ) γ which modifies the asymptotic behavior tõ This provides a useful approximation to the quasi-LF correlations at large λ. Although sub-leading effects might induce some further changes to this behavior, Eq. (37) can be viewed as a reasonable assumption. Based on the above discussion which indicates the quasi-LF correlations also decay algebraically at large λ, we can take the following strategy to improve the determination of parton distributions from lattice data in a limited range: Below the truncation point, we use lattice data. Above the truncation point, we use the extrapolated form (taking λ > 0 as an example) to accommodate the two different structures in Eq. (35). The parameters c i , d i can be determined by requiring that the extrapolated form give the same values as the lattice data at points closest to the truncation point. To illustrate the above idea, we take the DA of the kaon as an example. The data are taken using the ensemble with a ≈ 0.12 fm and 310 MeV pion mass from MILC collaboration [81]. The kaon momentum is P z = 4 × 2π/L (L ≈ 2.88 fm), and the renormalization is done using the RI/MOM scheme with the scale µ R = 1.8 GeV and p z R = 0. In the upper panel of Fig. 2, we show the comparison of the lattice data with their extrapolation using Eq. (38), where inspired by approximate isospin symmetry we have taken d 1 = d 2 . Since the errors of the renormalized lattice matrix elements increase significantly at large z, we choose to truncate the lattice data at z 0 = na where the data are consistent with 0 within 3 standard deviations. The unknown extrapolation parameters in Eq. (38) are then determined by requiring that the results are continuous at the closest z ≤ z 0 within errors. We choose n = 7 in the present case. Due to the oscillating phase in Eq. (38), in general there exists more than one solutions for the parameters. A useful criterion in practice could be to reproduce as closely as possible the available lattice data beyond z 0 . The extrapolation in Fig. 2 corresponds to With the extrapolated data, we can then Fourier transform to momentum space and apply the matching in Ref. [90] to extract the DA of the kaon. The lower panel of Fig. 2 shows the one-loop matched results obtained from lattice data up to z = 12 a and extrapolated data up to z = 100 a. The unphysical oscillating behavior that plagues the former disappears in the latter. The result after extrapolation also gets closer to the physical region. The contribution in unphysical regions can be further reduced by the use of a Bayesian prior outlined in the Appendix.
We shall point out that for a finite P z , the correlationh(z, P z ) has a finite correlation length ξ ∼ 1/P z , and thus has an exponential decay exp(−z/ξ) at large z.
The contribution of this exponential decay in the Fourier space is the power-suppressed higher-twist term. Thus by extrapolating in term of algebraic decay, one eliminates in a certain sense some higher-twist terms and empirically makes the expansion look somewhat better than it is.

V. LARGE MOMENTUM VS. SHORT DISTANCE EXPANSION
The Euclidean correlator in Eq. (5) introduced in Ref. [4] has also been considered in coordinate-space factorizaton (CSF) [40], which was introduced in an early work on meson DAs with current-current correlators [91] (see also [92]). The correlator can be factorized in terms of the LF correlations with expansion parameter (zΛ QCD ) 2 . The formalism is naturally suited for calculating moments of PDFs or short-distance LF correlations. To obtain the full parton physics, however, one has to simultaneously consider the constraint on the external momentum P z ∼ 1/z Λ QCD . This is identical to the observation in Ref. [5]: One must use large momenta to capture the full dynamical range of PDFs, which requires information on long-range correlations in λ. Despite complete equivalence [63,93], some analytical matching calculations might more conveniently be done in coordinate space. Not surprisingly, the same LaMET lattice data are needed for a CSF analysis to get the PDFs. Nominally, CSF can also admit data at small P z , but the same information is contained already in large P z data at smaller z.
The CSF expansion is formulated in terms of the Euclidean distance z, which is required to be small, i.e., to ensure the validity of perturbation theory and leadingtwist dominance. Assuming the largest z for the leading-twist approximation to be z LT (say, the value of z for which the higher-twist contribution is at the level ∼ 20%), then the small expansion parameter is (z LT Λ QCD ) 2 1 when potential linear divergences are subtracted before the expansion is made. Therefore, only the matrix element of O(z) within the range [0, z LT ] has a simple interpretation in terms of leading-twist parton physics.
An interesting question is then: What is the value of z LT ? If we take Λ QCD ∼ 300 MeV, and z LT Λ QCD = 1/2 ∼ 1/3 as a small parameter, then the estimate is that z LT is around 0.25 ∼ 0.33 fm. An upper limit is probably 0.4 fm. A good estimate of z LT can be provided by comparing the matrix element P = 0|O(z)|P = 0 or Ω|O(z)|Ω , both of which have been proposed to renormalize the bare quasi-LF correlation [40,42,43], to the leading-twist contributions in their OPE.
Let us take the zero-momentum matrix element for the isovector case as an example. In the MS scheme, it has a short distance expansion of the form [63,68] h(z, µ, P z = 0) = 1 2M P = 0|ψ(z)γ 0 W (z, 0)ψ(0)|P = 0 where W (z, 0) is a spacelike straight gauge link. Here µ is the MS renormalization scale, and a 0 = 1 is the conserved lowest moment of the correpsonding twist-2 PDF. The one-loop Wilson coefficient is [63] c 0 (µ 2 z 2 ) = 1 + α s (µ)C F 2π and the two-loop results can be found in Ref. [43]. According to Eq. (12), the mass-subtracted matrix element includes logarithmic divergences that are independent of z and should not constitute significant corrections in lattice perturbation theory. Therefore, we can roughly approximate its OPE by replacing µ with 1/a, where the lattice discretization effects are expected to be of O(a 2 /z 2 ). Since the lattice matrix elements are convergent as z → 0, which is contrary to the logarithmically divergent behavior in the MS OPE, we expect the above approximation to be reliable within the range a < |z| < z LT where the discretization and higher-twist effects are both suppressed. Note that lattice OPE is usually complicated by the broken Lorentz symmetry and operator mixings. Nevertheless, since for P z = 0 the only leading-twist contribution comes from the conserved vector current, we can ignore such effects here. In Fig. 3 we plot the mass-renormalized pion lattice matrix element. The bare lattice matrix element comes from a recent calculation of the pion valence PDF on an ensemble with a = 0.06 fm and pion mass m π = 300 MeV [94]. On the same lattice ensemble, the Wilson-line mass correction δm was fitted from the quark-antiquark potential [17], and its value is given in lattice units as aδm = −0.1568. The leading-twist contribution is plotted with next-to-leading-order (NLO) corrections and NLO correction plus LL resummation for fixed α s , Here we choose α s in the MS scheme at scale 1/a as the input for OPE, which should allow for better convergence than the bare lattice coupling [82]. To estimate the uncertainty from the choice of α s , we vary the MS scale from 1/(2a) to 2/a. Though a standard procedure of improvement shall be performed to define α s on the lattice, we expect that it will not alter the following conclusion. As one can see, for z ≤ a, the lattice result is significantly different from the leading-twist approximations due to discretization effects. As z increases, the agreement becomes better. However, for z ≥ 0.3 fm, the lattice result starts to deviate dramatically from the leadingtwist approximations, showing that the higher-twist contributions become significant. Therefore, we can roughly estimate that z LT ∼ 0.3 fm. One can also look at the case of the better established heavy-quark potential. It is well-known that the static heavy-quark potential receives both perturbative and non-perturbative contributions. The perturbative static potential is known up to N 3 LL level [74] and can be expressed in terms of the QCD running coupling constant. In Refs. [95][96][97], the running coupling constant has been extracted from lattice calculation of the static energy at short distances. The N 3 LL perturbative result agrees well with lattice data up to r ∼ 0.2 fm. However, it is well-known that the perturbative series for the static potential suffers from a renormalon ambiguity [98,99] and breaks down at large distance. The non-perturbative heavy-quark potential can be simulated using lattice QCD, and is well-known to be dominated by the linear term of the form σr at large distance. Phenomenologically, the static potential can be well approximated by the linear+Coulumb QCD static potential, V (r) = − e r + σr where e ≈ 0.25 ∼ 0.5 while √ σ ≈ 477 MeV [100][101][102]. When the perturbation theory is about to break down, the perturbative contribution − e r and the confining contribution σr should be of the same order of magnitude, which determines r c ≈ √ e √ σ to be around 0.2 ∼ 0.3 fm. This is consistent with the result in Refs. [95,103,104]. The boundary z LT should be of the same order of magnitude.
With the estimated z LT above, one can also define can be used to extract parton distributions with the matching formula in Eq. (16) [40,63,93]. Thus, the coordinate-space approach is useful for extracting the LF correlation functions in a limited range, with the LF distance ranging between [0, λ LT ].
The CSF approach has also been used for products of currents made of quark bilinears [91,92,[105][106][107][108][109][110]. Renormalization of power divergences in the CSF expansion for the quark and gluon blinears with Wilson line is easier to handle. In particular, a version of the ratio method which divides by the matrix element at zero momentum, can be used to eliminate the power divergences in the lattice matrix elements [40,41]. On the other hand, the current products can also be used in LaMET expansion after Fourier transforming into momentum space [31].
However, the CSF does not allow to directly obtain the momentum space distribution, because to achieve this, one needs the LF correlation at all LF distances. Thus, to reconstruct the parton distributions, one has to use parametrizations on the functional form of the x dependence, just like in phenomenological fits of parton distributions, and then Fourier transform it to coordinate space and fit to a limited range of the LF correlations [41,94,[109][110][111][112][113][114][115]. This is an uncontrolled process, with no reliable method to estimate errors. Note that although the CSF method allows for model-independent extraction of the PDF moments, the requirement for short distance will limit them to be the lowest ones.
From a different angle, the above practice amounts to postulating (or modeling) certain correlations between short and long distance behaviors of the LF correlations. Such a postulate has no first-principles foundation and the parton distribution obtained this way cannot be claimed to be generated by first principle lattice calcu-lations. The PDFs obtained in this way naturally suffer from the same ambiguity as in global fits, and it can happen that the correlation data in the limited range be fitted pretty well by more than one parametrizations that have completely different asymptotic behavior [110].
To make a more direct comparison between the momentum space and coordinate space approaches, let us consider the following example. Assume the quasi-LF correlations defining the quasi-PDFs behave likẽ with h(λ) being the light-cone correlator. The exponential e −mz with m ∼ Λ QCD is used to model high-twist contributions. From this equation, it is clear that if one stays in position space, the validity of the perturbative matching is controlled by the exponential factor e −mz . Only when z 1 m , the quasi-LF correlator can be matched to the light-cone one. The available range of λ is therefore much smaller than P z m , which indicates that the number of moments n that one can access is at most of order P z m , n ∼ P z m . Therefore, for P z m around 3 or 4, which is the practical value currently available, the coordinate space approach can only be used to access at most the first three or four moments.
On the other hand, after Fourier transforming to momentum space, one can show that as P z → ∞, the quasi-PDF approaches the light-cone one at a linear speed: Therefore, although in coordinate space the large z information is corrupted by higher-twist contributions in a way that is independent of P z , in momentum space the higher-twist contributions are controlled by 1/P z . This implies that for P z sufficiently large, the momentum space approach allows to access the full shape of the PDF, in comparison to only the first few moments in the coordinate space approach. This improvement is due to the utilization of both small and large z information, as emphasized above. From the discussion above, it is clear that the momentum and coordinate space expansions are different expansion schemes. Even though they are equivalent in the infinite momentum limit, they are different at finite momentum P max . In the latter, the information is filtered directly in coordinate space. One gets parton correlations in a range of LF distance which correspond to the number of moments controlled by 1/P z . While the former uses all the coordinate space information, filtering higher-twist physics in momentum space through 1/(x 2 (P z ) 2 ) 1. Therefore one gets partron distributions in an interval of x, with systematic control of errors, which can be directly compared with experimental data.

VI. CONCLUSION
To conclude, we have discussed some further subtleties in renormalization and matching of the quasi-LF corre-lations on lattice. We proposed a hybrid renormalization procedure to treat the short and long distance correlations separately. The short distance correlations can be renormalized by dividing the same correlator sandwiched in different external states, whereas the long distance ones are renormalized using the Wilson line mass renormalization. In this way, we avoid introducing extra non-perturbative effects at large distance in the renormalization stage. We also proposed how to extrapolate to large quasi-LF distance beyond the reach of lattice simulations by utilizing the asymptotic long-range behavior of the correlations, thus avoiding truncations in the ensuing Fourier transform. We finally compared the large-momentum expansion with the CSF approach when applied to LaMET data, showing that the former is a systematic expansion while the latter is not. In the Appendix, we proposed to use Bayesian constraints in the inverse construction of the PDF to avoid contributions in the unphysical region. Our proposal here has the potential to greatly improve current computational strategies in lattice applications of LaMET.

ACKNOWLEDGMENTS
We thank the Extended Twisted Mass Collaboration and the Brookhaven Lattice Group for providing the lattice matrix elements of nucleon and pion. We also thank MILC collaboration for providing the configurations, and J. Hua and Y. Huo for sharing the RI/MOM renormalization factor and kaon DA data generated on these configurations. We thank Y. In current PDF extractions using LaMET, we have been mostly relying on a direct inversion of the matching formula which can be handled only up to O(α s ) and the leading-twist. Due to subleading-twist and other effects, the extracted light-cone distribution does not vanish out-side the physical region. Sometimes, this problem can be quite serious [24].
A commonly used strategy to solve such a problem is to parametrize the physical distribution with a functional form and determine the unknown parameters by a minimal χ 2 fit to the data. This has been used, e.g. in [17,20,110,113]. Such a strategy has the advantage that the thus determined distribution is automatically limited to the physical region. However, the parametrization uncertainties are hard to quantify. In certain cases, different functional forms might fit the available lattice data equally well [110].
With the hybrid renormalization and extrapolation proposed in this paper, we expect the contribution in unphysical regions to be mild. Nevertheless, we propose an option to further improve it. That is, to use a Bayesian approach supplemented with appropriate constraints such that there is no contribution in the unphysical region. A similar proposal has also been made in a slightly different context in Ref. [116] (see also [117]). In the Bayesian approach, a constraint can be imposed by choosing a prior probability distribution that assigns probability 1 to the set of parameters satisfying the constraint. In the present case, the prior probability can be chosen to be 1 if the extracted light-cone distribution lies in the physical region only. If we denote such a prior probability distribution as θ, then applying Bayes' theorem gives the following posterior distribution where f denotes the physical distribution that lies in the physical region, and P (A|B) is a conditional probability characterizing the likelihood of A given that B is true.
Assume the final distribution f is characterized by a set of parameters as f (a n ), we need to determine the parameters a n by maximizing the above probability distribution with respect to the choice of a n . As suggested in Ref. [116], the probability distribution P (f |f, θ) can be written as the quadratic distance functional e −L with wheref 0 i represent the quasi-distribution data corresponding to the choice of an initial set of parameters of a n , and is the covariant matrix of the measured lattice dataf h i . Moreover, the prior probability P (f |θ) can be chosen, for example, as the exponential of the Shannon-Jaynes entropy e S(f,f 0 ) , in the maximum entropy method [118] with