Proton structure functions at NLO in the dipole picture with massive quarks

We predict heavy quark production cross sections in Deep Inelastic Scattering at high energy by applying the Color Glass Condensate effective theory. We demonstrate that when the calculation is performed consistently at next-to-leading order accuracy with massive quarks it becomes possible, for the first time in the dipole picture with perturbatively calculated center-of-mass energy evolution, to simultaneously describe both light and heavy quark production data at small $x_\mathrm{Bj}$. We furthermore show how the heavy quark cross section data provides additional strong constraints on the extracted non-perturbative initial condition for the small-$x_\mathrm{Bj}$ evolution equations.

Introduction -Probing the properties of the nonlinearly behaving gluonic matter in protons and nuclei at high energies is a major science goal of the future Electron-Ion Collider (EIC) [1][2][3]. Measuring the total and heavy quark production cross sections in Deep Inelastic Scattering (DIS) off nuclei is especially intriguing, as non-linear saturation effects are enhanced in heavy nuclei [4]. The EIC will be able to perform very precise total cross section measurements over a relatively wide kinematical domain characterized by the gluon longitudinal momentum fraction x Bj and the photon virtuality Q 2 .
Non-linear gluon saturation effects are expected to have a modest effect on structure functions in the EIC kinematics (see e.g. [5][6][7]). To unambiguously determine the existence of non-linear QCD dynamics at collider energies and to quantify its role on the small-x Bj structure of protons and nuclei, it is likely necessary to perform a global analysis of the future proton and nuclear DIS data at small x Bj . In particular, it will be important to include both the inclusive and heavy quark production data that have different sensitivities on saturation effects in order to extract in detail the properties of the QCD matter at extremely large parton densities. Charm production is an especially powerful process as the charm mass is large enough to suppress non-perturbative effects, but simultaneously light enough to allow one to access QCD dynamics in the non-linear regime.
To describe QCD dynamics at high energies, where parton densities are very large and emergent non-linear phenomena dominate, it is convenient to use the Color Glass Condensate (CGC) [8,9] effective field theory framework. The DIS process is then naturally described in the dipole picture [10,11], where the photon splits into a quark-antiquark pair long before the interaction with the target. The interaction of the quark dipole with the target is then taken to be eikonal, i.e. the transverse coordinates of the partons do not change when they traverse through the target color field. In this picture, leadingorder (LO) calculations including a resummation of the high-energy logarithms α s ln 1/x to all orders (where α s is the strong coupling) within the CGC framework have been successful in describing the precise proton structure function data from HERA [12][13][14]. This suggests that the HERA data is compatible with the hypothesis that gluon saturation is manifest at HERA energies. In addition, calculations based on collinear factorization have also found the resummation of the high-energy logarithms to be important in order to describe the details of the HERA data [15].
The structure function data is used to constrain the non-perturbative initial condition for the small-x Bj evolution equations. Therefore, a good description of the total cross section data is crucial when applying the CGC framework to describe any other scattering process (e.g. proton-nucleus collisions at the LHC [13,[16][17][18][19][20]). Compatibility with the available cross section data is also required when developing a realistic description for the early stages of heavy-ion collisions [21], needed to extract the fundamental properties of the Quark-Gluon Plasma.
In this Letter, we present predictions for heavy quark production cross sections in DIS using the non-perturbative initial condition for the perturbative Balitsky-Kovchegov (BK) small-x Bj evolution equation [22,23], determined from the fits to total DIS cross section data in [24]. The predicted heavy quark cross sections are shown to be in excellent agreement with the HERA data [25]. This is the first time in the CGC framework that a simultaneous description of total and heavy quark production data is achieved in calculations where the energy dependence is obtained by solving the small-x Bj evolution equation. A crucial ingredient here is the next-to-leading order (NLO) accuracy in α s recently achieved for the massive impact factors from firstprinciple light-cone perturbation theory calculations [26][27][28]. We also demonstrate how the heavy quark production data can provide additional constraints for the extracted non-perturbative initial condition of the BK evolution.
The results presented here are from the first-ever nu- merical calculation of the heavy quark structure functions in the dipole picture at NLO. The successful description of the HERA data demonstrates that future global analyses are feasible and can be applied to probe in detail gluon saturation at the LHC and future EIC, where nuclear targets with larger saturation scales are available.
Structure functions at high energy -Using the optical theorem, the total virtual photon (γ * ) -proton (p) cross section can be obtained from the forward elastic γ * + p → γ * + p scattering amplitude. In the dipole picture, the γ * +p scattering is described in terms of eikonal interactions between the partonic Fock states of the photon and the target color field, and perturbatively calculable impact factors describing the photon fluctuations to the given partonic states. Eikonal interactions with the target are encoded in the Wilson lines, which are the scattering matrix elements for bare partons propagating through the target color field.
At NLO the contributing photon Fock states are the quark-antiquark |qq and quark-antiquark-gluon |qqg states. Therefore, at NLO the total virtual photon cross section can be schematically decomposed into two parts. The first contribution (illustrated in Fig. 1a) corresponds to the case where the qqg system crosses the shockwave: The second contribution (illustrated in Fig. 1b), which includes the lowest-order part (interaction with an unevolved target) and the one-gluon-loop QCD corrections to it, reads Here K qq and K qqg refer to the perturbatively computed NLO impact factors obtained with massive quarks in [26][27][28] and in the massless quark limit in [29][30][31]. In addition, the notation ⊗ refers to an integral over the parton transverse coordinates x i and longitudinal momentum fractions in the mixed space. Additionally, N 01 and N 012 are correlators of two or three Wilson lines, where the subscripts 0, 1, 2 refer to the transverse coordinates of the quark, antiquark and the gluon. In terms of the Wilson lines V (x) in the fundamental representation these correlators read: Here · · · refers to the average over the target color charge configurations, N c is the number of colors, In addition, we have used the mean-field limit (which is a precise approximation [32]) to factorize the expectation value of the product to a product of expectation values.
The Wilson lines and their correlators satisfy smallx Bj evolution equations describing their dependency on the center-of-mass energy (see [33] for a detailed discussion of the evolution variable). The dipole amplitude N 01 satisfies the BK equation [22,23] and via Eq. (4) N 012 also depends on the center-of-mass energy. The evolution rapidity depends on the lower limit of the emitted gluon longitudinal momentum fraction [24,34]. The integration over the emitted gluon phase space in Eq. (1) contributes a large logarithm of energy that modifies the scattering amplitude of the original dipole N 01 . These logarithms are resummed into the BK equation [34]. The BK equation and a numerical solution to it are known at NLO [35][36][37]. We use the initial condition fitted to the HERA data in [24] including only massless quarks, where the full (numerically heavy) NLO BK equation has been approximated by evolution equations that use different schemes to resum the most important higherorder corrections. The same evolution equations, Re-sumBK [38,39], KCBK [40] and TBK [33] referring to different resummation schemes, are used in this work as in [24].
The structure functions are written in terms of the total virtual photon-target cross sections as Here the subscripts T and L refer to the transverse and longitudinal virtual photon polarization, respectively, and σ γ * T,L correspond to a sum of qq and qqg contributions. The experimental data is reported in terms of the reduced cross section where y = Q 2 /(sx) is the inelasticity and √ s is the lepton-nucleon center-of-mass energy.
Results -We calculate the proton reduced cross section σ r and the charm and bottom contributions to it (σ r,c and σ r,b ). We use the NLO dipole-proton scattering amplitudes determined in [24], available at [41]. In particular, we use the "light quark" fits of [24] where only the massless u, d and s quarks are included and the nonperturbative initial condition is fitted to the light quark contribution of the reduced cross section data measured at HERA [42]. This contribution is determined in [24] by subtracting interpolated charm and bottom quark contributions from the total cross section data. We do not include the fits to the inclusive HERA data as they use the massless quark cross sections to fit the inclusive data containing a substantial heavy quark contribution.
In [24] multiple different fits are reported, corresponding to different choices for the initial evolution rapidity Y 0,BK and different schemes for the coordinate space running coupling and resummations of particular higherorder corrections. In total there are 12 fits reported for massless quarks. All different fits result in an approximately equally good description of the light quark contribution to the HERA structure function data.
We calculate predictions for the charm production cross section in the region x Bj < 0.01, 2.5 GeV 2 ≤ Q 2 < 50 GeV 2 using all the different fits from [24], and compare the result to the HERA data from [25] in order to find which fits (if any) are allowed by the heavy quark production data. The charm mass (in the pole mass scheme used in the calculation of [27]) is allowed to vary within 1.1 GeV < m c < 1.6 GeV. We consider a fit to be compatible with the HERA charm production data if one obtains χ 2 c /N 2.5 with the optimal charm mass. We find that predictions calculated by using three of the 12 fits are in excellent agreement with the charm production data. This is illustrated in Fig. 2, where a comparison to the HERA reduced cross section data in a few selected Q 2 bins is shown. The H1 and ZEUS collaborations have also measured inclusive b quark production [25], but due to the larger uncertainties and more limited kinematical coverage we do not use this dataset to determine which NLO fits from [24] are allowed. We however note that using each of the three fits discussed above an excellent description of the b quark production data is obtained. In each case we find χ 2 b /N 1.6 when the b quark mass is also fitted to this data.
The excellent agreement with the predicted heavy quark production cross sections and the HERA measurements shows that at NLO it is possible to simultaneously describe all small-x Bj proton structure function data. The results also demonstrate that the inclusion of the heavy quark production data to the extraction of the non-perturbative initial condition for the high-energy evolution equation provides additional tight constraints. Similar conclusions have also been made in calculations of exclusive heavy quarkonium production [43,44]. The advantage of the charm reduced cross section studied in this work is that one does not need to introduce an additional model uncertainty related to the non-perturbative vector meson structure.
The fits that are found to be compatible with the charm quark production data are summarized in  [24] that are compatible with the heavy quark production data from HERA. The corresponding charm and bottom masses are also shown. The terminology used to specify the resummation scheme and the running coupling prescription follows that of [24], and the abbreviation PD refers to parent dipole and BSD to Balitsky + smallest dipole [45] running coupling.
along with the determined optimal heavy quark masses. The fact that the heavy quark data provides additional strong constraints for the determination of the initial condition for the BK evolution is expected. The heavy quark cross section is sensitive to much smaller dipoles than the inclusive one which can not discriminate fits that differ only at small dipole sizes. We note that the heavy quark production data only allows fits where the BK evolution is started at initial rapidity Y 0,BK = 0. In the second class of fits considered in [24] the dipole is frozen in the low-energy region 0 < Y < Y 0,BK = ln 1 0.01 where Y is the evolution rapidity. This is not completely consistent as the qqg production cross section (1) in the soft gluon limit results in a (leading order) BK evolution for the dipole. Additionally, we note that (in the case of ResumBK and KCBK evolutions formulated in terms of the projectile rapidity) the parent dipole prescription for the running coupling is preferred. We interpret that these physical constraints from the heavy-quark production data are because charm and bottom production probe dipole amplitudes in the perturbative region, and contribution from large dipoles dominating in light quark production with N ∼ 1 is suppressed, see e.g. [6]. Based on the observations above we argue that the fits summarized in Table I are the ones that should be used in all NLO CGC calculations. The potential deviation between the predictions is then a measure of the model uncertainty after the nonperturbative input is constrained by all HERA structure function data.
To more clearly illustrate the compatibility of the NLO CGC calculation with the most recent precise HERA data from [46], we show in Fig. 3 the total reduced cross sections computed using the dipole amplitude fits allowed by the charm data. We emphasize that this is the first time in the CGC framework that a simultaneous description of both the total and heavy quark production cross section is obtained when a perturbative small-x Bj evolution equation is used to describe the center-of-mass energy dependence.
Previous LO analyses have found it impossible to perform such a global fit to the HERA data without intro-  ducing, for example, additional parameters that render the proton probed by a charm quark dipole different from the proton probed by a light quark dipole [12]. A similar approximative NLO evolution equation as in this work was used in [14] but coupled to the LO impact factor. In that case, it was also found impossible to simultaneously describe the inclusive and heavy quark production data. When the computation is promoted to full NLO accuracy the mass dependence is modified for two reasons. First, after including higher-order corrections to the BK equation (in projectile rapidity), the dipole amplitude does not anymore evolve towards an asymptotic shape with an anomalous dimension γ < 1 (at small dipole sizes r the amplitude behaves as N ∼ r 2γ ) [47]. Instead, the anomalous dimension (which is γ 1 in the fits reported in [24]) remains approximatively constant suppressing the dipole amplitudes at small dipoles [24,37]. Hence, the heavy quark production cross section is suppressed relative to light quark production. Second, adding the NLO corrections to the massive impact factor enhances the heavy quark production. With TBK evolution we have opposite systematics: a small γ is developed and the impact factor suppresses heavy quark production. The net effect of these two competing NLO corrections is such that the mass dependence of the cross section matches that of the HERA data when the three fits identified in this work are used.
Finally, we illustrate the remaining theory uncertainty when performing NLO CGC calculations. We calculate predictions for the the proton longitudinal structure function F L , and for the charm and bottom quark contributions to it, in the EIC kinematics. We take x Bj = 2·10 −3 , and show in Fig. 4 the structure functions as a function of Q 2 calculated using the three fits determined above. For the bottom structure function, the different fits result in almost identical predictions for the EIC, whereas for charm production the predictions begin to differ at Q 2 20 GeV 2 . On the other hand, in the total longitudinal cross section a significant difference up to 20% is seen at all Q 2 . Therefore, an inclusion of the future F L data in the global analysis will provide further constraints for the initial condition of the small-x Bj evolution. The currently available F L data from HERA [48] is not able to distinguish between the different fits. Discussion -We have calculated heavy quark production cross sections in DIS at NLO in the CGC framework.
The Bjorken-x Bj dependence is obtained by solving the BK evolution equation with an initial condition extracted by fitting the total DIS cross section data in [24]. We identify a small subset of the fits reported in [24] that result in predictions for the charm and bottom structure functions which are in excellent agreement with the HERA data [25]. These three fits, constrained by both the total and heavy quark cross section data summarized in Table I, should be used in all future phenomenological applications at NLO accuracy. This is the first time in the CGC framework with perturbative energy evolution when a simultaneous description of all small-x Bj proton structure function data is obtained. A good agreement with the HERA measurements is a crucial test for the gluon saturation physics incorporated in the CGC framework, and enables rigorous studies of non-linear QCD dynamics in DIS and other scattering processes. In particular, we demonstrate that global analyses including all small-x Bj structure function data are feasible at NLO and that the heavy quark production data can provide additional constraints in such analyses.
As an application, we have calculated predictions for the proton longitudinal structure function F L , which will be measured accurately at the future Electron-Ion Collider. We reported predictions separately for the inclusive and heavy quark production cross section, and showed that the remaining model uncertainty is moderate. Including the F L data to the global analysis will further constrain the non-perturbative initial condition for the small-x Bj evolution equations.
To fully explore the model uncertainties one should perform a global analysis to the HERA inclusive and heavy quark production data, taking into account the correlated experimental uncertainties, and extract the non-perturbative model parameters with their uncertainties directly from such an analysis. Additional constraints and more detailed probes of non-linear dynamics can be obtained by including other observables such as diffractive structure functions and exclusive cross sections. Such studies are becoming feasible thanks to the extensive progress toward NLO accuracy in the CGC framework, see e.g. Refs. [43,44,[49][50][51][52][53][54][55][56][57][58]. In the future, we plan to perform a full Bayesian analysis to determine the likelihood distribution for all the model parameters, which will enable one to also fully take into account the propagation of model uncertainties.
Acknowledgments -We thank T. Lappi for discussions, and V. Apaja for computational support. This work was supported by the Academy of Finland, the Centre of Excellence in Quark Matter and projects 338263, 346567, 321840, 347499 and 353772. This work was also supported under the European Union's Horizon 2020 research and innovation programme by the European Research Council (ERC, grant agreement No. ERC-2018-ADG-835105 YoctoLHC) and by the STRONG-2020 project (grant agreement No. 824093). J.P. is supported by the Finnish Cultural Foundation. The content of this article does not reflect the official opinion of the European Union and responsibility for the information and views expressed therein lies entirely with the authors. Computing resources from CSC -IT Center for Science in Espoo, Finland and from the Finnish Grid and Cloud Infrastructure (persistent identifier urn:nbn:fi:research-infras-2016072533) were used in this work.