Polarized Parton Distributions at an Electron-Ion Collider

We study the potential impact of inclusive deep-inelastic scattering data from a future electron-ion collider (EIC) on longitudinally polarized parton distribution (PDFs). We perform a PDF determination using the NNPDF methodology, based on sets of deep-inelastic EIC pseudodata, for different realistic choices of the electron and proton beam energies. We compare the results to our current polarized PDF set, NNPDFpol1.0, based on a fit to fixed-target inclusive DIS data. We show that the uncertainties on the first moments of the polarized quark singlet and gluon distributions are substantially reduced in comparison to NNPDFpol1.0, but also that more measurements may be needed to ultimately pin down the size of the gluon contribution to the nucleon spin.


Experiment Set
x min  Table 1: The three EIC pseudodata sets [19]. For each set we show the number of points N dat , the electron and proton beam energies E e and E p , the center-of-mass energy √ s, the kinematic coverage in the momentum fraction x, and the average absolute statistical uncertainty δg 1 .  [5] and the EIC pseudodata from [19]. The shaded bands show the expected kinematic reach of each of the two EIC scenarios discussed in the text. the electron and proton beam energies E e , E p ; the corresponding center-of-mass energies √ s; and the smallest and largest accessible value in the momentum fraction range, x min and x max respectively. The kinematic coverage of the EIC pseudodata is displayed in Fig. 1 together with the fixedtarget DIS data points included in our previous analysis [5]. The dashed regions show the overall kinematic reach of the EIC data with the two electron beam energies E e = 5 GeV or E e = 20 GeV, corresponding to each of the two stages at eRHIC. It is apparent from Fig. 1 that EIC data will extend the kinematic coverage significantly, even for the lowest center-of-mass energy. In particular, hitherto unreachable small x values, down to 10 −4 , will be attained, thereby leading to a significant reduction of the uncertainty in the low-x extrapolation region. Furthermore, the increased lever-arm in Q 2 , for almost all values of x, should allow for much more stringent constraints on ∆g(x, Q 2 ) from scaling violations.
The observable provided in Ref. [19] for inclusive DIS pseudodata is the ratio g 1 (x, Q 2 )/F 1 (x, Q 2 ); we refer the reader to Ref. [5] for a discussion of its relation to experimentally measured asymmetries. The generation of pseudodata assumes a "true" underlying set of parton distributions. In Ref. [19] these are taken to be DSSV+ [6] and MRST [24] polarized and unpolarized PDFs respectively. Uncertainties are then determined assuming an integrated luminosity of 10 fb −1 , which corresponds to a few months operations for the anticipated luminosities for eRHIC [21], and a 70% beam polarization. Because the DSSV+ polarized gluon has rather more structure than that of NNPDFpol1.0, which is largely compatible with zero, assuming this input shape will allow us to test whether the EIC data are sufficiently accurate to determine the shape of the gluon distribution.
We reconstruct the g 1 polarized structure function from the pseudodata following the same procedure used in Ref. [5] for the E155 experiment. We provide its average statistical uncertainty in the last column of Tab. 1. A comparison of these values with the analogous quantities for fixed-target experiments (see Tab. 2 in Ref. [5]) clearly shows that EIC data are expected to be far more precise, with uncertainties reduced up to one order of magnitude. No information on the expected systematic uncertainties is available. We will perform two different fits, corresponding to the two stages envisaged for the eRHIC option of an EIC [21] discussed above, which will be referred to as NNPDFpolEIC-A and NNPDFpolEIC-B. The former includes the first two sets of pseudodata listed in Tab. 1, while the latter also includes the third set. The methodology for the determination of PDFs follows the one adopted in Ref. [5], to which we refer for details. The only modifications are the following. First, we have re-tuned the genetic algorithm which is used for minimization, and the parameters which determine its stopping at the optimal fit. This is required to obtain a good fit quality with EIC pseudodata, which are very accurate in comparison to their fixed-target counterparts and cover a wider kinematic region (see Fig. 1). In particular, we have used a larger population of mutants, increased the number of weighted training generations and tuned the stopping parameters. Furthermore, we have redetermined the range in which preprocessing exponents are randomized, since the new information from EIC pseudodata may modify the large-and small-x PDF behavior. In Tab. 2, we show the values we use for the present fit, compared to NNPDFpol1.0. We have checked that our choice of preprocessing exponents does not bias Table 2: Ranges for the small-and large-x preprocessing exponents.
our fit, according to the procedure discussed in Sec. 4.1 of Ref. [5].
Various general features of the NNPDFpolEIC-A and NNPDFpolEIC-B PDF determinations are summarized in Tab. 3, compared to NNPDFpol1.0. These include the χ 2 per data point of the final best-fit PDF compared to data, (denoted as χ 2 tot ), the average and standard deviation over the replica sample of the same figure of merit for each PDF replica when compared to the corresponding data replica (denoted as E ± σ E ) computed for the total, training and validation sets, the average and standard deviation of the χ 2 of each replica when compared to data (denoted as χ 2(k) ), and the average number of iterations of the genetic algorithm at stopping TL and its standard deviation over the replica sample. A more detailed discussion of these quantities can be found in previous NNPDF papers, in particular in Refs. [25,26], and Ref. [5] for the polarized case.  Table 3: Statistical estimators and average training length for the NNPDFpolEIC-A and NNPDFpolEIC-B with N rep = 100 replicas, compared to the NNPDFpol1.0 reference fit [5].
The fit quality, as measured by χ 2 tot , is comparable to that of NNPDFpol1.0 (χ 2 tot = 0.77) for both the NNPDFpolEIC-A (χ 2 tot = 0.79) and the NNPDFpolEIC-B (χ 2 tot = 0.86) fits. This shows that our fitting procedure can easily accommodate EIC pseudodata. The histogram of χ 2 values for each data set included in our fits is shown in Fig. 2, together with the NNPDFpol1.0 [5] result; the unweighted average χ 2 set ≡ 1

Nset
Nset j=1 χ 2 set,j and standard deviation over data sets are also shown. As already pointed out in Ref. [5], χ 2 values significantly below one are found as a consequence of the fact that information on correlated systematics is not available for most experiments, and thus statistical and systematic errors are added in quadrature. Note that this is not the case for the EIC pseudodata, for which, as mentioned, no systematic uncertainty was included; this may explain the somewhat larger (closer to one) value of the χ 2 per data point which is found when the pseudodata are included.
We notice that EIC pseudodata, which are expected to be rather more precise than fixed-target DIS experimental data, require more training to be properly learned by the neural network. This is apparent in the increase in T L in Tab. 3 when going from NNPDFpol1.0 to NNPDFpolEIC-A and then NNPDFpolEIC-B. We checked that the statistical features discussed above do not improve if we run very long fits, up to N max gen = 50000 generations, without dynamical stopping. In particular we do not observe  a decrease of the χ 2 for those experiments whose value exceeds the average by more than one sigma. This ensures that these deviations are not due to underlearning, i.e. insufficiently long minimization.
Parton distributions from the NNPDFpolEIC-A and NNPDFpolEIC-B fits are compared to NNPDFpol1.0 [5] in Figs. 3-4 respectively. In these plots, PDFs are displayed at Q 2 0 = 1 GeV 2 as a function of x on a logarithmic scale; all uncertainties shown here are one-σ bands. The positivity bound, obtained from the NNPDF2.3 NLO unpolarized set [27] as discussed in Ref. [5], is also drawn.
The most visible impact of inclusive EIC pseudodata in both our fits is the reduction of PDF uncertainties in the low-x region (x 10 −3 ) for light flavors and the gluon. The size of the effects is different for different PDFs. As expected, the most dramatic improvement is seen for the gluon, while uncertainties on light quarks are only reduced by a significant factor in the small x region. The uncertainty on the strange distribution is essentially unaffected: unlike in Ref. [19], we find no improvement on strangeness, due to the fact that we do not include semi-inclusive kaon production  data, contrary to what was done there. When moving from NNPDFpolEIC-A to NNPDFpolEIC-B the gluon uncertainty decreases further, while other PDF uncertainties are basically unchanged.
In Fig. 5 we compare the polarized gluon PDF in our EIC fits to the DSSV [1] and NNPDFpol1.0 [5] parton determinations, both at Q 2 0 = 1 GeV 2 and Q 2 = 10 GeV 2 . The DSSV uncertainty is the Hessian uncertainty computed assuming ∆χ 2 = 1, which corresponds to the default uncertainty estimate in Ref. [1]. This choice may lead to somewhat underestimated uncertainties: indeed, a more conservative uncertainty estimate is also provided in Ref. [1]. Furthermore, it is known from unpolarized global PDF fits that a somewhat larger 'tolerance' T value ∆χ 2 = T [28] should be adopted in order for the distribution of χ 2 values between different experiments in the global fit to be reasonable (indeed this choice was made in the polarized fit of Ref. [3], with T = 12.65).
It is clear that the gluon PDF from our fits including EIC pseudodata is approaching the DSSV PDF shape, especially at a lower scale where the DSSV gluon does have some structure, despite the fact that at higher scales, where much of the data is located, perturbative evolution tends to wash out  Figure 5: The polarized gluon PDF ∆g(x, Q 2 0 ), at Q 2 0 = 1 GeV 2 (upper panels) and at Q 2 = 10 GeV 2 (lower panels), in the NNPDFpolEIC PDF sets, compared to DSSV [1] and to NNPDFpol1.0 [5]. this shape. Also, this is more pronounced as more EIC pseudodata are included in our fit, i.e. moving from NNPDFpolEIC-A to NNPDFpolEIC-B. This means that EIC data would be sufficiently accurate to reveal the polarized gluon structure, if any.
It is particularly interesting to examine how the EIC data affect the determination of the first moments of the polarized PDFs ∆f (x, Q 2 ), as they are directly related to the nucleon spin structure. We have computed the first moments, Eq. (1), of the singlet, lightest quark-antiquark combinations and gluon for the NNPDFpolEIC-A and NNPDFpolEIC-B PDF sets. The corresponding central values and one-σ uncertainties at Q 2 0 = 1 GeV 2 are shown in Tab. 4, compared to NNPDFpol1.0 [5]. It is clear that EIC pseudodata reduce all uncertainties significantly. Note that moving from  Table 4: First moments of the polarized quark distributions at Q 2 0 = 1 GeV 2 for the fits in the present analysis, compared to NNPDFpol1.0 [5].
NNPDFpolEIC-A to NNPDFpolEIC-B does not improve significantly the uncertainty on quark-antiquark first moments, but it reduces the uncertainty on the gluon first moment by a factor two. However, it is worth noticing that, despite a reduction of the uncertainty on the gluon first moment, even for the most accurate NNPDFpolEIC-B fit, the value remains compatible with zero even though the central value is sizable (and negative).
In order to assess the residual extrapolation uncertainty on the singlet and gluon first moments, we determine the contribution to them from the data range x ∈ [10 −3 , 1], i.e.
The first moments Eq. (2) are given in Tab. 5 at Q 2 0 = 1 GeV 2 and Q 2 = 10 GeV 2 , where results for central values, uncertainties, and correlation coefficients between the gluon and quark are collected.
Comparing the results at Q 2 = 1 GeV 2 of Tab. 4 and Tab. 5 we see that in the NNPDFpol1.0 PDF determination for the quark singlet combination the uncertainty on the full first moment is about twice as large as that from the measured region, and for the gluon it is about four times as large. The difference is due to the extra uncertainty coming from the extrapolation. In NNPDFpolEIC-B the corresponding increases are by 20% for the quark and 30% for the gluon, which shows that thanks to EIC data the extrapolation uncertainties would be largely under control. The correlation coefficient ρ  Table 5: The singlet and gluon truncated first moments and their one-σ uncertainties at Q 2 = 1 GeV 2 and Q 2 = 10 GeV 2 for the NNPDFpolEIC PDF sets, compared to NNPDFpol1.0 [5]. The correlation coefficient ρ at Q 2 = 10 GeV 2 is also provided.
significantly decreases upon inclusion of the EIC data: this means that the extra information contained in these data allows for an independent determination of the quark and gluon first moments. In Fig. 6, we plot the one-σ confidence region in the ( ∆Σ(Q 2 ) TR , ∆g(Q 2 ) TR ) plane at Q 2 = 10 GeV 2 , for NNPDFpolEIC-A, NNPDFpolEIC-B and NNPDFpol1.0 [5]. The main result of our analysis, Fig. 6, can be directly compared to Fig. 8 of Ref. [19], which was based on the DSSV fit and is comparable to our NNPDFpolEIC-B results. In both analyses EIC pseudodata determine the singlet first moment in the measured region with an uncertainty of about ±0.05.
On the other hand, in Ref. [19] the uncertainty on the gluon was found to be about ±0.02, while we get a much larger result of ±0.30. One may wonder whether this difference may be due at least in part to the fact that the DSSV fit on which the result of Ref. [19] is based also includes jet production and pion production data from RHIC, which may reduce the gluon uncertainty. To answer this, we have computed the contribution to the gluon first moment (again at Q 2 = 10 GeV 2 ) from the reduced region 0.05 ≤ x ≤ 0.2, where the RHIC data are located. We find that the uncertainty on the contribution to the gluon first moment in this restricted range is ±0.083 using NNPDFpolEIC-B, while it is ±0.147 with NNPDFpol1.0 and +0.129 −0.164 with DSSV [29]. We conclude that before the EIC data are added, the uncertainties in NNPDFpol1.0 and DSSV are quite similar despite the fact that DSSV also includes RHIC data. Hence, the larger gluon uncertainty we find for the NNPDFpolEIC-B fit in comparison to Ref. [19] is likely to be due to our more flexible PDF parametrization, though some difference might also come from the fact that the SIDIS pseudodata included in Ref. [19] provide additional information on the gluon through scaling violations of the fragmentation structure function g h 1 (of course this also introduces an uncertainty related to the fragmentation functions which is difficult to quantify).
In summary, the EIC data would entail a very considerable reduction in the uncertainty on the polarized gluon. They would provide first evidence for a possible nontrivial x shape of the polarized gluon distribution. They would also provide evidence for or against a possible large gluon contribution to the nucleon spin, though the latter goal would still be reached with a sizable residual uncertainty. Additional measurements at an EIC, such as the charm polarized structure function, g c 1 , might provide more information on ∆g and its first moment.