Strange sea determination from collider data

We consider determinations of the strange sea in the nucleon based on the QCD analyses of data collected at the LHC with focus on the recent high-statistics ATLAS measurement of the $W^\pm$- and $Z$-boson production. We study the effect of different functional forms for parameterization of the parton distribution functions and the combination of various data sets in the analysis. We compare to earlier strange sea determinations and discuss ways to improve them in the future.


Introduction
The precise knowledge of the light-quark content of proton is very important for phenomenological studies at the Large Hadron Collider (LHC). QCD analyses of data from colliders and fixed-target experiments make this information available through the parton distribution functions (PDFs), which nowadays are accurate to next-to-next-to-leading order (NNLO) in perturbation theory [1]. It has been shown [2], that the recent LHC data on W ± -and Z-boson production provide valuable constraints on the light-quark distributions for up, down and strange and help to improve the flavor separation. Currently, however, the extraction of the strange sea carries the largest uncertainty, which, for instance, plays a crucial role in the precision of the recent M W -mass measurement by the ATLAS experiment [3]. It is therefore of particular importance to pin down the strange sea determination to better accuracy.
Such an improvement can be achieved with the W ± → l ± ν and Z → l + l − cross section measurements of the ATLAS experiment [4]. However, the ATLAS analysis, which has been published as the so-called epWZ16 set of PDFs [4], has obtained a strange-quark sea of a size comparable to the non-strange-quark ones in the kinematic range of Bjorken x ∼ 0.01. In this way, ATLAS has confirmed with better accuracy its earlier results [5] based on a smaller data sample [6]. An enhancement of the strange-sea was observed by ATLAS also in an analysis of its data on the associated production of W ± -bosons and a charm-quark [7], which were well described by its epWZ12 PDF set published in Ref. [5]. On the other hand, an analysis of the W ± +charm data collected by the CMS experiment [8] does not show any such strange sea enhancement.
In a wider context, this situation is problematic, because the ATLAS results also disagree with the strange sea PDFs extracted from other processes. First of all, there is data on charm-quark production in the neutrino-induced deep-inelastic scattering (DIS) off nucleons. This process, initially measured with a good accuracy by the CCFR and NuTeV experiments at Tevatron [9], was later studied with an even better accuracy by the NOMAD experiment at CERN's SPS collider [10]. All three experiments prefer a stronger suppression of the strange sea as compared to the ATLAS one [11]. Moreover, the ATLAS findings of an almost perfect flavor SU(3) symmetry among the three light sea quark distributions have not been confirmed in global fits of PDFs, as reviewed for instance in [1].
The present paper aims at clarifying these discrepancies and at consolidating the different strange sea determinations. For this purpose we use the global ABMP16 PDF fit [2] as a framework. We consider variants of the ABMP16 fit with different shapes for the functional form of the PDF parameterization at the initial scale of the fit as used by ATLAS and in the ABMP16 analyses. We also consider combinations of different sets of data from colliders and fixed-target experiments. In this way we can separate the impact of different effects on the strange sea determination and localize the origin of discrepancies.

Shape of PDF parameterizations
The ABMP16 analysis [2] is performed at NNLO accuracy in QCD and the PDF extraction is based on inclusive DIS and Drell-Yan (DY) data supplemented by data on the DIS-and hadroproduction of heavy quarks. In the ABMP16 fit, the PDFs are parameterized at a starting scale µ 2 0 = 9 GeV 2 for the QCD evolution in a scheme with N F = 3 light flavors as follows with the valence quark distributions (q v for q = u, d), the sea quark distributions (q s for q = u, d, s), assuming q s (x, µ 2 0 ) =q s (x, µ 2 0 ), and the gluon. The functional form of the PDFs is controlled by the exponents a p and b p and the functions P p (x) of the form where p = qv, qs, g. The normalizations N v q and A g in Eq.
(1) have been determined from the fermion number and momentum conservation sum rules and δ qq denotes the Kronecker symbol. All other 25 parameters A qs , a p , b p and γ p are fitted to the data. It has been checked in the ABMP16 fit, that Eq. (2) allows for sufficient flexibility of the PDFs in the entire range of Bjorken-x covered by the data which are included into the fit.
In contrast, the analysis of the ATLAS W ± → l ± ν and Z → l + l − cross section measurements for the extraction of the epWZ16 PDFs [4] at NNLO in QCD has used a much restricted set of data. ATLAS only includes its own data on W ± -and Z-production in combination with DIS data from the HERA collider. The epWZ16 PDFs are derived from the following parameterizations at the starting scale µ 2 0 = 1.9 GeV 2 , for N F = 3 light flavors, and assuming s =s for the strange sea. Again, the sum rules for fermion number and momentum conservation determine the normalizations A g , A u v and A d v . In addition, the parameter C g = 25 is fixed by hand to a large value and the assumption of iso-spin symmetry at small x, i.e.,ū =d as x → 0 is invoked to set Aū = Ad and Bū = Bd. Finally, the strange sea at small x is assumed to be related to the light quark sea,ū andd, so that Bs = Bd = Bū is put by hand. For these assumptions there is neither theoretical evidence nor are they indicated by fits using more general parameterizations, as will be shown below. This leaves a total of 15 variables in Eq. (3) to be determined from data. The PDF shape Eq. (3) is motivated by the predecessors of the epWZ16 analysis, the PDF fits of HERAPDF family, which are based exclusively on HERA data and therefore have to impose several constraints on the PDF shapes in kinematic regions of Bjorken-x, which are not sufficiently covered by the HERA data.
In order to check the consistency of the assumptions underlying the epWZ16 PDFs with the data at large Bjorken-x, which commonly constrain the PDFs in global analyses we perform a test variant of the ABMP16 fit using the PDF shapes of Eq. (3). In addition, the collider data from LHC and Tevatron on rapidity distributions for W ± -and Z-boson production in the electron-and muondecay channels as well as for lepton-charge asymmetries (see Tab. 2 in [2]) are replaced by data for DIS off deuterons (see Tab. 3.2 in [12]). The latter had been omitted in the ABMP16 fit since they require taking into corrections for the nuclear effects in the deuteron target, which bring in an additional source of uncertainty. In the meantime, though, the PDFs extracted with deuteron DIS data included and using the shape of deuteron corrections suggested by the off-shellness model of Kulagin-Petti [13] have been shown to be in agreement with the ones preferred by the W ± -and Z-boson collider data [14]. Therefore, the deuteron DIS data allow to obtain a reliable constraint on the light-quark PDFs (u, d) in the range x 0.01. This approach avoids the tedious computation of predictions for the lepton rapidity distributions in W ± -and Z-boson production with account of kinematic cuts by means of fully differential codes, like FEWZ (version 3.1) [15,16]. Thus, the use of deuteron DIS data leads to a fast and efficient fit, since the relevant DIS cross sections are evaluated at NNLO in QCD with the code OPENQCDRAD (version 2.1) [17].
For the purpose of comparison we also consider the variant of fit with the same data selection and the ABMP16 shape for the PDFs in Eq. (1). The notations used throughout the paper to present results of these two variants are the following: ABMP16 shape -a fit referring to the W ± -and Z-boson collider data replaced by deuteron DIS data and the PDF shape of Eq. (1), epWZ16 shape -the same but with the PDF shape of Eq. (3).
In Fig. 1 we show the results of these test fits for the strangeness suppression factor and the sea-quark iso-spin asymmetry .
The comparison of the nominal ABMP16 fit in Fig. 1 with the variant ABMP16 shape shows good compatibility of both quantities in the range x 0.01. This confirms the capability of deuteron data to replace the DY ones in the present study. However, at smaller values x 0.01, the uncertainty both in the iso-spin asymmetry I(x) and in the strangeness suppression r s (x) increases significantly in the variant ABMP16 shape. This happens due to flexibility of the PDF parameterization Eq. (1) at small x, which is determined in the nominal ABMP16 fit by the collider DY data relevant for this kinematics and lacks such constraints in the ABMP16 shape fit. In contrast, for the epWZ16 shape variant of fit with the PDF parameterization of Eq. (3) the strange sea is enhanced, rising to about r s ∼ 0.8 in the region x 0.1. It is worth stressing that no ATLAS data is used in this case. It is also interesting that the statistical quality of the neutrinoinduced DIS charm-production data description does not deteriorate with the observed strange sea enhancement. The total values χ 2 = 167 and 161 are obtained in the variants epWZ16 shape and ABMP16 shape, respectively, for the combination of CCFR/NuTeV, CHORUS, and NOMAD data sets used in the fit. This means that the strange sea enhancement is obviously achieved at the expense of a suppressed d-quark sea, as it was pointed out earlier in Ref. [11] and is also demonstrated in Fig. 1. In addition, the uncertainty band in the iso-spin asymmetry I(x) of the epWZ16 shape variant of the fit is significantly smaller than that for the ABMP16 shape and for the nominal ABMP16 fit despite the fact that the latter is based on a much wider set of data, even including DY collider data.
In summary the test variants indicate that the parameterization of Eq. (3) underlying the epWZ16 PDFs is over-constrained. As a matter of fact, Eq. (3) leaves little flexibility for the iso-spin asymmetry I(x, µ 2 ), which can be written as where δC = Cd − Cū. Depending on the sign of δC, this function is either positive or negative definite in the range 0 < x < 1. Thus, it does not allow non-monotonic behaviour of I(x) and, in particular, a delayed onset of the Regge asymptotics of a vanishing I(x) at small-x indicated by DY collider data, see also [18].
As a direct consequence of the epWZ16 shape's limited flexibility with respect to the parameterization of the iso-spin asymmetry one observes a poor description of the fixed-target DY data collected by the E866 experiment at the beam energy E b = 800 GeV and with di-muon invariant masses in the range 4.6≤ M µµ ≤12.9 GeV in proton-proton and proton-deuteron collisions [19]. The value of χ 2 /NDP = 96/39 obtained for this sample in the epWZ16 shape fit is about twice larger than χ 2 /NDP = 49/39 for the ABMP16 shape fit and χ 2 /NDP = 53/39 for the nominal ABMP16 fit, where NDP denotes the number of data points. In Fig. 2 the epWZ16 shape pulls demonstrate a clear off-set at small x. In addition, the uncertainties in the predictions of this fit are greatly suppressed, obviously due to the constraint I(x) → 0 in the limit x → 0 implied by Eq. (6). The pulls of the E866 data on inclusive di-muon production in proton-proton collisions [19] for the test fit ABMP16 shape versus the partonic momentum where M µµ , s and x L are invariant mass of the di-muon system, the center-of-mass collision energy and the Bjorken longitudinal momentum of the di-muon system, respectively. The uncertainties in the predictions of the ABMP16 shape fit (right-tilted hatching) and the difference between its central value and the epWZ16 shape fit including uncertainties of the latter (left-tilted hatching) are displayed for comparison.  Fig. 1 for the ABMP16 PDFs (gray, shaded) and two variants of the ABMP16 fit with the ATLAS data sets for the W ± → l ± ν and Z → l + l − cross sections from 2016 [4]: ATLAS(2016) (righttilted hatching) and the same in combination with the fixed-target DY data of the E866 experiment [19], ATLAS(2016)+E866 (left-tilted hatching).

Combinations of data sets in PDF fits
In order to check constraints on PDFs due to the combination of different data sets we consider two more variants of the ABMP16 fit. Specifically, we are interested in the final inclusive e ± p scattering cross-section data from H1 and ZEUS [20] in combination with the ATLAS data [4] on the W ± → l ± ν and Z → l + l − production, as these data have been employed in the fit of the epWZ16 PDFs.
To this end, we start from the ABMP16 fit and keep only the proton DIS data (see Tab. 2 in Ref. [2]), while we drop the ones on neutrino-induced charged-current charm-quark production (see Tab. 2 in Ref. [2]), which have the most essential impact on the strange sea determination in the global PDF fits, cf. e.g. Ref. [11]. Also, all other DY collider data are omitted in order to shed light on the potential of the ATLAS sample and no deuteron DIS data are used, either. Such a data selection more or less reproduces the framework of the ATLAS analysis [4] with one exception concerning the fixed-target proton data (see Tab. 2 in Ref. [2]). In our case these data are used in order to constrain the large-x PDF behavior. Furthermore, we either include or omit the fixedtarget DY data of the E866 experiment [19], which play an essential role in the interpretation of the ATLAS data, as we have already discussed earlier [11]. Thus, the following two variants are considered: ATLAS(2016) -a fit based on the proton DIS data used in the ABMP16 analysis and the 2016 ATLAS data set [4] for W ± → l ± ν and Z → l + l − production cross sections collected at the collision energy √ s = 7 TeV with a luminosity of 4.6 fb −1 and cuts on the lepton's transverse momentum P T of P l T > 20 GeV using the PDF shape of Eq. (1). We employ for this study Z-boson production data in the central-region, i.e. with the lepton-pair rapidity η ll 2.4, which provide the most accurate data sample of Ref. [4].
In Fig. 3 (left) we observe a somewhat enhanced strange sea from the variant ATLAS(2016) as compared to the nominal ABMP16 fit, but within uncertainties the results are well in agreement. Correspondingly, the ATLAS(2016) iso-spin asymmetry at x 0.01 is somewhat smaller than the nominal ABMP16 one, as shown in Fig. 3 (right). However, again, the discrepancy is statistically not very significant due to the quite big uncertainties in the ATLAS(2016) variant.
This discrepancy manifests itself also in the comparison with the E866 data, which overshoot the ATLAS(2016) predictions as shown in Fig. 4. Since the uncertainties in these predictions are also sizable, the E866 data can be well accommodated into the ATLAS(2016)+E866 variant of fit, with the values of χ 2 /NDP=48/39 and 40/34 obtained for the E866 and ATLAS sets, respectively. When the E866 data is included, the uncertainties both in I(x) and r s are greatly reduced as displayed already in Fig. 3. This illustrates in particular the poor potential of the combined HERA and ATLAS data in disentangling the light-flavor quark content of the proton. On the other hand, the value of iso-spin asymmetry I(x) obtained in the ATLAS(2016)+E866 variant is in a good agreement with the ABMP16 fit, which also includes the E866 data. In view of the clear correlation between strange-and down-quark distributions, the value of the strangeness suppression r s at x 0.05 in case of the ATLAS(2016)+E866 variant is smaller than the ATLAS(2016) one and also in agreement with the ABMP16 result, which is driven by the data on neutrino-induced DIS charm-quark production to a great extent.
In summary these comparisons explain the seemingly high precision of the strange sea extraction reported in [4] as a direct consequence of using the over-constrained PDF shape Eq. (3). The pulls for the ATLAS data on the pp → W ± + X → l ± ν + X production (left and center panels) and pp → Z + X → l + l − + X (right panel) at √ s = 7 TeV collected at luminosities of 35 pb −1 (2011) [6] (blue squares) and 4.6 fb −1 (2016) [4] (red circles) with cuts on the lepton's transverse momentum P l T > 20 GeV as a function of the lepton pseudo-rapidity η l versus NNLO predictions obtained using FEWZ (version 3.1) [15,16] and the ABMP16 PDFs. The uncertainties in predictions (shaded area) are superimposed for comparison. Indeed, as we have shown above, cf. Fig. 1, this shape results in a value of I(x) with underestimated uncertainties. In turn, this leads to an enhanced r s at x 0.01, which formally appears as being determined with a high accuracy. 1 In the epWZ16 fit no changes are observed even if the E866 data are included into the fit [21], which can easily be understood since the uncertainties in the epWZ16 predictions for E866 data are quite small due to the over-constrained PDF shape. Therefore there is only a little possibility for a variation of the PDFs in direction of the E866 data, in contrast to the case of the ATLAS(2016) fit, cf. Fig. 4.

ATLAS data vs. ABMP16 PDFs
The variant ATLAS(2016)+E866 discussed previously has revealed a strange sea suppression factor r s , shown in Fig. 3, which is somewhat enhanced at x ∼ 0.03 as compared to the ABMP16 determination. It is interesting to understand, which ATLAS cross section measurements, i.e., for W ± → l ± ν or Z → l + l − is driving this enhancement. From Fig. 5 it is obvious, that this happens due to the Z-boson production sample, which overshoots the ABMP16 prediction. The ATLAS Z-boson data of the 2016 sample is also increased compared to the earlier data from 2011 [6] used in the ABMP16 fit, especially in the region of large lepton-pair rapidity η ll 1.
Note, that the discrepancy between the prediction of the ABMP16 PDFs and the Z-boson data in Fig. 5 is at the level ∼ 1σ. Moreover, the 2011 ATLAS data sample [6] is well described in the ABMP16 fit with χ 2 /NDP = 31/30 (see Tab. 6 in [2]). Therefore, the new high statistics ATLAS data can evidently be accommodated into the ABMP16 fit with a reasonable value of χ 2 and at the price of a moderate variation of the PDFs.
In order to clarify the robustness of such a potential variation of the PDFs, we compare the new ATLAS results [4] with other ATLAS and CMS data on Z-boson production. For this comparison we select the integrated cross section for the central production with rapidities η ll 2.4 measured at the center-of-mass energies √ s = 7, 8 and 13 TeV [6,[22][23][24]. The kinematics of these measurements are similar to the one of ATLAS [4], so that they allow to understand how the latter data compare to the general trend. The ABMP16 predictions at NNLO in QCD are in agreement with the earlier ATLAS data at √ s = 7 TeV [6], which were used in the fit, while the recent ones [4] somewhat overshoot the predictions and the √ s = 13 TeV sample demonstrate similar trend, cf. Fig. 6. At the same time the latest ATLAS data are on average about ∼ 2% higher than the CMS ones at √ s = 7 and 8 TeV. This difference, although being with the experimental uncertainties, indicates that the CMS data prefer strange sea distribution comparable with the one in the ABMP16 PDFs.

Outlook
Having clarified the constraints on the strange sea PDFs from the recent ATLAS data on W ±and Z-boson production and having scrutinized analysis details underlying the determination of the epWZ16 PDFs we would like to close by reviewing prospects for further improvements in the precision of the strangeness distribution in the proton.  [4,6,[22][23][24]. The predictions for those data sets, which are not included into the ABMP16 analysis, are displayed with uncertainties (shaded area).
First of all, there is the associated production of W ± -bosons and a charm-quark which has been measured both by ATLAS and CMS [7,8]. These data had already been analyzed regarding their impact on the strange PDFs in [11]. In order to improve upon the current status, measurements of the lepton rapidity distributions dσ(W ± + c)/dη l for the individual W +c and W − c channels with an O(3%) accuracy are needed. The other limiting factor is the lack of the complete NNLO QCD corrections for those differential dσ(W ± + c)/dη l cross sections.
In the meantime, other global fits can be improved with the help of the NOMAD data on charged-current neutrino-induced DIS charm-quark production [10]. The impact of the NOMAD data on the determination of the strange sea at medium x is illustrated in Fig. 7 by comparing the experimental uncertainties in the data with the ones of the predictions based on various PDF fits. In Fig. 7 we select the sets of PDFs by CT14 [25], MMHT14 [26] and NNPDF3.1 [27], all of which have been fitted to a bulk of global data, however, not including those by the NOMAD experiment. The uncertainties in the NOMAD data on the cross sections of charged-current neutrinoinduced DIS charm-quark production [10] versus Bjorken x in comparison to predictions of various PDFs: CT14 [25] (left-tilted hatch), MMHT14 [26] (right-tilted hatch), NNPDF3.1 [27] (vertical hatch).
As a result of this lack of constraints from data, the uncertainties in the predictions by CT14, MMHT14 or NNPDF3.1 reach O(20%), cf. Fig. 7. For the case of NNPDF3.1 at large-x, x 0.5, the uncertainties are even larger than O(100%), which implies that some of the PDF replica predict negative cross sections. 2 Such unphysical features also cast severe doubt on the averaging procedure performed on the basis of these three PDF sets as advocated by the PDF4LHC recommendation [31]. Note, that the NOMAD data have much smaller uncertainties, ∼ 5%, and therefore can evidently help to consolidate the spread in predictions providing a more accurate framework for the precision studies, like the W-boson mass measurement [3]. Moreover, they are presented in the form of the dimuon and total charged-current cross section ratio, which is much less sensitive to the impact of nuclear corrections, as compared to the CCFR and NuTeV data.
Lattice QCD simulations may, in the future, also help to determine strangeness in the proton. The fraction of the nucleon's momentum carried by the strange quark has recently been extracted from nucleon matrix elements of suitable operators at the physical pion mass [32], but the current value still carries a rather large uncertainty of O(50%).
Finally, due to the documented correlation between the strange sea quarks and the non-strange 2 NNPDF3.1 [27] has also performed a fit variant based on the same data as the epWZ16 fit of ATLAS [4] and finds that strangeness is significantly enhanced. This is possible due to the phenomenological model for the charm-quark PDF applied in [27], which considers the entire charm PDF as a function to be fitted, rather than considering charm to be produced dynamically as predicted by QCD [28][29][30].
ones, it will be important to measure theū-andd-PDFs in the proton, specifically the ratiod/ū at large x 0.1. Here, Fermilab's E906 SeaQuest experiment can help to improve the situation by measuring muons in the DY process from proton-proton and proton-deuterium scattering. These data can be used to extract the ratiod/ū for the light-quark sea at Bjorken-x values up to x .45, i.e. to higher x than in previous E866 experiment [33].