A first unbiased global determination of polarized PDFs and their uncertainties

We present a first global determination of spin-dependent parton distribution functions (PDFs) and their uncertainties using the NNPDF methodology: NNPDFpol1.1. Longitudinally polarized deep-inelastic scattering data, already used for the previous NNPDFpol1.0 PDF set, are supplemented with the most recent polarized hadron collider data for inclusive jet and $W$ boson production from the STAR and PHENIX experiments at RHIC, and with open-charm production data from the COMPASS experiment, thereby allowing for a separate determination of the polarized quark and anti-quark PDFs, and an improved determination of the medium- and large-$x$ polarized gluon PDF. We study the phenomenological implications of the NNPDFpol1.1 set, and we provide predictions for the longitudinal double-spin asymmetry for semi-inclusive pion production at RHIC.


A global polarized PDF determination
In a recent paper, we presented the NNPDFpol1.0 parton set [1], a first unbiased determination of polarized parton distribution functions (PDFs) of the proton and their associated uncertainties based on the NNPDF methodology [2][3][4][5]. This methodology differs from that used in other recent next-toleading order (NLO) analyses [6][7][8][9][10], in that it relies on a Monte Carlo sampling and representation of PDFs, and it uses a parametrization of PDFs based on neural networks with a very large number of free parameters.
The NNPDFpol1.0 parton set was determined from all available inclusive deep-inelastic scattering (DIS) data with longitudinally polarized beams. One important limitation of only using (neutralcurrent) inclusive DIS is that only the quark PDF combinations ∆u + = ∆u + ∆ū, ∆d + = ∆d + ∆d, ∆s + = ∆s + ∆s, and the gluon ∆g are accessible. Furthermore, in DIS the gluon is mostly determined by scaling violations, and thus subject to sizable uncertainties due to the restricted lever-arm in Q 2 of polarized DIS measurements.
In recent years, the set of experimental data which may be used for the determination of longitudinally polarized PDFs has been extended impressively. They include now semi-inclusive DIS (SIDIS) in fixed-target experiments [11][12][13][14][15], one-or two-hadron and open-charm production in lepton-nucleon scattering [16][17][18][19][20], and semi-inclusive particle production [21][22][23][24][25][26][27], high-p T jet production [28][29][30] and parity-violating W ± boson production [31][32][33] in polarized proton-proton collisions. Many of these data probe individual quark flavors separately, or combinations of them. For instance, semi-inclusive DIS and W ± production data allow one to determine the light quark-antiquark separation of polarized PDFs. In addition, inclusive jet and pion production in polarized proton-proton collisions, as well as hadron or open-charm electroproduction provide a handle on the polarized gluon. Available measurements that provide information on polarized PDFs, the corresponding leading partonic subprocesses, the PDFs that are being probed, and the approximate range of x and Q 2 covered by available data, are summarized in Tab. 1.
It is clear from Tab. 1 that currently available processes do not extend significantly the kinematic coverage of polarized DIS data, even though they do provide independent information, and, in some cases, first information on some PDF combinations. It follows that only moderate improvements are expected from these data on the first moments of polarized PDFs, which are limited by the extrapolation to the unconstrained small-x region. Only future accelerators, such as a high-energy polarized Electron-Ion Collider (EIC) [34][35][36] or a neutrino factory [37], could extend significantly the coverage of the small-x regime and improve knowledge of the first moments of polarized PDFs [38][39][40].
The goal of this paper is to include in the NNPDF determination of polarized PDFs the new experimental information provided by polarized hadron collider data. This will lead to our first global polarized PDF set: NNPDFpol1.1. We will not include in our global fit processes which require knowledge of fragmentation functions for light quarks, such as for instance SIDIS or pion production. Fragmentation functions are on the same footing as PDFs: they can only be determined from a fit to experimental data [41][42][43][44][45][46][47], and as such they are subject to the same potential sources of bias. Because our methodology aims at reducing bias, and a determination of fragmentation functions based on our methodology is not yet available, we prefer not to use data which require their use for the time being. We do, however, include the open-charm leptoproduction data, because the fragmentation function for heavy quarks is almost computable in perturbation theory and only introduces a very moderate uncertainty.
Our PDF determination thus includes, in addition to DIS data, open-charm production data from the COMPASS experiment at CERN and the most recent high-p T inclusive jet and W ± production data from the STAR and PHENIX experiments at RHIC. The kinematic coverage of these data is shown in Fig. 1, together with that of the fixed-target inclusive DIS data already included in [1]. Other available polarized PDF sets include some of the non-DIS data of Tab. 1: in particular, the fits from the DSSV family (DSSV08 [6] and DSSV+/DSSV++ [48]) include SIDIS data, inclusive jet and identified hadron production measurements from polarized proton-proton collisions at RHIC, while the LSS10 fit [7] includes SIDIS data.
The new data sets will be added to those already included in the NNPDFpol1.0 polarized PDF determination [1] using the Bayesian reweighting method described in Refs. [49,50]. This methodology consists of updating the representation of the probability distribution in the space of PDFs provided by an available PDF set by means of Bayes' theorem in such a way that the information contained in the new data sets is included. The method has the dual advantage of not involving any further approximation once the starting Monte Carlo set is given -no parametrization or minimization is necessary -and also of being computationally rather light, in that the predictions to be compared to the new data can be made only once, rather than at each iteration of a minimization algorithm.  On the other hand, the method becomes impractical if the new data bring in a large amount of new information. Indeed reweighting has the effect of zooming in on the part of the space of PDFs which is compatible with the new data, by giving small weights to replicas which have little compatibility with them. As a consequence, after reweighting the number of replicas in the Monte Carlo set is effectively smaller than the starting one, and thus only if the starting number of replicas was sufficiently large will the final representation of the probability density remain accurate. If too much new information is brought in by the new data, the method becomes impractical because a very large number of starting replicas would be required in order to obtain accurate results after reweighting The reweighting method is especially useful in our case because on the one hand, fast interfaces for the computation of hadronic observables, such as the FastNLO framework [61], the general-purpose interface APPLgrid [62], and the FastKernel method [5], used for the unpolarized PDF determinations in Ref. [1] are not yet available in the polarized case. 1 On the other hand, we wish to add to our data set only a few dozen hadronic data points, out of a total of approximately 300 data points. Nevertheless, as we shall see, the construction of the prior probability distribution will require some care, due to the fact that the new data affect some combinations of PDFs which were completely undetermined before reweighting.
As a consequence, we will start in Sect. 2 with a discussion on how to set up the reweighting method, and in particular how to construct a suitable prior. We will then analyze separately the impact of each of the new processes which are included in our determination on polarized PDFs: first, in Sect. 3, open-charm and jet production data, which affect the gluon PDF ∆g, and then in Sect. 4, W ± production which allows for a determination of the light antiquark PDF ∆ū and ∆d. Finally, in Sect. 5, we will discuss a simultaneous reweighting with all new data sets and present our final PDF set, NNPDFpol1.1. Specifically, we will discuss the phenomenological implications of NNPDFpol1.1 compared to NNPDFpol1.0: we will reassess the spin content of the proton by computing PDF first moments, and give an illustrative example of the predictive power of our new PDF set for the case of longitudinal double-spin asymmetry for single-inclusive particle production in proton-proton collisions recently measured at RHIC.
2 Reweighting: construction of the prior We wish to include the polarized hadronic data into our polarized PDF determination by means of Bayesian reweighting. As a starting point, we construct a set of N rep = 1000 NNPDFpol1.0 PDF replicas according to the procedure of Ref. [1]: a large prior set is needed because reweighting always entails some loss of efficiency, so the final reweighted replica set corresponds to a smaller effective number of unweighted replicas [49,50].
However, this is not sufficient, because the NNPDF polarized parton set NNPDFpol1.0 [1] does not allow for a separation between quark and antiquark parton distributions, since it was determined from a fit to inclusive DIS data only. Because the new data that we wish to include are sensitive to such separation, we need to supplement the prior based on NNPDFpol1.0 with some assumption on the ∆q − ∆q PDF combinations which are left unspecified in it.
Clearly, choosing a completely unbiased flat prior for the PDF combinations which are undetermined in NNPDFpol1.0 would be extremely inefficient, given that PDFs span a space of functions. We will thus choose a prior based on an existing PDF set in which these PDF combinations are determined, and then check that our reweighted results are independent of this choice by varying the prior. In practice, we construct the prior by supplementing the PDFs which are determined in the NNPDFpol1.0 set, namely ∆u + , ∆d + , ∆s + and ∆g, with ∆ū and ∆d from the DSSV08 [6] set, but with the uncertainties inflated by a given factor. Of course, if the uncertainty was infinite this would be equivalent to an (unbiased) flat prior. Hence, we will verify that our results are independent of the choice of prior by inflating the uncertainty by an increasingly large factor, until the results stabilise.
In order to do this, we sample the DSSV08 ∆ū and ∆d distributions at a fixed reference scale Q 2 0 = 1 GeV 2 . We then select ten points, half logarithmically and half linearly spaced in the interval of momentum fraction 10 −3 x 0.4, which roughly corresponds to the range covered by SIDIS experimental data relevant for separating quark-antiquark contributions. These points with the corresponding PDF uncertainties are treated as sets of experimental pseudo-observables. Henceforth, they will be labeled as DSSV U and DSSV D respectively. In these pseudo-data prior fits, the experimental data are always taken as the central value from the DSSV08 best fit, while the experimental uncertainties are the corresponding nominal ∆χ 2 = 1 Hessian uncertainties, multiplied by a factor one, two, three and four respectively in order to obtain the sets with inflated uncertainty as explained above.
We then generate N rep = 1000 replicas of the original pseudo-data, following the procedure described in Sect. 2 of Ref. [1], and we fit each replica with a set of neural networks. To this purpose, we supplement the input PDF basis given in Sect. 3 of Ref. [1], namely ∆Σ, ∆T 3 , ∆T 8 and ∆g, with two new linearly independent light quark combinations: the total valence, ∆V , and the valence isotriplet, ∆V 3 , where ∆q − = ∆q − ∆q, q = u, d. Equation (1) holds under the assumption that ∆s = ∆s, i.e. ∆V 8 = ∆V . This assumption is not based on a theoretical motivation, but simply on the observation that present data are insufficient to determine ∆s − : hence this PDF combination should be simply viewed as undetermined in our fit.  Figure 2: The polarized sea quark densities, x∆ū(x, Q 2 0 ) (upper plots) and the x∆d(x, Q 2 0 ) (lower plots) at the initial energy scale Q 2 0 = 1 GeV 2 from the neural network fit (green full band) to the DSSV08 pseudo-data (points with uncertainties). Results are shown for the 1σ (left plots) and 4σ (right plots) prior ensembles (see text). The PDF positivity bounds from the corresponding unpolarized NNPDF2.3 counterpart are also shown.
Each of the PDF combinations in Eqs. (1)-(2) is parametrized as usual by means of a neural network supplemented with a preprocessing function, where NN ∆pdf , pdf = V, V 3 is the output of the neural network, and the preprocessing exponents m, n are linearly randomized for each Monte Carlo replica within the ranges given in Tab. 2. We have checked that our choice of preprocessing exponents does not bias the fit, according to the procedure discussed in Sect. 4.1 of Ref. [1]. The neural network architecture is the same as in NNPDFpol1.0, namely 2-5-3-1.
The DSSV08 PDFs provide us with pseudo-data for ∆ū and ∆d which in terms of the PDF basis   Table 3: The value of the χ 2 tot per data point for both separate and combined ∆ū and ∆d data sets after the neural network fit to ∆ū and ∆d pseudo-data sampled from DSSV08 [6]. are given by ∆d(x, Each DSSV08 pseudo-data replica is then combined at random with an NNPDFpol1.0 PDF replica, and the two missing basis combinations Eqs. (1)-(2) are determined by fitting with the standard NNPDF methodology, including the theoretical constraints which are relevant in the polarized case, as discussed in Ref. [1]. In particular, the positivity constraints Eqs. (61)-(62) of Ref. [1] have been enforced by letting f = u,ū, d,d separately. Note that no additional sum rules affect ∆V and ∆V 3 . As a consequence, the new PDF combinations ∆V and ∆V 3 are completely uncorrelated to the PDF combinations of the NNPDFpol1.0 set, as they are based on completely independent information and there is no further theoretically-induced cross-talk. The quality of the pseudo-data fits is quantitatively assessed by the χ 2 values per data point quoted in Tab. 3 which are close to one for both DSSV U and DSSV D data sets, and their combination. We thus end up with four separate prior PDF ensembles, labeled as 1σ, 2σ, 3σ and 4σ, corresponding to the different factors by which the DSSV08 nominal PDF uncertainty has been enlarged. In Sects. 3-4 we will explicitly show that this is sufficient to obtain reweighted results which are independent of the choice of prior, the 3σ and 4σ sets both being effectively unbiased priors.
In Fig. 2, we show the x∆ū(x, Q 2 0 ) and x∆d(x, Q 2 0 ) PDFs at the initial energy scale Q 2 0 = 1 GeV 2 from the 1σ and 4σ sets. The other priors, 2σ and 3σ, consistently provide intermediate results. In these plots, the positivity bound discussed in Ref. [1] and pseudo-data points sampled from DSSV08 are also shown.
These priors will be used in the next section for the inclusion of the new data sets, listed in the upper part of Tab. 1.

The polarized gluon: open charm and jet production
In this section, we include by reweighting the information coming from the data of Tab. 1 which mostly affect the gluon distributions: open-charm production data from the COMPASS experiment at CERN, and high-p T jet production measurements in proton-proton collisions from the STAR and PHENIX experiments at RHIC. We discuss the inclusion of each of these two data sets in turn.

Open-charm muoproduction at COMPASS
Open-charm production in polarized DIS [64] directly probes the gluon distribution because at leadingorder (LO) this process proceeds through photon-gluon fusion (PGF), γ * g → cc, followed by the fragmentation of the charm quarks into charmed mesons, typically D 0 mesons. The corresponding measurable photon-nucleon asymmetry A γN →D 0 X LL at LO is thus given by where ∆σ γg (σ γg ) is the spin-dependent (spin-averaged) partonic cross-section, ∆g (g) is the polarized (unpolarized) gluon PDF, and D D 0 c is the fragmentation function for a charm quark into a D 0 meson, assumed to be spin independent.
The COMPASS collaboration has measured the photon-nucleon asymmetry A γN →D 0 X LL , Eq. (7), obtained by the scattering of polarized muons of energy E µ = 160 GeV off a fixed target of longitudinally polarized protons or deuterons [20]. Three different data sets are available, depending on the D 0 decay mode used to reconstruct the charmed hadron in the final state: D 0 → K − π + , D 0 → K − π + π 0 or D 0 → K − π + π + π − . In the following, these will be referred to as COMPASS K1π, COMPASS K2π and COMPASS K3π respectively. Assuming LO kinematics, the polarized gluon PDF is being probed at intermediate momentum fraction values, 0.06 x 0.22, and at energy scale Q 2 = 4(m 2 c + p 2 T ) ∼ 13 GeV 2 , where m c is the charm quark mass and p T is the transverse momentum of the produced charmed hadron, as shown in Fig. 1 and Tab. 1.
In order to include COMPASS open-charm muoproduction data in our polarized PDF determination [1] via reweighting, we need to compute the predictions for the virtual photon-nucleon asymmetry A γN →D 0 X LL given in Eq. (7). We perform this computation, separately for the numerator and the denominator of Eq. (7), using the LO expressions in Ref. [65], to which we refer the reader for more details. The NLO prediction is also available [66]; however, as we shall see shortly, these data have a negligible impact, so much so that results would be essentially unchanged by the use of NLO theory.
Notice that here we do not need the prior Monte Carlo samples constructed in Sect. 2, since the asymmetry Eq. (7) only depends on the polarized gluon. We use the Peterson parametrization of the fragmentation function D D 0 c [67] with ǫ = 0.06; we checked that results are unaffected by reasonable variations of the fragmentation function [68], and indeed it was pointed out in Ref. [66] that the dependence on the fragmentation function is weaker than scale uncertainties of the NLO computation.
To get a feeling for the potential impact of these data, in Fig. 3 we compare the LO predictions obtained using various PDF sets to the COMPASS data, separated into individual decay channels and three bins for the energy E D 0 of the charmed hadron. Specifically, we show results obtained using DSSV08, AAC08 [8], BB10 [9] and NNPDFpol1.0 polarized sets supplemented with the following unpolarized sets: CTEQ6 [69] for DSSV08; MRST2004 [70] for AAC08 and BB10; and NNPDF2.3 [71] (for NNPDFpol1.0). In all cases, the PDF uncertainties shown are obtained neglecting the uncertainties due to the unpolarized sets. It is clear from Fig. 3 that the COMPASS data have very large uncertainties on the scale of the gluon PDF uncertainty, despite the fact that the PDF uncertainty is very large, especially for the NNPDFpol1.0 set [1].
The COMPASS data have been included into the NNPDFpol1.0 set by Bayesian reweighting. The values of χ 2 per data point before and after reweighting are shown in Tab. 4, along with those obtained    Figure 3: Double-spin asymmetry for D 0 meson photoproduction A γN →D 0 X LL , Eq. (7), as measured by COM-PASS [20] from the three different decay channels, compared to the corresponding LO theoretical prediction obtained using NNPDFpol1.0 DSSV08, ACC08 and BB10 polarized parton sets, supplemented in each case by a suitable unpolarized set (see text). Results are presented for three bins of the D 0 meson energy, E D 0 , and in five bins of its transverse momentum, p D0 T .  Table 4: Quality of the fit to combined and individual COMPASS open-charm data sets obtained using the polarized PDF sets (and unpolarized counterparts) of Fig. 3, as well as a set (denoted as reweighted) obtained starting with NNPDFpol1.0 and including the COMPASS data by reweighting. In each case, we show the number of data points, the effective number of replicas after reweighting for the reweighted set, and the χ 2 per data point obtained using each set.

Experiment Set
using other PDF sets in which the COMPASS data are not included; in each case the unpolarized set is the same as that used in Fig. 3. We also list the number of data points, and the effective number of replicas after reweighting a prior of N rep = 1000 replicas. Note that, because information on the correlation of systematics is unavailable, statistical and systematic uncertainties are added in quadrature when computing the χ 2 . It is clear from Tab. 4 that the inclusion of the COMPASS data has a negligible impact on the fit, as shown by the fact that the χ 2 values before and after reweighting are either the same or extremely close, and the effective number of replicas after reweighting is always very close to the number of replicas in the prior set. Also, despite very significant differences in the shape of the central gluon distribution for the various PDF sets considered here, in each case the χ 2 values are essentially the same for all sets. This means that the χ 2 is mostly determined by the mutual consistency or inconsistency of the data themselves, rather than by the actual shape of the gluon.
The observable A γN →D 0 X LL , Eq. (7) and the polarized gluon PDF x∆g(x, Q 2 0 ) at Q 2 0 = 1 GeV 2 before and after reweighting, are shown in Fig. 4 and Fig. 5 respectively: as expected, they are essentially unaffected by the inclusion of the COMPASS data. Note in particular that the uncertainty on the NNPDFpol1.0 NNPDFpol1.0 NNPDFpol1.0 Figure 4: The double-spin asymmetry A γN →D 0 X LL determined before and after reweighting compared to the COMPASS data. gluon, also shown in Fig. 5, is essentially unchanged. We conclude that the COMPASS data have little or no effect on polarized PDFs, and specifically the polarized gluon PDF.

High-p T jet production at STAR and PHENIX
We now turn to inclusive jet production in longitudinally polarized proton-proton collisions for which RHIC data are available (see e.g. [72,73]). This data is expected to have a significant impact on the gluon PDF because of the dominance of gg and qg initiated subprocesses in the accessible kinematic range (see e.g. [74,75]). Semi-inclusive hadron production in polarized collisions [21][22][23][24][25][26]76] is also sensitive to the gluon PDF, but it requires knowledge of fragmentation functions, which should be consistently determined along with parton distributions. Predictions for some semi-inclusive processes will be provided in Sect. 5.3 below with the goal of assessing their potential relevance, but we do not include them in our PDF determination in order not to have to rely on poorly known fragmentation functions.
We consider specifically data for the longitudinal double-spin asymmetry, defined as the ratio of the difference to the sum of inclusive jet cross-sections with equal (σ ++ ) or opposite (σ +− ) proton beam polarizations. For dijet production, the leading-order parton kinematics is where p T is the transverse jet momentum, η 3,4 are the rapidities of the two jets and √ s is the centerof-mass energy. In single-inclusive jet production, the underlying Born kinematics is not fixed uniquely because the second jet is being integrated over (in Fig. 1 we have conventionally assumed equal longitudinal momentum of the incoming partons, so η 3 = −η 4 ≡ η and x 1,2 = 2p T √ s e ±η ). The NLO QCD computation for inclusive high-p T jet production in polarized hadron collisions was first presented in Ref [77], based on the subtraction method of Refs. [78,79], along with a code for parton-level event generation. More recently, spin-dependent and spin-averaged cross-sections for single-inclusive high-p T jet production have been determined in Ref. [80] using the so-called narrowcone approximation, which holds in the limit of not too large jet radius R. In this approximation, analytical results for the corresponding NLO partonic cross-sections can be derived, leading to a faster and more efficient computer code, as all singularities arising in the intermediate steps of the calculation explicitly cancel. The narrow-cone approximation was shown to be close to the result of Ref. [77] in the particular case of RHIC kinematics [80]. We will therefore use the code of Ref. [80], rather than that of Ref. [77]. The approach of Ref. [80] has been recently extended in Ref. [81] to k t -type jet algorithms, used for the latest STAR data.
The two general purpose experiments at RHIC, STAR and PHENIX, have presented measurements of the longitudinal double-spin asymmetry for inclusive jet production, Eq. (8). Results from STAR are available for the 2005 and 2006 runs and, since very recently, also for the 2009 run. On the other hand, a single data set is available from PHENIX, corresponding to data taken in 2005, while further jet measurements from this experiment are not foreseen due to detector limitations in angular coverage. In the following, these data sets will be referred to as STAR 1j-05, STAR 1j-06, STAR 1j-09A, STAR 1j-09B and PHENIX 1j respectively.
The features of these data sets are summarized in Tab. 5 and the corresponding Born-level kinematic coverage is shown in Fig. 1. The experimental covariance matrix is available only for the STAR 1j-09A and STAR 1j-09B data sets: we included it in our analysis through a routine provided by the STAR collaboration [82], which takes into account also additional fully correlated systematics arising from relative luminosity and jet energy scale uncertainties. For the other data sets, systematic and statistical uncertainties are added in quadrature. For the STAR 1j-05 and STAR 1j-06 data sets, we have to account for the fact that the data are taken in bins of p T , whereas the corresponding theoretical predictions are computed for the center of each bin. We estimate the corresponding uncertainty as the maximal variation of the observable within each bin and take that value as a further uncorrelated systematic uncertainty. Furthermore, although these data are provided with asymmetric systematic uncertainties, we symmetrize them, according to Eqs. (7)-(8) of Ref. [2].
For each data set in Tab. 5, we have computed the longitudinal double-spin asymmetry Eq. (8), at NLO, using the narrow-cone approximation code of Ref. [80,81], suitably modified in order to use the Data set  N dat , the algorithm used for jet reconstruction and the value of the jet radius, R, the range over which the rapidity η is integrated, the center-of-mass energy of the collisions, √ s, and the integrated luminosity for each run, L.  NNPDF polarized parton sets via the LHAPDF interface [83,84]. In each case we use the jet algorithm and cone radius which are appropriate for the given data set, as listed in Tab. 5. Polarized and unpolarized PDFs are taken respectively from the prior ensembles constructed in Sect. 2 and from the NNPDF2.3 NLO reference parton set. As for open-charm muoproduction, the numerator in Eq. (8) is computed for each replica in the prior ensembles (N rep = 1000), while the denominator is evaluated only once for the central unpolarized replica. This is justified because uncertainties of the polarized PDF completely dominate over those of the unpolarized ones.
We have included the jet data of Tab. 5 by reweighting the prior sets discussed in Sect. 2. The χ 2 per data point before and after reweighting are listed in Tab. 6, for each data set and a combined set including all these data, for each of the four prior sets. Various measures of the effectiveness of the reweighting process are listed in Tab. 7: the effective number of replicas N eff , and the modal value of the P(α) distribution, α , defined by Eq. (12) of Ref. [49]. The parameter α measures the consistency of the data which are used for reweighting with those included in the prior set: α is the factor by which the uncertainty on the new data must be rescaled in order for both sets to be consistent with each other. A value of α close to one means that the uncertainties have been correctly estimated.
Inspection of Tabs. 6-7 allows us to draw the following conclusions: • Results are essentially independent of the choice of prior. This was to be expected, given the very mild sensitivity of this observable to the polarized flavor-antiflavor decomposition.  Table 7: The effective number of replicas after reweighting, N eff (the starting sample has N rep = 1000), and the modal value of the P(α) distribution, α (note that here and henceforth α denotes the mode, not the mean of the P (α) distribution). Results are quoted for separate and combined data sets and for each of the different prior PDF ensembles.
• The effective number of replicas after reweighting is always significantly lower than the size of the initial sample N rep = 1000, suggesting that the RHIC data do have a significant impact on the fit. The most constraining data sets, for which N eff is smallest, are STAR 1j-09A and STAR 1j-09B: this is to be expected since these are the measurements with smallest statistical and systematic uncertainties.
• The effective number of replicas after reweighting is nevertheless always rather larger than a hundred, which is a typical size needed for the final replica sample to provide accurate results. This means that the size of the prior sample is large enough for final results to be reliable.
• The modal value of the P(α) distribution for all the STAR data as well as for the global data set is always close to one, suggesting correct uncertainty estimation (even for the earlier data sets for which no information on correlated systematics is available). However, the modal value of α is significantly below one for the PHENIX data, suggesting that for this data set uncertainties are overestimated, possibly due to the missing information on correlated systematics.
• The χ 2 after reweighting is of order one for the data sets for which information on correlated systematics is available, and it shows a significant improvement, with an especially remarkable agreement for the STAR 1j-09B which, as mentioned, has the smallest uncertainties. This suggests that these data are bringing in significant new information.
Predictions for the asymmetry A 1−jet LL Eq. (8) obtained using the NNPDFpol1.0 PDF set before and after reweighting are compared to the RHIC data in Fig. 6. The curves shown correspond to the 1σ prior PDF ensemble, reweighted with the complete data set of Tab. 5. The improvement in experimental uncertainties in STAR 1j-09A and STAR 1j-09B as compared to all other data sets is clearly visible, and it leads to an equally visible reduction of the uncertainty on the theoretical prediction, as well as a significant change of its central value. This is mostly due to a corresponding reduction in uncertainty and change of shape of the polarized gluon upon reweighting: this can be seen in Fig. 7, where the polarized gluon distributions before and after reweighting are compared. Here too the result corresponds to the 1σ prior PDF ensemble, reweighted with the complete data set of Tab. 5. We have explicitly checked that the reweighted gluon is independent of the choice of prior. In the kinematic range probed at RHIC, x ∈ [0.04, 0.2] (see Fig. 1), the polarized gluon PDF tends to become positive and its uncertainty is reduced by more than a factor two for x ∼ 0.3. This feature is qualitatively consistent with that reported by the DSSV group in Ref. [85,86], based on the analysis of the same data. The other PDFs are essentially unaffected by the inclusion of the RHIC jet data.
4 Polarized quark-antiquark separation: the W asymmetry We now turn to polarized hadron collider data that constrain the flavor separation of polarized quarks and antiquarks. We consider in particular W production. Because of the chiral nature of the weak interactions, the polarized parton content may be accessed both through (parity violating) single-spin and (parity conserving) double-spin asymmetries. The single-spin asymmetry is defined as where σ +(−) denotes the cross-section for W production when colliding positive (negative) longitudinally polarized protons off unpolarized protons, and the double-spin asymmetry where σ ++ (σ +− ) is the cross-section for W production with equal (opposite) proton beam polarizations. At leading order, neglecting Cabibbo-suppressed channels, the former is given by and the latter is where for fixed W rapidity y W , the momentum fractions x 1,2 carried by the colliding partons are given by It is thus clear that first, each of these asymmetries is sensitive to the flavor decomposition of polarized quark and antiquark distributions and second, that their simultaneous measurement provides an especially strong constraint, as both linear and quadratic combinations of polarized PDFs are being measured. This last fact has the interesting implication that the single-and double-spin asymmetries Eqs. (10)-(11) must satisfy a nontrivial positivity bound, derived in Ref. [87] 1 where y W is the W boson rapidity (not to be confused with η l , the pseudo-rapidity of the lepton from the W decay, which is used for the experimental measurements). Both the STAR and PHENIX collaborations have presented measurements of the parity-violating spin asymmetry A W ± L , Eq.  respectively for STAR and PHENIX. Each of these two experiments provides a determination of the asymmetry for a single value of the rapidity and for outgoing W ± , but as discussed in Ref. [88] they are affected by very large uncertainties, do not provide any significant constraint, and we need not discuss them further.
Also, the STAR collaboration has recently presented [33] results for both the A W ± L and A W ± LL asymmetries, based on L = 9 pb −1 of √ s = 500 GeV data from the 2010 and L = 77 pb −1 of data at √ s = 510 GeV from the 2011 run. These data have been combined into a single data set at the nominal energy of √ s = 510 GeV, and have greatly reduced uncertainties as compared to previous measurements. The STAR data are provided for both W + and W − final states, which we will refer to as STAR-A W ± L and STAR-A W ± LL , presented in bins of η l , the rapidity of the lepton from the W decay. In particular the single-spin asymmetry data are given in six rapidity bins, and the double-spin asymmetry in three rapidity bins, integrated over the lepton transverse momentum in the range 25 < p l T < 50 GeV. Correlated beam polarization systematics (3.4% and 6.5% respectively for single-and double-spin asymmetries) are provided, while an additional uncorrelated systematics, due to the relative luminosity, affects A W ± L [33]. Using LO kinematics we see that these STAR W production data constrain light quark and antiquark PDFs with 0.05 x 0.4 and Q 2 ∼ M 2 W , see Fig. 1. We have included the STAR single-and double-asymmetry data by reweighting. The asymmetries have been computed using the CHE code [89], suitably modified to handle NNPDF parton sets via the LHAPDF interface. The values of χ 2 per data-point before and after reweighting (and the number of data points) are collected in Tab. 8, while the measures of the reweighting process (defined as for Tab 7 of Sect. 3) are given in Tab. 9. In each case, results are given both for separate and combined STAR data sets, and for all four priors constructed in Sect. 2.
Inspection of Tabs. 8-9 allows us to draw the following conclusions: • Independence of the choice of prior is achieved between the 3σ and 4σ priors, both in terms of fit quality (Tab. 8) and of reweighting parameters (Tab. 9). We will thus discuss henceforth results obtained using the 4σ prior.

left plots) and electron (right plots) single-spin asymmetries
A e + L and A e − L (top row) and double-spin asymmetries A e + LL and A e − LL (bottom row) before and after reweighting, compared to the STAR data from the 2012 run [33]. Results obtained with the 4σ prior are shown. The uncertainties shown are statistical only.
• The effective number of replicas after reweighting is always significantly lower than the size of the prior sample for the single-spin asymmetry, but quite close to it for the double-spin asymmetry, thus suggesting that the former data have a significant impact, while the latter does not. This is consistent with the fact that statistical uncertainties are smaller for single-spin asymmetries.
• The effective number of replicas, however, even for the double-asymmetry data remains rather larger than that found in Sect. 3.2 after reweighting with the STAR jet data: hence, as the impact of the reweighting is more moderate, the number of replicas after reweighting remains adequate for accurate phenomenology.
• The modal value of the P(α) distribution are generally close to one, though a bit higher for A W − L ( α ∼ 1.2), suggesting a mild underestimation of uncertainties, while the effective number of   replicas after reweighting with this data is smaller than that found when reweighting with the A W + L data, which should have a very similar impact.
• After reweighting, the χ 2 per data-point is below one (in fact, a little more than one σ below one), thus showing perfect agreement of the reweighted prediction with the data; interestingly the χ 2 before reweighting for the A W − L set was much greater than one, thus showing that the prior (based on DSSV, but with inflated uncertainties) does not agree well with these data.

LL
(A e − L , A e − LL ) before and after reweighting are shown in Fig. 8 as a function of the lepton rapidity η l , and compared to the STAR data [33]. As usual, the numerator in Eqs. (10)- (11) is computed for each replica in the PDF ensemble (N rep = 1000), while the denominator is evaluated only for the central unpolarized replica. The reduction in uncertainty after reweighting is clearly visible in all plots, and so is the good agreement between the data and the prediction after reweighting, the improvement being especially clear in the single-spin asymmetry plots.
The impact of the STAR data on the PDFs is seen in Fig. 9, where we compare the up and down antiquark polarized PDFs before and after reweighting; the corresponding absolute uncertainties are explicitly shown in the right plots. The change in shape in comparison to the prior (whose shape was determined by that of the DSSV best-fit PDFs) is especially remarkable for the ∆ū distribution, and  Figure 9: Comparison between the polarized antiquark sea densities x∆ū (upper plots) and x∆d (lower plots) before and after reweighting the 4σ ensemble with complete STAR W ± data set, at Q 2 0 = 1 GeV 2 . The absolute PDF uncertainty is also shown (right plots).
indicates that the STAR W data pull in a different direction for ∆ū than the SIDIS data used in DSSV08. The reduction in uncertainty is very visible in the region of the peak, where it amounts to almost 20%.

The NNPDFpol1.1 polarized parton set
In this section, we combine the different pieces of information obtained in Sects. 3 and 4 and produce a global polarized PDF set based on the NNPDF methodology, NNPDFpol1.1, which is the main result of this paper. This set is constructed by simultaneous reweighting of the prior PDF samples with all the new data from the COMPASS, STAR and PHENIX experiments listed in Tab. 1 (upper part), displayed in Fig. 1, and discussed in the previous sections.
In summary, the NNPDFpol1.1 set represents the state-of-the-art in our understanding of the proton's 1σ 2σ 3σ 4σ  Table 10: The value of the χ 2 per data point χ 2 /N dat (χ 2 rw /N dat ) before (after) reweighting, the effective number of replicas after reweighting N eff , and the modal value of the P(α) distribution. All results refer to reweighting with all the N dat = 110 data points corresponding to the data sets shown in Tab. 1 (upper part), with different choices for the prior. spin content from polarized observables which do not entail parton-to-hadron fragmentation. Further constraints will be provided by a variety of semi-inclusive measurements, which in turn will require the development of a set of parton fragmentation functions using the NNPDF methodology. In the long term, the final word on the spin content of the proton will require brand new facilities such as an Electron-Ion Collider, which could bring polarized PDF determinations to a similar level of accuracy as their unpolarized counterparts.
Based on our results, we will also reassess the status of polarized quark and gluon first moments, and, as an example of application, we will compute the longitudinal double-spin asymmetry for singleinclusive particle production in proton-proton collisions, and compare results to recent RHIC data.

Simultaneous reweighting of RHIC and COMPASS data
The NNPDFpol1.1 parton set is obtained by performing a global reweighting of the prior polarized PDF ensembles (constructed as described in Sect. 2) with all the relevant data from the COMPASS, STAR and PHENIX experiments. The reweighting parameters and the χ 2 per data point before and after reweighting are collected in Tab. 10, to be compared to Tabs. 6-7 of Sect. 3 and Tabs. 8-9 of Sect. 4, where the results of reweighting with individual data sets were shown.
Inspection of Tab. 10 allows us to draw the following conclusions: • Independence of the prior is achieved already with the 3σ prior, possibly even between the 2σ and the 3σ prior. The fact that independence of the prior is achieved somewhat earlier when performing the global reweighting than when reweighting with the most constraining data set (see Tabs. [8][9] is to be expected, as a consequence of the fact that the global data set carries more information.
• The effective number of replicas after reweighting is by almost one order of magnitude smaller than the size of the starting data set, thereby showing that the data have a significant impact on the fit.
• The effective number of replicas after reweighting remains nevertheless larger than the conventional threshold N rep = 100 which is necessary to ensure reliability of the results obtained using the reweighted set.
• The modal value of the P(α) distribution is just above one, suggesting good compatibility of the new global data set with the previous fit, even though, as discussed in Sect. 4, Tab. 9, there is some evidence for moderate uncertainty underestimation for the A W − L STAR data, which may explain the value slightly above one. • The χ 2 per data point improves substantially upon reweighting, and is of order one after reweighting. Note however that these values should be treated with care because information on correlated systematics is not available for all experiments, and thus for all experiments for which statistical and systematic uncertainties must be added in quadrature the χ 2 per data point might be expected to be less than one.
In the sequel, we will choose the PDF set obtained by reweighting the 4σ prior as our default NNPDFpol1.1 PDF set: this guarantees maximal independence of the choice of prior, without significant loss of accuracy, given that the values of N eff for the 3σ and 4σ prior are roughly equal. As a further check of independence of the prior, in Fig. 10 we display the distance d(x, Q 2 ), as defined in Appendix A of Ref. [5], between PDFs obtained with the 3σ and 4σ priors, at Q 2 = 10 GeV 2 . This statistical estimator is expected to take the value d ∼ 1 when two samples of N rep replicas are extracted from the same underlying probability distribution, while it is d = N rep when the two samples are extracted from two distributions which differ on average by one standard deviation. As we see in Fig. 10, the two ensembles are indeed statistically equivalent.
In order to construct the NNPDFpol1.1 set, made of N rep = 100 replicas, the results from the global reweighting discussed above are unweighted following the procedure in Ref. [50]. We show in Tab. 11 the χ 2 of each of the experiments included in the NNPDFpol1.1 analysis computed with the NNPDFpol1.1 PDF set; for experiments which were already included in NNPDFpol1.0 the value obtained using that set is also shown. It is clear that the description of inclusive DIS data in NNPDFpol1.1 is as good as in the original set, and that the description of each individual set used for reweighting is as good in the combined reweighted set as it was when reweighting with each individual set (as shown in Tab. 6 and 8 of Sects. 3 and 4 respectively). The comparison of predictions obtained using NNPDFpol1.1 to the data is accordingly very similar to that shown in Figs. 4, 6, 8, and is therefore not shown.
We now turn to the NNPDFpol1.1 PDFs which we first compare in Fig. 11 with those of the previous set, NNPDFpol1.0. Because in NNPDFpol1.0 only the PDF combinations ∆u + , ∆d + , ∆s + , and the gluon PDF, ∆g, are determined, only these can be compared directly. The comparison is shown at Q 2 = 10 GeV 2 , where the positivity bound coming from the unpolarized NNPDF2.3 set is also displayed. In order to quantitatively assess the impact of the new data for these PDF combinations, in Fig. 12 we also plot the distance d(x, Q) between NNPDFpol1.0 and NNPDFpol1.1 at Q 2 = 10 GeV 2 .  The main differences are found for ∆g in the region 0.02 x 0.5 covered by the STAR inclusive jet production data. In this region, the gluon from NNPDFpol1.1 parton set is positive at the one-sigma level, and it has a much reduced uncertainty in comparison to its NNPDFpol1.0 counterpart. At small x, outside the kinematical coverage of the STAR jet data, the NNPDFpol1.0 and NNPDFpol1.1 are again statistically equivalent. On the other hand, the total quark-antiquark combinations ∆u + , ∆d + and ∆s + are only moderately affected by the new data, and they are still mostly constrained by the polarized inclusive DIS data, with the RHIC data only leading to a minor reduction in uncertainty mostly for the ∆d + distribution and in the small x region.
We now turn to the individual NNPDFpol1.1 PDFs, which are compared in Fig. 13 to those from the global DSSV08 fit [6] at Q 2 = 10 GeV 2 ; PDF uncertainties are nominal one-sigma error bands for NNPDFpol1.1, and Hessian uncertainties (∆χ 2 = 1) for DSSV08. The main conclusions from the comparison in Fig. 13 are the following: • The ∆u and ∆d PDFs of the NNPDFpol1.1 and DSSV08 are qualitatively similar, though for NNPDFpol1.1 uncertainties are typically larger. Note however that the default ∆χ 2 = 1 adopted by DSSV08 may lead to uncertainty underestimation: it is well known that in Hessian global unpolarized fits a tolerance ∆χ 2 = T with T > 1 is needed for faithful uncertainty estimation, as also recognized in Ref. [6] (see also the discussion of first moments in Sect. 5.2 below).
• The NNPDFpol1.1 polarized gluon PDF is consistent at the one-sigma level with its DSSV08 counterpart in the large-x region x 0.2, where they have similar uncertainties. However, for x < 0.2, ∆g has a node in the DSSV08 determination, while it is clearly positive in NNPDFpol1.1. This result is mostly driven by the recent precise inclusive jet production data from STAR (STAR 1j-09A and STAR 1j-09B of Sect. 3), which were not available at the time of the original DSSV08 analysis shown in Fig. 13 (only STAR 1j-05 and STAR 1j-06 were included). Recent updates of the DSSV08 fit including also STAR 1j-09A and STAR 1j-09B data sets [85,86] suggest a positive ∆g consistent with the NNPDFpol1.1 result. • As already noticed in Sect. 4, inclusion of the W ± data (not included in the DSSV08 fit) visibly affects the shape of the ∆ū distribution, especially above x ∼ 3·10 −2 , which thus differs from that in the DSSV08 set. This might be a signal of tension between W ± and semi-inclusive DIS data, possibly due to limited knowledge of fragmentation functions. A similar change in the shape of ∆ū, making its peak less negative, was also found in a preliminary global fit including STAR data in the DSSV framework [90].
• Since W boson production data in the kinematic regime probed by STAR are not sensitive to strangeness, the discrepancy between NNPDF and DSSV determinations of ∆s, already found in Ref. [1], is still present. As discussed in our previous work [1], in the NNPDF analysis the polarized strange PDF is obtained from inclusive DIS data through its Q 2 evolution and assumptions about flavor symmetry of the proton sea enforced by experimentally measured baryon octet decay constants. On the other hand, the DSSV determination of polarized PDFs also includes semi- inclusive data with identified kaons in final states, which are directly sensitive to strangeness, but subject to the uncertainty in the kaon fragmentation functions (which is difficult to quantify).
In view of the comparison with other PDF sets, it is interesting to examine positivity bounds which, as discussed in Sect. 2, our PDFs satisfy by construction for each individual flavor, as is also clear from Fig. 13. Note that positivity also implies that single-and double-spin asymmetries must satisfy the bounds Eq. (15) [87]. Results for the combinations of asymmetries which are bounded to be non-negative are shown in Fig. 14, determined using NNPDFpol1.1 and DSSV08 PDFs (with, as in Sect. 4 the NNPDF2.3 [71] and MRST02 [91] respectively as corresponding unpolarized sets), for the RHIC center-of-mass energy √ s = 510 GeV, integrated over the lepton transverse momentum in the range 25 < p T < 50, for W − and W + production. Interestingly, while NNPDFpol1.1 satisfies the bounds (as it must to by construction) DSSV08 PDFs appear to violate both the bounds for W − production when |y W | 1.6, which roughly corresponds to momentum fractions x 0.8. This ties in neatly with the observation that while NNPDFpol1.1 and DSSV08 results are in good agreement for W + production, they differ for W − , reflecting the different shape of the ∆ū distribution, also seen in Fig. 13.

The spin content of the proton revisited
The first moments of the polarized PDFs can be directly related to the fraction of the proton spin carried by individual partons. In Ref. [1] we presented a detailed analysis of the first moments of various PDF combinations from the fit to polarized inclusive DIS data only. In this section we revisit this analysis with the NNPDFpol1.1 parton set, and quantify the impact of the RHIC data on the proton spin content. We define the (truncated) first moments of the polarized PDFs ∆f (Q 2 , x) in the region We will provide results for both full moments, ∆f (Q 2 ) [0,1] , and truncated moments, restricted to the x region which roughly corresponds to the kinematic coverage of experimental data, i.e. ∆f (Q 2 ) [10 −3 ,1] , see Fig. 1.  We consider first polarized quark and antiquark PDFs. In contrast to the previous NNPDFpol1.0 determination, we can now determine the first moment of individual light flavor and antiflavors. In Tab. 12 we show results for the C-even combinations ∆u + and ∆d + , for the light antiquarks ∆ū and ∆d, for the polarized strangeness ∆s (recall that we assume ∆s = ∆s), and for the singlet PDF combination ∆Σ = q=u,d,s ∆q + . The corresponding central values and one-sigma PDF uncertainties obtained from the N rep = 100 replicas of the NNPDFpol1.1 parton set at Q 2 = 10 GeV 2 are collected in Tab. 12. We compare our results to both NNPDFpol1.0 and DSSV08. In the latter case, we quote the PDF uncertainty obtained from the Lagrange multiplier method with ∆χ 2 /χ 2 = 2%, which is recommended in Ref. [6] as more reliable. This corresponds to a value of the tolerance parameter, defined above in Sect. 5.1, of order T ∼ 8, hence to uncertainties rather larger than those shown for DSSV PDFs in Fig. 13. For DSSV08 we also show in parenthesis the contribution that must be added to the truncated moment   given in the table in order to obtain the first moment: this contribution comes from extrapolation in the unmeasured region, and it should be assigned 100% uncertainty [6]. The first moments obtained from NNPDFpol1.1 and NNPDFpol1.0 are perfectly consistent with each other, as expected based on the agreement of the PDFs seen in Fig. 11. The new constraints on polarized quark PDFs from the RHIC W data lead to a substantial reduction, by almost a factor two, of the PDF uncertainties on the first moments of ∆u + and ∆d + . The contribution to the uncertainty coming from the data and extrapolation regions are of comparable size in NNPDFpol1.1, just as in NNPDFpol1.0. This means that the uncertainty due to the extrapolation has also decreased by almost a factor two in NNPDFpol1.1 in comparison to NNPDFpol1.0, despite the fact that the kinematic coverage of the data at small x is not significantly extended by the hadronic data (recall Fig. 1): this is because the lower uncertainty in the data region also limits the spread of acceptable small-x extrapolations. Interestingly, all quark truncated first moments are also compatible between NNPDFpol1.1 and DSSV08, despite the differences in shape in the ∆ū PDF.
We now turn to the gluon. Results for full and truncated moments at Q 2 = 10 GeV 2 are presented in Tab. 13, where, on top of NNPDFpol1.1, NNPDFpol1.0, and DSSV08 results, we also show the result found with a recent update of the DSSV family, DSSV++ [85]. The interest of this comparison lies in the fact that DSSV++ also includes jet production data from RHIC, so the data which constrain the gluon are essentially the same as in NNPDFpol1.1 (the exception being pion production data from RHIC, included in DSSV++ but not in NNPDFpol1.1). We show full and truncated moments in the measured region [10 −3 , 1], and also truncated moments in the region x ∈ [0.05, 0.2], which corresponds to the range covered by the RHIC inclusive jet data (see Fig. 1).
The significant improvement in the uncertainties on the first moment of ∆g in the data region when going from NNPDFpol1.0 to NNPDFpol1.1 is apparent; clearly, the improvement is concentrated in the region of the RHIC jet data. The uncertainty from the extrapolation, however, does not improve significantly and it dominates the result for the full first moment, which remains essentially undetermined. This strongly suggests that only with a wider kinematic coverage, such as would be obtained at a polarized Electron-Ion Collider, could a significantly more accurate determination of the polarized gluon first moment be achieved [39,40]. A moderate improvement might also be possible from RHIC jet data at higher center-of-mass energy, up to √ s = 500 GeV. Interestingly, the contribution to the first moment from the region of the RHIC jet data is clearly positive. The result found in this region is in good agreement (both in terms of central value and uncertainty) with that of DSSV++ [85]. This provides some evidence for a positive polarized gluon, though unfortunately the large uncertainty due to the extrapolation does not allow one to draw firm conclusions about the full first moment.

Single-particle inclusive production asymmetries at RHIC
As an illustration of the predictive power of the NNPDFpol1.1 set, in this section we compute the longitudinal double-spin asymmetry for single-hadron production in polarized proton-proton collisions, which has also been measured at RHIC recently. As for inclusive jets, Eq. (8), this asymmetry is defined as where σ ++ (σ +− ) is the cross-section for the process with equal (opposite) proton beam polarizations. Theoretical predictions for the polarized and unpolarized differential cross-sections d∆σ and dσ may be obtained using factorized expression with the general structure dσ = a,b,c=q,q,g where the sum runs over all initial-and final-state partonic channels, d∆σ c ab , dσ c ab are respectively the polarized and unpolarized cross-sections for the partonic subprocess and D H c is the fragmentation function for parton c into hadron H. Predictions for these processes are subject to a significant uncertainty due to lack of knowledge of the relevant fragmentation functions: therefore, we did not use them for PDF determination in order not to introduce a theoretical uncertainty over which we have poor control. On the other hand, the relevant partonic cross-sections are known up to NLO, and may be obtained using the code of Ref. [92].
In Figs. 15-16 we show predictions for several of these double-spin asymmetries, compared to RHIC data, obtained using the NNPDFpol1.1 parton set. We use the fragmentation functions from the DSS07 set [45]. All uncertainties shown are PDF uncertainties from the N rep = 100 polarized replica set (with, as usual, the unpolarized denominator computed using the central PDF, as its uncertainty is negligible). Of course, an extra unknown uncertainty from the fragmentation function should also be included, but it cannot be reliably estimated at present.
√ s = 200 GeV [26]. Earlier PHENIX data for neutral pion production [21][22][23], with significantly larger uncertainties, are not considered. Our predictions are always in good agreement with the data within experimental uncertainties; they suggest that double-spin asymmetries for single-hadron production remain quite small in all the available p T range, typically below the 1% level. Our predictions for negatively charged pion asymmetry is also small for all transverse momenta, see Fig. 16. In contrast, A π + LL is larger than A π 0 LL . High-p T data (both polarized and unpolarized) are potentially sensitive to the gluon distribution, hence these data might eventually provide a further handle on the polarized gluon, if sufficiently accurate fragmentation functions become available.

Conclusions and outlook
We have presented a first global polarized PDF determination based on NNPDF methodology, which includes, on top of the deep-inelastic scattering data already used in our previous NNPDFpol1.0 polarized PDF set, COMPASS charm production data and all relevant inclusive hadronic data from polarized collisions at RHIC, i.e. essentially all available data which do not require knowledge of light-quark fragmentation functions. We have thus achieved a significant improvement in accuracy in the determination of the gluon distribution in the medium and small-x region (from jet data), with evidence for a positive gluon polarization in this region, and a determination of individual light quark and antiquark PDFs (from W ± productions data). Together with the available NNPDF unpolarized PDF sets (currently NNPDF2.3 [71]) this provides a first global set of polarized and unpolarized PDFs determined with a consistent methodology, including mutually consistent constraints from cross-section positivity. This provides a reliable framework for phenomenological applications, also including possible searches for new physics with polarized beams [93]. Future inclusive RHIC data (specifically from the PHENIX and STAR collaborations) with improved accuracy and kinematic coverage will lead to even more precise polarized PDF determinations. In addition, a significant potential for improvement lies in the use of semi-inclusive data, whose availability and accuracy is constantly increasing both from fixed-target [11][12][13][14][15] and RHIC collider [24][25][26] data. However, a consistent inclusion of these data in our framework requires a simultaneous determination of fragmentation functions using NNPDF methodology.
However, present and future data from existing facilities are unlikely to substantially improve our knowledge of polarized first moments, i.e. of the proton spin structure. Indeed, the accuracy of present determinations of polarized first moments is already limited by the uncertainties due to extrapolation into the unmeasured small-x region. This is especially true for the polarized gluon, for which we have now tantalizing evidence of a positive polarization in the measured region, which is however completely swamped by an uncertainty from the extrapolation region which is larger by one order of magnitude. Improved accuracy requires a widening of the kinematic coverage: this could be achieved at a polarized Electron-Ion Collider [34][35][36], which would probe polarized PDFs down to much smaller values of x, as shown quantitatively in Refs. [39,40], and would also provide further constraints on flavor separation from polarized charged-current DIS [94].
The NNPDFpol1.1 polarized PDFs, with N rep = 100 replicas, are available from the NNPDF HepForge web site, http://nnpdf.hepforge.org/ in a format compliant with the LHAPDF interface [83,84]. In addition, stand-alone Fortran77, C++ and Mathematica driver codes are also available there.