Parton Distribution Uncertainties using Smoothness Prior

A study of the parameterization uncertainty at low Bjorken $x \le 0.1$ for the parton distribution functions of the proton is presented. The study is based on the HERA I combined data using a flexible parameterization form based on Chebyshev polynomials with and without an additional regularization constraint. The accuracy of the data allows to determine the gluon density in the kinematic range of $0.0005 \le x \le 0.05 $ with a small parameterization uncertainty. An additional regularization prior leads to a significantly reduced uncertainty for $x \le 0.0005$.


Introduction
The accurate knowledge of the parton distribution functions (PDFs) plays an important role for predictions of hard scattering cross sections at pp and pp colliders. The latter are computed in the perturbative approach including higher order radiative corrections, e.g. at next-to-leading order (NLO), which results in reduced theoretical uncertainties. Particular cross sections, such as Drell-Yan production of W, Z bosons at the LHC are even calculated to next-to-next-to-leading order (NNLO), see [1,2], and exhibit a small theoretical uncertainty of ∼ 2%. For these processes, the accuracy of the prediction is presently limited by the uncertainties of the PDFs.
The PDFs being non-perturbative by definition can be determined from fits to data from DIS eand ν-scattering, and from Drell-Yan experiments. These fits are performed using the well-known QCD evolution equations at NLO and NNLO [3][4][5][6][7][8]. The data are provided at discrete values of Bjorken x and absolute four momentum transfer squared Q 2 with their statistical and systematic uncertainties. With this given input, the uncertainty of the PDFs due to experimental errors are estimated using Hessian [9,10] and Monte Carlo [11] methods. Additional theoretical uncertainties arise, e.g. from unknown higher orders in the evolution or the treatment and scheme choice for heavy flavor contributions [12,13]. These need to be considered separately (see e.g. [14,15]).
PDF fits require an ansatz for the parameterization by a certain function of x at the starting scale Q 2 0 of the evolution. Fitting of experimental data at discrete points with an, in general, arbitrary function is an ill-posed problem which requires regularization. Typically Regge-theory inspired parameterizations are used with a small number of parameters which implicitly contain smooth and regular behavior requirements for the PDFs. For these parameterizations, it is difficult to estimate the PDF uncertainty arising from the choice of a particular ansatz. Alternatively, flexible parameterizations based on a neural net approach were used recently [16]. The number of parameters in this approach is determined by the data using an over-fitting protection technique which is an implicit regularization.
In this note, a new study of the parameterization uncertainty at low Bjorken x < 0.1 is performed. An explicit regularization prior is introduced which disfavors resonant-like behavior of PDFs at low x and the impact of the prior on the parameterization uncertainty is evaluated with particular emphasis on the gluon density in the range 0.0005 ≤ x ≤ 0.05. We choose a flexible ansatz for the PDFs at low x using Chebyshev polynomials. The analysis is based on the combined HERA I data [17].

QCD Analysis Settings
The QCD analysis presented here is performed using as a sole input the combined H1 and ZEUS data on neutral and charged current e ± p scattering double-differential cross sections collected during the HERA I run period of 1994-2000 [17]. The kinematic range of the data extends from 0.045 < Q 2 < 30000 GeV 2 and 0.000006 < x < 0.65, however in the QCD fit analysis only data with Q 2 ≥ Q 2 min = 3.5 GeV 2 are considered in order to minimize the non-perturbative higher twist effects.
The QCD fit is performed within the framework of the QCDNUM program implemented at NLO in QCD [18] and using a Zero-Mass-Variable-Flavor scheme. The fit minimizes a χ 2 function as specified in [17]. The PDFs are parameterized at the starting scale of Q 2 0 = 1.9 GeV 2 . We use a flavor decomposition similar to [19] as follows: xd val , xu val , x∆ = xū − xd, xS = 2(xū + xd + xs + xc + xb) where c and b quark densities are zero at the scales below their corresponding thresholds. The PDFs are evolved in Q 2 using the NLO equations in the massless MS-scheme and the charm and beauty quark PDFs are generated by evolution for scales above the respective thresholds. The renormalization and factorization scales are set to Q 2 .
Since this study is focused on the low x < 0.1 region, the set-up for the QCD analysis is special if somewhat simplified compared to modern high precision determinations of PDFs, see e.g. [17,[20][21][22]. At low x the PDFs are dominated by the gluon and sea-quark densities while at high x the valence-quark densities give larger contribution. Thus, regarding the functional form for the PDFs, standard Regge-theory inspired parameterizations are used for the valence quarks: The low-x behavior of the valence densities is assumed to be the same for u and d quarks by The normalizations A u v and A d v are determined by the fermion number sum rules. Therefore the valence sector is described by three parameters. For the gluon and sea densities a flexible Chebyshev polynomials based parameterization is used. The polynomials use log x as an argument to emphasize the low x behavior. The parameterization is valid for x > x min = 1.7 × 10 −5 which covers the x range of the HERA measurements for Q 2 ≥ Q 2 min . The PDFs are multiplied by (1 − x) to ensure that they vanish as x → 1. The resulting parameterization form is where T i denote Chebyshev polynomials of the first type and the sum over i runs up to N g,S = 15 for the gluon and sea-quark densities. The Chebyshev polynomials are given by the well-known recurrence relation: The normalization A g is determined by the momentum sum rule. The advantage of the parameterization given by Eqs. (3), (4) is that momentum sum rule can be evaluated analytically. Moreover, already for N g,S ≥ 5 the fit quality is similar to that of a standard Regge-inspired parameterization with a similar number of parameters. The PDF uncertainties are estimated using the Monte Carlo technique [11]. The method consists in preparing replicas of data sets by allowing the central values of the cross sections to fluctuate within their systematic and statistical uncertainties taking into account all point-to-point correlations. The preparation of the data is repeated for N > 100 times and for each of these replicas a complete NLO QCD fit is performed to extract the PDF set. The PDF central values and uncertainties are estimated using the mean values and root-mean-squared (RMS) over the PDF sets obtained for each replica.

Choice of the Smoothness Constraint
Fitting an arbitrary function to a discrete number of measurements is an ill-posed problem which requires regularization. This regularization should have a physical motivation and be flexible enough to cover the space of solutions compatible with QCD. At low x, the sea-quark PDF closely corresponds to a measurement of the structure function F 2 in a DIS process. For DIS at low x, the invariant mass of the hadronic final state W is calculated from Q 2 and x as Experimentally, it is well-known that for low values of W and Q 2 the structure function F 2 displays resonances [23]. These resonances, however, disappear for high W > 5 GeV. The smooth behavior of F 2 for high W can be explained phenomenologically by high particle multiplicity of the hadronic final state. A prior which disfavors resonant structures in W , for W exceeding a certain value W min , has therefore a strong phenomenological motivation. This prior can be introduced as an additional penalty to the likelihood function for the PDFs which are longer in W . Note that a prior using the length in W as opposed to the length in x enhances sensitivity to the low x region. For the χ 2 function the prior corresponds to an extra penalty term of a form where α is the relative weight of this PDF-length prior and the PDF x f = xg, xS, respectively. The prior χ 2 prior has a minimum for the shortest PDF in W which corresponds to a condition for the derivative, x f ′ (W ) = 0. In this case, χ 2 prior = 0 holds irrespective of the value of α. The total χ 2 tot is given by the sum of the χ 2 , for the data versus theory comparison, and the penalty term We choose W max = 320 GeV which is the maximum value achievable at HERA. To stay far away from the resonance region, W min = 10 GeV is used which for Q 2 = 1.9 GeV 2 corresponds to x ≈ 0.02. The prior is applied to both gluon and sea-quark densities at the starting scale Q 2 0 = 1.9 GeV 2 of the evolution.

Results
The Monte Carlo procedure of extracting PDFs is illustrated for N par ≡ N g = N S = 9 in Fig. 1 which shows the gluon PDF at the starting scale Q 2 0 = 1.9 GeV 2 for each replica, with their RMS band. The distributions are compared to those obtained using the standard parameterization form: Introducing the length prior to the fit by changing the weight of the penalty term from 0 GeV −1 to 1000 GeV −1 increases the χ 2 of the fit by 66 units, see Table 1. Further increase of the penalty term to α = 5000 GeV −1 reduces fit quality considerably with an additional increase of χ 2 by 141  units. For low values of α ≤ 100, the impact of the penalty term on the shape of the central value of the gluon PDF is small while for α ≥ 1000 the distribution changes significantly, see Fig. 2. In addition, the shape of the gluon distribution using a standard parameterization can be reproduced by the Chebyshev parameterization of the gluon PDF if a tight length prior is applied to the fit, as demonstrated in Fig.1. Fig. 1 shows that the PDF uncertainty is very large for

Summary
The focus of this study has been on the parameterization uncertainty of PDFs at low x, especially of the gluon PDF in a fit to the HERA I data at NLO in QCD. A flexible PDF parameterization based on Chebyshev polynomials has been chosen and the impact of an additional smoothness prior on the quality of the has been investigated. We have found that the uncertainty of the fit is generally small in the 0.0005 < x < 0.05 range. The uncertainty, however, increases significantly for larger and smaller x values. The regularization with a smoothness prior, which disfavors resonant structures for large values of W allows to significantly reduce uncertainty also for the range x < 0.0005.