1 Introduction

To analyse data taken at the Large Hadron Collider in the best possible way, it is of great importance to have sound theoretical control over the QCD dynamics of proton-proton collisions. The mechanism of double parton scattering (DPS), in which two partons in each proton participate in a hard-scattering process, can give important contributions to particular final states and in particular kinematic regions. A prominent example is the production of two W bosons with the same charge [1,2,3,4,5,6,7], a channel that is also a background in searches for new physics (see e.g. [8,9,10]). A variety of DPS processes have been studied experimentally at the LHC [6, 11,12,13,14,15,16,17,18,19,20,21] and at lower energies [22,23,24,25,26,27,28,29,30,31,32] (see e.g. Figure 4 of [19] and Figure 15 of [33] for overviews). Recent years have seen significant progress in the QCD description of double parton scattering, see e.g. [34,35,36,37,38,39,40,41,42] and the brief overview in [43]. In particular, the formalism developed in [39, 44,45,46,47] extends the factorisation proofs for single Drell-Yan production [48,49,50] to double parton scattering with colourless final-state particles and achieves a consistent combination of single and double parton scattering contributions to a given final state. The non-perturbative quantities in DPS factorisation formulae are double parton distributions (DPDs), which specify the joint distribution of two partons in a proton. In the formalism just mentioned, these distributions depend in particular on the spatial separation \(\varvec{y}\) of the two partons in the plane transverse to the proton momentum. Alternatively, one may work with the transverse momentum \(\varvec{\varDelta }\) that is Fourier conjugate to \(\varvec{y}\).

Given the complexity of measuring and computing DPS cross sections, a largely model-independent fit of DPDs to experimental data, akin to what is done for single parton distributions (PDFs), will not be possible for a considerable time. It is hence essential to develop realistic models for DPDs. Considerable efforts have been made to compute them in quark models [51,52,53,54,55,56,57,58,59,60,61,62], and lattice calculations of Mellin moments of DPDs are underway [63, 64]. In addition, there are important theoretical constraints on DPDs. On the one hand, there is the perturbative splitting of one parton into two [34,35,36,37,38,39, 41, 42, 45, 57, 65,66,67,68,69,70,71,72,73,74,75,76], which determines the behaviour of DPDs at small \(\varvec{y}\) and likewise puts constraints on DPDs depending on \(\varvec{\varDelta }\). On the other hand, there are sum rules [77, 78], which involve DPDs integrated over \(\varvec{y}\) (or evaluated at \(\varvec{\varDelta } = \varvec{0}\)) and express the conservation of momentum and quark number. So far, only a small number of studies [2, 77, 79,80,81] have used these sum rule to constrain DPDs, and it is the goal of the present paper to continue this line of work. Whereas the DPD models in [2, 77, 79, 80] are formulated for DPDs at \(\varvec{\varDelta } = \varvec{0}\), we work with DPDs in \(\varvec{y}\) space, because these are the quantities required for computing DPS cross sections in the formalism of [45].

This paper is organised as follows. In Sect. 2, we recall the theory underlying our model construction, highlighting in particular the nontrivial relation between DPDs depending on \(\varvec{y}\) and those depending on \(\varvec{\varDelta }\). The starting point for our DPD model, taken from [45], is described in Sect. 3. In Sect. 4 we give a few technical details about our numerical calculations. In Sect. 5, we make a series of changes to our model DPDs, improving at each step their agreement with the sum rules. The scale dependence of our results is studied in Sect. 6, before we conclude in Sect. 7.

2 Theory

The model analysis in this paper is based on the theory for double parton distributions developed in [45]. Let us briefly present the most important results of that work for our context.

Consider the distribution function \(F_{a_1 a_2}(x_1, x_2, \varvec{y}; \mu )\) for finding two partons \(a_1\) and \(a_2\) in the proton. The momentum fractions of the partons are \(x_1\) and \(x_2\), and \(\varvec{y}\) denotes their spatial separation in the transverse plane. At leading order (LO) in \(\alpha _s\), the scale dependence of DPDs is given by evolution equations

$$\begin{aligned}&\frac{\mathrm {d} F_{a_1 a_2}(x_1,x_2,\varvec{y}; \mu )}{\mathrm {d} \log \mu ^2}\, \nonumber \\&\quad = \frac{\alpha _s(\mu )}{2\pi } \sum _{b_1} \int \limits _{x_1}^{1-x_2} \frac{\mathrm {d} z_1}{z_1}\; P_{a_1 b_1}\Bigl ( \frac{x_1}{z_1} \Bigr )\, F_{b_1 a_2}(z_1,x_2^{},\varvec{y}; \mu ) \nonumber \\&\qquad + \frac{\alpha _s(\mu )}{2\pi } \sum _{b_2} \int \limits _{x_2}^{1-x_1} \frac{\mathrm {d} z_2}{z_2}\; P_{a_2 b_2}\Bigl ( \frac{x_2}{z_2} \Bigr )\, F_{a_1 b_2}(x_1,z_2,\varvec{y}; \mu ) \,, \end{aligned}$$
(1)

with the same DGLAP splitting functions \(P_{a b}(v)\) that govern the evolution of ordinary PDFs at LO. For simplicity, we take a common factorisation scale \(\mu \) for both partons in the present work, but it is straightforward to use different scales \(\mu _1\) and \(\mu _2\).

The behaviour of \(F_{a_1 a_2}(x_1, x_2, \varvec{y}; \mu )\) at small \(y = |\varvec{y}|\) is dominated by the perturbative splitting of a single parton \(a_0\) into the observed partons \(a_1\) and \(a_2\). Evaluating the splitting mechanism at LO in \(\alpha _s\), one obtains

$$\begin{aligned}&F_{a_1 a_2,\,\text {spl,pt}}(x_1,x_2,\varvec{y};\mu ) \, \Big |_{D = 4 - 2\epsilon } \nonumber \\&\quad = \frac{\mu ^{2\epsilon }}{y^{2-4\epsilon }}\, \frac{\varGamma ^2(1-\epsilon )}{\pi ^{1-2\epsilon }}\, \frac{f_{a_0}(x_1+x_2;\mu )}{x_1+x_2}\, \nonumber \\&\qquad \times \frac{\alpha _s(\mu )}{2\pi }\, P_{a_1 a_2, a_0}\biggl ( \frac{x_1}{x_1+x_2}, \epsilon \biggr ) \,, \end{aligned}$$
(2)

in \(D = 4 - 2\epsilon \) dimensions, where \(f_{a_0}\) is the PDF for parton \(a_0\). The function \(P_{a_1 a_2, a_0}(v, \epsilon )\) is equal to the ordinary DGLAP splitting function \(P_{a_1\,a_0}(v)\) for \(\epsilon =0\), and its form for nonzero \(\epsilon \) may be found in [82].

Instead of the DPDs \(F(x_1, x_2, \varvec{y}; \mu )\) in transverse-position space, one may also consider distributions depending on the transverse momentum \(\varvec{\varDelta }\) that is Fourier conjugate to \(\varvec{y}\). Since according to (2) the distribution \(F(x_1, x_2, \varvec{y}; \mu )\) behaves like \(1/y^{2 - 4\epsilon }\) at short distances y, its Fourier transform w.r.t. \(\varvec{y}\) requires an additional renormalisation in the ultraviolet.

One way to achieve this is to perform the Fourier transform in \(2 - 2\epsilon \) transverse dimensions. This gives rise to a \(1/\epsilon \) ultraviolet pole that can be renormalised using standard \(\overline{\mathrm{MS}}\) subtraction, after which one can set \(\epsilon \) to zero. Owing to this additional renormalisation, the evolution equations of the resulting momentum space distributions \(F(x_1, x_2, \varvec{\varDelta }; \mu )\) differ from those of \(F(x_1, x_2, \varvec{y}; \mu )\) by an inhomogeneous term that can readily be deduced from (2). This inhomogeneous equation has long been known and discussed in the literature [65,66,67,68, 71, 77].

An alternative is to start from the distributions \(F(x_1, x_2, \varvec{y}; \mu )\) in \(D=4\) physical dimensions and to cut off their \(1/y^2\) singularity at short distances in the Fourier transform:

$$\begin{aligned}&F_{\varPhi , a_1 a_2} ( x_1, x_2, \varvec{\varDelta }; \mu , \nu ) \nonumber \\&\quad = \int \mathrm {d}^2 \varvec{y} \; e^{i \varvec{y} \varvec{\varDelta }}\; \varPhi (y \nu )\, F_{a_1 a_2} ( x_1, x_2, \varvec{y}; \mu ) \,. \end{aligned}$$
(3)

Here \(\nu \) is a scale with dimension of mass, and \(\varPhi (u)\) is a suitable function, which may be taken as a hard cutoff

$$\begin{aligned} \varPhi ( u ) = \varTheta ( u - b_0) \quad \text {with } b_0 = 2 e^{-\gamma } \approx 1.12 \,, \end{aligned}$$
(4)

where \(\gamma \) is the Euler-Mascheroni constant. This choice of \(b_0\) is such that certain analytical expressions simplify, see [45, 82].

Since the distributions in (3) differ from those defined with \(\overline{\mathrm{MS}}\) subtraction only by the treatment of the ultraviolet region, one can use the small y expression (2) to derive a perturbative matching equation between the two types of DPD:

$$\begin{aligned}&F_{a_1 a_2}(x_1, x_2, \varvec{\varDelta }; \mu ) \nonumber \\&\quad = F_{\varPhi , a_1 a_2}(x_1, x_2, \varvec{\varDelta }; \mu , \nu ) + \frac{f_{a_0}(x_1 + x_2;\mu )}{x_1 + x_2}\, \nonumber \\&\qquad \times \frac{\alpha _s(\mu )}{2\pi }\, \biggl [ \log \frac{\mu ^2}{\nu ^2}\; P_{a_1 a_2, a_0}( v, 0 ) + P'_{a_1 a_2, a_0}(v, 0) \,\biggr ] \nonumber \\&\qquad + {\mathcal {O}}\left( {\varDelta ^2} / {\nu ^2} , {\varLambda ^2} / {\nu ^2} ,\alpha _s^2 \,\right) \,, \end{aligned}$$
(5)

where we have abbreviated \(P'(v,\epsilon ) = \partial P(v,\epsilon ) / \partial \epsilon \) and \(v = x_1/(x_1 + x_2)\). Here \(\varLambda \) denotes a non-perturbative scale. It is understood that one should take \(\nu \sim \mu \) to avoid logarithmically enhanced higher-order corrections . Under this condition, the \(\nu \) dependence cancels between the first and second line of (5) within the stated accuracy. We will investigate this numerically in Sect. 6.2.

We remark in passing that the previous discussion can be extended beyond LO. The higher-order forms of (2) and (5) involve convolutions instead of ordinary products, and the NLO kernels for unpolarised partons have been computed in [82].

The distributions \(F(x_1, x_2, \varvec{\varDelta }; \mu )\) are of particular interest because at the point \(\varvec{\varDelta } = \varvec{0}\) they fulfil the sum rules formulated in [77]. Abbreviating

$$\begin{aligned} F(x_1, x_2; \mu ) = F(x_1, x_2, \varvec{\varDelta } = \varvec{0}; \mu ) \,, \end{aligned}$$

these sum rules read

$$\begin{aligned} \int \limits _0^{1 - x_1}\!\! \mathrm {d} x_2\, F_{a_1 q_{v}}(x_1, x_2; \mu )&= ( N_{q_{v}} + \delta _{a_1, \bar{q}} - \delta _{a_1, q} ) \,\nonumber \\&\quad \times f_{a_1}(x_1; \mu ) \end{aligned}$$
(6)
$$\begin{aligned} \sum _{a_2}\!\!\! \int \limits _0^{1-x_1}\!\!\! \mathrm {d} x_2\,x_2\, F_{a_1 a_2}(x_1, x_2; \mu )&= (1 - x_1)\,f_{a_1}(x_1; \mu ) \end{aligned}$$
(7)

and express the conservation of quark number and of momentum, respectively. Here \(F_{a_1 q_{v}} = F_{a_1 q} - F_{a_1 \bar{q}}\) denotes the valence combination for quark flavor q, and \(N_{q_{v}}\) is the number of valence quarks with flavour q in the target. Equivalent sum rules can be written down for DPDs integrated over \(x_1\), given the trivial symmetry relation \(F_{a_1 a_2}(x_1, x_2; \mu ) = F_{a_2 a_1}(x_2, x_1; \mu )\).

Note that naively \(F(x_1, x_2; \mu )\) just corresponds to the integral of \(F(x_1, x_2, \varvec{y}; \mu )\) over all \(\varvec{y}\), as one would expect for a sum rule. As discussed above, this simple correspondence is however invalidated by the singular short-distance behaviour of the \(\varvec{y}\) dependent distributions. As shown in [78], it is indeed the momentum space DPDs defined with \(\overline{\mathrm{MS}}\) renormalisation and taken at \(\varvec{\varDelta } = \varvec{0}\) that appear in the above sum rules (together with \(\overline{\mathrm{MS}}\)renormalised PDFs). Already in [77] it was pointed out that the inhomogeneous term in the evolution equations for momentum space DPDs is essential for ensuring that (6) and (7) are valid at all \(\mu \).

The matching relation (5) allows us to devise models for the position space distributions \(F(x_1, x_2, \varvec{y}; \mu )\), which are the primary quantities needed to compute cross sections in the formalism of [45] and at the same time to use the DPD sum rules (6) and (7) as constraints for these models. In practice, the sum rules will then only be fulfilled approximately and in a particular range of momentum fractions. This is the strategy adopted in the present work.

One might think of a different procedure and start with a model for the momentum space DPDs \(F(x_1, x_2, \varvec{\varDelta }; \mu )\), constructed such that the sum rules are satisfied exactly. Using the extension of (5) to arbitrary values of \(\varvec{\varDelta }\), given in [82], one can then compute the functions \(F_{\varPhi }(x_1, x_2, \varvec{\varDelta }; \mu ,\nu )\). The latter can be used instead of \(F(x_1, x_2, \varvec{y}, \mu )\) to compute the double parton scattering cross section, as shown in section 8 of [45]. This possibility shall not be pursued here. We note that it has proven to be difficult to devise a general ansatz for distributions \(F(x_1, x_2; \mu )\) that satisfy the sum rules exactly, with the only consistent solution so far being limited to the pure gluon sector [80]. Until further progress is made in that direction, the best one can achieve with either momentum or position space models is that the sum rules are satisfied approximately to a degree one deems satisfactory.

3 Initial model

As starting point of our work, we take the DPD model introduced in [45]. Let us briefly recall its features and motivation. We require that the DPDs have the small y behaviour given by the perturbative splitting mechanism at LO. This is achieved by using a two-component ansatz

$$\begin{aligned}&F_{a_1 a_2 }(x_1,x_2, \varvec{y}; \mu ) \nonumber \\&\quad = F_{a_1 a_2,\,\text {int}}(x_1,x_2, \varvec{y}; \mu ) + F_{a_1 a_2,\,\text {spl}}(x_1,x_2, \varvec{y}; \mu ) \,, \end{aligned}$$
(8)

where \(F_{\text {spl}}\) tends to the perturbative splitting form at small y, whilst \(F_{\text {int}}\) remains finite in that limit. The \(\mu \) dependence of both components is required to follow the evolution equations (2). The physical idea behind the separation (8) is that in \(F_{a_1 a_2,\,\text {int}}\) the partons \(a_{1}\) and \(a_{2}\) originate from the “intrinsic” part of the proton wave function, whilst in \(F_{a_1 a_2,\,\text {spl}}\) they are obtained from a parton \(a_{0}\) in the proton by perturbative splitting. It should be borne in mind that this is meant to be a heuristic picture, rather than a distinction that could be formulated in a field theoretically rigorous way.

For the intrinsic part of the DPD, we make an ansatz at the scale \(\mu _0 = 1 \,\mathrm{GeV}\). It consists of the product of two PDFs with a factor for the y dependence and a “phase space factor” \(\rho _{a_1 a_2}\) suppressing the distributions close to the kinematic boundary \(x_1 + x_2 = 1\),

$$\begin{aligned}&F_{a_1 a_2,\,\text {int}}(x_1,x_2, \varvec{y}; \mu _0) = f_{a_1}(x_1;\mu _0)\, f_{a_2}(x_2;\mu _0)\; \nonumber \\&\quad \times \frac{1}{4\pi h_{a_1 a_2}}\, \exp \biggl [ - \frac{y^2}{4h_{a_1 a_2}} \biggr ]\, \rho _{a_1 a_2}(x_1, x_2) \end{aligned}$$
(9)

with

$$\begin{aligned} \rho _{a_1 a_2}(x_1, x_2)&= \frac{(1-x_1-x_2)^2}{(1-x_1)^{2}\, (1-x_2)^{2}} \,. \end{aligned}$$
(10)

Apart from the factor \(\rho _{a_1 a_2}\), this form is obtained if one assumes that the two partons \(a_1\) and \(a_2\) in the proton are completely uncorrelated. Under that assumption, one can express a DPD as a convolution

$$\begin{aligned}&F_{a_1 a_2}(x_1,x_2,\varvec{y};\mu _0) \nonumber \\&\quad = \int d^2 \varvec{b}\, f_{a_1}(x_1, \varvec{b} + \varvec{y}; \mu _0) \, f_{a_2}(x_2, \varvec{b}; \mu _0) \end{aligned}$$
(11)

of two impact-parameter dependent PDFs \(f_a(x, \varvec{b})\), cf. [83] and section 2.1 of [39]. If one furthermore assumes that the impact-parameter dependent PDFs can be expressed in terms of ordinary PDFs and a Gaussian impact parameter profile, i.e.

$$\begin{aligned} f_{a}(x, \varvec{b}; \mu )&= f_{a}(x; \mu )\, \frac{1}{4 \pi h_a} \, \exp \left[ -\frac{\varvec{b}^2}{4 h_a}\right] , \end{aligned}$$
(12)

then the convolution integral in (11) yields a Gaussian with a width that is the sum of the single-particle widths, i.e. \(h_{a_1 a_2} = h_{a_1} + h_{a_2}\). For the single-particle widths we use the values

$$\begin{aligned} h_{g} = 2.33\,\,\mathrm{GeV}^{-2}, \quad h_{q} = h_{\bar{q}} = 3.53\,\,\mathrm{GeV}^{-2}, \end{aligned}$$
(13)

whose physical motivation is discussed in [45].

The phase space factor \(\rho _{a_1 a_2}\) ensures that the distributions go to zero when approaching the kinematical boundary \(x_1 + x_2 = 1\). The first or second power of \((1 - x_1 - x_2)\) is frequently used in the literature, but as observed in [77], this results in a strong violation of the sum rules in the region \(x_1 \ll 1\). A much better agreement is obtained with a phase space factor that does not yield any suppression in that limit. This is achieved by dividing \((1 - x_1 - x_2)^n\) by \((1 - x_1)^n \, (1-x_2)^n\).

For the “splitting part” of the DPD, we make the ansatz

$$\begin{aligned}&F_{a_1 a_2,\,\text {spl}}(x_1,x_2, \varvec{y}; \mu _y) \nonumber \\&\quad = F_{a_1 a_2,\,\text {spl,pt}}(x_1,x_2, \varvec{y}; \mu _y)\, \exp \left[ - \frac{y^2}{4h_{a_1 a_2}} \right] , \end{aligned}$$
(14)

where

$$\begin{aligned}&F_{a_1 a_2,\,\text {spl,pt}}(x_1,x_2, \varvec{y}; \mu _y) \nonumber \\&\quad = \frac{1}{\pi y^2}\; \frac{f_{a_0}(x_1+x_2;\mu _y)}{x_1+x_2}\, \frac{\alpha _s(\mu _y)}{2\pi }\, P_{a_1 a_0}\biggl ( \frac{x_1}{x_1+x_2} \biggr ) \end{aligned}$$
(15)

is the splitting form (2) in \(D=4\) dimensions. As required by theory, the ansatz (14) tends to the perturbative result for small y, with power corrections of order \(y^2 / h_{a_1 a_2}\). At large y, the \(1/y^2\) falloff of the perturbative result is dampened by the Gaussian factor in (14). For lack of better guidance, we take the same parameters \(h_{a_1 a_2}\) in this factor as in the intrinsic part (9).

The splitting form (14) is evaluated at the scale

$$\begin{aligned} \mu _y&= \frac{b_0}{y^*}, \quad y^{*} = \frac{y}{\sqrt{ 1 + y^2 /y_{\text {max}}^2 }} \end{aligned}$$
(16)

with \(y_{\text {max}} = 0.5 \,\mathrm{GeV}^{-1}\). In the perturbative regime \(y \ll y_{\text {max}}\) this corresponds to the natural choice \(\mu \sim 1/y\), which avoids logarithmically enhanced corrections from higher orders. For large y, the scale \(\mu _y\) approaches a limiting value \(b_0 / y_{\text {max}} \approx 2.25 \,\mathrm{GeV}\), which ensures that neither \(\alpha _s\) nor the PDFs on the r.h.s. of (14) are evaluated at too small scales.

For the parton densities appearing in both (9) and (15), we take the MSTW2008 LO distributions [84] with the small modification described in sect. 3.2 of [77]. The latter ensures that the \(\bar{d}\) and the \(\bar{s}\) PDFs are positive and thus admit a probability interpretation. For the strong coupling, we use the starting value \(\alpha _s(\mu _0) = 0.682\) adopted in the MSTW2008 LO analysis. Throughout this work, we fix the number of active quark flavours to \(n_f = 3\).

4 Technical implementation

With the general prescription (5) and the two-component model (8), the DPDs entering the sum rules are given by

$$\begin{aligned}&F_{a_1 a_2}(x_1, x_2; \mu ) = 2 \pi \int _{b_0 / \nu }^{\infty } \mathrm {d} y\; y\, F_{a_1 a_2,\,\text {int}}(x_1,x_2, y; \mu ) \nonumber \\&\quad + 2 \pi \int _{b_0 / \nu }^{\infty } \mathrm {d} y\; y\, F_{a_1 a_2,\,\text {spl}}(x_1,x_2, y; \mu ) \nonumber \\&\quad + F_{a_1 a_2,\,\text {match}}(x_1, x_2; \mu ) \,, \end{aligned}$$
(17)

where the matching term

$$\begin{aligned}&F_{a_1 a_2,\,\text {match}}(x_1, x_2; \mu ) = \frac{f_{a_0}(x_1 + x_2;\mu )}{x_1 + x_2}\, \frac{\alpha _s(\mu )}{2\pi }\, \nonumber \\&\quad \times \biggl [ \log \frac{\mu ^2}{\nu ^2}\; P_{a_1 a_2, a_0}( v, 0 ) + P'_{a_1 a_2, a_0}(v, 0) \,\biggr ] \end{aligned}$$
(18)

follows from (5). In (17) we have used that the position space DPDs depend on \(\varvec{y}\) only via y. Whilst evaluating \(F_{\text {match}}\) is straightforward, the numerical computation of the intrinsic and splitting terms is more demanding. In the following paragraphs, we give some details about our numerical implementation. A reader not interested in these technicalities may skip forward to Sect. 5.

DPD evolution and grids  To evolve \(F_{\text {int}}\) and \(F_{\text {spl}}\) from their respective starting scales in (9) and (14) to the scale \(\mu \) at which the sum rules are to be evaluated, we use a modified version of the code employed in the study [45], which was itself a modification of the original code described in [77]. With this code, we compute position space DPDs on grids in the momentum fractions \(x_1\) and \(x_2\), the interparton distance y, and the renormalisation scale \(\mu \). The momentum fraction grids are equidistant in the variables \(u_i = \log (x_i/(1-x_i))\). We use 89 grid points in each \(x_i\) direction, with the smallest and largest \(x_i\) values being \(x_{\text {min}} = 5 \times 10^{-5}\) and \(x_{\text {max}} = 1 - x_{\text {min}}\).

For the factorisation scale, we use 51 points on an equidistant grid in \(\log \mu ^2\), with largest scale \(\mu _{\text {max}} = 172 \,\mathrm{GeV}\). For each grid point \(\mu _i\), we define a grid point in \(y_i\) such that \(\mu _i = \mu _{y_i}\) with the function \(\mu _y\) given in (16). This is convenient for evaluating \(F_{\text {spl}}\) at its starting scale. The smallest value \(\mu _{\text {min}}\) on the \(\mu \) grid thus corresponds to the largest value on the y grid and is just slightly larger than the limiting value \(b_0 / y_{\text {max}} \approx 2.25 \,\mathrm{GeV}\) of \(\mu _y\) for infinitely large y.

It turns out that for evaluating the integrals in (17), the y grid just described is not quite dense enough at small y values, and that for \(F_{\text {spl}}\) we also need additional points at large y. Extending the y grid appropriately, we end up with 60 points for the intrinsic part and 90 points for the splitting part of the DPD.

Integration  At the starting scale \(\mu _0\), the y dependence of the intrinsic part \(F_{\text {int}}\) is given by a simple Gaussian factor. This does not remain true at other scales \(\mu \), because quark and gluon distributions mix under evolution and have different Gaussian widths in our model. However, we find that at the \(\mu \) values we consider, the y dependence of the evolved distributions \(F_{\text {int}}\) is reasonably well approximated by a linear superposition of Gaussians with widths \(h_{q q}\), \(h_{q g}\), and \(h_{g g}\). Determining the appropriate superposition by a fit for each pair \((x_1, x_2)\) on our grid, we can evaluate the first y integral in (17) analytically.

This strategy does not work for the splitting part \(F_{\text {spl}}\), for which we perform the y integral numerically, using the values of the distribution on the grid in y. Finally, the integral over \(x_2\) in the sum rules is evaluated numerically, using the values of \(F_{a_1 a_2}(x_1, x_2; \mu )\) on the \(x_2\) grid. For both the y and the \(x_2\) integrals, integration rules for equidistant grids with errors of order \(1/N^4\) are used if \(N > 4\), where N is the number of grid points in the relevant integration interval.

5 Refining the model

In this section, we describe how the initial model described in Sect. 3 is modified so as to fulfil the DPD sum rules to a good approximation over a wide range in \(x_1\). The modifications are performed in several steps, after each of which we quantify the degree to which the sum rules are satisfied. To this end, we follow [77] and consider the “sum rule ratios”

$$\begin{aligned} R_{a_1 q_{v}}(x_1; \mu )&= \frac{\int \mathrm {d}x_2\, F_{a_1 q_{v}}(x_1, x_2; \mu )}{ ( N_{q_{v}} + \delta _{a_1, \bar{q}} - \delta _{a_1, q} ) \,f_{a_1}(x_1; \mu )} \,, \end{aligned}$$
(19)
$$\begin{aligned} R_{a_1}(x_1; \mu )&= \frac{\sum _{a_2} \int \mathrm {d}x_2\,x_2\, F_{a_1 a_2}(x_1, x_2; \mu )}{ (1 - x_1)\,f_{a_1}(x_1; \mu )} \end{aligned}$$
(20)

with \(a_1\) being a quark, an antiquark, or a gluon. Note that a number sum rule ratio cannot be defined for \(F_{d d_v}\) in this way, because the denominator of \(R_{a_1 q_{v}}\) is zero in that case. The same holds for \(F_{a_1 s_v}\) unless \(a_1 = s\) or \(a_1 = \bar{s}\).

Postponing the discussion of \(F_{d d_v}\) to the end of this section, we now take a closer look at the number sum rules involving \(s_v\). We first observe that the PDFs underlying our DPD model satisfy the relation \(f_{s}(x) = f_{\bar{s}}(x)\), which is of course stable under LO evolution. As a consequence, our initial DPD model satisfies

$$\begin{aligned} F_{s s_v}&= - F_{\bar{s} s_v}, \quad F_{a_1 s_v} = 0\quad \hbox {for}\ a_1 \ne s, \bar{s} \end{aligned}$$
(21)

at all scales \(\mu \). This will remain true with the modifications made in the present section. One thus obtains \(R_{s s_v} = R_{\bar{s} s_v}\) and needs to consider only one of these ratios. Furthermore, the number sum rules for \(F_{a_1 s_v}\) with \(a_1 \ne s, \bar{s}\) are satisfied exactly. To prove the relations (21), we first note that they hold separately for \(F_{\text {int}}(x_1, x_2, \varvec{y}, \mu _0)\) and for \(F_{\text {spl}}(x_1, x_2, \varvec{y}, \mu _y)\) in the model specified in Sect. 3. It is easy to see that they are stable under LO evolution. Since they also hold for the matching term in (17), they are valid for the distributions \(F_{a_1 s_v}(x_1, x_2; \mu )\) entering the sum rules.

We will separately evaluate the contributions of the three terms in (17) to the numerators of \(R_{a_1 q_v}\) and \(R_{a_1}\), so as to see which part of the DPD model requires adjustment to improve a specific sum rule. We will show plots for selected sum rules that are representative of the general situation, or – when there are large differences between sum rules – show the best and worst cases.

Throughout this section, we evaluate the distributions (17) for \(\nu = \mu = \mu _{\text {min}}\), where \(\mu _{\text {min}} = 2.25 \,\mathrm{GeV}\) is the smallest value on the grid described in the previous section. Other scale choices will be explored in Sect. 6.

5.1 Zeroth iteration: initial ansatz

Let us start with the DPD model described in Sect. 3 and consider the momentum sum rules. They turn out to be satisfied surprisingly well, as is illustrated in Fig. 1. Notice that there is a rather large contribution from \(F_{\text {spl}}\) to \(R_g\). This is readily explained by identifying which parton combinations can be produced by perturbative splitting at LO, namely \(q \bar{q}\), qg, \(\bar{q} g\), and gg, as well as channels obtained from those by interchanging the two partons. All \(2 n_f + 1\) DPDs appearing in the g momentum sum rule thus receive a sizeable splitting contribution. By contrast, for the u momentum sum rule shown in Fig. 1a there are just two parton combinations with a large splitting contribution, namely \(u\bar{u}\) and ug. Another noteworthy feature is the relatively small size of the matching contribution, which is a consequence of our choice \(\nu = \mu \).

Fig. 1
figure 1

Momentum sum rule ratios \(R_u\) and \(R_g\) for the initial model of Sect. 3, evaluated at the scale \(\mu _{\text {min}} = 2.25 \,\mathrm{GeV}\). Shown are the individual contributions from the intrinsic and splitting parts in (17), as well as the full result. The \(\pm 10 \%\) deviations from unity are indicated by a light grey band. Not shown is the separate contribution from the matching term \(F_{\text {match}}\), which is negligible in this case. The remaining plots in this section will follow the same conventions unless explicitly stated otherwise

Fig. 2
figure 2

The momentum sum rule ratio \(R_g\) for different powers n in the phase space factor \(\rho _{a_1 a_2} = (1 - x_1 - x_2)^n\, (1 - x_1)^{-n} \, (1-x_2)^{-n}\) of the intrinsic part (9). The case \(n=2\) is shown in Fig. 1b

Let us investigate at this point the phase space factor \(\rho _{a_1 a_2}\) in (9). In some of the earlier works on DPDs, a simple factor \((1 - x_1 - x_2)\) has been suggested [85,86,87,88], whilst the more recent study [89] argued that a factor \((1 - x_1 - x_2)^2\) is more appropriate. Even higher powers n would be obtained if one were to generalise the Brodsky-Farrar quark counting rules [90, 91] from PDFs to DPDs. Each of these variants leads to a very strong suppression of DPDs in the region where \(x_1 \approx 1\) and \(x_2 \approx 0\) (or vice versa), since in that region the suppression from the phase space factor comes on top of the suppression of the corresponding PDF. As discussed in Sect. 3, it is more appropriate to divide \((1 - x_1 - x_2)^n\) by \((1 - x_1)^n \, (1-x_2)^n\) for a given n in order to remove the phase space suppression in the regions \(x_1 \approx 0\) and \(x_2 \approx 0\). Including this division, we have investigated the momentum sum rules for different values of n and find that best agreement is achieved for \(n = 2\), as is illustrated by the comparison of Fig. 2 with Fig. 1(b).

Turning to the number sum rules, we find that these are violated quite strongly in the initial model, as is illustrated in Fig. 3. The agreement does not improve with other choices of the power n just discussed.

Fig. 3
figure 3

Number sum rule ratios \(R_{a_1 q_{v}}\) for the initial model. The upper plots are for unequal flavors of the two partons, and the lower ones are for equal flavours. The ratio \(R_{d u_v}\) is completely dominated by the intrinsic part of the DPD

The adjustments discussed in the following will improve the situation considerably. Let us at this point note that the number sum rules for equal quark flavours (such as those in the lower row of Fig. 3) can receive a substantial contribution from \(g\rightarrow q \bar{q}\) splitting at the starting scale \(\mu _y\) of \(F_{\text {spl}}\). This contribution is negative for \(a_1 = q\) and positive for \(a_1 = \bar{q}\), given that \(F_{a_1 q_v} = F_{a_1 q} - F_{a_1 \bar{q}}\).

5.2 First iteration: number effects and modified phase space factor

In the first iteration of our model, we implement the same two adjustments that were already made in [77]. To describe these adjustments, it is convenient to specify the ansatz (9) for \(F_{a_1 a_2,\,\text {int}}\) with \(a_1\) and \(a_2\) taking the values \(q_v, \bar{q}, g\) instead of q, \(\bar{q}\), g (with \(q_v\) denoting the linear combination \(q - \bar{q}\)). This switch from quarks and antiquarks to “valence” and “sea” quarks is familiar from the parametrisation of ordinary PDFs.

Following the argumentation in [77], it is natural to change the ansatz for distributions with two valence quark labels so as to take into account “number effects”, i.e. the fact that we have a finite number of valence quarks in the proton, two u and one d. To do this, we set \(F_{u_v u_v, \text {int}}\) to half the value given by (9) and set \(F_{d_v d_v, \text {int}}\) to zero. The latter corresponds to the simple intuition that the probability to find two “valence d quarks” in the proton is nil. The second adjustment argued for in [77] is to modify the phase space factor from the parton independent form in (10) to

$$\begin{aligned} \rho _{a_1 a_2}(x_1, x_2) \,&= (1 - x_1 - x_2)^2 \, \nonumber \\&\times (1 - x_1)^{-2-\alpha (a_2)} \, (1 - x_2)^{-2-\alpha (a_1)} \,, \end{aligned}$$
(22)

with

$$\begin{aligned} \alpha (a) = {\left\{ \begin{array}{ll} 0.5 &{} \quad \hbox { for}\ a = q_v \\ 0 &{} \quad \hbox { for}\ a = \bar{q}, g \end{array}\right. } \end{aligned}$$
(23)

Whilst the original form in (10) satisfies \(0 \le \rho _{a_1 a_2} \le 1\), the phase space factor in (22) becomes greater than 1 when the momentum fraction of a valence parton tends to 0 and the momentum fraction of the other parton (valence or sea) tends to 1. Due to the PDFs in the ansatz (9), the intrinsic part of the DPD still goes to zero in that limit.

With these modifications, we find that the momentum sum rules are further improved, such that for most of the \(x_1\) range, relative deviations are less then \(10\%\). This is illustrated in Fig. 4, which is to be compared with Fig. 1 for the initial model.

Fig. 4
figure 4

Momentum sum rule ratios for the first iteration of our model, taking into account number effects (explained in the second paragraph of Sect. 5.2) and the modified phase space factor given by (22) and (23). The corresponding plots for the original model are shown in Fig. 1

A more significant improvement is obtained for the number sum rules, as can be seen from the comparison of Fig. 5 with Fig. 3. The modified phase space factor yields a weaker suppression for valence partons at large momentum fractions of the other parton. This largely mitigates the steep decrease of the sum rule ratios with \(x_1\) in the initial model. Taking into account number effects strongly reduces the value of \(R_{u u_v}\) at low \(x_1\), which is much too high in Fig. 3c.

Fig. 5
figure 5

The same number sum rule ratios as in Fig. 3, but for the first iteration of our model

5.3 Second iteration: parameter scan for the phase space factor

Given that there is no strong motivation to take the particular value 0.5 for \(\alpha (u_v)\) and \(\alpha (d_v)\) in (23), it is natural to explore whether tuning these parameters can improve the sum rule ratios further. We have therefore performed a parameter scan over these two powers. To quantify the degree to which the sum rules are fulfilled, we introduce

$$\begin{aligned} \delta&= \int \limits _{x_{\text {min}}}^{0.8} \text {d} x_1 \, \left| R(x_1) - 1 \right| \end{aligned}$$
(24)

as a quality measure for each sum rule ratio R, where \(x_{\text {min}} = 5 \times 10^{-5}\). A global quality measure is then the sum \(\delta _{\text {gl}}\) of these measures over all sum rules, excluding of course the cases for which \(R_{a_1 q_{v}}\) cannot be defined, as specified below (20).

Notice that in (24) we have taken an upper integration limit of \(x_1 = 0.8\). This is because for very high \(x_1\), we consider even large relative deviations from the DPD sum rules to be acceptable: DPDs in this region are expected to be very small and should hence not play any role in cross sections that are of measurable size.

The values of \(\delta _{\text {gl}}\) obtained in our parameter scan over \(\alpha (u_v)\) and \(\alpha (d_v)\) are shown in Fig. 6. A minimum is reached at

$$\begin{aligned} \alpha (a) = {\left\{ \begin{array}{ll} 0.63 &{} \quad \hbox { for}\ a = u_v \\ 0.49 &{} \quad \hbox { for}\ a = d_v \\ 0 &{} \quad \hbox { for}\ a = \bar{q}, g \\ \end{array}\right. } \end{aligned}$$
(25)

which we take as the second iteration of our model.

Fig. 6
figure 6

The quality measure \(\delta _{\text {gl}}\) defined after (24), evaluated as a function of the powers \(\alpha (u_v)\) and \(\alpha (d_v)\) in the phase space factor. The right panel gives a zoom into the parameter space shown on the left

As illustrated in Fig. 7a, the momentum sum rules are not strongly affected by this change of parameters. The same holds for number sum rules that do not involve u quarks, which is not surprising because \(\alpha (u_v)\) has significantly changed whereas \(\alpha (d_v)\) has not. Furthermore, we see in Fig. 7b that the change in \(R_{u d_v}\) is very small. By contrast, all number sum rules for \(u_v\) are significantly improved in the range \(x_1 \le 0.8\), as illustrated in the lower plots of Fig. 7.

Fig. 7
figure 7

Sum rule ratios in the first and second interactions of the model, which respectively correspond to the powers (23) and (25) in the phase space factor

One may wonder whether tuning other parameters in our model can lead to further improvements. Candidates for such an endeavour are the parameter \(y_{\text {max}}\) in the starting scale \(\mu _y\) of \(F_{\text {spl}}\), as well as the widths \(h_{a_1 a_2}\) of the Gaussian damping factor, which appears in both \(F_{\text {int}}\) and \(F_{\text {spl}}\). We find, however, that changing these parameters does not lead to a significant decrease of \(\delta _{\text {gl}}\), and that the minimum of \(\delta _{\text {gl}}\) is achieved for parameter values very close to those specified in Sect. 3. We hence leave these parameters at their initial values.

Notice that the Gaussian factor in the intrinsic part (9) of the DPD is normalised such that its integral over all \(\varvec{y}\) gives unity. Restricting this integral to \(y \ge b_0/\nu \) has little effect, which explains why a change of \(h_{a_1 a_2}\) has almost no impact on the contribution of \(F_{\text {int}}\) to the sum rules.

5.4 Third iteration: modifying the splitting part at large distances

After several modifications to the intrinsic part \(F_{\text {int}}\) of our DPD model, we now turn to the splitting part \(F_{\text {spl}}\). While the latter can be computed for perturbatively small y, its form at large distance y needs to be modelled. We now modify the initial ansatz (14) and multiply \(F_{\text {spl,pt}}\) by the superposition of two Gaussians in y, with a relative weight depending on the momentum fractions:

$$\begin{aligned}&{\tilde{F}}_{a_1 a_2,\text {spl}}(x_1,x_2, \varvec{y}; \mu _y) \nonumber \\&\quad = F_{a_1 a_2,\text {spl,pt}}(x_1,x_2, \varvec{y}; \mu _y) \exp \left[ - \frac{y^2}{4h_{a_1 a_2}} \right] \nonumber \\&\qquad \times \left\{ 1 + \left( \exp \left[ \frac{y^2}{4h_{a_1 a_2}^{*}} \right] - 1 \right) g_{a_1 a_2}(x_1 + x_2) \right\} . \end{aligned}$$
(26)

The factor multiplying \(F_{\text {spl,pt}}\) can be rewritten as the sum of two Gaussians, one multiplied with \(1 - g_{a_1 a_2}(x_1 + x_2)\) and the other multiplied with \(g_{a_1 a_2}(x_1 + x_2)\). For the new width parameters \(h_{a_1 a_2}^{*}\) we make the same ansatz as we did for \(h_{a_1 a_2}\), i.e. we set \(h_{a_1 a_2}^{*} = h_{a_1}^{*} + h_{a_2}^{*}\). We take values

$$\begin{aligned} h_{g}^{*}&= 3.015 \,\mathrm{GeV}^{-2}, \quad h_{q}^{*} = h_{\bar{q}}^{*} = 5.375 \,\mathrm{GeV}^{-2} \end{aligned}$$
(27)

such that the Gaussian factor \(\exp \bigl [ y^2/(4 h_{a_1 a_2}^{*}) - y^2 /(4 h_{a_1 a_2}) \bigr ]\) multiplying \(g_{a_1 a_2}\) is approximately the same for all parton combinations. Admittedly, the form (26) is rather special among all possible functions that have the correct limit at small y. Clearly, the requirement of fulfilling the sum rules is not nearly enough to determine the functional form of DPDs at large y, so that a particular ansatz must be made. Our choice has the feature of introducing a nontrivial interplay between the dependence on y and on the parton momentum fractions, controlled by a one-variable function \(g_{a_1 a_2}(x_1 + x_2)\) for each LO splitting process \(a_0 \rightarrow a_1 a_2\). We will find that this is an adequate degree of complexity, in the sense that the sum rule constraints are sufficient to determine this function.

Whilst strict positivity of \({\tilde{F}}_{\text {spl}}\) requires \(g_{a_1 a_2} (x_1+x_2) > 0\), the procedure described below yields negative values of this function in some cases. We checked that the resulting full DPDs \(F_{\text {int}} + {\tilde{F}}_{\text {spl}}\) are still positive in the range of \(x_1, x_2\) and y covered by our DPD grids. This holds for all scales \(\mu \) on our grid, from the starting scale \(\mu _{\text {min}} = 2.25 \,\mathrm{GeV}\) up to the highest value \(\mu = 172 \,\mathrm{GeV}\).

Let us first consider the splitting \(g\rightarrow q \bar{q}\), where q takes one of the values u, d, s. This splitting feeds into the number sum rules for equal quark flavours, which at this stage are least well satisfied. Judging the impact of the function \(g_{a_1 a_2}(x_1+x_2)\) is complicated by the fact that the ansatz (26) for \({\tilde{F}}_{\text {spl}}\) is made at the y dependent scale \(\mu _y\) and needs to be evolved to the scale \(\mu _{\min }\) where we evaluate the sum rules. For definiteness, we consider the sum rule

$$\begin{aligned} (N_{q_v} \! + 1) \, f_{\bar{q}}(x_1;\mu _{\text {min}}) \,&= \int \limits _{0}^{1 - x_1} \!\! \text {d} x_2 \; F_{\bar{q} q_v,\text {match}}(x_1,x_2; \mu _{\text {min}}) \nonumber \\&\quad + \int \limits _{0}^{1 - x_1} \!\! \text {d} x_2 \int \text {d}^2\varvec{y} \; \Bigl [ F_{\bar{q} q_v,\text {int}}(x_1,x_2,\varvec{y};\mu _{\text {min}}) \nonumber \\&\quad + {\tilde{F}}_{\bar{q} q_v,\text {spl}}(x_1,x_2,\varvec{y};\mu _{\text {min}}) \Bigr ] \,, \end{aligned}$$
(28)

where here and in the following it is understood that the integrals over \(\varvec{y}\) are restricted to \(y \ge b_0/\nu = b_0/\mu _{\text {min}}\). To simplify the determination of \(g_{a_1 a_2}(x_1 + x_2)\), we make two approximations. Firstly, we use that for small y the initial and modified splitting model do not differ significantly, i.e.

$$\begin{aligned} {\tilde{F}}_{\bar{q} q_v, \text {spl}}(x_1,x_2,\varvec{y};\mu _{\text {min}}) \approx F_{\bar{q} q_v, \text {spl}}(x_1,x_2,\varvec{y};\mu _{\text {min}}) \,. \end{aligned}$$
(29)

Secondly, we recall that for large y the scale \(\mu _y\) is close to \(\mu _{\text {min}}\), so that we have

$$\begin{aligned} {\tilde{F}}_{\bar{q} q_v, \text {spl}}(x_1,x_2,\varvec{y};\mu _{\text {min}}) \approx {\tilde{F}}_{\bar{q} q_v, \text {spl}}(x_1,x_2,\varvec{y};\mu _{y}) \,. \end{aligned}$$
(30)

Combining both approximations gives

$$\begin{aligned}&\int \text {d}^2\varvec{y}\, {\tilde{F}}_{\bar{q} q_v, \text {spl}}(x_1,x_2,\varvec{y};\mu _{\text {min}}) \nonumber \\&\quad \approx \int \text {d}^2\varvec{y}\; \varTheta (y_{\text {sep}}-y)\, F_{\bar{q} q_v, \text {spl}}(x_1,x_2,\varvec{y};\mu _{\text {min}}) \nonumber \\&\qquad + \int \text {d}^2\varvec{y}\; \varTheta (y-y_{\text {sep}})\, {\tilde{F}}_{\bar{q} q_v, \text {spl}}(x_1,x_2,\varvec{y};\mu _{y}) \,, \end{aligned}$$
(31)

where we use (29) below \(y_\text {sep}\) and (30) above. Taking \(y_{\text {sep}} = 1 \,\mathrm{GeV}^{-1}\) ensures that (30) is rather well fulfilled, as \(\mu _{\text {min}}\) and \(\mu _{y}\) differ by at most \(12 \%\). We will find that \(|g_{q \bar{q}}| < 12\), which corresponds to a relative discrepancy below \(30\%\) between the l.h.s. and the r.h.s. of (29). While this may not seem to be very precise, it will turn out to be sufficient for improving the sum rules significantly.

Using (26) and (31), the sum rule (28) can be approximated as

$$\begin{aligned} k_{\bar{q}}(x_1)&\underset{\text {def}}{=} (N_{q_v} + 1) \, f_{\bar{q}}(x_1;\mu _{\text {min}}) \nonumber \\&\quad - \int \limits _{0}^{1-x_1}\text {d}x_2 \; F_{\bar{q} q_v}(x_1, x_2;\mu _{\text {min}}) \nonumber \\&= \int \limits _{0}^{1-x_1}\text {d}x_2 \int \text {d}^2\varvec{y} \; \varTheta (y-y_{\text {sep}})\, F_{q \bar{q}, \text {spl}}(x_1,x_2,\varvec{y};\mu _{y}) \, \nonumber \\&\quad \times h_{q \bar{q}}(y)\, g_{q \bar{q}} (x_1 + x_2) \,, \end{aligned}$$
(32)

where \(F_{\bar{q} q_v}(x_1, x_2;\mu )\) denotes the full DPD (17) in the second iteration of our model and we have abbreviated

$$\begin{aligned} h_{q \bar{q}}(y)&= \exp \left[ y^2 /(4h_{q \bar{q}}^{*}) \right] - 1. \end{aligned}$$
(33)

Here we used that at the scale \(\mu _y\) one has \({F}_{\bar{q} q_v, \text {spl}} = {F}_{\bar{q} q, \text {spl}} = {F}_{q \bar{q}, \text {spl}}\) and a corresponding relation for \({\tilde{F}}_{a_1 a_2, \text {spl}}\). Shifting the integration variable on the r.h.s. of (32) from \(x_2\) to \(x = x_1 + x_2\) gives

$$\begin{aligned} k_{\bar{q}}(x_1)&= \int \limits _{1}^{x_1} \text {d}x \; K_{q \bar{q}}(x_1, x) \, g_{q \bar{q}}(x) \end{aligned}$$
(34)

with

$$\begin{aligned} K_{q \bar{q}}(x_1, x)&= - \int \text {d}^2{\varvec{y}} \; \varTheta (y-y_{\text {sep}}) \, \nonumber \\&\quad \times F_{q \bar{q}, \text {spl}}(x_1,x - x_1,{\varvec{y}};\mu _{y}) \, h_{q \bar{q}}(y) \,. \end{aligned}$$
(35)

We recognise in (34) a Volterra equation of the first kind [92]. We discretise this equation by taking both \(x_1\) and x on the grid for DPDs discussed in Sect. 4. The integral over x is turned into a sum using a simple trapezoidal rule in the variable \(u = \log (x/(1-x))\). The result is a linear system of equations

$$\begin{aligned} ( k_{\bar{q}} )_i = \sum _{j} ( K_{q \bar{q}} )_{i j} \, ( g_{q \bar{q}} )_j \end{aligned}$$
(36)

with an upper diagonal matrix \(K_{i j}\), which is readily solved using Gauss–Jordan elimination.

In order to have an analytic formulation for our model, we fit the obtained discrete values of \(g_{a_1 a_2}(x)\) to the form

$$\begin{aligned} g(x)&= A + B x^b + C \,x^{c_1} (1-x)^{c_2} \,, \end{aligned}$$
(37)

for each of the splittings \(g\rightarrow u \bar{u}, g\rightarrow d \bar{d}\), and \(g\rightarrow s \bar{s}\). This reproduces the general shape of the numerical results rather well, except for some deviations at very large x. The resulting functions are shown in Fig. 8a–c, and the fitted parameters are given in Table 1.

Fig. 8
figure 8

Modification functions \(g_{a_1 a_2}(x)\) for the \(g \rightarrow q \bar{q}\) and \(g \rightarrow g g\) splittings. For each channel we display the fit to the form (37) and the direct solution of the discretised Volterra equation (36). The direct solution is shown as a dashed curve with linear interpolation between each data point

Table 1 Parameters of the modification functions \(g_{a_1 a_2}\) defined by (26) and (37)

With these modified \(g \rightarrow q \bar{q}\) splittings, the agreement of the model with the \(\bar{q} q_v\) number sum rules improves significantly, as can be seen in Fig. 9b–d. Remarkably, the modification of the \(g \rightarrow u\bar{u}\) splitting improves not only the sum rule for \(\bar{u} u_v\) but also one for \(u u_v\), as seen in Fig. 9a.

Fig. 9
figure 9

Change of the number sum rules for equal flavours due to the modification of the \(g \rightarrow q \bar{q}\) splittings at large y

Fig. 10
figure 10

Change of the sum rule ratio (38) due to the modification of the \(g \rightarrow d \bar{d}\) splitting at large y. The sum rule is exactly satisfied if \({\tilde{R}}_{d d_{v}} = 0\)

Fig. 11
figure 11

Change of momentum sum rules due to the modification of the \(g \rightarrow q \bar{q}\) splittings at large y

At this point, we recall that the ratio \(R_{a_1 q_{v}}\) is undefined for \(F_{d d_v}\). In order to quantify how well the number sum rule for this distribution is satisfied, we introduce the modified ratio

$$\begin{aligned} {\tilde{R}}_{d d_{v}}(x_1; \mu )&= \frac{\int \mathrm {d}x_2\, F_{d d_{v}}(x_1, x_2; \mu )}{f_{d}(x_1; \mu )} \,, \end{aligned}$$
(38)

in which the zero prefactor in the denominator of (19) has been replaced with unity. The ratio \({\tilde{R}}_{d d_{v}}\) should be close to zero. We see in Fig. 10 that this is indeed the case: the modification of the \(g\rightarrow d \bar{d}\) splitting improves not only the sum rule for \(F_{\bar{d} d_v}\) but also the one for \(F_{d d_v}\). Altogether, we have reached a satisfactory agreement of our model with all number sum rules.

The modification of the \(g\rightarrow q\bar{q}\) splitting also affects the quark momentum sum rules, as illustrated in Fig. 11. In the cases shown in the figure, the agreement of the momentum sum rule becomes slightly worse, whereas the changes in the remaining cases are insignificant. One could improve \(R_{\bar{u}}\) and \(R_{\smash {\bar{d}}}\) by modifying the \(g \rightarrow g \bar{u}\) and \(g \rightarrow g \bar{d}\) splittings, but this would also affect the number sum rules ratios \(R_{g u_v}\) and \(R_{g \smash {d_v}}\). We refrain from such an exercise, considering that the agreement shown in Fig. 11 is still satisfactory.

The sum rule ratio that is farthest away from 1 after these improvements is the one for the gluon momentum sum rule. This can be adjusted by modifying the \(g\rightarrow g g\) splitting at large y in the same way as discussed for \(g\rightarrow q \bar{q}\). The parameters of the modification function \(g_{g g}(x)\) are given in Table 1, and the function itself is shown in Fig. 8d. The resulting improvement of the sum rule can be seen in Fig. 12, and we have checked that none of the other sum rule ratios is adversely affected by this final modification of our model.

Fig. 12
figure 12

Change of the gluon momentum sum rule due to the modification of the \(g \rightarrow g g\) splitting at large y

Fig. 13
figure 13

Momentum sum rule ratios \(R_u\) and \(R_g\) for the final iteration of our model. The corresponding plots for the initial model are in Fig. 1 and those for the first iteration in Fig. 4. Not shown is the separate contribution from the matching term \(F_{\text {match}}\), which is negligible in this case

Let us finally take a look at the relative importance of intrinsic and splitting contributions to the sum rules in the final iteration of our model. In Figs. 13 and 14, we show the situation for the same sum rules that were shown in Figs. 1 and 3 for our initial model. We find that for \(R_u\), \(R_{d u_v}\), and \(R_{g d_v}\) the main change between the initial and final versions is due to the intrinsic part. By contrast, for \(R_g\) and \(R_{u u_v}\), and \(R_{\smash {\bar{d}} d_v}\) there are important changes both in the intrinsic and in the splitting parts, where the latter are restricted to the small \(x_1\) region in the case of \(R_{u u_v}\). That these sum rules are strongly affected by the splitting modification at large y was already seen in Figs. 9a, c and 12.

In the final iteration of our model, the sum rules that receive positive or negative splitting contributions larger than \(20\%\) in at least part of the \(x_1\) range are the momentum sum rules for sea quarks (\(\bar{u}\), \(\bar{d}\), \(\bar{s}\), and s) and the number sum rules for equal flavours (\(q q_v\) and \(\bar{q} q_v\)). Compared with the initial model, the contribution of the \(g \rightarrow g g\) splitting to \(R_{g}\) has strongly decreased due to its modification at large y.

Fig. 14
figure 14

Number sum rule ratios \(R_{a_1 q_{v}}\) for the final iteration of our model. The corresponding plots for the initial model are in Fig. 3 and those for the first iteration in Fig. 5. The ratio \(R_{d u_v}\) is completely dominated by the intrinsic part of the DPD

6 Scale dependence

So far, we have evaluated the sum rules for DPDs and PDFs at the scale \(\mu = \mu _{\text {min}} = 2.25 \,\mathrm{GeV}\), and with the matching between position and momentum space DPDs computed for a cutoff scale \(\nu = \mu \). In this section, we investigate how the sum rules change if these scales are chosen differently.

6.1 Renormalisation scale

As shown in [77], the DPD sum rules are preserved under LO evolution. If they are approximately valid at some scale, one may expect that they are still approximately valid when the DPDs and PDFs are evolved to a different scale. We verified that this is indeed the case for the DPD model developed in the previous section. This is illustrated in Fig. 15 for momentum sum rules and in Fig. 16 for number sum rules. We evolved the distributions from \(\mu _{\text {min}}\) to \(\mu = 144.6 \,\mathrm{GeV}\), which is a point on our \(\mu \) grid. The DPD matching at the high scale is evaluated with \(\nu = \mu \).

Fig. 15
figure 15

Comparison of the gluon momentum sum rule ratio for the final iteration of our model at two different scales. The contribution of the matching term is small and not shown

Fig. 16
figure 16

Comparison of number sum rule ratios for the final model at low and high scales. The contribution of the matching term is small and not shown

In the case of the g momentum and the \(g u_v\) number sum rule, we notice that the individual contributions from \(F_{\text {int}}\) and \(F_{\text {spl}}\) to the sum rule ratios change considerably under evolution, while the sum of all contributions remains nearly the same. This highlights the relevance of the perturbative splitting mechanism for ensuring the scale independence of the DPD sum rules, which was pointed out in a number of different studies [70, 71, 78].

In the figures for the \(\bar{u} u_v\) sum rule, we observe that the oscillatory behaviour of \(R_{\bar{u} u_v}\), which is a consequence of the modified splitting term in our model, is less pronounced after evolution to \(\mu = 144.6 \,\mathrm{GeV}\). This is a typical feature of scale evolution, which tends to “wash out” details of distributions when going from low to high scales.

6.2 Cutoff scale

The matching relation given in (5) is only accurate up to higher orders in \(\alpha _s\) and up to power corrections in \(\varLambda / \nu \). The higher order analysis in [82] reveals that the term of order \(\alpha _s^n\) in the matching relation is accompanied by up to n powers of \(\log (\mu ^2 / \nu ^2)\). Varying \(\nu \) around its “natural value” \(\mu \) thus provides an estimate of higher order and power corrections in the matching relation. Following a widespread practice for scale variations, we vary \(\nu \) between \(\mu /2\) and \(2 \mu \), taking again \(\mu = \mu _{\text {min}}\). The resulting variation of the sum rule ratios for our final DPD model is illustrated in Fig. 17.

Fig. 17
figure 17

Cutoff scale dependence of sum rule ratios, evaluated at \(\mu = \mu _{\text {min}}\) for the final iteration of our model. The solid curve is for \(\nu = \mu \), and the band corresponds to a variation of \(\nu \) between \(\mu /2\) and \(2\mu \)

We find the \(\nu \) dependence to be moderate, with changes of \(10\%\) or less in the sum rule ratios in almost all cases. These variations are hence of the same order as the agreement of the sum rule ratios with 1. The theoretical uncertainties reflected by the \(\nu \) variation also suggest that it is of limited value to tune the sum rule ratios obtained for \(\nu =\mu \) much further than we have done.

The only sum rule ratio with a larger \(\nu \) dependence is \(R_{\bar{s} s_v}\), shown in Fig. 17d, which varies up to \(20\%\). To understand this, we note that the \(\nu \) dependence of the splitting and matching terms in (17) is stronger than the \(\nu \) dependence of the intrinsic term. The latter gives an important contribution to all sum rule ratios, except for \(R_{\bar{s} s_v}\), where within our model it is strictly zero.

One might wonder whether a change of the scale \(\nu \) could systematically improve the agreement of our initial model with the sum rules. The examples in Fig. 18 show that this is not the case: the \(\nu \) variation is not able to bring the ratios \(R_{\bar{u}}\) or \(R_{\bar{u} u_v}\) close to 1 for all \(x_1 \le 0.8\). We also note that the change of the sum rules with \(\nu \) is roughly of the same size in our initial and final models. This justifies our choice of \(\nu = \mu \) for the tuning of the model described in the previous section.

Fig. 18
figure 18

As Fig. 17, but for the initial DPD model described in Sect. 3

7 Conclusions

The number and momentum sum rules for DPDs put important constraints on DPD parametrisations. We have shown that one can construct physically plausible models for DPDs in position space that approximately fulfil these constraints. Our starting point was the DPD ansatz used in [45], the construction of which ensures the correct small y limit given by LO perturbation theory, but does not take into account DPD sum rule constraints at all. That ansatz was then sequentially modified: we started by adapting the modifications discussed in [77] to our case and furthermore tuned some model parameters, using parameter scans and a measure that quantifies how well the sum rules are globally satisfied. In the last step, we modified the form of the parton splitting term at large y, where perturbation theory is not applicable and this term has to be regarded as part of the non-perturbative model. Whilst the specific form of that modification was motivated more by practical considerations than by physical intuition, our exercise shows that one can adapt position space DPDs up to the point where all momentum and number sum rules are satisfied within about \(10\%\) accuracy. An exception to this statement is the region of parton momentum fractions \(x > 0.8\), where even ordinary PDFs are poorly known and where double parton scattering processes will have tiny cross sections.

We verified that the approximate validity of the sum rules remains stable under evolution from low to high scales. Furthermore, we find that the sum rules are robust under variation of the cutoff scale \(\nu \), which appears when converting DPDs from position to momentum space. The largest \(\nu \) variation is observed for the number sum rule that involves only strange quarks, where we see effects of up to \(20\%\). Since the \(\nu \) variation reflects in particular the size of uncomputed higher orders in the parton splitting, and since we vary \(\nu \) around a central value of \(2.25 \,\mathrm{GeV}\), we find a scale variation of this size not too surprising. One can expect that the inclusion of perturbative splitting terms at NLO, which have been computed in [82], will improve the situation.

For any given DPD model and PDF set, one can verify to which extent the sum rules are fulfilled. If they are violated significantly, one can unfortunately not fully deduce the region of variables \(x_1, x_2, y\) in which the DPDs are unreliable, since the sum rules are integrated over one momentum fraction and over y. If, however, one has a given functional form of DPDs and needs to choose its parameters, the sum rules can be of more direct use. Whilst imposing that they be satisfied exactly will in general be a condition that cannot be fulfilled, the type of quality measure for the sum rules we introduced in Sect. 5.3 provides a simple quantitative criterion for the theoretical consistency of the model. In a more sophisticated treatment, one should also take into account the uncertainties on the PDFs, which appear on the r.h.s. of the sum rules and typically are also an input to the DPD model.

Whilst perturbative calculations for double parton scattering have been pushed to higher orders in recent years, the construction of more reliable DPD models remains an outstanding task. The present work shows that two major theoretical constraints on DPDs, namely the small y limit and the sum rules (where y is integrated over) can be satisfied simultaneously at least in an approximate way. Of course, this theoretical input alone is not sufficient to pin down the DPDs, and ultimately, the predictions obtained with any DPD model should be compared with experiment. This will be a huge endeavour and must be left to future work.