Theoretical considerations on multiparton interactions in QCD

We investigate several ingredients for a theory of multiple hard scattering in hadron-hadron collisions. Issues discussed include the space-time structure of multiple interactions, their power behavior, spin and color correlations, interference terms, scale evolution and Sudakov logarithms. We discuss possibilities to constrain multiparton distributions by lattice calculations and by connecting them with generalized parton distributions. We show that the behavior of two-parton distributions at small interparton distances leads to problems with ultraviolet divergences and with double counting, which requires modification of the presently available theoretical framework.


Introduction
In hadron-hadron collisions at very high energies, several partons in one hadron can scatter on partons in the other hadron and produce particles with large transverse momentum or large invariant mass. The effects of such multiparton interactions average out in sufficiently inclusive observables, which can be described by conventional factorization formulae that involve a single hard scattering. However, multiple interactions do change the structure of the final state. They have been seen at the Tevatron [1,2] and are expected to be important for many analyses at LHC [3,4,5].
The phenomenology of multiparton interactions relies on models that are physically intuitive but involve significant simplifications. A brief review of the subject can be found in [6] and an overview of implementations in event generators in [7]. So far a systematic description of multiple interactions in QCD remains elusive. In this letter we report on some steps towards this goal. We will see to which extent the cross section formulae currently used to calculate multiplescattering processes can be justified in QCD and to which extent they need to be completed. At a more fundamental level, we find that there is an unsolved problem of double counting between single and multiple hard scattering.
We consider the case of two hard scatters at parton level. For definiteness we analyze the production of two electroweak gauge bosons with large invariant mass (γ * , Z or W ) and indicate which of our results can be generalized to other final states such as jets. Since the main interest in multiparton interactions is driven by the need to understand details of the final state, we keep the transverse momenta of the produced gauge bosons differential, rather than integrating over them. For the production of a single boson there is a powerful theoretical description based on transversemomentum dependent parton densities [8,9,10,11], which we aim to extend to the case of multiparton interactions. Integrating over transverse momenta gives the more familiar formulation in terms of collinear parton distributions. Detailed derivations of our results and further discussion will be given in [12].

Tree-level analysis
We begin by sketching the derivation of the cross section formula for double parton scattering at tree level. For definiteness we take two colliding protons and consider the case where the two partons coming from one of the protons are quarks. The corresponding graph is shown in Fig. 1a, which also specifies our assignment of momentum variables.
We use light-cone coordinates v ± = (v 0 ± v 3 )/ √ 2 and v = (v 1 , v 2 ) for any four-vector v and choose a reference frame where p moves fast to the right andp fast to the left, with transverse momenta p =p = 0. We consider kinematics where the invariant masses of the bosons are large and where their transverse momenta are much smaller, i.e. we require q T ≪ Q with q 2 1 ∼ q 2 2 ∼ q 2 T and q 2 1 ∼ q 2 2 ∼ Q 2 . The lower blob in Fig. 1a is described by the correlation function for two quarks in a proton, Here T denotes time ordering andT anti-time ordering, as appropriate for fields in the scattering amplitude or its conjugate. The association between momenta and field positions becomes clear if one rewrites z 1 k 1 + z 2 k 2 − yr = (y + 1 2 z 1 )(k 1 − 1 2 r) + 1 2 z 2 (k 2 + 1 2 r) + 1 2 z 2 (k 2 − 1 2 r) − (y − 1 2 z 1 )(k 1 + 1 2 r) in the Fourier exponent. For now we gloss over the flavor and color structure of Φ, which will be discussed in Sections 5 and 6. Throughout this work we consider unpolarized protons, so that an average over the proton spin in (1) is understood. The correlation functionΦ for two antiquarks in a proton is defined in analogy to (1), with interchanged roles of the q andq fields. The contribution of graph 1a to the cross section then reads where H i (k i ,k i , r,r) is the squared amplitude for each of the two hard-scattering processes. Repeated Dirac indices α 1 , β 1 , etc. are to be summed over. The sta-tistical factor S is 2 if the final states of the two hard scatters are identical and 1 otherwise.
To proceed, we make the same approximations as in processes with a single hard scattering: 1. use that the minus-momenta of right-moving partons and the plus-momenta of left-moving partons are small compared with the large scale Q. The constraint δ (4) (r+r) forces r + andr − to be small, although by general scaling arguments they could both be large, and the constraint δ (4) we thus find that the parton momentum fractions (k + i ± 1 2 r + )/p + ≈ x i and (k − i ± 1 2r − )/p − ≈x i are fixed by the final-state kinematics. Note that this does not hold for the transverse parton momenta, since q i receives contributions from both k i ± 1 2 r andk i ± 1 2r . 2. neglect small plus-and minus-components, as well as all transverse momenta in the squared partonlevel amplitudes H i , which then depend only on q + i and q − i , i.e. on quantities of order Q. After this approximation, the integrations over k − i and r − in (2) only concern the factor Φ, and those over k + i andr + only concern the factorΦ. 3. perform a Fierz transformation for the index pairs (α i , β i ) and (ᾱ i ,β i ) in (2). The correlation function Φ is then multiplied with Dirac matrices that can carry Lorentz indices. One has to retain only those terms with the maximum number of plus indices, since the momenta on which Φ depends are largest in the plus-direction. ForΦ the dominating terms have the maximum number of minus indices.
After these steps we find that the two-quark distributions required to describe the graph 1a read with bilinear operators Here a = q, ∆q, δq labels the quark polarization and with j = 1, 2. The operators in (5) are well-known from the definitions of single-parton densities for unpolarized, longitudinally polarized and transversely polarized quarks, see e.g. [13,14]. Analogous definitions hold for antiquarks and for a left-moving hadron. The cross section in (2) can finally be written as where here and in the following we write F (x i , k i , r) instead of F (x 1 , x 2 , k 1 , k 2 , r) for brevity. The partonlevel cross sections are given bŷ with quark spin projectors P q (k) = 1 2 k + γ − , P ∆q (k) = 1 2 γ 5 k + γ − , P j δq (k) = 1 2 γ 5 k + γ − γ j and corresponding antiquark spin projectors Pā. It is understood that for each a i = δq both F a1,a2 andσ i,aā carry extra indices j associated with the direction of the transverse quark polarization. Corresponding remarks hold for a i = δq.
The difference r of transverse parton momenta can be replaced by the Fourier conjugate position y, both in the distributions and in the cross section Integration of the cross section over q 1 and q 2 leads to collinear (i.e. transverse-momentum integrated) twoparton densities The corresponding cross section formula is the basis for the phenomenology of multiple interactions and has been used for a long time. It was derived in [15] for scalar partons and in [16] for quarks, in ways similar to the one we just sketched. As we will discuss in Section 13 the integral in (11) diverges logarithmically at order α s and requires appropriate regularization. Up to this caveat, which also applies to single-parton densities, F (x i , y) gives the probability for finding two partons with momentum fractions x 1 and x 2 at relative transverse distance y in a proton. By contrast, F (x i , k i , y) depends on both transverse-momentum and transverse-position arguments for the quarks and does not admit a probability interpretation due to the uncertainty principle. Instead, F (x i , k i , y) has the structure of a Wigner distribution [17] for the transverse degrees of freedom: the integral (11) gives the probability to find two partons at transverse distance y, whereas the integral gives the probability to find two partons with transverse momenta k 1 and k 2 . A similar discussion for generalized parton distributions can be found in [18]. Using (5) and (9) we can identify k 1 and k 2 as the "average" transverse momenta of the two partons and y as their "average" transverse distance, where the "average" refers to the scattering amplitude and its complex conjugate (see Fig. 1a). In this sense the cross section formula (10) describes two hard-scattering processes where a quark and an antiquark with average transverse momenta k i andk i annihilate into a gauge boson with transverse momentum q i = k i +k i . The two annihilation processes occur at an average transverse distance y, which equals the average transverse distance of the two quarks in one proton and of the two antiquarks in the other. If the cross section is integrated over q 1 and q 2 , the specification of "average" for the distance y can be dropped.
It is gratifying to find such an intuitive physical interpretation, which we will further develop in Section 4. On the other hand, we emphasize that (10) and its q i -integrated version can be derived from Feynman graphs using standard hard-scattering approximations, without any need to appeal to classical or semi-classical arguments.
One can also Fourier transform the two-parton distributions w.r.t. the transverse momenta k i , This form is most suitable for the resummation of Sudakov logarithms, as is well-known for single-parton distributions [8]. Indeed, the cross section formula is reminiscent of the one for single Drell-Yan production [19]. Up to terms of order α s related to the regularization of logarithmic divergences, the collinear distributions F (x i , y) are obtained from F (x i , z i , y) by setting z i = 0. We note that by taking the complex conjugate of (9) one easily finds that F (x i , k i , y) is real valued, whereas F (x i , k i , r) and F (x i , z i , y) can have complex phases. The graph in Fig. 1a involves a two-quark distribution F a1,a2 in proton p. The cross section receives further contributions where the two partons in proton p are both antiquarks, or where one is a quark and the other an antiquark. The definitions of quark-antiquark distributions F a1,ā2 and Fā 1 ,a2 and the associated cross section formulae are close analogs of the expressions given above. There are however further contributions, which will be discussed in Section 5.
The preceding results can be generalized to hardscattering processes initiated by gluons, such as gluongluon fusion into a Higgs boson via a top-quark loop. Two-gluon distributions are defined as in (9), but with an extra factor of 1/(x 1 p + x 2 p + ) on the r.h.s. and with operators that are bilinear in the gluon field strength G µν , where a = g, ∆g, δg are polarization labels and The operators O g and O ∆g appear in the usual densities for unpolarized and longitudinally polarized gluons. O l l ′ δg describes linear gluon polarization (or equivalently gluon helicity flip by two units) and has been discussed in [20,21].

Power behavior
A pair of electroweak gauge bosons can be produced by two hard scatters, but also by a single one. An example graph is shown in figure 1b, and the corresponding cross section formula reads where x = x 1 + x 2 ,x =x 1 +x 2 andσ is the cross section for qq annihilation into two gauge bosons. Dimensional analysis of (7) and (16) reveals that for both the single and double hard-scattering mechanisms. Here the small scale Λ 2 represents q 2 T or the scale of non-perturbative interactions, whichever is larger. To obtain (17) one uses that parton distributions do not depend on the hard scale Q 2 (except for logarithms, which are discarded when counting powers), whereas hard-scattering cross sections are independent of Λ 2 . We thus obtain an important result: multiple hard scattering is not power suppressed in cross sections that are sufficiently differential in transverse momenta.
The situation changes when one integrates over q 1 and q 2 . In the double-scattering mechanism both transverse boson momenta are restricted to order Λ since they result from the transverse momenta of the annihilating partons. For a single hard scattering one has |q 1 + q 2 | ∼ Λ, whereas the individual transverse boson momenta can be as large as Q. Due to this phase space effect one obtains In the transverse-momentum integrated cross section multiple hard scattering is thus power suppressed. This is in fact required for the validity of the usual collinear factorization formulae, which describe only the single-hard-scattering contribution. The power behavior just discussed remains the same for parton-level processes initiated by gluons instead of quarks, and for final states other than gauge bosons.

Impact parameter
The distributions F (x i , z i , y) depend on spatial transverse coordinates for the quarks but still refer to a proton with definite (zero) transverse momentum. A representation purely in impact parameter space can be obtained using the methods of [22,23,24], where impact parameter densities for a single parton are constructed from generalized parton distributions. To this end we first define non-forward distributions F (x i , z i , y; ∆) as in (12) but between proton states p + , 1 2 ∆| and |p + , − 1 2 ∆ with different transverse momenta. Introducing the wave packet which describes a proton with definite transverse position b, one can show that with The difference d in the transverse positions of the two proton states is a consequence of Lorentz invariance, see [24]. Integrating F (x i , z i , y; b) over b one recovers the distributions F (x i , z i , y), so that the cross section (13) can be cast into the form which has a simple geometric interpretation in impact parameter space. Taking the average of transverse positions in the amplitude and its conjugate as in Section 2, one identifies y as the average distance between the two scattering partons, b as the average distance between parton 2 and the right-moving proton, and b as the average distance between parton 2 and the left-moving proton. This is illustrated in Fig. 2 for the case where the cross section is integrated over q i , so that z i = 0 and the positions in the amplitude and its conjugate coincide. The q i -integrated version of (22) has previously been derived in [25]. Given the work in [15,16] and [25], we disagree with the statement in [26] that the impact parameter picture of multiparton interactions has until now been based on "semi-intuitive reasoning".

Interference contributions
The multiparton distributions discussed so far have an interpretation as probabilities or as pseudoprobabilities in the sense of Wigner distributions. However, there are contributions to the cross section which involve distributions that are interference terms rather than probabilities. Figure 3a shows an example where the parton with momentum fraction x 1 is a quark in the scattering amplitude and an antiquark in the conjugate scattering amplitude. Such interference terms in the fermion number of the partons have no Figure 3: Example graphs for interference terms in fermion number (a) and in quark flavor (b). The blobs indicating two-parton distributions are not shown for simplicity. Labels q andq indicate whether a line is represented by a quark field or a conjugate quark field in the corresponding matrix element. Momenta are to be assigned as in Fig. 1a. equivalent in single hard-scattering processes, where they are forbidden by fermion number conservation. Their contribution to the cross section has the same structure and in particular the same power behavior as the contributions discussed in Section 2. To the best of our knowledge such interference terms are not included in existing phenomenology.
Let us introduce a shorthand notation for the Fourier transformed matrix element of a product of field operators ϕ, with indices assigned as fract. x 1 in conjugate ampl. The numbering corresponds to the order of the parton lines in figures 1a and 3, from left to right. We then respectively have for a two-quark and a quark-antiquark distribution, whereas the quark-antiquark interference distribution associated with the lower part of figure 3a is given by So far we have not paid attention to the quark flavor structure of the double-scattering process. One readily finds that there are also interference terms in the flavor quantum numbers of the partons. An example is given in Fig. 3b, where the parton with momentum fraction x 1 is a u quark in the amplitude and a d quark in the conjugate amplitude. The corresponding distribution is given by the matrix element (ū 3 Γ a2 d 2 ) (d 4 Γ a1 u 1 ) .

Color structure
In contrast to single-parton densities, where two parton fields are always coupled to a color singlet, multiparton distributions have a nontrivial color structure, which we now discuss. A color decomposition of twoquark distributions is given by where j, j ′ , k, k ′ are color indices and N is the number of colors. For brevity we omit the labels a 1 , a 2 on F in this section. 1 F describes the case where the quark lines with equal momentum fractions x 1 or x 2 are coupled to a color singlet, whereas in 8 F they form color triplets. Obviously, a probability interpretation is only possible for 1 F , whereas 8 F may be regarded as an interference term in color space. The color structure of F jj ′ ,kk ′ can alternatively be parameterized by 1 F and the matrix element where quark lines with different longitudinal momenta are coupled to color singlets. We note that (27) becomes equal to 8 F in the large-N limit, except if one has | 8 F | ≪ | 1 F |. By a Fierz transform one can express (27) in terms of matrix elements where the bilinear quark operators have no uncontracted color or spinor indices.
To illustrate that the color octet combination 8 F need not be small let us consider a three-quark system, as it is done in constituent quark models. Since the color part of the three-quark wave function is ǫ jkl , the color structure of a two-quark distribution then reads where l is the color index of the spectator quark and therefore summed over. From this one readily obtains With a suitable assignment of color indices, decompositions analogous to (26) can be written down for distributions F involving one or two antiquarks instead of quarks, and for the interference distributions I in (25). Given the normalization of 8 F chosen in (26), color singlet and color octet distributions enter with equal weight in the cross section if each hard scatter produces an electroweak gauge boson or any other color-singlet system. Two-gluon distributions have a more involved color structure. We first couple each of the gluon pairs {1, 4} and {2, 3} to an irreducible representation and then couple the two pairs to an overall color singlet. This gives where we use a shorthand notation for the contraction of Lorentz indices. Each of the pairs {1, 4} and {2, 3} is coupled to a singlet, an antisymmetric and a symmetric octet in 1 F , A F and S F , respectively. The ellipsis in (31) stands for terms where the pairs are in higher representations, which are 10, 10 and 27 for SU (3). The corresponding tensors in color space can be found in [27], cf. also App. A of [28]. In hard-scattering processes that produce color singlet states, the distributions enter as Of course there are also mixed quark-gluon distributions. The quark lines can only couple to a color singlet or octet, which has to be matched by the gluon lines in order to obtain an overall singlet. A complete decomposition is thus given by We note that the color structure of two-parton distributions has already been discussed in [27], using a basis where the parton pairs {1, 2} and {3, 4} are coupled to states of definite color.

Wilson lines
Our discussion so far has been concerned with tree graphs as in Fig. 1. At this level, our results can readily be generalized to other hard-scattering processes, in particular to jet production with the well-known subprocesses qq → qq, qg → qg, etc. A proper factorization formula in QCD must of course include corrections to the tree-level cross section, and in particular take care of additional gluon exchange. Detailed studies of factorization in terms of transverse momentum dependent parton distributions have been performed for single Drell-Yan production and its crossed counterparts, e + e − annihilation into back-to-back hadrons and semi-inclusive deep inelastic scattering [8,9,10,11]. In [12] we argue that the factorization proof for Drell-Yan production can to a large part be extended to double hard-scattering processes producing uncolored states such as electroweak gauge bosons. We restrict ourselves to such processes from now on. For hard-scattering processes like jet production, serious problems for establishing transversemomentum dependent factorization have been encountered even for single hard scattering [29]. A treatment of the multiple-scattering case will probably have to wait until this situation is clarified.
Starting from tree-level diagrams, there are two types of additional gluon exchange that are not power suppressed in the large scale and hence need to be taken into account systematically. The fist type concerns gluons which emerge from the subgraph representing the partons in the right-moving proton and which attach to a hard-scattering subgraph, see Fig. 4. These gluons move fast to the right, and to avoid power suppression their polarization must be in the plus-direction. To leading-power accuracy, the effect of these gluons can be represented by Wilson lines that appear in the operators defining parton distributions and make them gauge invariant. For gauge boson pair production, each quark or antiquark field in (12) is to be multiplied by a Wilson line according to with where j and k are color indices, P denotes path ordering, and the sign convention for the coupling g is such that the covariant derivative reads D µ = ∂ µ +igA µ,a t a . An analogous discussion holds for left-moving gluons in the left-moving proton.
One naively expects the vector v to point in the minus direction for right-moving and in the plus direction for left-moving partons. This leads however to rapidity divergences in the parton distributions, due to contributions from left-moving gluons in a rightmoving hadron and vice versa. To exclude this unwanted kinematic region and to remove the divergence, one can take v with nonzero plus-and minus components [8,10]. This results in an additional parameter in the parton distributions for proton p. Their ζ dependence is connected with Sudakov logarithms and will be discussed in Section 14. An adequate choice of ζ in cross section formulae is the hard scale Q.
The second type of unsuppressed gluon exchange is between the right-and left-moving partons as shown in Fig. 4, provided the gluons are soft and thus do not take partons far off shell. In processes with small observed transverse momenta in the final state, the effect of these gluons does not cancel, but it can be described by a so-called soft factor, which is defined in terms of vacuum expectation values of Wilson lines. Proper care needs to be taken to prevent double counting, because the Wilson lines W (z, v) in the parton distributions include soft gluon momenta as well [9,10,11].
We now turn to collinear multiparton distributions. It is well known that for single-parton distributions the rapidity divergences just mentioned cancel between real and virtual graphs [30], so that in this case the direction v can be taken exactly lightlike. The same argument holds for two-parton distributions in the color singlet channel. As can be seen from (11) and (12), the transverse separation z i of quark and antiquark fields is zero in the collinear distribution F (x i , y). For color singlet distributions 1 F , the Wilson lines whose color indices are contracted therefore combine to a Wilson line of finite length, going from y − 1 2 z 1 to y + 1 2 z 1 or from − 1 2 z 2 to 1 2 z 2 along the light cone. The same is, however, not true for color octet distributions 8 F . Rapidity divergences from real and virtual graphs do not cancel in that case, because compared to 1 F some graphs change their color factors whereas others do not. One must hence keep v away from the light cone even in collinear octet distributions. Furthermore, the color indices of Wilson lines with equal transverse positions do not match, so that these Wilson lines cannot be combined to a line of finite length. As a consequence, terms with color octet distributions in collinear factorization formulae will have a very different structure from the usual one.
It should be possible to extend the previous discussion to gluon distributions. For the Wilson lines in single-gluon densities a detailed analysis can be found in [31], whereas to our knowledge the soft-gluon sector has not been elaborated.

Parton spin correlations
The cross section formulae in Section 2 have a nontrivial dependence on parton spin, even though they are for unpolarized colliding protons. F q,q and its counterparts for antiquarks describe unpolarized partons, whereas all other distributions F ∆q,∆q , F q,∆q etc. describe correlations between the polarization of the two partons, or between their polarization and the transverse vectors y and k i . Such spin correlations have been pointed out in [27], but to our knowledge they have not been implemented in phenomenology. We find that 16 scalar or pseudoscalar functions are required to parameterize the spin structure of F a1,a2 (x i , k i , y) for a given quark flavor combination. Eight scalar functions remain when one integrates over k 1 and k 2 to obtain F a1,a2 (x i , y).
To see that parton spin correlations in the proton need not be small, let us consider a SU (6) symmetric three-quark wave function. As is well known, this gives ∆u/u = 2/3 and ∆d/d = −1/3 for the average longitudinal polarization of u and d quarks, which reproduces the trend observed in polarized quark densities at x around 0.3. Similarly, one obtains F ∆u,∆u /F u,u = 1/3 and F ∆u,∆d /F u,d = −2/3 and thus a considerable correlation between the longitudinal polarization of two quarks.
Parton momentum fractions in processes at LHC are typically well below 0.3, say of order 10 −2 or smaller. In that region polarized single-parton distributions (to the extent that they are known) are small compared with their unpolarized counterparts. This means that there is little correlation between the respective polarizations of a parton and the proton when they are far apart in phase space. From this we should however not conclude that distributions like F ∆q,∆q are unimportant at small but comparable x 1 ∼ x 2 , since they describe a correlation between two partons that are relatively close in phase space.
Parton spin correlations have important consequences in multiple interaction processes. Let us again consider gauge boson pair production. For longitudinal polarization of both the quark and the antiquark that annihilate into a boson, each spin projector Γ ∆q and Γ ∆q in the hard-scattering cross section (8) contains a γ 5 . After anticommuting one γ 5 through the fermion trace, one obtains a factor γ 2 5 = 1, so that the cross section depends on the combination F q,q Fq ,q + F ∆q,∆q F ∆q,∆q of two-parton distributions. Longitudinal spin correlations between quarks and antiquarks thus directly affect the rate of gauge boson pair production by multiple scattering.
The distribution F jj ′ δq,δq gives rise to even more striking effects. Its decomposition into scalar functions contains a term with δ jj ′ , which describes a correlation proportional to the product s 1 s 2 of the two transverse quark polarization vectors. Recall that the quark field bilinearsqiσ +j γ 5 q defining the distribution F jj ′ δq,δq are chiral odd, which corresponds to quarks with opposite helicities in the amplitude and its conjugate. As a result, transverse quark polarization does not contribute to the production of W bosons. There is however a term with F jj ′ δq,δq F kk ′ δq,δq in the cross section for producing two neutral gauge bosons, γ * γ * , γ * Z or ZZ. This term gives in particular rise to an angular correlation proportional to cos 2ϕ, where ϕ is the azimuthal angle between the momenta of the negatively charged leptons or the quarks from the boson decays. We thus obtain the important result that correlations between transverse quark and antiquark polarizations give rise to a correlation between the decay planes of the two produced bosons. Moreover, this type of correlation does not arise if the two bosons are produced by a single hard scattering such as in Fig. 1b, where for q T ≪ Q one obtains an angular modulation with cos ϕ but none with cos 2ϕ.
It is natural to expect that spin correlations between partons in the proton also result in correlations between the scattering planes in other double-scattering processes, such as the production of two dijets. We note that in [32] the absence of such correlations was explicitly assumed, which in view of our discussion may not be adequate. It would also be interesting to study whether polarization induced angular correlations can contribute to the so-called "ridge effect" observed in pp collisions by CMS [33].

Mellin moments
If one takes Mellin moments dx 1 x n1 1 dx 2 x n2 2 ( 1 F ) of color singlet distributions, the light-cone operators O a (y, z) in (5) turn into local twist-two operators that are well known from the operator product expansion. (As follows from the discussion in Section 7, the corresponding operators for color octet distributions still contain Wilson lines and will not be discussed here.) Let us consider two unpolarized quarks and take the lowest Mellin moment in both x 1 and x 2 , We introduce the local operator O µ (y) =q(y)γ µ q(y) and decompose where the ellipsis represents terms with uncontracted vectors y µ , y ν or the metric tensor g µν . The reduced matrix element O O can only depend on the invariants py and y 2 . We now choose a frame where p = 0 and y + = 0, so that py = p + y − and y 2 = −y 2 . Using O q (y, 0) = 1 2 O + (y) | y + =0 we can then write (37) in a manifestly covariant form M q,q (y 2 ) = d(py) O O (py, y 2 ) It is straightforward to generalize this procedure to higher Mellin moments and to the other quark polarizations ∆q and δq. In each case one can express the Mellin moment in terms of the matrix element of a product of two local twist-two operators. The matrix element in (39) can be evaluated on a lattice in Euclidean spacetime if one takes y 0 = 0. This is rather similar to recent lattice studies of transversemomentum dependent single-quark distributions [34,35]. The restriction to y 0 = 0 entails where p and y denote the spacelike three-vectors. Results from a discrete Euclidean lattice are hence not sufficient for evaluating the integral over all py at fixed y 2 in (39). This is completely analogous to the situation discussed in [35]. Despite this limitation of principle, we hope that lattice data in a certain range of py and y 2 will one day provide genuinely nonperturbative information about the behavior of multiparton distributions.

Connection with generalized parton distributions
To develop a viable phenomenology of multiple interactions, one needs a simple ansatz for multiparton distributions that can be progressively refined. For the distributions which admit a probability interpretation, a natural starting point is to replace them by the product of single-parton densities.
To formalize and extend this ansatz, we insert a complete set of intermediate states |X X| between the operators O a2 and O a1 in the definition (20) of twoparton distributions in impact parameter space. This gives a product of single-parton operators sandwiched between a proton state and X. If we assume that the ground state dominates in the sum over all X and take the intermediate proton states in the impact parameter representation (19), then F (x i , z i , y; b) involves a product of single-proton matrix elements Integrating the phase factor e iy − (p ′ −p) + over y − sets p ′+ = p + in (20), and we obtain a representation in terms of single-quark distributions in impact parameter space. These distributions satisfy in analogy to (20). For z = 0 they are the impact parameter dependent parton densities f a (x; b) discussed in [22,23] and give the probability to find a parton with momentum fraction x and distance b from the proton center. At z i = 0 the relation (42) involves only collinear distributions. Integrating over b we get which has a straightforward physical interpretation as sketched in Fig. 5. In different guises, this relation is at the basis of most phenomenological studies and has long been used in the literature, see e.g. [36,37,38] and [39,40]. As was noticed in [26], the convolution in (45) simplifies to a product if one Fourier transforms to transverse momentum space, where one has In the previous argument we have neglected the spin of the proton. When inserting intermediate proton states between O a2 and O a1 in a two-parton distribution for an unpolarized proton, we schematically have with proton helicities λ and λ ′ . The sum on the r.h.s. includes matrix elements where the proton helicity differs in the bra and the ket state. As an example let us consider the collinear distribution for two unpolarized quarks. We then find where the first term corresponds to λ = λ ′ and the second one to λ = −λ ′ in (47). The generalized parton distributions (GPDs) H q (x, ξ, t) and E q (x, ξ, t) can be studied in hard exclusive processes such as deeply virtual Compton scattering or vector meson production.
For their definitions and further information we refer to [41,42]. We emphasize that the relations (42), (45) and (46) are obtained by restricting a sum over all intermediate states to a single proton in a selected helicity state. We do not have a justification or a strong physical motivation for this restriction, other than stating a posteriori that it is tantamount to neglecting any correlation between two partons in the proton. It seems plausible to assume that this is a reasonable first approximation, at least in a certain region of variables, but one should not expect such an approximation to be very precise. The result (48) already goes beyond the complete neglect of correlations, and the relative size of the term depending on e q may be taken as an indicator for the level of precision of the simplest factorized ansatz. Possible deviations from (45) and their consequences for multiple scattering cross sections have e.g. been discussed in [40,43,44].
x 1 x 2 Figure 5: Illustration of the approximation (45) of a two-parton distribution in terms of single-parton distributions. Implicit in the figure is the representation of these distributions as squares of light-cone wave functions in the proton.
So far we have discussed the distributions 1 F a1,a2 , which can be interpreted as probabilities or pseudoprobabilities. The formal derivation we have sketched can however be extended to distributions that have interference character. Repeating the above steps for the distributions appearing in Fig. 3b, one obtains matrix elements of operatorsdΓu orūΓd that transfer isospin. For a proton target, the ground state among the intermediate states inserted between the two operators is then a neutron. Isospin symmetry relates the resulting matrix elements as n|d Γu|p = p|ūΓd|n = p|ūΓu|p − p|d Γd|p . Similar relations for strange quarks assume flavor SU (3) symmetry and can be found in [41,42].
Inserting physical intermediate states is useful between the color singlet operators appearing in 1 F but not between the color octet operatorsq Γt a q in 8 F . One can however repeat the preceding construction for the distributions 1F defined in (28). In that case the two partons represented by a color singlet operator carry different plus-momentum fractions x 1 and x 2 . This implies p ′+ = p + in the resulting product of proton matrix elements, which are therefore associated with GPDs at nonzero skewness ξ = ± (p − p ′ ) + /(p + p ′ ) + . Likewise, the rearrangement of fields to color singlet operators in the interference distribution (25) leads to GPDs with nonzero ξ. We note that in the exclusive processes where GPDs can be measured, one always has ξ = 0 because the scattered proton must have a smaller momentum than the proton target.
GPDs give rather direct information about the distribution of partons in impact parameter b, which is Fourier conjugate to a measurable transverse momentum ∆ in physical processes. By contrast, the interparton distance y in multiple interactions is integrated over in the cross section formula (10) and not directly related to any measurable kinematic distribution. Although the connection between GPDs and multiparton distributions is not exact, it provides an opportunity to obtain some quantitative information about impact parameter dependence from an independent source.
An important example is the correlation between impact parameter and longitudinal momentum of partons, for which there are clear indications in GPDs (see Section 4.4.5 of [45] and references therein) and which has recently been studied for multiparton interactions in [46].

Ladder graphs
The factorization formulae in Section 2 are for kinematics where the transverse boson momenta are much smaller than the large scale Q. This includes but is not limited to the region where q T is comparable to a hadronic scale Λ. In this and the following section we consider the region of intermediate transverse momenta Λ ≪ q T ≪ Q. If q 1 and q 2 are large compared with Λ then at least some of the transverse momenta in the two-parton distributions must be large as well. This opens the way for a perturbative description where the transverse-momentum dependent distributions factorize into collinear distributions and a hard parton-level scattering. This increases the predictive power of the theory, given that collinear two-parton distributions depend on fewer variables and are less numerous. Note that the scale q T of the hard scattering in the two-parton distributions is still much smaller than the scale Q of the multiple hard scattering in the proton-proton collision.
The type of graphs that describe the hard scattering in F (x i , k i , r) depends on the relative size of the transverse-momentum arguments. For instance, the single-ladder graph in Fig. 6 gives large k 1 , whereas a double-ladder graph with an additional gluon exchanged between the quarks with momentum fraction x 2 leads to large k 1 and k 2 . In both cases no hard parton is exchanged between the lines with momentum fraction x 1 and those with momentum fraction x 2 , so that r is of order Λ. This corresponds to an interparton distance y of hadronic size.
The computation of the graph in Fig. 6 proceeds l 1 − 1 2 r k 1 − 1 2 r k 2 + 1 2 r k 2 − 1 2 r k 1 + 1 2 r Figure 6: Ladder graph for a two-parton distribution at large k 1 and small k 2 and r.
exactly as for single-quark distributions at large transverse momentum and can e.g. be found in [11,47,48]. One obtains where P has a logarithmic dependence on k 2 1 which we have not displayed for brevity. We have integrated the distribution over k 2 since this turns out to be the quantity that appears in the cross section at large q 1 and q 2 . The indices J and J ′ indicate that one has a matrix structure if transitions between gluon and quark distributions are permitted by the quantum numbers. Quark-gluon transitions occur for the combinations where the second parton index a can indicate an unpolarized or polarized quark or antiquark. If a indicates a gluon, one should replace 8 F q,a + 8 Fq ,a by S F q,a + S Fq ,a and 8 F q,a − 8 Fq ,a by A F q,a − A Fq ,a . Taking into account the color factors in the definitions (26), (31) and (33), we find splitting matrices for N colors and n F quark flavors, where the functions P qq (z), P qg (z), etc. are defined without color factors and differ from the usual DGLAP splitting functions at most by terms proportional to δ(1 − z). The color factors in (51) agree with those in [49] if one restores a missing factor 2/N in the expression of P 8f , eq. (54b) of that paper. This factor is also required for the normalization of P 8f as a projector in eq. (53) of [49]. We see that color factors are always smaller in the octet channels than in the singlet channel, with the biggest suppression occurring for P qq . In the large-N limit the singlet matrix 1 P has eigenvalues N P gg and N 2 P qq , whereas for both S P and A P one of the eigenvalues is N 2 P gg and the other is of order 1. This suggests a relative suppression of color octet distributions at large transverse momentum, but quantitative estimates are needed to assess how important this suppression is.
Quark-gluon transitions do not occur for the combinations q [ 1 F q,a − 1 Fq ,q ], for the difference of distributions for two quark flavors, and for matrix elements where the quark flavors differ on the two sides of the final-state cut as in Fig. 3b. In these cases the indices J and J ′ in (49) take only one value, with the splitting function P being C F P qq for color singlet and − 1 2N P qq for color octet combinations. Furthermore, there is no transition between gluons and the interference distributions in Fig. 3a. From the experience with single-parton distributions one can expect that at low x 1 and high k 1 those combinations of quark and antiquark distributions will dominate that receive a contribution from gluons on the r.h.s. of (49).
The splitting matrices for longitudinally polarized quarks and gluons contain different splitting functions ∆P qq , ∆P qg , etc. but have the same color structure as (51). No transitions occur between δq, δq and δg, since the former describe parton helicity flip by one unit and latter by two units. A relation similar to (49) can be derived for double-ladder graphs and involves the same splitting matrices as above.
We finally note that for q T ≫ Λ the contributions from ladder graphs lead to a power behavior of the double-scattering cross section scaled with s 2 .

Parton splitting
We now consider the case where |r| ∼ |k i | is large, which corresponds to perturbatively small distances |y|. Power counting analysis shows that the dominant contributions to the double-scattering cross section are k 1 − 1 2 r k 2 + 1 2 r k 2 − 1 2 r k 1 + 1 2 r k 1 + k 2 Figure 7: Lowest-order graph for a quark-antiquark distribution at large |r| ∼ |k i |. The two partons with momentum fractions x 1 and x 2 originate from the splitting of a single gluon.
from graphs where two partons in the t channel split into four partons. A lowest-order example for F a1,ā2 is shown in Fig. 7. At next order in α s graphs appear where the hard scattering if fully connected, e.g. with an additional gluon exchanged between the left-and right-hand sides of Fig. 7. The power behavior of the scaled cross section is found to be if in at least one of the colliding protons the two-parton distribution has a fully connected hard scattering. If in both distributions the hard-scattering subprocess is disconnected as in Fig. 8, the scaled cross section behaves like 1/Λ 2 instead, but the final-state phase space is then restricted to |q 1 + q 2 | ∼ Λ. We see that the contribution (52) from ladder graphs at large y is suppressed by Λ 2 /q 2 T compared with the splitting contribution (53) at small y. This suppression may however be compensated by a stronger increase of (52) when x 1 and x 2 become small. It is known that in appropriate quantum number channels the iteration of ladder graphs leads to a growth of parton densities at small momentum fractions. The two-parton distribution at the bottom of Fig. 6 contains one ladder between the lines with momentum fraction x 1 and another between the lines with momentum fraction x 2 , whereas at the bottom of Fig. 7 there is a single-parton density containing only one ladder. To establish whether the Λ 2 /q 2 T suppression or the small-x enhancement of the ladder-graph contributions is more important in given kinematics requires a detailed study.
The calculation of the graph in Fig. 7 reveals a number of important issues. We give here only the contribution from the transverse-momentum dependent unpolarized gluon distribution f g (x, k) and omit terms depending on the gluon Boer-Mulders function [50], which describes the correlation between the transverse momentum and the linear polarization of a gluon. Ink Fā 1 ,a 2 Figure 8: Graph for the cross section where both twoparton distributions F a1,ā2 and Fā 1 ,a2 (indicated by boxes) involve the splitting of one into two partons.
the color singlet channel we find with k = 1 2 (k 1 − k 2 ). For the color octet distributions 8 F a1,ā2 we obtain the same expression times a suppression factor −1/ √ N 2 − 1. We have whereas T a1,ā2 = 0 for the other polarization combinations. We see that several nontrivial and large spin correlations are generated by perturbative splitting. To which extent these correlations persist at nonperturbative values of y is an interesting question from the point of view of hadron structure. With the analog of (7) for the product F a1,ā2 Fā 1 ,a2 one finds that the contribution of Fig. 8 to the cross section for two-boson production is proportional to with q = 1 2 (q 1 − q 2 ) and k ± = k ± 1 2 r. Each integral is infrared finite but has a logarithmic divergence at large k ± . To assess the meaning of these divergences, we first recall that the derivation of the cross section formula (7) and its analogs for other channels assumes transverse parton momenta |k i ± 1 2 r| ≪ Q. As it turns out, the integrand in (56) does not decrease fast enough to suppress the region of large k ± , where the approximations leading to (7) are invalid. On the other hand, we note that Fig. 8 can also be read as a graph for producing two gauge bosons V 1 and V 2 in a single hard-scattering process. For the parton level amplitude it represents a box graph, which indeed diverges logarithmically in the ultraviolet. The full amplitude for gg → V 1 V 2 is however finite, because the divergences of all contributing box graphs cancel each other. This was noted in [51] and has long been known for the related process γγ → γγ with on-shell photons [52]. A detailed analysis of the infrared behavior of gg → V 1 V 2 is given in [51], and explicit expressions for the gg → Z Z amplitude can be found in [53].
Since the definition of two-parton distributions we have used so far includes the splitting contribution (54), this definition is not appropriate for the cross section formula (7), and one or both of them needs to be modified. It remains a task for future work to devise a consistent formulation that is also suitable for practical computations. Such a formulation must obviously avoid the divergent integrals in (56). In addition it must address the problem of double counting in Fig. 8 when the contributions from single and double hard scattering are added in the cross section. The analogous double counting problem for multijet production was pointed out in [54].

Collinear distributions
We have encountered collinear two-parton distributions in two different contexts, firstly in the cross section for multiple scattering when final-state momenta are integrated over, and secondly in the calculation of two-parton distributions at high transverse momentum. In this section we discuss the scale evolution of collinear distributions at leading order in α s . We focus on the color singlet sector, which is most similar to the case of single-parton densities. (As noted at the end of Section 7, collinear color octet distributions involve further complications related with rapidity divergences.) Integrating F (x i , k i , r) over the transverse momenta k i gives logarithmic divergences, which is evident from the factor 1/k 2 1 in (49). The definition of collinear distributions requires subtraction of these divergences, in full analogy to the case of single-parton distributions. We now discuss the contribution of the splitting graph in Fig. 7 and consider unpolarized quarks for definiteness. The expression (54) behaves like 1/k 2 for large k and thus gives a logarithmically divergent integral. This divergence needs to be subtracted as well. Notice that the four quark lines in Fig. 7 are far off-shell if one of the transverse momenta r and k is large. This implies that (54) describes not only the large r behavior of F a1,ā2 (x i , k i , r) but also its behavior for large k at small r.
Calculating the graphs in Figs. 6 and 7 in 4 − 2ǫ dimensions, performing MS subtraction of the ultraviolet divergent terms, and adding the contributions from self-energy graphs, one obtains the evolution equation which is valid for any value of r. Here P b1,b2 denotes the usual DGLAP splitting functions (including color factors). Remarkably, the y dependent distributions F (x i , y) evolve differently. Working in 4 − 2ǫ dimensions, one finds that for nonzero y has the finite value 1/y 2 at ǫ = 0 and hence requires no ultraviolet subtraction. One thus obtains a homogeneous evolution equation This result is very natural if one considers that F a1,ā2 (x i , y) is defined in terms of the product of two operators with light-like field separations z 2 1 = z 2 2 = 0, where we have omitted the Wilson lines between the fields for brevity. Each of the two operators needs to be renormalized in the same way as in a single-parton density, but as long as the operators are taken at a finite spacelike distance y 2 = −y 2 no further short-distance singularities appear. It has long been known that evolves as in (57), see [55,56,57] and the recent studies [58,59]. In position space representation, the inhomogeneous term now appears because the behavior at small y induces a logarithmic divergence in the integral over y, which needs to be regularized as indicated by the subscript "reg" in (61). We note that F (x i ) does not directly appear in double-scattering cross sections but is often used as an intermediate quantity for modeling the distributions F (x i , y).
We have already seen in (56) that the contribution of splitting graphs to the multiparton distributions leads to logarithmic divergences in the differential cross section. An even worse divergence appears in the cross section integrated over q 1 and q 2 , which according to (7) and (10) involves an integral This is linearly divergent in y 2 according to (62). It also diverges linearly in r 2 , because for large r one has F (x i , r) ∼ log(r 2 /µ 2 ). The extra ultraviolet subtraction included in the definition of F (x i , r) is insufficient to regulate this divergence in the cross section. In the recent paper [60] a finite result is obtained by imposing a cutoff r 2 < min(q 2 1 , q 2 2 ) in (63). In this context we wish to comment on an ansatz often made in phenomenological studies, where the y dependent two-parton distributions are written as with a smooth function f (y) satisfying the normalization condition d 2 y f (y) = 1. A typical choice for f (y) is for instance a Gaussian. This type of ansatz is obviously inconsistent if F (x i , y; µ) is defined from the product (60) of twist-two operators, since the µ dependence of the l.h.s. is then given by the homogeneous evolution equation (59) whereas the µ dependence of the r.h.s. is governed by an inhomogeneous evolution equation as in (57). If one instead defines the y dependent distribution as the Fourier transform of F (x i , r), then the ansatz (64) is consistent regarding evolution since F (x i , r) evolves according to (57). However, we do not think that is a satisfactory definition, since it does not cure the divergence of the integral (63) in double-scattering cross sections. An ansatz like (64) with a smooth function f (y) does not have this problem and may be regarded as modeling a y distribution in which the perturbative splitting contribution giving rise to the 1/y 2 singularity has been removed. Since the ansatz is ad hoc, one cannot say what the correct evolution equation to be used on both sides of (64) actually is. Our discussion suggests that the homogeneous form (59) may be more appropriate, at least for values y of typical hadronic size, which are of course most important when the ansatz is used in the factorization formula. A theoretically sound solution remains a task for future work.

Sudakov logarithms
As is well known, transverse momenta q i which are much smaller than the hard scale Q of a process give rise to Sudakov logarithms in the cross section. These logarithms must be resummed to all orders in perturbation theory, which for single gauge boson production can be done using the Collins-Soper-Sterman formalism [19]. We extend this formalism to gauge boson pair production in [12] and sketch the main results of our analysis in the following. The dependence of a two-quark distribution on the rapidity parameter ζ defined in (36) is governed by the differential equation where 1 F and 8 F depend on x i , z i , y and ζ. They also depend on a renormalization scale µ, but we need not discuss this dependence here. The kernels G and K in (65) already appear in the Collins-Soper equation [8] for single-quark distributions, The matrix M mixes color singlet and color octet distributions and is µ independent, whereas the µ dependence of G and K is given by a renormalization group equation and thus cancels in G + K. Both K and M are due to soft gluon exchange and can be defined as vacuum matrix elements of Wilson line operators, similar to those discussed in Section 7. They can only be calculated perturbatively if the transverse distances on which they depend are sufficiently small.
The general solution of (65) can be written as with and The scale µ 0 specifies the initial condition of the differential equation (65), with a natural choice being µ 0 ∝ 1 |z 1 ||z 2 |. The leading double logarithms of ζ/µ 0 in (68) come from the second line in (69), whereas terms involving K and M only contain single logarithms. The Sudakov exponent S also appears in the solution of (66) for single-quark distributions [8], We thus obtain the important result that to double logarithmic accuracy the Sudakov factor for a multiparton distribution is the product of the Sudakov factors for single parton densities, both for color singlet and color octet distributions. A non-trivial cross talk between all partons, and in particular a mixing between color singlet and octet distributions occurs however at the level of single logarithms, which are known to be important for phenomenology. When the parton distributions F (x i , z i , y; ζ) are inserted into the cross section formula (13), the typical size of |z i | is 1/|q i | and ζ should be taken of order Q, so that logarithms of ζ/µ 0 turn into logarithms of Q/q T . If all distances z i and y are small, one can calculate K and M in perturbation theory. We give explicit results in [12] and only mention some of their features here. The off-diagonal elements in the matrix e L M turn out to be color suppressed, but only by 1/N . In the limit where |z i | ≪ |y| we find that the off-diagonal elements are additionally suppressed by |z 1 ||z 2 |/y 2 and that the diagonal element for the color octet is smaller than the one for the color singlet. This results is a suppression of octet distributions by 8 F (x i , z i , y; ζ) 1 F (x i , z i , y; ζ) ∼ |z 1 ||z 2 | y 2 λ (72) with a power λ = min 1, N α s π log √ x 1 x 2 ζ µ 0 .
In the cross section one has |z i | ∼ 1/|q i |, so that for large q T the factor (72) disfavors color octet distributions in a wide range of y. To study quantitatively the importance of color-octet suppression, one needs to extend the perturbative result (72) to the region of non-perturbative distances y. This has not been achieved yet.

Conclusions
We have studied several aspects of multiparton interactions in hadron-hadron collisions. Our theoretical framework is hard-scattering factorization, which requires a large virtuality or momentum transfer in each partonic scattering process but is valid in the full range of parton momentum fractions. A complementary approach, based on the high-energy limit and using BFKL methods is discussed in [61,62,63]. The basic cross section formula for multiple interactions can be derived at tree level using standard hard-scattering approximations and has an intuitive geometrical interpretation in impact parameter space. We have shown that it can be formulated at the level of transverse-momentum dependent multiparton distributions, which permits a description of the transverse momenta of the particles produced in the hard scattering. This is particularly important because it is in transverse-momentum dependent cross sections that multiparton interactions are not power suppressed compared with single hard scattering.
The derivation of the tree-level formula for double hard scattering exhibits nontrivial contributions from correlation and interference effects. They have partly been discussed earlier in the literature [27] but are not included in current phenomenological models. The polarizations of two partons can be correlated even in an unpolarized proton, and we have shown that such correlations can significantly affect both the overall rate and the angular distribution of final-state particles in multiple interaction processes. Two-parton distributions have a nontrivial color structure since each parton can carry a different color in the scattering amplitude and its complex conjugate. In addition, there are interference terms in fermion number and flavor as shown in Fig. 3.
To develop a reliable phenomenology, one needs information about the size and kinematic dependences of two-parton distributions. One can relate them to generalized parton distributions for single partons, which are experimentally accessible in exclusive scattering processes, but this requires an approximation whose reliability we cannot quantify. Nevertheless this relation offers some guidance, especially about the interplay between longitudinal momentum and transverse position of partons, as well as the size of twoparton distributions that are of interference nature and hence do not have an intuitive probability interpretation. For transverse parton momenta above a few GeV one can compute transverse-momentum dependent distributions in terms of collinear ones and thus has more predictive power from theory. We find a tendency for a simplified color structure in this regime, but quantitative estimates remain to be done. Finally, we have identified matrix elements closely related to two-parton distributions that are suitable for evaluation in lattice QCD.
To go from tree level to genuine factorization formulae, one must be able to sum certain types of collinear and soft gluon exchanges into Wilson lines. We argue in [12] that for double-scattering processes producing color-singlet particles this can be achieved using the methods that have been successfully applied to single Drell-Yan production [8,9,10,11]. This also permits the resummation of Sudakov logarithms, with important results sketched in Section 14. For multiple jet production the situation is unfortunately more complicated, as serious obstacles to formulating transversemomentum dependent factorization have been identified even for single hard scattering [29]. The production of two electroweak gauge bosons thus emerges as a channel where the current perspectives for developing the theory look best, and where different aspects of multiple interactions can hopefully be explored experimentally at LHC. For phenomenological estimates of W W production we refer to [64,65].
The cross section for double hard scattering involves an integral over the transverse distance y between the two partons emerging from each hadron. This includes the region of small y, where the two partons can originate from the perturbative splitting of a single parton. We find that this mechanism, which also affects the scale evolution of multiparton distributions, leads to serious ultraviolet divergences in the cross section and to a double counting problem between single and double scattering. To modify the definition of multiparton distributions and possibly the cross section formulae is a prerequisite for putting the theory on solid ground.