Dipole Subtraction vs. Phase Space Slicing in NLO NRQCD Heavy-Quarkonium Production Calculations

We compare two approaches to evaluate cross sections of heavy-quarkonium production at next-to-leading order in nonrelativistic QCD involving $S$- and $P$-wave Fock states: the customary approach based on phase space slicing and the approach based on dipole subtraction recently elaborated by us. We find reasonable agreement between the numerical results of the two implementations, but the dipole subtraction implementation outperforms the phase space slicing one both with regard to accuracy and speed.


Introduction 2 Singular cross section contributions
The factorization theorems of QCD and NRQCD imply that the inclusive cross section to produce a heavy-quarkonium state H is given by Here, f a/A (x a ) is the parton distribution function (PDF) describing the probability to find parton a with longitudinal momentum fraction x a inside hadron A. O H [n] is the LDME of NRQCD associated with the intermediate Fock state n, which has N col color and N pol polarization degrees of freedom. p 1 and p 2 are the four-momenta of partons a and b, n col and n pol their color and spin averaging factors. dPS is the phase space and F sym the symmetry factor associated with the outgoing particles. |ab → QQ[n] + X denotes the matrix element of the partonic subprocess ab → QQ[n] + X, which is calculated by applying spin and color projectors to the usual QCD amplitudes as described in Ref. [7]. A summation of spin and color degrees of freedom of the QQ pair and all incoming and outgoing partons is always implicitly understood in the squared amplitudes, but no averaging. At this point, we deviate from the definition of the bra and ket symbols used in Refs. [6,8]. We denote the momentum of the QQ pair as p 0 and set p 2 0 = 4m 2 Q , with m Q being the heavy-quark mass. Our real-correction partonic amplitudes have two further light QCD partons, to which we assign momenta p 3 and p 4 .
In the limit where an outgoing gluon with momentum p j gets soft, the squared production amplitude becomes, for the Fock states considered in our analysis, ; p j ), ; p j ), with S 1 (n; p j ) = g 2 s 4 i,k=1 i,k =j − p i · p k p i · p j p k · p j + p 0 · p i p 0 · p j p i · p j + p 0 · p k p 0 · p j p k · p j − p 2 0 (p 0 · p j ) 2 × n, Born|T i T k |n, Born (4) S 2 (n, m; p j ) = 4g 2 where |m, Born is the Born amplitude of QQ[m] production without the soft gluon. T i acts on |m, Born by inserting at the corresponding place T c if parton i is an outgoing quark or incoming antiquark, −T c if parton i is an incoming quark or outgoing antiquark, and if abc if parton i is a gluon, where c, a, and b are the color indices of the soft, splitting, and other outgoing gluons, respectively. T Q inserts T c at the place of the outgoing heavy quark Q, T Q inserts −T c at the place of the outgoing heavy antiquark Q, with c being the color index of the outgoing gluon attached to the Q or Q lines. ǫ(m l ) is the polarization four-vector of the QQ[m] state with m l being the quantum number of the z component of its orbital angular momentum.
In the limit where an outgoing light parton with momentum p j becomes collinear to an incoming light parton with momentum p i , its main contribution stems from Feynman diagrams where parton i splits into j and a parton with momentum p (ij) = p i − p j taking away the fraction x of the incoming parton's longitudinal momentum. The squared matrix element in that limit is given by |p j ini. coll. p i 2 = n col (i) n col ((ij))n pol ((ij)) whereP i,(ij) (x, p ⊥ ) are the spin-dependent Altarelli-Parisi splitting functions as given in Eqs. (39)-(42) of Ref. [7] with p ⊥ being the residual transverse component of p (ij) . Thê P i,(ij) (x, p ⊥ ) functions depend on the spin indices s and s ′ or the polarization indices µ and ν of parton i. The squared amplitude in the limit where the outgoing partons 3 and 4 are collinear is given by those Feynman diagrams where a final-state parton with momentum p (34) = p 3 + p 4 splits into the outgoing partons 3 and 4, and reads where The phase space integrations in D = 4 − 2ǫ dimensions yield 1 ǫ and 1 ǫ 2 poles, which are canceled by similar poles in the virtual corrections, by the mass factorization counterterms, and by the operator renormalization counterterms: A part of the initial-state collinear singularities is absorbed into the PDFs according to the MS prescription, thereby leading to mass factorization counterterms, where µ r is the renormalization scale, µ f is the QCD factorization scale, and P + a,(ij) (x) are the regularized Altarelli-Parisi splitting functions as listed in Ref. [7]. The singularities of the S 3 part of the soft singularities are canceled by NLO corrections to LDMEs, where ultraviolet singularities are removed by MS renormalization. These operator renormalization contributions are, for the Fock states relevant to our analysis, given by with and with where µ Λ is the NRQCD factorization scale.

Phase space slicing implementation
Our implementation of phase space slicing follows the lines of Ref. [4]. Here, the realcorrection phase space is split into three regions by introducing two cut-off parameters, δ s and δ c : The soft region, where p 3 or p 4 is soft, the hard-collinear region, where p 3 and p 4 are hard and p 3 or p 4 is collinear to another massless parton, and the hard-noncollinear region. The condition of p j being soft is defined by δ s > 2E j / √ s with E j the energy component of p j in the center-of-mass frame of p 1 and p 2 , and the condition of p i being collinear to p j by δ c > |2p i · p j |/ √ s with s = (p 1 + p 2 ) 2 . Since the hard-noncollinear region is free of singularities, the phase space integration is done there numerically, while, in the other two regions, the phase space integrations are done analytically in D = 4 − 2ǫ dimensions. This is possible because not only the squared matrix elements factorize as described above, but also the phase space elements factorize as dPS 3 = dPS 2 dPS p j , where p j is soft, and dPS 3 = dPS 2 dPS p i p j , where p i is collinear to p j . Here, dPS 3 is the phasespace factor of the process p 1 + p 2 → p 0 + p 3 + p 4 , dPS 2 is the phase-space factor of the Born process corresponding to the respective soft or collinear limit, and The dependencies of all contributions on δ s and δ c cancel in the sum, as long as δ s and δ c are chosen small enough.

Hard-collinear part
Integrating Eq. (7) analytically over the hard-collinear phase space part and adding the corresponding contribution of the mass factorization counterterm in Eq. (9), we obtain in the limit δ s → 0 where the Born amplitude |Born is defined with an incoming momentum p (ij) = xp i instead of p i . δ j,g is 1 if particle j is a gluon, otherwise 0. F in,i→(ij),j are given by with C A = 3, C F = 4/3, n f is the number of light, active quark flavors, and C ǫ = (4πµ 2 r e −γ E /m 2 Q ) ǫ . Furthermore, P i,(ij) and P ′ i,(ij) are the O(ǫ 0 ) and O(ǫ) parts of the spin-averaged splitting functions, namely Integrating Eq. (8) analytically over the hard-collinear phase space part, we obtain in the limit δ s → 0 with

Soft part: S 1 terms
The integral of the S 1 terms in Eq. (4) over the soft phase space region can be written as with where we define T 0 = T Q + T Q and use T i = − 4 k=0 k =i,j T k , with p j being the soft momentum. Evaluating the integrals I ik following Ref. [4], we obtain with ψ i = −2p 0 · p i , ξ i = −2p (34) · p i , and p (34) = p 3 + p 4 − p j .

Soft part: S 2 terms
The integral of the S 2 terms in Eq. (5) over the soft phase space region is To evaluate the phase space integrals involving p β j , we use the tensor decomposition which leads to the expressions where the p β 0 terms vanish upon contraction with ǫ β (m l ) in Eq. (23). As for Ω i=1 or 2,j , the angular integrals needed to evaluate p j soft dPS p j p i · p j /(p 0 · p j ) 3 and p j soft dPS p j p 3 · p 4 /((p 0 · p j ) 2 (p i · p j )) are not listed in Ref. [4] or the references cited therein. We obtain these by relating the phase space integrals to cut virtual-correction loop integrals and evaluating the latter by means of the integration-by-parts technique [9]. The final results are

Soft part: S 3 terms
The integral of the S 3 terms in Eq. (6) over the soft phase space region is To evaluate the integral involving p α j p β j , we use the tensor decomposition where the p α 0 and p β 0 terms vanish upon contraction with ǫ β (m l ) in Eq. (27). Evaluating the integrals analytically and adding the corresponding operator renormalization counterterm contribution of Eqs. (10) or (12), we arrive at the finite expression

A remark on the tensor decomposition
We note that the tensor decompositions of Eqs. (24) and (28) involve momentum p (34) . A potential pitfall is that p (34) does not appear in the integrand of Eq. (28), nor does it in Eq. (24) in the cases i = 1 or 2, and one might naïvely think that dPS p j only involves momentum p j , of which p (34) is independent in the limit of p j being soft. Thus, one might be led to assume that the structures with p (34) are not necessary in the above-mentioned cases. However, the dependence on p (34) does enter in a more subtle way, via the phase space constraint of p j being soft, which implies to understand the necessity of all structures involving p (34) , one can convince oneself that the corresponding coefficients A 2 , A 4 , A 7 , and A 9 in Eqs. (24) and (28) are indeed nonzero. To this end, one applies the projectors and inserts the results of the soft phase space integrals. By the same token, using a tensor decomposition with the p (34) structures omitted then leads to incorrect results for Ω β i=1 or 2,j and Ω αβ j . Incidentally, this neither spoils the infrared finiteness nor the dependence on δ s , which would have served as a crucial check otherwise. So, this mistake is easily made and more easily overlooked. As already mentioned in Ref. [12], Ref. [11] is affected by this mistake. The numerical effects of this are discernible in Fig. 2 of Ref. [11], where the dashed curves, indicating the 3 P [8] J contributions, are subject to visible deviations. Fortunately, the effects on the physical results in Fig. 1 of Ref. [11] are insignificant, being of the order of the numerical uncertainty. We believe that this easy-to-miss mistake has also creeped into other authors' calculations. In fact, Eqs. (5) and (6) of Ref. [13] only hold if the incorrect tensor decomposition is applied. Furthermore, purposely including this mistake, we are able to approximately reproduce the results shown in Fig. 3 of Ref. [14] and Fig. 2 of Ref. [15], while our correct evaluations significantly differ from these results.

Dipole subtraction implementation 4.1 Summary of dipole subtraction formalism
Our implementation of dipole subtraction, which is based on Refs. [6,8] is explained in detail in Ref. [7]. For the reader's convenience, we recall the main formulas here. We calculate the NLO corrections to our partonic cross sections as where dPS 2 and dPS 3 are the two-and three-particle phase space factors. The latter factorize in some way as dPS 2 dPS dipole or dPS 2 dxdPS dipole , where dx matches its counterpart in Eq. (9). dσ real , dσ virtual , dσ MFC , and dσ op. ren. are the real-correction contributions, the virtual-correction contributions, the mass factorization counterterms, and operator renormalization counterterms, respectively. The subtraction term dσ subtr is given by This term is defined in terms of 2 → 3 kinematic variables in the same way as dσ real and constructed so that it matches dσ real in all singular limits: We call each term in the sum a dipole. The dipoles in the first and second lines of Eq. (34) reproduce dσ real in the initial-and final-state collinear limits as well as the S 1 part of the soft limits, while the dipoles in the last two lines reproduce the S 2 and S 3 parts of the soft limits. As for the Born amplitudes in Eq. (34), partons i and j are replaced by (ij), which is a gluon, light quark, or the QQ pair, according to the soft or collinear limits to be approximated. The contribution in Eq. (34) is set to zero if there is no collinear or soft singularity in the considered limit. In the dipoles for the S 2 and S 3 terms, m( 3 P . Since the Born amplitudes are defined in terms of 2 → 2 kinematic variables, we need for each dipole a mapping of the 2 → 3 process momenta to 2 → 2 kinematics momentap i . These momentum mappings are constructed in such a way that the Born amplitudes in Eq. (34) equal their counterparts in the factorization formulae of the respective collinear and soft limits. Furthermore, we definẽ where p A and p B are the momenta of the incoming hadrons. The idea of the subtraction formalism is that the term in the first square bracket of Eq. (32) is devoid of singularities and can, therefore, be integrated over numerically in four dimensions. On the other hand, the various V terms in Eq. (34) are defined only in terms of those kinematic variables that we use to parametrize dPS dipole , and are sufficiently simple to be integrated analytically over dPS dipole . The resulting poles in ǫ then render the second square bracket in Eq. (34) finite and ready for numerical integration over dPS 2 and dx.
In practice, we need to produce predictions involving experimental cuts on the transverse momentum p T and the rapidity y of the heavy quarkonium, see for example the low-p T cut in Eq. (32). The implementation of these phase space cuts is, however, unproblematic, too, since, in the first square bracket of Eq. (32), p T andp T coincide in the singular limits and, in the second square bracket, the θ function stands outside the analytic dPS dipole integration. Similarly, y andỹ coincide in the singular limits, too. Table 1 indicates where one can find the analytic expressions for the various V terms, their counterparts upon integration over the respective phase space dPS dipole , and their respective momentum mappings. As for the V terms of the Catani-Seymour [6] and Phaf-Weinzierl [8] papers, V ini,S 1 ij,k equals V ij k or V ij,k and V fin,S 1 ij,k equals V ij,k or V k ij in their notations. Indices s and s ′ or µ and ν within V ij,k are to be contracted with the open polarization indices of particle (ij) in the corresponding Born amplitude. Figure 1 summarizes all dipole terms according to their corresponding singular limits in a graphical form.

Organization in terms of computer codes
In this section, we briefly describe how we implement the dipole subtraction procedure in our computer codes, emphasizing those parts which differ from our implementation of phase space slicing. All necessary Born diagrams are created with FeynArts and then treated by a Mathematica script which inserts the color operators T in the various combinations needed and applies the color-singlet and color-octet projectors to evaluate all color factors in the squared amplitudes. These color factors, together with the yet unsquared amplitudes, are then passed to two FORM scripts, CalcDipoles and CalcDipolesInteg. For these two routines, we have prepared an input card which encodes the information of Figure 1.
CalcDipoles generates the FORTRAN routines encoding the dipole terms. This is done by squaring the Born amplitudes, written in terms of thep i momenta and with the respective color insertions, and multiplying them by the necessary factors, in particular V ij,k , taking into account the spin correlations in the case of splitting gluons. Then, the respective momentum mappings are implemented, the resulting expressions are simplified, and the FORTRAN routine AMP2 Dipoles is generated, which takes as arguments the number of the dipole and the partonic 2 → 3 kinematic variables.
Similarly, CalcDipolesInteg squares the Born amplitudes with the respective color insertions, and then multiplies the finite parts of the V terms integrated over the dipole phase space dPS dipole . At this point, the mass factorization and operator renormalization counterterms are included, as described in Ref. [7]. The integrated dipoles have the general form [h(x)] + f (x) + g(x), where h(x) is singular in the limit x → 1. The generated FORTRAN function AMP2 DipolesInteg takes as arguments the number of the dipole, the 2 → 2 kinematics variables, and the value of x. A second function, AMP2 Dipoles-IntegSubtr, which only contains the terms h(x)f (1), is generated as well.
Together with the FORTRAN functions for the virtual-and the real-correction squared amplitudes, we then have the ingredients for the numerical phase space integrations in the main FORTRAN code. The θ functions constraining the 2 → 3 particle phase space, such as θ(p T − p T,min ) in Eq. (32), have to be implemented for each dipole individually with the respective momentum mapping. The x integrations over the plus distributions are thereby explicitly implemented as

Numerical tests of the integrated dipole terms
In Ref. [7], we have already described one numerical test of our dipole subtraction implementation, namely we have shown that our expressions for the dipole terms agree with the real-correction contributions in the corresponding singular limits. Here, we describe a further internal test. This time, we test the expressions of the integrated dipole terms. We do this by evaluating the phase space integrals, of specific dipole terms plus the corresponding mass factorization and operator renormalization counterterms in two different ways and comparing the results. In the first mode of evaluation, we use the results of the expressions implemented in CalcDipolesInteg and integrate them numerically over dPS 2 or dPS 2 dx. In the second mode of evaluation, we separate the three-particle phase space as in the phase space slicing implementation according to the slicing parameters δ s and δ c . For the contributions from the soft and collinear regions, we take the respective analytic limits of the dipole terms, integrate them analytically over the corresponding phase space, dPS p j soft or dPS i j , add the corresponding mass factorization and operator renormalization counterterms, and then do the integrations over dPS 2 or dPS 2 dx numerically. For the contribution from the hard-noncollinear region, we integrate the expressions for the dipole terms as encoded in CalcDipoles numerically over dPS 3 . Both contributions are then combined to yield the final results of the second mode of evaluation. We perform these tests with groups of one, two, or three dipoles in order to simplify the analytic integrations of the soft limits in the second version. We have successfully tested all the dipoles in this way. Typical examples are presented in Table 2.  J ] + g with dipoles 8 and 13 and for g + d → cc[ 3 P [8] J ] + d with dipole 243 using the implementations of dipole subtraction and phase space slicing, for n f = 3, α s = 1/(4π), µ f = 0.5 GeV, m Q = 0.2 GeV, (p 1 + p 2 ) 2 = 100 GeV 2 , p T,min = 2 GeV, δ c = δ, and δ s = δ/1000 with variable value of δ. For a given value of δ (first column), the soft and collinear parts (second column), the hard-noncollinear parts (third column), and their sum (fourth column) are listed. The quoted errors are the numerical-integration uncertainties.

Comparison of phase space slicing and dipole subtraction methods
In Tables 3-6, we compare our dipole subtraction and phase space slicing implementations. We calculate at NLO total cross sections, including the Born contributions, of inclusive charmonium production in proton-antiproton collisions at typical center-of-mass-energies for selected bins of transverse momentum and rapidity. As in Refs. [10,11,12], we treat the first n f = 3 quark flavors as massless, take the heavy-quark mass, defined in the on-shell renormalization scheme, to be m Q = 1.5 GeV, use set CTEQ6M [16] of proton PDFs, evaluate α s = α 0 ) = 1 GeV 5 . In the phase space slicing implementation, we choose the cut-off parameters to be δ s = δ and δ c = δ/1000, vary δ from 10 −5 to 10 −2 , and take the evaluation with δ = 10 −3 as default to be compared with the results obtained using dipole subtraction.
From Tables 3-6, we observe that the results obtained in the selected bins using the two implementations numerically agree at the level of about 10%, in line with the uncertainty inherent in the application of the phase space slicing method. Besides being more accurate, the dipole subtraction implementation is also typically much faster than the phase space slicing implementation. The reason for that is that, in the phase space slicing implementation, there is usually a much stronger numerical cancellation between the contributions from the analytic and numerical integrations than in the dipole subtraction implementation, necessitating a higher relative accuracy in the numerical integrations. However, this advantage is to some extent compensated by the fact that the θ functions in the first term of Eq. (32) cut out very different phase space regions and so worsen the convergence of the numerical Monte-Carlo integrations in the dipole subtraction implementation. Nevertheless, we observe that our dipole implementation achieves a final accuracy of 1% typically 2 to 6 times faster than the phase space slicing implementation.

Summary
In this article, we have reviewed the singularity structure of NLO NRQCD calculations of the production of heavy-quark pairs in S and P wave states and provided details of our phase space slicing implementation thereof. Thereby, we have identified a common mistake in the literature. Furthermore, we have summarized the dipole subtraction formalism for such calculations, which we have recently developed in Ref. [7], added details about its implementation in terms of computer codes, and performed internal numeric tests. Finally, we have extensively compared our two implementations numerically and found reasonable agreement. As expected, the dipole subtraction implementation outperforms the phase space slicing implementation both with regard to accuracy and speed.