Higgs production in gluon fusion beyond NNLO

We construct an approximate expression for the cross section for Higgs production in gluon fusion at next-to-next-to-next-to-leading order (N$^3$LO) in $\alpha_s$ with finite top mass. We argue that an accurate approximation can be constructed by exploiting the analiticity of the Mellin space cross section, and the information on its singularity structure coming from large N (soft gluon, Sudakov) and small N (high energy, BFKL) all order resummations. We support our argument with an explicit comparison of the approximate and the exact expressions up to the highest (NNLO) order at which the latter are available. We find that the approximate N$^3$LO result amounts to a correction of 16% to the NNLO QCD cross section for production of a 125 GeV Higgs at the LHC (8 TeV), larger than previously estimated, and it significantly reduces the scale dependence of the NNLO result.


Introduction
The dominant Higgs production mechanism at the LHC is gluon fusion via a heavy fermion loop (mainly a top quark) [1], and indeed the recent announcement of the discovery of a Higgs-like particle [2,3] is largely based on events in this channel.In view of this, an accurate determination of the cross section in this channel is of great interest.Next-to-leading order (NLO) corrections to the inclusive cross section, originally computed in Refs.[4,5] in the large top mass (m t → ∞) approximation, and in Ref. [6] for general m t are known to be as large as the leading order, and the NNLO corrections (first computed in Refs.[7][8][9] in the m t → ∞ limit and for finite top mass in Refs.[10][11][12][13][14][15]) about half as large as the leading order.The significant scale dependence of the NNLO result suggests that corrections at yet higher orders are not negligible: in fact they currently account for half or more of the uncertainty on the theory prediction for the cross section [16] (the other half being due to parton distributions and the strong coupling).
While computations of the full N 3 LO correction to the cross section are in progress [17][18][19], it is interesting to derive approximate expressions for it.Several of us have argued (see e.g.[20][21][22]) that accurate approximations to partonic cross sections may be obtained from knowledge of their N space singularity structure, both at finite perturbative order, and at the resummed level.Because the N → ∞ singularity and the rightmost singularity at finite N are known to all orders in α s respectively from threshold (Sudakov) and high energy (BFKL) resummation, if this is indeed the case it is possible to construct reliable approximations even to very high orders in α s .The possibility of constructing approximations based on the combination of results from large and small N resummation has also been considered in [23,24].
In this paper, we will pursue this idea in the context of Higgs production in gluon fusion: we will determine the dominant small N and large N singularities up to N 3 LO from resummation arguments, and, after testing our methodology against known results up to NNLO, we will use them to construct a N 3 LO approximation.

The partonic cross section and its singularities
The factorized Higgs production cross section is where L ij (z, µ 2 ) are the parton luminosities We introduce coefficient functions C ij , defined as σij z, m 2 H , α s (µ 2 R ), where σ 0 is the leading order (LO) partonic cross section, so that the coefficient function is normalized to δ(1 − z) at leading order: and for simplicity, we have suppressed the dependence on renormalization and factorization scales µ F , µ R .In the sequel, we will concentrate on the gluon fusion subprocess, while the contribution from other subprocesses will be only briefly discussed in Section 4, so in most of the discussion below we will drop the parton indices ij, and assume that both the coefficient function and luminosity refer to the gluon channel.
Because the cross section Eq. (2.1) is a convolution, its Mellin transform factorizes in terms of the Mellin space luminosity and coefficient function, respectively defined as according to σ(N, m 2 H ) = σ 0 m 2 H , α s L (N )C(N, α s ). (2.7) While in momentum space the coefficient functions are distributions, if the Mellin transform integral has a finite convergence abscissa, the N space coefficient function is an analytic function of the complex variable N , given by the integral representation Eq. (2.6) to the right of the convergence abscissa, and by analytic continuation elsewhere.Therefore, it is fully determined by knowledge of its singularities.
The singularity structure of the perturbative expansion of C(N, α s ) is relatively simple.At any perturbative order, the rightmost singularity is a multiple pole located at N = 1 [25], with further multiple poles along the real axis at N = 0, −1, −2, . .., with residues of order one (this is also what is found in all known fixed order calculations); Re N = 1 is the convergence abscissa of the Mellin transform, and as N → ∞, C(N, α s ) grows as a power of ln N .While knowledge of the residues of all poles is required in order to fully determine the function C(N, α s ), its behavior in the physical region 1 ≤ Re N < ∞ is mostly controlled by the residues of the leading (rightmost) pole at N = 1, together with that of the singularity at infinity.Both are known from resummation: Sudakov (soft gluon) resummation determines to all orders in the strong coupling the coefficients of the ln m N terms which control the behavior as N → ∞, while BFKL (high energy) resummation determines the residues of the leading 1 (N −1) n multiple poles.This suggests that an approximation of the coefficient function Eq. (2.6) may be constructed by simply combining the large N (soft) and small N (high energy) terms, .This is clearly nontrivial: for example, a term proportional to ln m N has a cut at N = 0, while at each fixed order the expected behavior of the coefficient function is a pole, rather that a cut.So the approximate expressions for C soft and C h.e. should reproduce this behavior, with no spurious singularities.
We will show in the sequel that an approximate expression of the form of Eq. (2.8) is possible, but both C soft and C h.e. will have to be carefully constructed.Indeed we will now show that constructing C soft in such a way that the small N singularity structure is preserved, the agreement at large N is considerably improved.This result may seem surprising, but it is in fact a consequence of analyticity.

Large N
We first discuss the computation of the large N (soft) part of the coefficient function.All contributions to C(N, α s ) which do not vanish as N → ∞ may be computed from Sudakov resummation, using techniques summarized long ago in Ref. [26].The resummed coefficient function has the form with Inclusion of all g i with 1 ≤ i ≤ k + 1 and of g 0 up to order α k−1 s gives the next k -to-leading log approximation to ln C res (N, α s ); it determines the coefficient of all contributions to the coefficient function of the form α n s ln m N with 2(n−k)+1 ≤ m ≤ 2n.This can be extended to 2(n−k) ≤ m ≤ 2n by also including the order α k s contribution to g 0 .The functions g 1 , g 2 and g 3 are known exactly, while g 0 is known up to O(α 2 s ).The function g 4 is only known in part [27][28][29], but the missing information (the 4-loop cusp anomalous dimension) only enters at O(α 4 s ).We can thus determine all large N non-vanishing contributions to C(N, α s ) up to O(α 2 s ), and all logarithmically enhanced contributions (but not the constant) to O(α 3 s ).The accuracy of an approximation to the Higgs production cross section at the LHC based on the dominance of threshold terms can be studied [30] by using the saddle point method to determine which is the region in N space that gives the bulk of the contribution to the cross section.It turns out that, despite the fact that Higgs production at the LHC is far from the kinematic threshold, partly because of the underlying partonic kinematics and partly because of the shape of the cross section, at the LHC with 8 TeV center-of-mass energy, logarithmically enhanced terms are still providing most of the cross section, though the situation gradually changes as the center-of-mass energy increases.However, our goal here is to construct an approximation to the coefficient function which holds for all N in the physical region.Now, it has been observed long ago [31] that the the quality of the soft approximation to the full coefficient function significantly depends on the choice of subleading terms which are included in the resummed result: indeed, while resummation uniquely determines the coefficients of logarithmically enhanced terms, there is a certain latitude in defining how the soft approximation is constructed, by making choices which differ by terms which vanish as N → ∞.A similar situation has been observed recently in Drell-Yan production at the LHC [21], for which the threshold approximation is generally expected to be less good than for Higgs production.By comparing results which differ by terms which vanish as N → ∞, we will now show that several preferred choices for such subleading terms are favored by the requirement that some aspects of the known small N singularity structure of the exact result be reproduced.
In order to outline our strategy, let us work with the simplest example.Let us first suppose that we know the N space resummed coefficient function and that we want to extract from Eq. (2.9) an approximate expression for the O(α s ) coefficient C (1) (z), which is given by [6,32] C (1) ) The constant d and the function R gg (z) are known functions of m H /m t ; in particular R gg (z) is an ordinary function, regular in z = 1, so its Mellin transform vanishes as N → ∞ and therefore its specific form is of no relevance for the large N behavior.
Expanding Eq. (2.9) to O(α s ), and keeping NLL terms, we find C (1)  res (N ) = g 1,2 ln 2 N + g 2,1 ln N + g 0,1 , (2.16) where γ E is the Euler-Mascheroni constant.The asymptotic behavior of the O(α s ) coefficient as N → ∞ is correctly reproduced by this expression, in that lim where C (1) (N ) is the Mellin transform of Eq. (2.12); the constant g 0,1 is fixed by this condition.
On the other hand, the behavior of Eq. (2.16) at small values of N is incompatible with the known singularity structure.In particular, there is a logarithmic branch cut starting at N = 0 which is definitely unphysical, as the exact coefficient function has poles and not cuts at small N .This cut is a subleading singularity, given that the leading singularity is located at N = 1, but close enough to the leading one that the behavior of the coefficient function can be significantly affected.Even if we plan to eventually improve this expression by introducing the correct singularity at N = 1 according to Eq. (2.8), the logarithmic singularity will interfere with it and spoil the accuracy of the approximation.This problem, however, is an artifact of the large N approximation, since powers of ln N are the large N approximation of powers of the digamma function ψ 0 (N ) appearing in fixed order computations.Indeed, the inverse Mellin transform of Eq. (2.16) (using Eq. (A.6b) of Appendix A.1) is seen to be C (1)  res (z, where which is seen to differ from the soft contribution Eq. (2.13) to the exact result Eq. (2.12).
This can be understood noting that singular terms as z → 1 arise from integration of the real emission diagrams over the transverse momentum of the gluon, which has the form where Λ is a collinear cut-off and p gg (z) is the LO gluon-gluon Altarelli-Parisi splitting function for z < 1, , are of the form and they appear with a coefficient proportional to the Altarelli-Parisi splitting function.Explicitly, the latter in the z → 1 limit may be expanded as Logarithmically enhanced contributions to the coefficient function are generated by the first terms in both expansions Eqs.(2.23) and (2.24), namely ln(1 − z) and A g (1) respectively.
We will now argue that an optimal choice of the soft approximation, differing from Eq. (2.19) by subleading terms, is obtained by writing the large soft logs as powers of ln 1−z √ z , so in particular retaining the √ z in the denominator despite the fact that it is subleading, and furthermore, by retaining at least the first correction on the right-hand side of Eq. (2.24), also subleading.Therefore in this case our suggestion consists in the simple replacement in Eq. (2.19), where A g,m (z) is a finite m-th order expansion of A g (z) about z = 1, Eq. (2.24), and Note that we have chosen to apply the plus prescription only to the first term, singular in z = 1, which is the natural choice in fixed order calculations.In this way, D1 (N ) differs from D 1 (N ) only by terms vanishing at large N .Since D log 1 (N ) and D1 (N ) differ at large N by a constant, the coefficient g 0,1 must be modified accordingly, in order that the requirement Eq. (2.18) be satisfied.These technical details are discussed in Appendix A.1.

Our conclusion Eq. (2.25) relies on the following arguments:
• The replacement of D log 1 (z), whose Mellin transform is with D1 (z), whose Mellin transform is removes the logarithmic branch cut of D log 1 (N ), which is incompatible with the known analytic structure of the coefficient function.The only singularities are now isolated poles, as in the exact expression.
• The same features are shared by the Mellin transform of D 1 (z), that is However, the presence of ψ 1 (N ) exactly cancels the double poles of ψ 2 0 (N ) in N = 0, −1, −2, . .., which are there in the exact result.Therefore, the choice of D1 (N ) is preferred over D 1 (N ).
• In the replacement Eq.(2.25) the factor A g (z) is expanded up to a finite order m > 0 about z = 1.This is because the inclusion of the full A g (z) would introduce a spurious singularity in N = 1.Indeed, the Mellin transform of A g (z) D1 (z) is given by The first term, due to 1/z in A g (z), has a double and a simple pole in N = 1, while the exact singularity is a simple pole, with a (m H /m t )-dependent coefficient controlled by small z resummation.The expansion of A g (z) in powers of 1 − z to any finite order is not singular in z = 0, and therefore does not affect the singularity structure around N = 1.
We turn now to the general case.Each of the above arguments can be generalized to all orders, where N space resummed results contain powers of ln k N , whose inverse Mellin transform is a linear combination of distributions D log j (z) Eq. (2.20) with j ≤ k − 1.The fact that the NLO result in z space depends on powers of ln 1−z √ z rather than ln ln 1 z is of kinematical origin and ultimately comes from the upper bound for the transverse momentum of emitted gluons, Eq. (2.21), and therefore it persists to all orders.It follows that the exact result to all orders is expressed in terms of distributions Dk (z), defined in Eq. (A.2c) of Appendix A.1 in analogy with Eq. (2.26).The Mellin transform of such distributions, Dk (N ), first, has poles rather than cuts as small N singularities, and also, in comparison to the distributions D k (N ), lacks contributions proportional to powers of ψ k (N ) with k odd, which would change the pole structure (see Appendix A.1).
It has been shown in Refs.[31,33] that the factor A g (z), Eq. (2.14), is present to all orders, because the full leading order anomalous dimension exponentiates.However, terms beyond the first in its expansion Eq. (2.24) generate contributions α n s (1 − z) j ln 2n−1 (1 − z) with j ≥ 0 to the coefficient functions, which are generally of the same order as other terms which we do not control.However, it can be shown [34] that the inclusion of the O (1 − z) 1 term in the expansion Eq. (2.24) correctly predicts, after exponentiation, the subdominant contributions of the form α n s ln 2n−1 (1 − z) (i.e., in N space, terms behaving as α n s N −1 ln 2n−1 N at large N ) to all orders, so the inclusion of this term rests on firm ground.
Including the O (1 − z) 1 from Eq. (2.24) we get which is easily implemented to all orders by the replacement Including also the next order gives which amounts to replacing in the N space expressions.The third order term of the expansion of A g (z) is accidentally zero, so A g,2 (z) = A g,3 (z).We have checked that the inclusion of terms of order (1 − z) 4 and higher in the expansion of A g (z) does not affect our results significantly.We will consider both the expansions to first and second order, and use their difference as a means to estimate the uncertainty on the result.Specifically, we will take the mid-point between them as our best prediction, with the firstand second-order expansion result giving the edges of the uncertainty band.
In summary, our soft approximation (to be combined with small N terms determined in the next Section) is constructed in the following way.The resummed expression Eq. (2.9) can be rewritten where the coefficients b n,k are obtained from the functions g i , Eq. (2.11), and have been determined up to n = 3 [28,29].The function g 0 (α s ) is known only up to O(α 2 s ); the uncertainty associated to g 0,3 will be discussed in Sect.3.
The replacements Eq. (2.32) or (2.34) are then applied to Eq. (2.35).We obtain, respectively, where we have defined so that the condition Eq. (2.18) is satisfied to all orders after the replacement.Explicit expressions for the coefficients b n,k and d k are given in Appendix A.1.
Equations (2.36) can be cast in the form which is now expanded in powers of α s : We obtain As a test of our procedure we now compare the first two orders of our soft approximations Eq. (2.36) to the full result.Note that in the sequel when comparing to known results, and also when constructing our O(α 3 s ) approximation, we will always be retaining the exact m t dependence.As terms of comparison, at NLO we use the finite-m t result of Ref. [6] (using the numerical implementation of Ref. [32]), while at NNLO we use the approximate finite-m t result obtained by matching the double expansion in powers of 1 − z and m H /m t of Refs.[11,12] to the known small z terms computed in Ref. [10] according to Ref. [13] (see Refs. [14,15] for further approximate finite-m t results).Note that the soft limit only depends on m t through the function g 0 (α s ) of Eq. (2.9).
Results are shown, as functions of N along the real N axis, in Fig. 1.We find the comparison in N space to be most instructive, because the coefficient function is then an ordinary function, rather than a distribution as in z space.Furthermore, the saddle point which dominates the Mellin inversion is on the real axis [30].All this said, it should be kept in mind that the physical cross section is obtained by Mellin inversion of the product of the N space coefficient function and luminosity: therefore, agreement on the real axis is certainly necessary, but in general not sufficient for agreement of the physical results.In particular, spurious singularities (and in particular spurious cuts) may substantially modify the behavior of the coefficient function in the complex plane.
In order to understand the role of various subleading terms, we also show in Fig. 1 the results obtained expanding the resummed expression, Eq. (2.35), which is built up from the distributions D log k (z), Eq. (2.20), and thus it has spurious cuts starting at N = 0 (labeled N -soft), and the one obtained expanding the resummed expression Eq. (2.35) in powers of α s and then replacing (and adjusting the constant term), where D k (N ) are the Mellin transforms of the D k (z) distributions, Eq. (2.13), so it does not include the contributions coming from the 1/ √ z in the phase space integration (labeled soft-0).We finally show the results found expanding out the collinear-improved resummed results of Ref. [35], (labeled N -soft-collinear): these differ from the N -soft curves by the addition to g 0 Eq.(2.9) of an O(1/N ) contribution of collinear origin.This is akin to the collinear improvement which is effected in our result by the shift Eq.(2.32): indeed, the subdominant α n s N −1 ln 2n−1 N contributions (which as mentioned above are universal) generated by this collinear improvement coincide with those which we also include through our shift.
While our preferred options clearly provide the best approximation to the exact result in the soft region N 2, it is interesting to observe that the N -soft form, based on Eq. (2.35), despite having the wrong singularity structure (cuts rather than poles at small N ), still provides a reasonable approximation, though not quite as good as our preferred ones.This can be understood noting that [21] ln k ln 1 , this choice for the logarithms is very similar to our C soft 1 approximation Eq. (2.36a), up to a shift in N by 1/2.The N -soft-collinear result is quite close to the N -soft, to which it approaches at small N , but somewhat closer to our own, especially at large N (by construction the N -soft and the N -soft-collinear results coincide when N = 1).

Small N
The leading small N singularities for the Higgs inclusive cross section have been determined to all orders in α s in Ref. [36] in the m t → ∞ limit, and in Ref. [10] for finite m t .These results have been obtained by means of the so-called high energy or k t factorization technique of Ref. [37], which has been subsequently used to compute high energy cross section for an increasing number of processes [38][39][40][41][42] and more recently extended to rapidity distributions (and also used to determine all order results for Higgs production) in Ref. [43].
In this formalism, small N singularities are obtained to all orders by computing the leading order partonic cross section for the relevant process, but with off-shell incoming gluons.They are extracted from the off-shell coefficient function C off-shell , defined through where z is the scaling variable defined in the previous section, while of the i-th incoming gluon, and the angle between the incoming transverse momenta is integrated over.To do this, one defines the impact factor h(N, M 1 , M 2 ) according to where the pre-factor R accounts for factorization scheme dependence [44], and in MS is given by The determination of the coefficients c i 1 ,i 2 (m t , m H , µ F ) has been reduced to quadratures to all orders in Ref. [10]; they have been numerically determined up to and including second order in α s in [10] and up to and including fourth order in [45].
The leading singularities of the partonic coefficient function are obtained by identifying the Mellin variables M i with the anomalous dimension γ + s .This, in turn, is the eigenvalue of the singlet anomalous dimension matrix which contains, to all orders in α s , the contributions with the highest powers of the rightmost N space singularities.Indeed, as well known, only one of the two eigenvalues (which henceforth we will refer to as the "large" eigenvalue) has singularities at N = 1,1 while the other has singularities at N = 0.
In other words, the leading singularities are found letting M i = γ + s , with where the coefficients e n−1, −n are determined [46] using duality [47] from the leading order BFKL kernel.The first 35 coefficients e n, −n are tabulated in Ref. [48]; the first few have accidental zeros, and are given by e 0,−1 = C A /π, e 1,−2 = e 2,−3 = 0, e 3,−4 = 2ζ 3 (C A /π) 4 , e 4,−5 = 0.It follows that to k-th order in α s the coefficient function has a k-th order pole in N = 1.Note that in the heavy top limit the small N singularity structure is different, in that at each extra order in α s the order of the pole increases by two units [36].However, these double poles are unphysical, and follow from a breakdown of the large m t approximation at high energy: we will thus not discuss them further.
It has been shown in Refs.[20,49] that the nature of the small N singularity of coefficient functions at the resummed level is entirely determined by the singularity of the resummed anomalous dimension γ + .However, reproducing the correct all order small N singularity of the anomalous dimension (which is a simple pole to the right of N = 1, but close to it) requires [50] the all order inclusion of two classes of subleading terms on top of the leading (or next-to-leading) singularities Eq. (2.47): namely, running coupling corrections, without which the small N leading singularity would be a square-root cut instead of a simple pole [51], and anticollinear terms [52] without which the perturbative expansion of both the position and residue of the above simple pole would not be stable (similar conclusions can also be arrived at from a study [53,54] of the BFKL [25] equation).The inclusion of a further series of all order running coupling corrections in the coefficient function is further required [20,49] in order for this not to develop extra spurious singularities.When expanded out in perturbation theory, these running coupling corrections correspond to series of contributions of increasingly low logarithmic order (i.e.increasingly subleading): we will retain both up to the NLL order, i.e. keeping not only the leading singular contribution to each order in α s , but also the first subleading correction, i.e. to order α k s both the contributions with a k-th and a (k − 1)-th order pole in N = 1.
For anomalous dimensions this is simply done by including the full next-to-leading singular contribution to them.For coefficient functions, these running coupling corrections are found by letting, in Eq. (2.45), M k i = γ + res k , with γ + res k given recursively by [20,49] γ where and with γ + res we have denoted a form of the large eigenvalue which includes at least the leading singularities Eq. (2.47), but may include other subleading contributions.
We can now compute the small N approximation to the coefficient function.We expand the anomalous dimension to fixed perturbative order The leading and next-to-leading singularities of the anomalous dimension γ + are given by where the coefficient of the leading poles can be read off Eq. (2.47): e 0,−1 = C A π and e 1,−2 = e 2,−3 = 0.The other coefficients are: ) The N → 1 result for the partonic coefficient function in the gluon channel can be then obtained where we have omitted the dependence of the coefficients on m t , m H and µ F for simplicity.
Because we wish to combine the small N behavior which we are determining here with the large N behavior determined in Sect.2.1, we must make sure that the small N contribution to the coefficient function vanishes as N → ∞.However, the coefficient function C ABF (N ) Eq. (2.53) manifestly does not vanish in the large N limit, because of the constant contribution to γ (0) Eq. (2.51) which propagates into C ABF (N ) to all orders in α s .
Therefore, we construct an improved small N approximation to the coefficient function as a subtracted version of C ABF (N ).The subtracted coefficient function has the same leading small N singularities as C ABF (N ), but it vanishes as N → ∞.It is given by (2.54) ABF-sub (N ) and C (n) ABF have the same leading N = 1 singularities: the subtraction only introduces subleading N = 0 and N = −1 singularities.However, lim N →∞ C (n) ABF-sub (N ) = 0. Of course, many forms of the subtraction are possible: the particular one given in Eq. (2.54) has been chosen as a compromise between the contrasting goals of not changing the small N behavior and of damping strongly enough at large N .In z space, the subtraction Eq. (2.54) corresponds to damping the z → 1 behavior of the coefficient function through a multiplicative factor (1 − z) 2 .
In view of combining the small and large N approximations to the coefficient function, one may ask what is the expected transition point between the two approximations.In order to answer the question, a relevant observation is to note that momentum conservation implies that γ + (2) = 0 to any order in perturbation theory.This in particular implies that all C (n) ABF (N ) vanish at N = 2.This suggests that N = 2, which is a fixed point for the anomalous dimension, marks the transition between the small N approximation (not accurate when N 2) and the large N approximation (not accurate when N 2).In particular, because the coefficient function Eq. (2.53) is a polynomial in γ + (N ), it vanishes at N = 2 if the anomalous dimension does.
However, the small N approximation Eq. (2.50) to the anomalous dimension does not respect momentum conservation, because it only includes the contribution to γ + from the leading and nextto-leading singularities in N = 1 Eq.(2.51), and not the full fixed order expression of γ + Eq. (2.51).Momentum conservation can be enforced [49] by adding to C (n) ABF (N ) a function f mom (N ).This function must not introduce spurious singularities at N = 1 and it should also be subdominant with respect to the large N contributions that we control in C soft (N ).A natural choice appears to be f mom (N ) = c/N , with c fixed so that, after subtraction Eq. (2.54), our small N coefficient function vanishes in N = 2.With this choice, the small N approximation of the coefficient function becomes . (2.55) Note however that the exact coefficient function does not in general vanish at N = 2, only the contribution to it driven by hard radiation from external legs and expressed in terms of the anomalous dimension does.Thus, for instance, contributions from subdominant poles in N = 0, −1, −2, . . .will in general lead to a non-vanishing contribution to the coefficient function in N = 2.Because we do not control such a contribution, we estimate it by allowing the coefficient function to deviate from zero at N = 2, by modifying the value of the constant in the subtraction term of Eq. (2.55).We take this deviation from zero to reach as its maximum value 5% of the size of the soft contribution Eq. (2.40) at N = 2, C soft (2), with either sign; namely, we choose in Eq. (2.55) ( This means that the small N contribution, rather than being completely switched off at N = 2, is small at that point, and gets switched off somewhere in its vicinity. Our final result Eq. (2.55) for the small N contribution C (n) h.e.(N ) to the coefficient function, as well as several small N approximations are compared to each other and to the known full result at NLO and NNLO in Fig. 2. The full result is shown both in the pointlike approximation (labeled as large m t ), and for finite m t (labeled as exact): the different small N behavior of the pointlike result, due to spurious double poles, is apparent.The small N approximations Eq. (2.53) (labeled AFB), and its subtracted version Eq. (2.54) (labeled ABF-sub) are seen to provide an equally good approximation to the exact result in the very small N region where the latter is dominated by its small N poles, but only the subtracted version vanishes at large N .The final result Eq. (2.55) after enforcing momentum conservation of the anomalous dimension is finally shown (labeled high energy), with an uncertainty band obtained by varying the size of (2) about zero as discussed above: it coincides with the small N approximation for 1 ≤ N 1.25, but it is gradually switched off for larger N until vanishing in the vicinity of N ∼ 2.
3 Approximate cross sections up to N 3 LO

Parton level results
We can now construct an approximation to the full coefficient function.Having constructed a large N approximation C soft (N, α s ) Eq. (2.40) and a small N approximation C h.e.(N, α s ) Eq. (2.55) to the coefficient function, in such a way that the small N term does not spoil the large N singularities and conversely, we can combine them using Eq.(2.8), which we then expand out in powers of α s according 0.8 0.9 and the combined small and large N approximation Eq. (3.1) (approx).The bottom plot shows the ratio of the approximate results to the exact result.Note that at NNLO the "exact" result is in fact the approximate construction of Ref. [11,12].
to Eq. (2.4), so that at N k LO we have Before turning to the N 3 LO, which is our main result, we first compare the NLO and NNLO results found using our procedure to the corresponding exact results.We will use m H = 125 GeV and m t = 172.5 GeV throughout.The comparison is shown in Fig. 3, where our best approximate result C approx (N ) Eq. (3.1) is then obtained as the envelope of these uncertainty bands (red band).In each plot we also show the ratio of the approximate result to the exact one.
It is apparent that the approximate results reproduce the exact one within the uncertainty in the full region of real N > 1 at NLO, while at NNLO there is a small disagreement (of about 5%) very close to N = 1.Note, however, that in this region what we call "exact" result is not necessarily reliable: indeed, in the absence of a full NNLO result we are taking as exact the matching of Ref. [11,12] of a double expansion in powers of 1 − z and m H /m t with the exact leading (double) N = 1 pole computed in Ref. [10].In particular, the contribution from subleading poles (single pole at N = 1 and multiple poles for non-positive integer N ) in the "exact" result are not correctly reproduced, while in our approximate expression they are partly estimated by varying the size of C (n) h.e.(2).We now consider our new result for the N 3 LO coefficient function: C  The coefficients in the large N contribution Eqs.(2.36) are collected in Appendix A.1, except the coefficient ḡ0,3 , which is unknown: unless stated otherwise, the results are presented with ḡ0,3 = 0.This is a coefficient in the expansion of the constant function ḡ0 (α s ), related by Eq. (2.37) to the function g 0 (α s ) which appears in the resummed expression Eq. (2.9).The general relation between the coefficients g 0,n and ḡ0,n , is discussed in Appendix A.1, see in particular Eq.(A.17): it turns out (see Tab. 2 and Ref. [28]) that the known coefficients g 0,n are rather larger than ḡ0,n .This is also the case for Drell-Yan production [28].For this reason, at third order, where g 0,3 = ḡ0,3 + r 3 , with  r 3 = 114.7,we will take ḡ0,3 = 0 (rather than g 0,3 = 0) as preferred choice (see also a corresponding discussion in Ref. [28]).Coming now to the small N expression Eq. (2.53), the coefficients c i,j are collected in Appendix A.2, while explicit expressions for γ (i) are given in Eq. (2.51).The function C approx (N ) is plotted in Fig. 4, together with the soft approximation (bounded by the two curves C (3) soft 1 (N ) and C (3) soft 2 (N )) and the high energy approximation (given by C h.e.(N )).We finally turn to the behaviour of the perturbative expansion of the coefficient function.In Fig. 5 we compare the NLO, NNLO and N 3 LO truncations of We note that at moderately large N 4 (where we expect our approximation to be very accurate) the O(α 3 s ) contribution is significant, so the convergence of the series is quite slow.On the other hand, the saddle point argument of Ref. [30] implies that the dominant contribution at LHC energies comes from the region ∼ 2, where convergence is much faster, though the N 3 LO contribution is still quite large.

Hadron level results
We now discuss the corresponding hadron level quantities.To this purpose, we define the gluon channel K-factors where H ) is the contribution from the gluon channel to the cross section Eq. (2.1), which implies that and and we use everywhere the NNLO expression of α s , and NNLO parton distributions.
We then compute the K-factors using various approximations for the coefficient function, and compare them to each other and, at NLO and NNLO, to the exact result.Specifically, besides our preferred approximation Eq. (3.1) shown in Figs.3-4, we also show results obtained using the soft contribution C (n) soft to the coefficient function (also shown in Figs.3-4), as well as the N -soft approximation Eq. (2.35) shown in Fig. 1 and also determined with ḡ0,3 = 0.This N -soft approximation is essentially the same as the N 3 LO approximation previously published in Ref. [28], though here, unlike in Ref. [28], we include the full m t dependence.We use the NNLO NNPDF2.1 [55] set of parton distribution function, with α s (m 2 Z ) = 0.119, and with the scale choice µ F = µ R = m H .The scale dependence will be studied in Sect. 4 below.
Results are shown in Fig. 6, where the various contributions to the functions K (n) gg are plotted as a function of the collider energy √ s.As the energy increases, τ becomes smaller and one would expect small z effects to become more relevant.Indeed, we observe that the soft approximations deviates from the exact results, while the full approximation reproduces well the shape of the cross section for all √ s.On the other hand, at low energies the red and green curves (and corresponding uncertainty bands) tend to coincide, meaning that the small z contribution has become negligible.In that region, we also observe that the bottom edge of the uncertainty band (which is obtained using C soft 2 Eq.(2.36b)) better approximates the exact result than the top (obtained using C soft 1 Eq.(2.36a)).Finally, we note the N -soft curve always undershoots the exact result, the more so at higher perturbative orders.This agrees with the behaviour of the N -soft curve in Fig. 1.
In Fig. 6 we also show our prediction for the N 3 LO K-factor.The third order term is quite large at small energy √ s ∼ 1 TeV, where the soft contribution is dominant, but it remains sizable even for √ s ∼ 10 TeV, where it is almost half of the NNLO.This slow convergence of the perturbative series may make all order resummation mandatory.In the low energy region the uncertainty band on our soft approximation is quite narrow, and the N -soft curve is well below and outside it.At very high energies the convergence of the perturbative expansion seems to improve, but very slowly, and the uncertainty on our prediction increases.

N 3 LO Higgs production at the LHC
We now concentrate on Higgs production at the LHC at √ s = 8 TeV, with m H = 125 GeV.In Table 1 we present the K-factor K gg Eq. (3.3), computed again with the NNLO NNPDF2.1 [55] PDF set and α s (m 2 Z ) = 0.119.Results at NLO, NNLO and N 3 LO are compared to the exact results (when available) as well as the N -soft approximation which, as mentioned in Sect.3.2, is essentially the same as the approximation published in Ref. [28], from which it differs because of the inclusion of finite top mass effects.We show results for two choices of the renormalization scale, µ R = m H and µ R = m H /2, √ s, computed using the various approximations to the coefficient function shown in Figs.3-4.We also show approximation based on using the N -soft coefficient function of Fig. 1, which at N 3 LO is close to the result of Ref. [28].
both with µ F = m H : as we shall see below, the factorization scale dependence is essentially negligible, even at LO.The uncertainty on our prediction has been determined as discussed in Sect.3. Our approximate result agrees with the exact result at NLO and NNLO within its stated uncertainty.The N -soft approximation leads to a systematically smaller result, the more so at higher perturbative orders.
We now turn to our result for the full N 3 LO Higgs production cross section at central scale where the error shown is our estimate of the uncertainty in our approximation procedure, and we have separated off the contribution from the unknown coefficient ḡ0,3 , discussed in Sect.3.1 above.As discussed in Sect. 3 our default choice is ḡ0,3 = 0, on the grounds that the perturbative behaviour of the ḡ0,i coefficients (see Table 2) suggests that ḡ0,3 is possibly of order ten or so (while the coefficient g 0,3 is likely to be rather larger, perhaps of order hundred).
We have computed the LO, NLO and NNLO contributions to the cross section, with full top mass effects [6,12], and including all partonic subprocesses, while the prediction at N 3 LO only contains the approximate coefficient function for the gluon channel.We have cross-checked results in the pointlike limit against the ihixs code [57], and the top mass dependence against the numerical implementation of Ref. [32].
Our approximation is compared to the exact result (when available) and to the N -soft approximation, which up to NNLO coincides with the fixed-order truncation of the resummed result [35,56], and at N 3 LO is close to it and to the result of Ref. [28] (see text).The uncertainty shown corresponds to the band in Fig. 6.With µ R = m H , the N 3 LO amounts to a 17% correction to the NNLO prediction σ NNLO = 19.33 pb.This correction is larger than that found in Refs.[35,56] using NNLL resummation, which increases the NNLO result by about 8%.If expanded out to finite order, the resummed result of Refs.[35,56] coincides with the N -soft approximation, which at N 3 LO is by little more than 1 pb (corresponding to 6% of the NNLO) smaller (see Fig. 8 below).Also, the result of Refs.[35,56] corresponds to taking g 0,3 = 0 instead of ḡ0,3 = 0 as we do in our default result, given that in the NNLL expression g 0 (α s ), Eq. (2.9), is included up to order α 2 s .With this choice, the N 3 LO is further reduced by about 5% of the NNLO, down to a correction of about 6%.The extra 2% or so in Refs.[35,56] is accounted for by N 4 LO and higher orders.With µ R = m H /2 as sometimes [57] advocated, the impact of the N 3 LO corrections is reduced to 11.5%, but the difference between our result and the N -soft prediction (and thus also that based on NNLO resummation) increases, from about 5% to about 8.5%, see Fig. 8 below.
We now study the dependence of the cross section on variations of the renormalization and factorization scales, µ R and µ F respectively.We first show the scale dependence of the known NLO and NNLO cross sections in Fig. 7, with the the choices of renormalization scale µ R = m H and µ R = m H /2 used to compute Tab. 1 shown as vertical lines.The two choices µ F = m H and µ F = m H /2 are also shown, even though only µ F = m H was used form Tab. 1.We consider both a simultaneous scale variation in all partonic subprocesses (black curves), as well as the scale variation for the gluon-gluon subprocess only.The renormalization scale dependence of the full result is not much different from that of the gluon contribution.The factorization scale dependence instead is much stronger for the gluon channel alone than for the full result.This cancellation of the factorization scale dependence between partonic subchannels is a direct consequence of the known structure of the Altarelli-Parisi equations.The factorization scale of the full result turns out to be essentially negligible, thereby justifying the choice not to show the dependence on it in Tab. 1.
The scale dependence of our N 3 LO result is displayed in Fig. 8.We only show the renormalization scale dependence: the factorization scale dependence of the N 3 LO result will be weaker than that of the NNLO, which is already negligible.Also, our N 3 LO result only includes the (dominant) gluon contribution, so its factorization scale dependence would be misleadingly large, and canceled by a contribution from the quark channels.
The N 3 LO contribution reduces the renormalization scale dependence of the NNLO QCD result from ±10% to ±7% if the scale is varied in the range 0.5 < µ R /m H < 2. We also show the prediction obtained using the soft approximation C N -soft , with ḡ0,3 = 0, i.e. essentially the approximation of Ref. [28], as well as the prediction obtained by performing a collinear improvement of the latter [35] (labeled N -soft-collinear, see Sect.2.1, Fig. 1).The fact that he N -soft result is rather smaller than our own is clearly seen.The collinear improvement of Ref. [35] has a negligible impact, and indeed it has therefore not been included [58] in the recent phenomenological results of Ref. [28,56].As seen in Fig. 6, for central scale choices m H /2 µ R m H the the difference between our approximate result and the N -soft approximation is due almost entirely to our different way of treating subleading soft terms, and this is thus the reason why correction is more substantial than those of Refs.[28,56] (note that in Ref. [28] a smaller value of α s (m Z ) is adopted, which would lead to a yet smaller result).The scale dependence of our result is similar to that of the N -soft result and its collinear improvement (and thus to that of Refs.[28,56]) towards the high end, but it has a different shape towards lowers scales, where it is much weaker, partly due to the matching with the small N terms.

Conclusions and Outlook
We have determined an approximate expression for the N 3 LO Higgs production cross section in gluon fusion, with finite top mass.We have considered the dominant gluon channel only.Our approximation is based on combining information on the large N and small N singularities of the coefficient function, which are determined from resummation, while making sure that they do not interfere with each other, so that large N terms do not introduce spurious small N singularities, and conversely.Small N resummation in unfortunately only known to the leading logarithmic level (unlike large N resummation, which is known up to N 3 LL order), so our approximation looses accuracy at small N , but fortunately Higgs production in gluon fusion is dominated by large N terms down to fairly moderate N values [30].
We have found that at √ s = 8 TeV this correction leads to a 17% increase of the cross section for µ R = muf = mh = 125 GeV and it noticeably reduces the scale dependence of the NNLO result.Our correction is larger than that previously found in Refs.[28,56] essentially because of our different treatment of subleading soft terms, while its scale dependence, especially towards lower scales, is milder due to the matching to the small N "BFKL" terms.The difference becomes yet larger for lower scale choices.
The results presented here can be used to improve the prediction for standard model Higgs production in gluon fusion, and to the very least they provide an estimate of the impact of higher order corrections on the currently known NNLO result which is rather more reliable that the commonly used scale variation.A public code is available at http://www.ge.infn.it/∼bonvini/higgs/While we have concentrated on the dominant gluon channel the inclusion of other partonic channels along the same lines is possible.More interestingly, our approach could also be extended to the construction of approximate expressions for rapidity distributions [21,43].Both are left for future work, as well as the construction of a fully resummed result, in which the large and small N terms are included to all orders in α s , and the extension to other processes, specifically Drell-Yan production.In particular, Eq. (A.10c) shows that ψ j (N ) with j odd never appear in Υ (k) (N, 0).More details can be found in Ref. [59].
We now consider the coefficient function in the soft limit, which is completely fixed by the coefficients b n,k and by the function g 0 (α s ) appearing in Eq. (2.35), which we reproduce here: The coefficients b n,k depend only on soft gluon radiation, and therefore do not depend on m H or m t .They can be computed within the effective theory in which the top is integrated out and the top loop shrinks to a point (pointlike approximation).On the other hand, g 0 (α s ) depends on m H /m t , and therefore it also depends on whether the pointlike approximation is used or not.The dependence of g 0 (α s ) on the ratio m H /m t obviously affects logarithmic terms by interference in C res (N, α s ) but such dependence is under control.
We now list the explicit coefficients b n,k for n = 1, 2, 3. We omit the scale dependence, which can be restored by imposing scale invariance of the hadronic cross section.The order α s coefficients are We now turn to the function α n s g 0,n . (A.15) The first two terms of the expansions are known, and for m H = 125 GeV, m t = 172.5 GeV and n f = 5, are given by 2 g 0,1 = 8.7153; g 0,2 = 40.10,(A.16) but g 0,3 is still unknown.The function ḡ0 (α s ) is related to g 0 (α s ) by Eq. (2.37), which we rewrite here using the explicit values of the d k , Eq. (A. and the r n can be read off Eq. (A.17) order by order in α s .It is interesting to observe that each r n depends on g 0,j with j < n; in particular, r 3 does not depend on the unknown coefficient g 0,3 .The numerical values of r n for n = 1, 2, 3 are given in Table 2.We note that r 3 is of order 10 2 , which is the order of magnitude of a naive estimate of g 0,3 on the basis of the known values of g 0,1 , g 0,2 .

A.2 Small N contributions
The coefficients c i 1 ,i 2 of the small N singularity, Eq. (2.45), were expressed in terms of single and double integrals over the off-shell gluon virtualities in Refs [10,45].The NLO coefficient g0,1 has been computed numerically using the implementation of Ref. [32] of the exact result [6].The NNLO coefficient g0,2 was computed in Ref. [12] as an expansion in powers of (mH /mt) 2 .We have checked that truncating the expansion to order 4 the result is accurate at the per mille level.

Figure 1 .
Figure 1.The partonic coefficient function Eq. (2.40) in N space for m H = 125 GeV at NLO (left)and NNLO (right) and its soft approximation in various forms: our preferred choices Eqs.(2.36), denoted as soft 1 and soft 2 , the simpler approximations based on D log (N ) as in Eq. (2.35), denoted as N -soft, its collinear-improved version from Ref.[35], and the approximation based on D(N ), as in Eq. (2.42), denoted as soft-0.

Figure 2 .
Figure 2. Comparison of the NLO (left) and NNLO (right) exact coefficient function both for finite m t (exact) and in the m t → ∞ limit (large m t ), and several small N approximations to it: the NLO and NNLO contributions to C ABF Eq. (2.53) (ABF), to C ABF-sub Eq. (2.54) (ABF-sub) and to C h.e.Eq. (2.55) (high energy).

Figure 3 .
Figure3.Comparison of the NLO (left) and NNLO (right) exact coefficient functions to various approximations to it.The large N approximation, corresponding to the band between the soft 1 and soft 2 curves in Fig.1(soft); the small N approximation, corresponding to the high energy curve in Fig.2(high energy); and the combined small and large N approximation Eq. (3.1) (approx).The bottom plot shows the ratio of the approximate results to the exact result.Note that at NNLO the "exact" result is in fact the approximate construction of Ref.[11,12].
Eq. (3.1) (labeled as approx) is shown along with the large N C (k) soft (N ) (labeled as soft) and small N C (k) h.e.(N ) (labeled as high-energy) terms which contribute to it.As discussed in Sect.2.1 and Sect.2.2 respectively, the uncertainty on C soft (N, α s ) is obtained as the spread between the two different forms Eq. (2.36) of the large N approximation (green band), while the uncertainty on C h.e.(N, α s ) is obtained by varying the size of C (n) h.e.(2) about zero (blue band).The uncertainty on C (k) is given either by Eq. (2.36a) or Eq.(2.36b), and C (3) h.e.(N ) is given in Eqs.(2.54), (2.55) in terms of C

Figure 6 .
Figure 6.The NLO, NNLO and N 3 LO contributions to the K-factor Eq. (3.5) in the gluon channel only as a function of the collider energy √ s, computed using the various approximations to the coefficient function shown in Figs.3-4.We also show approximation based on using the N -soft coefficient function of Fig.1, which at N 3 LO is close to the result of Ref.[28].

Figure 7 .
Figure 7. Dependence of the NLO and NNLO cross sections on the renormalization scale µ R and factorization scale µ F .The curves labeled gg (N)NLO are obtained including all channels at (N)LO and the gluon-gluon contribution at (N)NLO.The two choices of renormalization scale used to compute Tab. 1 are shown as vertical bars.The two corresponding choices of factorization scale are also shown (but only µ F = m H is used in Tab. 1).

Figure 8 .
Figure 8. Dependence of the N 3 LO cross section on the renormalization scale µ R .The two choices of renormalization scale used to compute Tab. 1 are shown as vertical bars.

1 2 H µ 2 F 2
For m H = 125 GeV, m t = 172.5 GeV and n f = 5, their numerical values for µ F = m H are given in MS by c can be easily restored by the substitutions c 1,0 → c 1,0 + F F = ln m

Table 1 .
NLO, NNLO and N 3 LO contributions to the gluon fusion K-factors Eq. (3.3), computed with two different choices of the renormalization scale, µ F = m H and µ

Table 2 .
Numerical values of g 0 and ḡ0 at various perturbative orders.