NLO matching for ttbb production with massive b-quarks

Theoretical uncertainties in the simulation of ttbb production represent one of the main obstacles that still hamper the observation of Higgs-boson production in association with top-quark pairs in the H->bb channel. In this letter we present a next-to-leading order (NLO) simulation of ttbb production with massive b-quarks matched to the Sherpa parton shower. This allows one to extend NLO predictions to arbitrary ttbb kinematics, including the case where one or both b-jets arise from collinear g->bb splittings. We find that this splitting mechanism plays an important role for the ttH(bb) analysis.

The recent discovery of the Higgs boson and first measurements of its interactions permit to probe the mechanism of spontaneous symmetry breaking, by which elementary particles acquire their mass [1,2]. Data collected in the first run of the LHC provide significant sensitivity to Higgs-boson interactions with force carriers-gluons, photons, Z and W bosons-while constraints on Higgscouplings to matter particles-leptons and quarks-are less stringent and mostly stemming from indirect effects on Higgs-gluon and Higgs-photon couplings. The direct investigation of Higgs-boson couplings to quarks and leptons will thus represent a crucial further step towards a complete understanding of the origin of mass. In this context, the reaction pp → ttH(bb), i.e. Higgs-boson production in association with a top-quark pair with subsequent Higgs-boson decay into a bottom-quark pair, provides a unique opportunity to test the mass-generation mechanism in the heavy-quark sector. This process is notoriously very challenging due to the presence of four b-quarks in the final state, which hampers a correct identification of the Higgs-boson mass peak. As a result, the ttH signal is strongly contaminated by background contributions from top-quark pair production in association with light-, charm-and bottom-jet pairs. The large uncertainty in the Monte-Carlo simulations of these multi-particle QCD backgrounds represents one of the main bottlenecks of the present ttH(bb) analyses [3,4], and the availability of state-of-the art theory predictions for ttjj, ttcc, and ttbb production is a key prerequisite to improve the sensitivity to the ttH(bb) signal. In the case of the irreducible ttbb background, theory predictions play an especially important role, since the lack of sufficiently distinctive kinematic features and the rather small cross section do not allow for an efficient ttbb measurement in a signal-free control region.
NLO calculations for ttbb [5][6][7][8] and ttjj [9,10] production can reduce perturbative uncertainties from 70-80% down to 15-20%. However, in order to be applicable to the experimental analyses, these calculations need to be matched to parton showers. Matched NLO predictions for pp → tt+ ≤ 1 jets, with consistent merging of 0-and 1-jet final states, have been presented in [11], and first technical results towards NLO matched ttbb production have been discussed in [12], where the NLO calculation of [7] was matched at the level of the first shower emission with the PowHeg approach [13]. In this letter, we present a fully-showered NLO simulation of ttbb production. Besides matching NLO matrix elements to the parton shower with the MC@NLO method [14], for the first time we also include finite b-quark mass effects. This represents the first complete NLO-matched simulation with four (massive) coloured particles in the final state. Using massive b-quarks we can extend the simulation to the whole ttbb phase space, thereby including also tt + 1 bjet contributions with an unresolved (soft or collinear) bquark, which play an important role in the ttH(bb) analysis. Moreover, matching massive NLO matrix elements to the parton shower gives access to novel tt + b-jets production mechanisms, where b-jets arise from hard gluons via collinear g → bb splittings. In particular, one can describe tt + 2 b-jet events where both b-jets originate from g → bb splittings (see Fig. 1). For this kind of configurationswhich turn out to be quite important-the finite b-quark mass allows one to obtain an NLO accurate description of the first g → bb splitting, while simulations with massless b-quarks must rely on ttgg matrix elements plus pure parton-shower splittings in the collinear regions.
The presented simulation has been prepared within the  Sherpa+OpenLoops framework [15][16][17], which supports the fully automated simulation of any Standard-Model process at NLO QCD, including matching to the parton shower and multi-jet merging. The OpenLoops [16] program is a one-loop generator based on a novel numerical recursion, which is formulated in terms of loopmomentum polynomials called "open loops" and allows for a fast evaluation of scattering amplitudes with many external particles. 1 It uses the Collier library [19] for the numerically stable evaluation of tensor integrals [20,21] and scalar integrals [22]. Real-emission contributions, infrared subtractions based on the Catani-Seymour (CS) technique [23,24], and phase-space integration are handled by Sherpa [15] and Amegic++ [25]. The NLO corrections are matched to the Sherpa parton shower [26] using the Sherpa formulation [27,28] of the MC@NLO method [14]. 2 The essence of the MC@NLO approach is encoded in the following formula for the no-emission and first-emission contributions to the expectation value of a generic observable [28], The terms B(Φ B ) and V (Φ B ) represent Born and virtual matrix-element contributions to the Born phase space Φ B , while R(Φ R ) denotes real-emission matrix-element contributions to the corresponding phase space Φ R . Similarly as for NLO calculations, infrared singularities are removed from the Φ R phase space via local subtraction terms D ijk (Φ R ) and added back to the virtual contributions in the form where each subtraction term is integrated over a factorised phase space Φ R|B associated with a Φ R → Φ B mapping.
In fixed-order calculations, to achieve an exact cancellation of the subtraction terms, events associated with D ijk (Φ R ) 1 A public implementation of OpenLoops will appear in the next future [18]. 2 In the following, MC@NLO always refers to the algorithm of Refs. [27,28] and its implementation within Sherpa. must be attributed to the Born phase space according to the appropriate Φ R → Φ B mapping. In contrast, in the MC@NLO approach D ijk (Φ R ) contributions are handled as genuine real-emission events, and the resulting mis- is compensated, to order α s , by Φ B → Φ R migrations that result from parton-shower emissions. The first shower emission is described by where the second line corresponds to the first-emission probability, and the Sudakov form factor ∆(t 0 , µ 2 Q ) represents its no-emission counterpart. The parton shower is driven by the evolution variable t. It starts at the resummation scale µ 2 Q and stops when t reaches the infrared cutoff t 0 . The key principle, by means of which the MC@NLO approach preserves NLO accuracy up to the first emission, is the correspondence between the splitting kernels of the parton shower and the terms D ijk that are subtracted from the real emission. In Sherpa this is achieved by using CS dipoles D ijk both as subtraction terms and as splitting kernels of the parton shower. More precisely, the kernels of the shower are given by the spin-averaged CS dipoles, taken in the large-N c limit. In addition, to obtain a fully consistent matching, the first shower emission is supplemented by exact spin and colour correlations [27]. The MC@NLO matching can be regarded as an effective subtraction of the first shower emission, and, similarly as for the shower, also the subtraction terms in (1) and (2) must be restricted to the kinematic region t < µ 2 Q . Finally, noemission and first-emission events generated according to (1)-(3) are used as seeds for subsequent shower emissions.
In the following, we present and compare LO, NLO and MC@NLO simulations of ttbb production at the 8 TeV LHC. The results are based on a Sherpa 2.0 pre-release version. 3 Hadronisation and underlying events are not considered, and top quarks are treated as stable particles with mass m t = 173.2 GeV. While spin-correlated t → Wb decays can be simulated in a fully automated way, omitting top decays permits us to focus on the behaviour of those b-jets that arise from QCD interactions, and that involve many more subtleties from the viewpoint of the theoretical simulation and its uncertainties. Consistently with the use of a finite b-quark mass, m b = 4.75 GeV, we employ four-flavour parton distributions. Specifically, at NLO (LO) QCD the LHApdf implementation of the MSTW2008NLO (LO) parton distributions [29] Table 1: Cross sections with standard ttb and ttbb cuts and with an additional cut, m bb > 100 GeV. Full MC@NLO predictions (σ MC ) are compared to results obtained with parton-shower g → bb splittings switched off (σ 2b MC ). The first and second uncertainty represent ξ R and ξ F variations. In the MC@NLO case, the latter is combined with ξ Q variations in quadrature.
consistently included in the virtual corrections via zeromomentum subtraction of the heavy-quark loops in the renormalisation of α s .
As renormalisation scale we employ the geometric average of the top-quark and b-quark transverse energies, 4 which represents a natural generalisation of the dynam- . The default scale corresponds to ξ R = 1, and ξ R parametrises scale variations. To NLO accuracy, this choice corresponds to α 4 s (µ R ) i α s (E T,i ) and guarantees that the strongcoupling factors associated to the production of the various final-state objects adapt to the respective transverse energies. The factorisation and resummation scales, which define the available phase space for QCD radiation, are related to the average top-quark transverse energy via The default scale choice corresponds to ξ F = ξ Q = 1, and ξ F parametrises correlated variations of µ F and µ Q , while ξ Q controls additional variations of µ Q with fixed µ F . QCD partons, including b-quarks and excluding only top-quarks, are recombined into IR-safe jets using the antik T algorithm [30] with jet-resolution parameter R = 0.4. Events are categorised according to the number N b of reconstructed b-jets with p T > 25 GeV and |η b | < 2.5. In this respect, we classify as b-jet any jet involving at least a b-quark, which includes also the case of collimated bb pairs resulting from the splitting of energetic gluons. This is, at least experimentally, the most realistic b-jet definition, and its implementation at NLO is possible only in presence of massive b-quarks. In fact, in calculations with massless b-quarks, collimated bb pairs must be handled as gluon-jets in order to avoid collinear singularities.
To investigate NLO and MC@NLO correction effects we considered an exclusive ttbb sample, with events involving N b ≥ 2 b-jets, and a more inclusive ttb sample with N b ≥ 1. For the ttbb sample an additional analysis is performed with a cut on the invariant mass of the first and second b-jet, m bb > 100 GeV, which corresponds to the ttH(bb) signal region. The respective LO, NLO and MC@NLO cross sections are reported in Table 1. In order to isolate contributions arising from b-quarks emitted by the parton shower, we also present MC@NLO predictions generated in absence of g → bb parton-shower splittings. Scale uncertainties are assessed via independent factor-two variations of ξ R and ξ F . Additional scale uncertainties related to the parton shower are included via ξ Q = 2 ±1/2 variations of the resummation scale and are combined in quadrature with ξ F variations.
Fixed-order results in Table 1 feature NLO K-factors close to 1.2, with ±0.05 variations depending on the selection cuts. This is consistent with the O(20%) contribution of b-quarks to the running of α 4 s (µ) from m b to µ R , and with the fact that the corresponding K-factor in the fiveflavour scheme, where b-quark contributions are included in the running of α s , is very close to one [31]. In this respect, let us note that a fully consistent resummation of ln(µ R /m b ) terms associated with the running of α s would increase the ttbb NLO cross section by about 9% as compared to standard 4F-scheme predictions presented in this letter. This estimate was obtained using a modified set of MSTW four-flavour PDFs with five active flavours in the evolution of α s .
Scale uncertainties in Table 1 are dominated by renormalisation-scale variations and decrease from about 60-70% at LO to 20-30% at NLO. Scale variations at NLO and MC@NLO level are rather similar. In presence of standard ttb and ttbb cuts, matching to the parton shower shifts the NLO cross section by only 1% and 6%, respectively. However, the MC@NLO correction to ttbb finals states is quite sensitive to the invariant mass of the bb pair and turns out to be enhanced by a factor four in the region m bb > 100GeV, which is relevant for Higgs-boson searches. This MC@NLO effect-which clearly exceeds the magnitude of the Higgs signal in the present ttH(bb) analyses [3,4]-tends to disappear if g → bb splittings are switched off in the parton shower. 5 As discussed below, various features indicate that this effect is dominated by the double-splitting mechanism depicted in Fig. 1.b.
The differential distributions in Figs. 2 and 3 provide examples of nontrivial matching corrections. Standard ttbb cuts are applied, and the MC@NLO bands display the combination in quadrature of µ R , µ F and µ Q scale variations. The corresponding uncertainties are typically around 30% and tend to increase in the tails, also due to statistical fluctuations. The transverse momentum of the first non-b jet (Fig. 2.a) shows the typical MC@NLO behaviour. At transverse momenta above the resummation scale, where the parton shower stops emitting, MC@NLO and NLO predictions agree well. The fixed-order infrared singularity at small p T is consistently damped by the Sudakov form factor, and Sudakov effects start to be important already at p T ∼ 50 GeV. This reflects the presence of intense QCD radiation resulting from the gluon-gluon initial state and from the high center-of-mass energy of the ttbb system. In the intermediate p T region we observe an MC@NLO correction of about +30% wrt. NLO. This can be attributed to g → bb parton-shower splittings and to the enhancement of the first shower emission that results from the (B + V + I) term in (1). The precise position and magnitude of the MC@NLO/NLO maximum depend on the choice of the renormalisation and resummation scales, and scale variations permit assessing related higher-order uncertainties inherent in the matching procedure. Figure 2.b confirms that matching corrections are quite sensitive to the invariant mass of the first two b-jets. The MC@NLO/NLO ratio grows with m bb and reaches 25-30% in the Higgs-signal region, m bb ∼ 125 GeV. This enhancement at high invariant mass can be attributed to tt + 2 b-jets production via double g → bb splittings, since this mechanism is kinematically favoured by the fact that the probability that two hard gluons split into collinear bb pairs does not decrease when the invariant mass of the gluon pair grows. This interpretation is confirmed by the fact that the shape of the MC@NLO m bb distribution becomes almost identical to the NLO one if g → bb splittings are switched off in the parton shower. Further evidence of the correctness of the above picture is provided by the fact that the MC@NLO excess increases with the di-jet invariant mass at a similar rate as the ratio of the ttgg to ttbb cross sections. For instance, using LO matrix elements, we checked that both quantities increase by a factor two in the range between 100 and 250 GeV.
The plots in Fig. 3, where an additional cut m bb > 100 GeV is applied, reveal distinctive kinematic features of the MC@NLO enhancement in the Higgs-signal region. The unambiguous MC@NLO/NLO peaks that appear in the distributions, both in the transverse momentum of the first b-jet ( Fig. 3.a) and in the ∆R separation of the first two b-jets ( Fig. 3.b), show that the MC@NLO enhancement is dominated by back-to-back b-jets with the smallest possible p T that is needed to reach m bb = 100 GeV. This is consistent with the expected behaviour of double g → bb splitting contributions in Fig. 1.b, where emissions at small-p T are doubly enhanced by soft and collinear singularities associated with the parent gluons. Also this interpretation is fully confirmed by the fact that MC@NLOinduced shape distortions in Fig. 3 disappear almost completely when g → bb shower splittings are switched off.
To exclude the possibility that double splittings in our simulation are artificially enhanced by a too high choice of the resummation scale, we checked that the characteristic "double-splitting" enhancement in the m bb distribution of Fig. 2 is present also in simulations based on merged LO matrix elements for tt plus multi-jet production. In this framework, ttbb events are not showered with a global resummation scale, but starting from a scale that is determined according to the most likely shower history of the event at hand. Comparing the shape of the MC@NLO distribution of Fig. 2 against MEPS@LO simulations [32] of tt+ ≤ 3j with massive b-quarks, we found good agreement for merging scales around 15 GeV, i.e. for the case where most of the phase space associated with (the first) g → bb splittings is described in terms of matrix elements, as in the present MC@NLO simulation. A thorough understanding of the uncertainties related to the choice of the merging scale and the interplay between matrix elements and parton shower in the vicinity of the kinematic threshold for g → bb splittings requires further detailed studies that are beyond the scope of this letter.
In summary, we presented the first complete MC@NLO simulation of ttbb production at the LHC, including bquark mass effects. This allows one to cover the full ttbb phase space at NLO accuracy and to describe contributions stemming from double collinear g → bb splittings, which can lead to a significant contamination of the ttH(bb) signal. This unexpected finding changes the standard perturbative picture of ttbb production based on hard b-quark jets. The presented simulation will allow for a thorough analysis of the related uncertainties. In this respect it will be important to assess the role of the parton-shower tune and to devise efficient strategies for the rejection of double-splitting contributions. Aspects not discussed here, such as top-quark decays, hadronisation and underlying events, can be simulated in a fully automated way using Sherpa. To gain more insights into theoretical uncertainties associated with the parton shower and the b-quark mass, it will be very instructive to compare the four-flavour scheme adopted in this paper to the five-flavour scheme. Both schemes provide reliable NLO predictions for observables involving resolved b-jets at the LHC [33]. In the five-flavour scheme, where b-quarks are massless, ttbb matrix elements cannot be used to fill the entire b-quark phase space, and the collinear regions need to be described by lower-multiplicity hard matrix elements (ttg, ttb, tt, etc.) supplemented by parton-shower emissions. Technically this requires the merging of NLO matrix elements for tt + 0, 1, 2 jets, which was presented for the first time in [34]. A consistent combination of this recent simulation and the massive ttbb predictions presented in this paper would provide an optimal description of tt plus multi-jet production.