Measurement of the cross section ratio sigma(t t-bar b b-bar) / sigma(t t-bar jj) in pp collisions at sqrt(s) = 8 TeV

The first measurement of the cross section ratio sigma(t t-bar b b-bar) / sigma(t t-bar jj) is presented using a data sample corresponding to an integrated luminosity of 19.6 inverse femtobarns collected in pp collisions at sqrt(s) = 8 TeV with the CMS detector at the LHC. Events with two leptons (e or mu) and four reconstructed jets, including two identified as b quark jets, in the final state are selected. The ratio is determined for a minimum jet transverse momentum pt of both 20 and 40 GeV. The measured ratio is 0.022 +/- 0.003 (stat) +/- 0.005 (syst) for pt>20 GeV. The absolute cross sections sigma(t t-bar b b-bar) and sigma(t t-bar jj) are also measured. The measured ratio for pt>40 GeV is compatible with a theoretical quantum chromodynamics calculation at next-to-leading order.


Introduction
With the observation of a new boson at a mass around 125 GeV/c 2 [1-3] whose properties are consistent with those of the standard model (SM) Higgs boson H [4][5][6][7][8][9], the SM appears to be complete. One of the most sensitive channels in the discovery of the Higgs boson, H → γγ, is expected to have top quark loops both in the production and decay of the Higgs boson in the SM. Hence, it is important to determine the couplings of the new boson to fermions, especially to the top quark. In the SM, one of the most promising channels for a direct measurement of the top quark Yukawa coupling is the production of the Higgs boson in association with a tt pair (ttH), where the Higgs boson decays to bb, thus leading to a ttbb final state.
The expected quantum chromodynamics (QCD) cross section for ttH production in pp collisions at √ s = 8 TeV, calculated to next-to-leading order (NLO), is 0.128 +0.005 −0.012 (scale) ± 0.010 pb (PDF+α S ) [10], where the uncertainty labelled "scale" refers to the uncertainty from the factorization and renormalization scales, and the uncertainty labelled "PDF+α S " comes from the uncertainties in the parton distribution functions (PDFs) and the strong coupling constant α S . This final state, which has not yet been observed, has an irreducible nonresonant background from the production of a top quark pair in association with a b quark pair. Calculations of the inclusive production cross section for tt events with additional jets have been performed to NLO precision [11][12][13][14][15][16]. For a proton-proton centre-of-mass energy of 8 TeV, the predictions for the production of a top quark pair with two additional jets ttjj and with two additional b quark jets ttbb are σ ttjj = 21.0 ± 2.9 (scale) pb and σ ttbb = 0.23 ± 0.05 (scale) pb, respectively [16]. In this calculation, the additional jets are required to have transverse momenta p T > 40 GeV/c and absolute pseudorapidity |η| < 2.5, while for the ttH production value quoted above, no such requirements are applied to the decay products of the Higgs boson. The dominant uncertainties in these calculations are from the factorization and renormalization scales [17,18] caused by the presence of two very different scales in this process, the top quark mass and the jet p T . Therefore, experimental measurements of σ ttjj and σ ttbb production can provide a good test of NLO QCD theory and important input about the main background in the search for the ttH process.
In this Letter, the first measurements of the cross sections σ ttbb and σ ttjj and their ratio are presented. The analyzed data sample of pp collisions at a centre-of-mass energy of 8 TeV was collected with the CMS experiment at the CERN LHC and corresponds to an integrated luminosity of 19.6 ± 0.5 fb −1 [19]. The primary motivation for measuring the cross section ratio is that many kinematic distributions are expected to be similar for ttbb and ttjj, leading to reduced systematic uncertainties in the ratio.
jets") are identified by the presence of corresponding hadrons containing a b or c quark among the ancestors of the jet constituents. In the case where two jets contain the decay products of the same b hadron, the jet with the higher p T is selected as the b jet. When a b hadron is successfully matched, the c quarks are not considered.
The ttjj sample is composed of four components, distinguished by the flavour of the two jets in addition to the two b jets required from the top quark decays. The four components are the ttbb final state with two b jets, the ttbj final state with one b jet and one lighter flavour jet, the ttcc final state with two c jets, and the ttLF final state with two light-flavour jets (from a gluon or u, d, or s quark) or one light-flavour jet and one c jet. The ttbj final state is mainly from the merging of two b jets or the loss of one of the b jets caused by the acceptance requirements. Efficiency corrections to the measurement for the visible phase space are mainly from detector effects. The results for the visible phase space are compared with those from MC simulations.
The goal of the full phase-space result is to provide a comparison to theoretical calculations, which are generally performed at the parton level. To obtain a full phase-space MC sample, the jet reconstruction is performed on the partons (gluons, as well as quarks lighter than top) before hadronization, as well as τ leptons that decay hadronically. As the full hadronization and decay chain is known, only τ leptons that decay hadronically, and partons that lead to hadrons, are included. The jet reconstruction algorithm is the same as for the visible phase space. Following the jet reconstruction, b jets are identified with a ∆R = √ ∆φ 2 + ∆η 2 < 0.5 requirement between b quarks and parton-level jets, where ∆φ and ∆η are the azimuthal angle and pseudorapidity differences, respectively, between the directions of the b quark and the parton-level jet. For comparison with theoretical predictions [16], results are quoted for two different jet p T thresholds of p T > 20 and > 40 GeV/c on the jets not arising from top quark decays.

Event selection and background estimation
The events are recorded using dilepton triggers with asymmetric thresholds of 8 and 17 GeV/c on the transverse momentum of the leptons. Jets are reconstructed using the same algorithm as in the simulations. The leptons and all charged hadrons that are associated with jets are required to originate from the primary vertex, defined as the vertex with the highest ∑ p 2 T of its associated tracks. Muon candidates are reconstructed by combining information from the silicon tracker and the muon system [34]. Muon candidates are further required to have a minimum number of hits in the silicon tracker and to have a high-quality global fit including a minimum number of hits in the muon detector. Electron candidates are reconstructed by combining a track with energy deposits in the ECAL, taking into account bremsstrahlung photons. Requirements on electron identification variables based on shower shape and trackcluster matching are applied to the reconstructed candidates [35,36]. Muons and electrons must have p T > 20 GeV/c and |η| < 2.4.
To reduce the background contributions of muons or electrons from semileptonic heavy-flavour decays, relative isolation criteria are applied. The relative isolation parameter, I rel , is defined as the ratio of the sum of the transverse momenta of all objects in a cone of ∆R < 0.3 around the lepton p T direction to the lepton p T . The objects considered are the charged hadrons associated with the primary vertex as well as the neutral hadrons and photons, whose energies are corrected for the energy from pileup. Thus, Leptons are required to have I rel < 0.15. The efficiencies for the above lepton identification requirements are measured using Z boson candidates in data and are found to be consistent with the values from the simulation. The residual differences are applied as a correction to the simulation.
The event selection requires the presence of two isolated opposite-sign leptons of invariant mass M > 12 GeV/c 2 . Lepton pairs of the same flavour (e + e − , µ + µ − ) are rejected if their invariant mass is within 15 GeV/c 2 of the Z boson mass. The missing transverse energy (E miss T ) is defined as the magnitude of the vectorial sum of the transverse momenta of all reconstructed particles in the event [37]. In the same-flavour channels, remaining backgrounds from Z/γ * +jets processes are suppressed by demanding E miss T > 30 GeV. For the e ± µ ∓ channel, no E miss T requirement is applied.
Four or more reconstructed jets are required with |η| < 2.5 and p T > 30 GeV/c, of which at least two jets must be identified as b jets, using a combined secondary vertex (CSV) algorithm, which combines secondary vertex information with lifetime information of single tracks to produce a b-tagging discriminator [38]. A tight b-tagging requirement on this discriminator is applied, which has an efficiency of about 45% for b jets and a misidentification probability of 0.1% for light-flavour jets.
Differences in the b-tagging efficiencies between data and simulation [38] are accounted for by reweighting the shape of the CSV b-tagging discriminator distribution in the simulation to match that in the data. Data/MC scale factors for this p T -and η-dependent correction are derived separately for light-and heavy-flavour jets. The scale factor for c jets is not measured, owing to the limited amount of data, and is set to unity. Light-flavour scale factors are determined from a control sample enriched in events with a Z boson and exactly two jets. Heavy-flavour scale factors are derived from a tt-enriched sample with exactly two jets, excluding Z → events.
The background contributions arising from Z/γ * +jets events is estimated in data using the number of events having a dilepton invariant mass of 76 < M < 106 GeV/c 2 , scaled by the ratio of events that fail and pass this selection in the Drell-Yan simulation [39,40]. The multijet and diboson background contributions are negligible after the full event selection.

Measurement
After the full event selection, the three dilepton categories ee, µµ, and eµ are combined, and the ratio of the number of ttbb events to ttjj events is obtained from the data by fitting the CSV b-tagging discriminator distributions. The distributions of the discriminator from simulation for the third and fourth jets in decreasing order of the b-tagging discriminator, i.e. for the two additional jets not identified as coming from the top quark decays, are shown in Fig. 1. The third and fourth jets from ttjj events tend to be light-flavour jets, while these are heavy-flavour jets for ttbb events. These two distributions are used to separate ttbb from other processes. Figure 2 shows the b-tagging discriminator distributions of the third and fourth jets in the events from data and simulation, where the simulation histograms have been scaled to the fit result. The fit is performed to both distributions simultaneously, and contains two free parameters, an overall normalization and the ratio of the number of ttbb events to ttjj events. The ttcc and ttLF contributions are combined, and the ratio of the ttbb to ttbj contributions is constrained using the predictions from the MC simulation. Additionally, the background contributions from single top production and from tt events that fail the visible phase-space requirements (labelled "tt other") are scaled by the normalization parameter. The contribution from Z/γ * +jets is fixed from data, as described above. Nuisance parameters are used to account for the uncertainties in the background contributions.
The b-tagged jet multiplicity distribution in Fig. 3 shows the comparison between data and the MC simulation, scaled by the fit results to the data. The results, which include the requirement of four jets but not the b-tagging requirement, indicate that the fit is a good match to the data, as  Table 1 gives the predicted number of events for each physics process and for each dilepton category after fitting to the data, as well as a comparison of the total number of events expected from the simulation and observed in data. Since the full event selection requires at least two b-tagged jets, which is usually satisfied by tt events, only 3% of the events are from non-tt processes. The expected contribution from the ttH process is 12 events. This contribution is not subtracted from the data. Table 1: The number of events for each physics process and for each dilepton category after fitting to the data, their total, and the observed total number of events. The results are after the final event selection. The Z/γ * → uncertainty is from data, while all other uncertainties include only the statistical uncertainties in the MC samples. The ratio of the number of ttbb to ttjj events at the reconstruction level obtained from the fit is  corrected for the ratio of efficiencies. The event selection efficiencies, defined as the number of ttbb and ttjj events after the full event selection divided by the number of events in the corresponding visible phase space are 18.7% and 7.2%, respectively. The ttbb and ttjj cross sections in the visible phase space are measured using σ visible = N/( L), where L is the integrated luminosity, N is the number of observed events, and is the efficiency for each process. However, the NLO theoretical calculation is based on parton-level jets being clustered with partons before hadronization in the full phase space. For the purpose of comparing with the theoretical prediction, the cross sections in the full phase space are extrapolated from the cross sections in the visible phase space using σ full = σ visible /A, where A is the acceptance. The acceptances for extending ttbb and ttjj to the full phase space based on the MADGRAPH simulation are 2.6% and 2.4%, respectively, including the tt to dilepton branching fraction, calculated using the leptonic branching fraction of the W boson [41]. The acceptance is defined as the number of events in the corresponding visible phase space divided by the number of events in the full phase space.

Estimation of systematic uncertainties
The systematic uncertainties are determined separately for the ttbb and ttjj cross sections and their ratio. In the ratio, many systematic effects cancel, specifically normalization uncertainties such as the ones related to the measurement of the integrated luminosity and the lepton identification including trigger efficiencies, since they are common to both processes. The various systematic uncertainties in the measured values are shown in Table 2 for the visible phase space and a jet p T threshold of 20 GeV/c, including the luminosity uncertainty [19] and lepton identification [42], which only affect the absolute cross section measurements. The systematic uncertainty in the lepton identification is assessed using the scale factor obtained from Z boson candidates and also taking into account the different phase space between Z boson and tt events. The uncertainty arising from constraining the ratio of the ttbj to ttbb contributions in the fit to match the MC prediction is evaluated by comparing the result with and without the constraint. The number of pileup interactions in data is estimated from the measured bunch-to-bunch instantaneous luminosity and the total inelastic cross section. The systematic uncertainty in the number of pileup events is estimated by varying this cross section by 5%. The contributions from Drell-Yan and single top quark processes are small, and the shapes of the distributions from these backgrounds are similar to those of the ttLF component. Therefore, these backgrounds do not affect the measurement significantly. For the efficiency of ttjj events, the uncertainty owing to the heavy-flavour fraction is estimated by varying the contribution by 50%. An uncertainty to account for the variation of the ttcc fraction in the fit is also assigned by varying the contribution by 50%.  The dependence of the correction factor for the particle level on the assumptions made in the MC simulation is another source of systematic uncertainty: the generators MADGRAPH and POWHEG are compared and the difference in the efficiency ratio is taken as the systematic uncertainty. The uncertainties from the factorization/renormalization scales and the matching scale that separates jets from ME and from parton showers in MADGRAPH are estimated by varying the scales a factor of two up and down with respect to their reference values. The uncertainties in the PDFs are accounted by following the PDF4LHC prescription [45].
The total systematic uncertainty in the cross section ratio is 22%, with the dominant contributions from the b-tagging efficiency and the misidentification of light-flavoured partons, followed by the renormalization/factorization and matching scale systematic uncertainties.
The uncertainty in σ ttjj is significantly smaller than that in σ ttbb since the measurement of the latter requires the identification of multiple b jets. The uncertainty in σ ttbb is larger than that for the cross section ratio since uncertainties that are common between ttbb and ttjj, such as the jet energy scale uncertainty, partially or completely cancel in the ratio.
The systematic uncertainties in the measurements with a p T threshold of 40 GeV/c are found to be very similar to those with a 20 GeV/c threshold. The uncertainty from the Q 2 scale for the higher-p T threshold of 40 GeV/c cannot be accurately determined due to the statistical uncertainties in the MC sample. Thus, the p T > 40 GeV/c threshold measurements use the same Q 2 systematic uncertainties as those found for the p T > 20 GeV/c threshold results.
In extrapolating the measurements from the visible phase space to the full phase space, the systematic uncertainty in the acceptance is included. The effect of the MC modelling of the acceptance is estimated by comparing the results between MADGRAPH and POWHEG. This uncertainty equals 5% for each of the cross section measurements and 2% for the cross section ratio. The measured cross sections σ ttbb and σ ttjj and their ratio are given for the visible phase space (PS) defined as two leptons with p T > 20 GeV/c and |η| < 2.4 plus four jets, including two b jets with p T > 20 GeV/c and |η| < 2.5, and the full phase space, corrected for acceptance and branching fractions. The full phase-space results are given for jet thresholds of p T > 20 and 40 GeV/c. The uncertainties shown are statistical and systematic, respectively. The predictions of a NLO theoretical calculation for the full phase space and p T > 40 GeV/c are also given [16].

Results
After correcting for the efficiency ratio and taking into account the systematic uncertainties, the cross section ratio σ ttbb /σ ttjj is measured in the visible phase space from a fit to the measured CSV b-tagging discriminator distributions shown in Fig. 2. The measured cross section ratio in the visible phase space for events with particle-level jets and a minimum jet p T of 20 GeV/c is This result is for the visible phase space, defined as events having two leptons with p T > 20 GeV/c and |η| < 2.4, plus four jets, including two b jets with p T > 20 GeV/c and |η| < 2.5. The predicted value from both MADGRAPH and POWHEG is found to be 0.016 ± 0.002, where the MC uncertainty is the sum in quadrature of the statistical uncertainty and the systematic uncertainties from the factorization/renormalization and the matching scales. The measured cross sections are presented in Table 3. When the ttH contribution is subtracted from the data, the ratio is reduced by only 4%, much less than the overall uncertainty. Therefore, compared to the uncertainties, the contribution from ttH can be considered negligible. The measured full phase-space ratio with a minimum p T of 20 GeV/c for parton-level jets is consistent within the uncertainties with the result in the visible phase space.
A NLO theoretical QCD calculation is available for parton-level jets with a p T > 40 GeV/c threshold [16]. The NLO cross section values for σ ttbb , σ ttjj , and the ratio σ ttbb /σ ttjj are given in Table 3. To compare with this theoretical prediction, the analysis is repeated for a jet threshold of p T > 40 GeV/c. The measured cross section ratio in the full phase space with the p T > 40 GeV/c threshold is σ ttbb /σ ttjj = 0.022 ± 0.004 (stat) ± 0.005 (syst).
The cross sections in the full phase space for this p T threshold are summarized in Table 3. The measured cross section ratio is higher, but compatible within 1.6 standard deviations with the prediction from the NLO calculation of 0.011 ± 0.003.

Summary
A measurement of the cross section ratio σ ttbb /σ ttjj has been presented by the CMS experiment, using a data sample of pp collisions at a centre-of-mass energy of 8 TeV, corresponding to an integrated luminosity of 19.6 fb −1 . The individual cross sections σ ttjj and σ ttbb have also been determined. The cross section ratio was measured in a visible phase-space region using the dilepton decay mode of tt events and corrected to the particle level, corresponding to the detector acceptance. The measured cross section ratio in the visible phase space is σ ttbb /σ ttjj = 0.022 ± 0.003 (stat) ± 0.005 (syst) with a minimum p T for the particle-level jets of 20 GeV/c. The cross section ratio has also been measured in the full phase space with minimum parton-jet p T thresholds of p T > 20 and >40 GeV/c in order to compare with a NLO QCD calculation of the cross section ratio. The measurement is compatible within 1.6 standard deviations with the theoretical prediction. These are the first measurements of the cross sections σ ttbb and σ ttjj , and their ratio. The result will provide important information about the main background in the search for ttH and as a figure of merit for testing the validity of NLO QCD calculations.