Measurement of forward tt, W+bb¯ and W+cc¯ production in pp collisions at s=8 TeV

The production of tt , W + bb and W + cc is studied in the forward region of proton–proton collisions collected at a centre-of-mass energy of 8 TeV by the LHCb experiment, corresponding to an integrated luminosity of 1 . 98 ± 0 . 02 fb − 1 . The W bosons are reconstructed in the decays W → (cid:2) ν , where (cid:2) denotes muon or electron, while the b and c quarks are reconstructed as jets. All measured cross-sections are in agreement with next-to-leading-order Standard Model predictions. 3 .


Introduction
The production of tt pairs from proton-proton (pp) collisions in the forward region is of considerable interest, as it may be sensitive to physics beyond the Standard Model (SM) [1]. Furthermore, forward tt events can be used to constrain the gluon parton distribution function (PDF) at large momentum fraction [2]. The tt cross-section has been measured at ATLAS and CMS using several final states and at various centre-of-mass energies [3][4][5]. LHCb has also measured top quark production in the forward region in the W + b final state [6].
Measurements of the production cross-sections of W + bb and W + cc in the forward region provide experimental tests of perturbative quantum chromodynamics (pQCD) [7][8][9], in a complementary phase space region to ATLAS and CMS. Previous studies of the W + bb final state have been performed by ATLAS [10] and CMS [11,12] at centre-of-mass energies √ s = 7 TeV and 8 TeV.
LHCb has previously performed measurements of the production cross-sections of a W boson with at least one observed b or c jet [13] at 7 and 8 TeV, and a Z boson with at least one b jet at 7 TeV [14]. This Letter reports a study of events containing one isolated lepton (muon or electron) and two heavy-flavour tagged jets to measure the production cross-sections of tt, W + + bb, W − + bb, W + + cc and W − + cc. The study of W + cc is the first of its kind.
Measurements are performed using a data sample corresponding to an integrated luminosity of 1.98 ± 0.02 fb −1 of pp collisions recorded at 8 TeV during 2012 by the LHCb experiment.

The LHCb detector and samples
The LHCb detector [15,16] is a single-arm forward spectrometer fully instrumented in the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. The tracking system provides a measurement of momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV. 1 The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) μm, where p T is the component of the momentum transverse to the beam, in GeV. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad (SPD) and preshower (PRS) detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The online event selection is performed by a trigger, which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.
The W → μν candidates are required to satisfy the hardware trigger requirement for muons, of having hits in the muon system corresponding to a high transverse momentum particle, and to satisfy the software trigger requirement of p T (μ) > 10 GeV. The W → eν candidates are required to satisfy the hardware trigger requirement for electrons of having an electromagnetic cluster of high transverse energy associated with signals in the PRS and SPD detectors, and the software trigger, which selects events with an 1  electron with p T (e) > 15 GeV. A global event cut (GEC) on the number of hits in the SPD is applied in order to prevent highmultiplicity events from dominating the processing time of the reconstruction code.
Simulated event samples of W + jets, Z + jets, tt, single-top and diboson (W Z, Z Z) production are generated using Pythia 8 [17] with a specific LHCb configuration [18]. Event samples of W + bb, W + cc, Z + bb and Z + cc production are generated with Alpgen [19], which includes tree-level contributions with up to four additional emissions of final state partons with respect to the leading-order diagram. Pythia 8 is used to perform the hadronisation for these samples. The cross-sections of the simulated processes are calculated at next-to-leading-order (NLO) including spin correlation effects with MCFM [20] using the CT10 PDF set [21]. Decays of hadronic particles are described by EvtGen [22], in which final-state radiation is generated using Photos [23]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [24] as described in Ref. [25]. Since neither showering nor hadronisation are included in MCFM, an overall correction is calculated to compare the measurements with the predicted cross-section at particlelevel. This is done by generating W + bb, W + cc and tt events with Pythia 8 with the CT10 PDF set [21] where the same acceptance requirements are applied. The particle-level lepton momentum used here is the momentum after final-state radiation as implemented in Pythia 8.

Event selection
Events are selected by requiring the presence of either a high-p T muon or electron and two heavy-flavour tagged jets. The same fiducial definition for lepton and jets used in previous studies [6,13,26] is applied. The lepton must have p T ( ) > 20 GeV and 2.0 < η( ) < η max ( ), where η max ( ) is 4.50 for a muon candidate, corresponding to the muon identification system acceptance, and is 4.25 for an electron candidate, corresponding to the electromagnetic calorimeter acceptance. The jets are required to have p T (j) > 12.5 GeV and 2.2 < η(j) < 4.2. Due to the limited sample size to validate the heavy-flavour tagging algorithm for higher p T jets [27], only jets with p T (j) < 100 GeV are considered. The lepton is required to be isolated from both jets using R( , j) > 0.5, where R = η 2 + φ 2 is the distance between them in η-φ space and φ is the azimuthal angle. This requirement serves to remove the background formed by leptons coming from the same parton as the jets. The jets are also required to have , are removed to reduce the contamination from events not containing a W boson. If more than one ( + j 1 + j 2 ) candidate is found in the event, the candidate with highest p miss T is selected. Jets are reconstructed using a particle flow algorithm [28] and clustered using the anti-k T algorithm [29] with distance parameter R = 0.5 as implemented in the FastJet software package [30]. As in Ref. [28], the jet energy is corrected to the particle level, excluding neutrinos, and the same jet quality requirements are applied. Jets are heavy-flavour tagged, i.e. as originating from a b or c quark, by the presence of a secondary vertex (SV) with R < 0.5 between the jet axis and the direction of flight of the heavy-flavour hadron candidate, defined by the vector from the PV to the SV position.
The SV-tagger algorithm, described in detail in Ref. [27], uses two boosted decision trees (BDTs) [31,32]: one that separates heavy-flavour from light-parton jets (BDT(bc|udsg)) and one that separates b jets from c jets (BDT(b|c)). Both jets used in the analysis are required to have BDT(bc|udsg) > 0.2, which gives a heavy-flavour tagging efficiency of about 50% (20%) for b (c) jets and a misidentification probability of about 0.1% for light jets.
In order to suppress the Z + jets background, events with an additional oppositely charged high-p T lepton that fulfills the lepton requirements described above are vetoed. Backgrounds from misidentified leptons or semileptonic decays of heavy-flavour hadrons are suppressed by two requirements applied to the lepton: IP( ) must be less than 0.04 mm and p T ( )/p T (j ) > 0.8, where j is defined as a reconstructed jet with relaxed quality criteria that contains the lepton.

Backgrounds
In both the electron and muon channels, the background processes include Z + bb and Z + cc production with Z → μμ or Z → ee, where one of the final state leptons is not reconstructed.
Z (→ τ τ ) + bb production is also considered, where at least one τ decays to an electron or a muon. A small contribution of Z → τ τ produced in association with one b or c jet is also included. Other processes of Z production associated to jets are negligible. Background contributions from W (→ ν) + jets where the event does not contain two b jets, and W (→ τ ν τ ) + bb where τ decays to an electron or muon are also included. Single-top, W (→ ν)Z(→ bb) and Z (→ )Z (→ bb) production are considered as background processes. The expected yields of the background processes described above are obtained from NLO cross-sections. Weight factors are applied to compensate for residual differences between data and simulation for GEC, trigger and heavy-flavour tagging efficiencies. Further details about these factors and their uncertainties are given in Sec. 5.4.
The QCD multi-jet background, which includes lepton misidentification and semileptonic decays of a beauty or charm hadron, is estimated by using events which fail the p T ( )/p T (j ) > 0.8 requirement. The QCD multi-jet background normalisation is adjusted in order to describe the event yield at IP( ) > 0.04 mm, after subtracting the non-QCD backgrounds obtained from simulation.

Overview
The data sample is split into four subsamples, according to the flavour and charge of the lepton (μ ± and e ± ). A simultaneous fit to the distributions of four variables is performed to determine the tt, W + + bb, W − + bb, W + + cc, and W − + cc yields in each sample. The four variables used in the fit are the invariant mass of the two jets (m jj ), the response of a multivariate classifier trained to distinguish between tt and W + bb events and the multivariate discriminant classifier for each jet, j 1 BDT(b|c) and j 2 BDT(b|c), trained to discriminate between b and c jets. The expected background components are obtained from simulation, with the exception of the QCD multijet background. The fitted signal yields are converted into cross-sections using simulation and data-driven efficiencies and the measurement of the integrated luminosity [33]. The systematic uncertainties are included as nuisance parameters in the fit and propagated to the final result.

Fit variables
While W + bb and W + cc processes can be disentangled using the BDT(b|c) variables for both jets, the separation between tt and W + bb or W + cc is obtained by using the m jj variable and a multivariate discriminant, uGB, constructed such that its response is minimally correlated with m jj [34]. The variables m jj , j 1 BDT(b|c) and j 2 BDT(b|c) are found to be uncorrelated. The uGB response is trained in simulation using 11 kinematic variables of , R(jj, j 1 ), R(jj, j 2 ) and cos(θ jj ( )), where θ jj ( ) is the lepton scattering angle in the dijet rest frame and jj represents the dijet system. The muon and electron decay channels are trained separately. Fig. 1 shows the correlation between the uGB and the m jj variables. In the fit all variables are treated as uncorrelated; the effect of the observed small correlations is taken into account in the systematic uncertainties of the results.

Signal determination
A binned maximum likelihood fit is performed to determine the yields of tt, W + + bb, W − + bb, W + + cc and W − + cc. The sim-ulated background yields are normalised to NLO predictions and they are allowed to vary in the fit within their uncertainties. The QCD multijet background is normalised from a data-driven method as explained in Section 4. The fit is performed assuming the four variables (m jj , uGB, j 1 BDT(b|c) and j 2 BDT(b|c)) to be uncorrelated.
The free parameters in the fit are the normalisation factors with respect to the SM predicted yields K (i), where i = tt, W + + bb, W − + bb, W + + cc, W − + cc. The K (tt) parameter is fitted using all four samples, while the others are fitted in each corresponding sample. The projections of the fit in each of the four samples are shown in Figs. 2-5, while the fit results are given in Table 1.

Systematic uncertainties
Systematic effects can impact the results in two ways: by affecting signal and background yields, or by altering template shapes used in the fits. The efficiency of the GEC is measured in a Z + jet sample selected with a looser trigger requirement [26] and a 2% uncertainty is assigned to account for the final-state dependence of the GEC efficiency observed in simulation. The systematic uncertainty on the integrated luminosity is 1.16% [33].
The lepton reconstruction and trigger efficiencies are studied using data-driven methods in Z → + − [35,36]. Those studies show that data and simulation agree within 1.0-5.0% depending on η( ) and p T ( ), which is taken as systematic uncertainty. The uncertainty of the lepton kinematic efficiency, which includes the effect of final-state radiation, is neglected. The method described in Ref. [27] is used to assess the systematic uncertainty due to the errors of the heavy-flavour tagging efficiency weight-factor described in Sec. 4, which amounts to 5-10% depending on p T (j).
The systematic uncertainty of the jet energy calibration includes possible biases due to flavour dependence (2%), tracks not associated to a real particle (1.2%), track momentum resolution (1%) and residual differences between simulation and data due to pile-up and calorimeter response (1%) as described in Refs. [26,28]. The jet energy resolution at LHCb is modelled in simulation to an accuracy of about 10% [28,13]. The uncertainties related to the jet reconstruction and quality selection efficiencies are found to be below 2%. The jet-related systematic uncertainties affect both the template shapes and the expected yields.
The simulated background normalisations are predicted at NLO and they are affected by uncertainties on the PDF (δ PDF ), on the strong coupling constant α s (δ αs ) and on the renormalisation and factorisation scale (δ scale ). The PDF uncertainty is evaluated following the procedure of Ref. [37]. The influence of the uncertainty on the strong coupling constant is evaluated by calculating the cross-sections with PDF sets [21] using values of α s (M Z ): 0.117, 0.118 and 0.119. The scale uncertainty is evaluated by calculat-ing the cross-sections varying the renormalisation and factorisation scale by a factor of two. The total uncertainty is taken as (δ 2 PDF + δ 2 αs ) + δ scale as done in Ref.
[6] which translates to relative uncertainties on the signal yields in the range 3-10%. These theoretical uncertainties are also considered in the signal yields in the experimental acceptance. The systematic uncertainties in the normalisations due to the limited size of the simulated samples are between 1 and 7%. The uncertainty on the normalisation of the QCD multi-jet background, taken from data, is found to have a negligible effect.
Possible correlation effects between the fitted variables are studied by using templates generated randomly from the analysis templates with or without correlations found in simulation. It is found that the correlation and the fit procedure can affect the final yields by up to 10%.
All significant systematic uncertainties are correlated between the four samples except for the uncertainty due to the finite size of the simulated samples, which affects each sample and process independently.

Results and conclusions
The production cross-sections for tt, W + + bb, W − + bb, W + + cc and W − + cc are measured for pp collisions at a centre-ofmass energy of 8 TeV corresponding to an integrated luminosity of 1.98 ± 0.02 fb −1 of data collected in 2012 by the LHCb experiment. These production cross-sections are obtained as the product of the normalisation factors shown in Table 1 Table 2 Observed and expected cross-sections in the fiducial region defined in Section 3. The first uncertainty on the expected cross-sections is related to the scale variation and the second is the total. The first uncertainty on the observed cross-sections is statistical and the second is systematic.
GeV. The measured and expected cross-sections are presented in Table 2 and Fig. 6