Search for supersymmetry in pp collisions at √ s = 8 TeV in events with a single lepton, large jet multiplicity, and multiple b jets

: Results are reported from a search for supersymmetry in pppp collisions at a center-of-mass energy of 8 TeV, based on events with a single isolated lepton (electron or muon) and multiple jets, at least two of which are identified as b jets. The data sample corresponds to an integrated luminosity of 19.3 fb−1recorded by the CMS experiment at the LHC in 2012. The search is motivated by supersymmetric models that involve strong-production processes and cascade decays of new particles. The resulting final states contain multiple jets as well as missing transverse momentum from weakly interacting particles. The event yields, observed across several kinematic regions, are consistent with the expectations from standard model processes. The results are interpreted in the context of simplified supersymmetric scenarios with pair production of gluinos, where each gluino decays to a top quark–antiquark pair and the lightest neutralino. For the case of decays via virtual top squarks, gluinos with a mass smaller than 1.26 TeV are excluded for low neutralino masses. Abstract Results are reported from a search for supersymmetry in pp collisions at a center-of-mass energy of 8 TeV, based on events with a single isolated lepton (e or µ ) and multiple jets, at least two of which are identiﬁed as b jets. The data sample corresponds to an integrated luminosity of 19.3 fb − 1 recorded by the CMS experiment at the LHC in 2012. The search is motivated by supersymmetric models that involve strong-production processes and cascade decays of new particles. The resulting ﬁnal states contain multiple jets as well as missing transverse momentum from weakly interacting particles. The event yields, observed across several kinematic regions, are consistent with the expectations from standard model processes predicted from control samples in the data. The results are interpreted in the context of simpliﬁed supersymmetric scenarios with pair production of gluinos, where each gluino decays to a top quark-antiquark pair and the lightest neutralino. For the case of decays via virtual top squarks, gluinos with a mass smaller than 1.26 TeV are excluded for low neutralino masses.


Introduction
This paper presents results from a search for new physics in proton-proton collisions at a centerof-mass energy of 8 TeV in events with a single lepton (electron or muon), missing transverse momentum, and multiple jets, at least two of which are tagged as originating from bottom quarks (b-tagged jets). This signature arises in models based on supersymmetry (SUSY) [1][2][3][4][5][6], which potentially offers natural solutions to limitations of the standard model (SM). Large loop corrections to the Higgs boson mass could be cancelled by contributions from supersymmetric partners of SM particles. Achieving these cancellations requires the gluino ( g) and top squark ( t), which are the SUSY partners of the gluon and top quark, respectively, to have masses less than about 1.5 TeV [7][8][9][10]. Extensive searches at LEP, the Tevatron, and the Large Hadron Collider (LHC) have not produced evidence for SUSY (see  for recent results in the single-lepton topology). For scenarios with mass-degenerate scalar partners of the first-and second-generation quarks, the mass limits generally lie well above 1 TeV. However, viable scenarios remain with t and g masses below approximately 0.5 and 1.5 TeV, respectively.
In some of these scenarios top squarks are the lightest quark partners. In R-parity conserving models [18] this could lead to signatures with multiple W bosons, multiple b quarks, and two LSPs in the final state, where the LSP is the weakly interacting lightest SUSY particle. The search described in this paper is designed to detect these signatures. It focuses on gluino pair production, with subsequent gluino decay to two top quarks and the LSP ( χ 0 1 ) through either a virtual or an on-shell top squark: pp → g g with g(→ tt) → tt χ 0 1 . These decay chains result in events with high jet multiplicity, four b quarks in the final state, and large missing transverse momentum (E T / ). The probability that exactly one of the four W bosons decays leptonically is approximately 40%, motivating a search in the single-lepton channel.
Three variations of this scenario, denoted models A, B, and C, are considered in this analysis and implemented within the simplified model spectra (SMS) framework [19][20][21]. In model A (models B and C), gluinos are lighter (heavier) than top squarks and gluino decay proceeds through a virtual (real) t. The relevant backgrounds for this search arise from tt, W+jets, single-top quark, diboson, ttZ, ttW, and Drell-Yan (DY)+jets production. The non-tt backgrounds are strongly suppressed by requiring at least six jets, at least two of which are b-tagged. The remaining background is dominated by tt events with large E T / , generated either by a single highly boosted W boson that decays leptonically (single-lepton event) or by two leptonically decaying W bosons (dilepton event). Though tt decays produce two true bquarks, additional b-tagged jets can arise because of gluon splitting to a bb pair or from mistagging of charm-quark, light-quark, or gluon jets.
We search for an excess of events over SM expectations using two approaches. The first approach is based on the distribution of E T / in exclusive intervals of H T , where H T is the scalar sum of jet transverse momentum (p T ) values. In this approach, we evaluate the E T / distribution in the signal region of high H T in two different ways [12,13]: by extrapolating from lower H T and by using the charged-lepton momentum spectrum (this latter spectrum is highly correlated with the neutrino p T spectrum in events with a leptonically decaying W boson and so carries information about E T / ).
The second approach, which is new and described in more detail in this paper, is based on the azimuthal angle, ∆φ(W, ), between the reconstructed W-boson direction and the lepton. The single-lepton background from tt is suppressed by rejecting events with small ∆φ(W, ). As

Data sample and event selection
The data used in this search were collected in proton-proton collisions at √ s = 8 TeV with the Compact Muon Solenoid (CMS) experiment in 2012 and correspond to an integrated luminosity of 19.3 fb −1 . The central feature of the CMS apparatus is a superconducting solenoid, providing a magnetic field of 3.8 T. Within the superconducting solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass-scintillator hadron calorimeter. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. Extensive forward calorimetry complements the coverage provided by the barrel and endcap detectors. The origin of the CMS coordinate system is the nominal interaction point. The polar angle θ is measured from the counterclockwise beam direction and the azimuthal angle φ (in radians) is measured in the plane transverse to the beam axis. The silicon tracker, the muon systems, and the barrel and endcap calorimeters cover the regions |η| < 2.5, |η| < 2.4, and |η| < 3.0, respectively, where η = − ln[tan(θ/2)] is the pseudorapidity. A detailed description of the CMS detector can be found elsewhere [22].
Simulated event samples based on Monte Carlo (MC) event generators are used to validate and calibrate the background estimates from data and to evaluate the contributions for some small backgrounds. The MADGRAPH [23] 5 generator with CTEQ6L1 [24] parton distribution functions (PDFs) is used for tt, W+jets, DY+jets, ttZ, ttW, and QCD multijet processes and the POWHEG 1.0 [25] generator for single-top-quark production. The PYTHIA [26] 6.4 generator is used to generate diboson samples and and to describe the showering and hadronization of all samples (the Z2 * tune [27] is used). Decays of τ leptons are handled by TAUOLA [28]. The GEANT4 [29] package is used to describe the detector response.
The SUSY signals for the three scenarios considered in this analysis are generated with MAD-GRAPH and CTEQ6L1 PDFs. In these scenarios, gluinos are pair-produced and decay into tt χ 0 1 , assuming the narrow-width approximation. For the signal samples, the detector response is described using a fast simulation [30]. The fast simulation has been validated extensively against the detailed GEANT4 simulation for the variables relevant for this search and efficiency corrections based on data are applied. All simulated events are reweighted to match the multiplicity distribution of additional proton-proton collisions ("pileup") as observed in data.
Events are selected online with either triple-or double-object triggers. The triple-object triggers require a lepton with p T > 15 GeV, together with H T > 350 GeV and E T / > 45 GeV. The doubleobject triggers, which are used to select control samples and extend the E T / acceptance in the approach based on ∆φ(W, ), have the same H T requirement, no E T / requirement, and a lepton p T threshold of 40 GeV. The trigger object efficiencies are measured in independently triggered control samples and found to reach a plateau at approximately 95% for thresholds well below those used in the offline selection. The measured trigger efficiencies are used to correct the simulation.
The preselection of events is based on the reconstruction of an isolated lepton (e or µ) and multiple jets and follows the procedure described in Ref. [12]. Events are required to include at least one lepton with p T > 20 GeV and |η| < 2.5 (e) or |η| < 2.4 (µ). Standard identification and isolation requirements [31,32] are applied to reject backgrounds from jets mimicking the lepton signature and from non-prompt leptons produced in semileptonic decays of hadrons within jets. The isolation selection requires the sum of transverse momenta of particles in a cone of radius (∆η) 2 + (∆φ) 2 = 0.3 around the electron (muon) direction, divided by the p T of the lepton itself, to be less than 0.15 (0.12). The lepton efficiencies are measured with a "tag-and-probe" technique [33] to be approximately 80% for electrons and 95% for muons. The efficiencies vary by less than 20% over the selected kinematic range and the average values agree to better than 1% between data and simulation.
Jets are clustered from particles reconstructed with the particle-flow (PF) algorithm [34], which combines information from all components of the detector. The clustering is performed with the anti-k T clustering algorithm [35] with a distance parameter of 0.5. Jet candidates are required to satisfy quality criteria that suppress noise and spurious non-collision-related energy deposits. Jets with p T > 40 GeV and |η| < 2.4 are considered in the analysis and are used to determine the number of selected jets N j and H T . The missing transverse momentum is determined from the vector sum of the momenta of all particles reconstructed by the PF algorithm. Jet and E T / energies are corrected to compensate for shifts in the jet energy scale and the presence of particles from pileup interactions [36].
The number of b-tagged jets, N b , is determined by applying the combined secondary vertex tagger [37,38] to the selected jets. At the working point used, this tagger has a roughly 70% btag efficiency, and a mistag rate for light partons (charm quarks) of approximately 3% (15-20%). Scale factors for the efficiencies and mistag rates relative to simulation are measured with control samples in data and applied in the analysis.
As the signal events are expected to exhibit a high level of hadronic activity and contain a large number of b quarks, events are required to have H T > 400 GeV. In addition, at least two b-tagged jets and a total jet multiplicity N j ≥ 6 are required. The SM background in this sample is dominated by tt production. Samples with 3 ≤ N j ≤ 5 or fewer than two b-tagged jets are used to define background-dominated control regions. Events with a second isolated lepton with p T > 15 GeV are vetoed by the nominal signal selection to suppress contributions from dilepton tt decays, but such events are used as a control sample to measure the residual background from that process.

Search in missing transverse momentum and H T
We now describe the background estimation method based on the evaluation of the E T / spectrum. This method utilizes two techniques, as mentioned above, both of which were employed for previous CMS studies [12,13]. The lepton spectrum (LS) method makes use of the similarity between the neutrino and charged-lepton p T spectra in W decays to predict the high-side tail of the E T / distribution [39] based on the p T distribution of charged leptons with high p T . The missing transverse momentum template (MT) method uses a parametric description of the E T / spectrum based on a fit to control regions at low H T . Through extensive use of data control samples, we avoid uncertainties related to potential deficiencies of the simulation in the description of SM We consider overlapping signal regions corresponding to lower limits for H T ranging from 400 to 1000 GeV, each of which provides sensitivity to a different SUSY-particle mass region. The E T / spectrum in these samples is divided into exclusive ranges: 150-250, 250-350, 350-450, and >450 GeV. To increase the sensitivity, the search regions are further divided into events with N b = 2 and ≥3. The two background estimation methods provide direct predictions for events with two b-tagged jets. The expected yields at higher N b are obtained by extrapolating those predictions to the ≥3 b-jet case.

Prediction of the single-lepton background for the LS method
The E T / spectrum of the single-lepton background is predicted with a method based on the similarity of the neutrino and charged lepton p T spectra in W decays. In each event, the charged and neutral lepton p T can be very different, but the distributions of the true neutrino p T and the true lepton p T are identical in the absence of W polarization. There are several effects that result in differences between the observed lepton and neutrino p T spectra and for which corrections are derived: W polarization, the effect of a lepton p T threshold, the difference between the E T / and lepton-p T resolutions, and non-single-lepton components, which are not modeled by the method. The W-boson polarization in tt decays is the dominant effect that causes a difference between the neutrino and lepton p T spectra. This polarization is well understood theoretically [40] and accounted for in the simulation. The difference between the E T / and lepton-p T resolution is modeled with templates measured in multijet data samples, which have little genuine E T / . These resolution templates are binned in H T and N j and are used to smear the lepton-p T spectrum to account for the difference with respect to the E T / resolution.
To predict the E T / spectrum from the lepton p T spectrum, scale factors κ LS are calculated in simulation and applied in the bins of E T / to the results obtained from a control sample selected using lepton p T > 50 and without a E T / requirement. The scale factors are defined by κ LS (E T / bin) = N true (E T / bin)/N pred (E T / bin), where N true is the MC yield of true single-lepton events of all background types in a given E T / bin and N pred is the predicted MC yield in the same bin, after the E T / resolution templates have been applied to the p T spectrum. The scale factor removes the contribution of the τ lepton and dilepton backgrounds, which are predicted separately. The calculation of the scale factor is dominated by the contributions of tt events, but W+jets, DY+jets, single-top quark, ttZ, and ttW events are included as well; the contribution of diboson events is negligible. The dependence of the scale factor on lepton p T primarily reflects the effect of the W-boson polarization in tt decays. The scale factor varies from around 0.9 at low lepton p T , after E T / resolution smearing, to about 1.7 at high p T .
Systematic uncertainties are evaluated by calculating the change induced in the scale factors from various effects and propagating this change to the predicted yields. The dominant uncertainties arise from the statistical uncertainties of the simulated samples used in the determination of the scale factor (9-49%), the jet and E T / scale (7-31%, depending on H T and E T / ), and the W polarization in tt decays (2-4%). Smaller uncertainties arise from the lepton efficiency, sub-dominant background cross sections, and DY+jets yield.

Predictions of τ lepton and dilepton backgrounds for the LS method
Neutrinos from τ lepton decays cause the E T / and charged-lepton p T spectra to differ. Therefore, the SM background from τ leptons is evaluated separately, following the procedure documented in Ref. [39]. While τ-lepton decays are well simulated, their p T spectra may not be. Thus we apply τ-lepton response functions derived from simulated tt events to the p T spectra 3.3 The missing transverse momentum model in the MT method 5 of electrons and muons measured in single-lepton and dilepton control samples. In these control samples, the E T / requirement is removed and a selection to reject DY events is applied. The H T and N j requirements are loosened in the control sample used to estimate the background of events with hadronically decaying taus. For leptonic (hadronic) τ-lepton decays, hereafter labelled τ (τ h ), the response function is the distribution of the daughter lepton (jet) p T as a fraction of the parent τ lepton p T . To predict the contribution to the E T / spectrum, the observed lepton in the control sample is replaced by a lepton (or jet), with the transverse momentum sampled from the appropriate response function; the difference between the sampled and original p T is added vectorially to the E T / . This procedure is used to predict three background categories: single τ , + τ h , and + τ events; the notation includes τ components. Each of these processes could yield events that satisfy the single-lepton selection requirements; in the + τ case this can only occur when exactly one of the final-state leptons is selected.
The E T / spectrum obtained from applying the response functions to the control samples is corrected as a function of E T / and H T for branching fractions and efficiencies determined from MC simulation. These correction factors are roughly 0.2, 0.9, and 0.6 for the single τ , + τ , and + τ h backgrounds, respectively, in all H T bins. A correction is derived from simulation to account for a possible dependence on E T / of the event selection and acceptance (note that this correction is consistent with one to within the uncertainties).
SM backgrounds also arise from dilepton events. There are two categories of these events: those with both leptons reconstructed but where only one of the leptons is selected, and those with one lepton that is not reconstructed, which can occur either because of a reconstruction inefficiency or because the lepton lies outside the η acceptance of the detector. The estimate of the background from these processes is given by the simulated E T / distribution, corrected by the ratio of the number of data to MC events in a dilepton control sample. This sample is the same as that used in the + τ background prediction, but with an additional requirement of E T / > 100 GeV used to retain high trigger efficiency. Systematic uncertainties for the dilepton background estimate arise from the uncertainty in the data/MC scale factor, pileup, trigger and selection efficiencies, and the top quark p T spectrum.
The background composition is similar in each of the LS signal regions, with relative yields of single lepton, single τ, and other background components in the approximate proportion of 4:1:1. The total yields are given in Table 1 and Fig. 1 shows the E T / distributions.

The missing transverse momentum model in the MT method
For values of E T / well above the W boson mass, the SM E T / distribution primarily arises from neutrino emission (genuine E T / ) and has an approximately exponential shape. According to simulation, this distribution depends on H T and, to a lesser extent, N j and N b , with only a small variation predicted for the non-exponential tails. Empirically, we find that the genuine E T / distribution from tt events (the leading background term) can be parametrized well with the Pareto distribution [41], which is widely used in extreme value theory: where x min , α, and β are the position, scale, and shape parameters, respectively. Equation 1 yields an exponential function for β = 0. We set x min = 150 GeV, representing the lower bound of the E T / spectrum to be described, while α and β are determined from a fit to data.
Both the control regions used for a fit of the E T / model to data and the signal regions have selection criteria applied to H T . Because of the correlation between the momentum of the lep- Table 1: Observed yields in data and SM background predictions with their statistical and systematic uncertainties from the LS and MT methods. For the MT method the low E T / (150-250 GeV) and low H T (400-750 GeV) regions in the N b = 2 sample are used as control regions and are not shown in the table.
tonically decaying W boson and the momenta of the jets balancing it, restrictions on H T affect the E T / spectrum. We describe the ratio between the E T / spectrum after imposing a lower bound on H T and the inclusive E T / spectrum by a generalized error function (corresponding to a skewed Gaussian distribution), similar to the approach described in Ref. [13]. The evolutions with H T of the location and variance parameters of this function are determined from simulation and found to be linear. The results in simulation are found to be consistent with the dependence measured from a tt-dominated control sample in data, defined by N j ≥ 4, N b ≥ 2, and H T > 400 GeV. Systematic uncertainties related to the error functions are determined from this comparison and from the difference between between linear and quadratic models of the function parameters. The E T / spectrum in exclusive bins of N b is also affected by an acceptance effect due to the p T requirement on the b-tagged jets: in tt events at low H T , high values of E T / correspond to low values of the p T of the b quark associated with the leptonically decaying W boson and tend to move events to lower b-jet multiplicities. We therefore apply an acceptance correction when applying the E T / model to events with one or two b-tagged jets. For N b = 2 and 150 < E T / < 1000 GeV, the size of the correction is 12% for H T = 750 GeV and is smaller for larger H T .
The b-jet multiplicity distribution is used to estimate the ratio of the W+jets background to the tt background as a function of H T . The H T distribution of tt events is extracted from the N b = 2 sample as described in Ref. [13]. The contribution of W+jets events for E T / > 150 GeV is approximately 1%. Based on the measured ratio of W+jets to tt background events, the Pareto distribution describing the leading background term is combined with the shape of the W+jets

The fit to the missing transverse momentum spectrum in the MT method
The model for genuine E T / in SM events is convolved with the E T / resolution templates described in Section 3.1 and used in a simultaneous fit to the E T / shapes in control regions in the N b = 1 and N b = 2 bins. The control regions are chosen in order to ensure reasonably small statistical uncertainties and to limit potential contributions from signal events: for events with two btagged jets the control region is defined by 400 < H T < 750 GeV and 150 < E T / < 400 GeV, while for one b-tagged jet it is extended to 400 < H T < 2500 GeV and 150 < E T / < 1500 GeV.
Because of limited statistical precision in the control regions, we are unable to obtain a reliable estimate of β from data. We use a constraint from simulation together with an uncertainty derived from a comparison between data and simulation in control regions with lower jet multiplicity. The constraint is implemented as a Gaussian term corresponding to the value and its statistical uncertainty obtained from simulation, β = 0.03 ± 0.01. The prediction from simulation for N j = 3-5 is β = 0.15-0.05, consistent with the data. The maximum difference between data and simulation in any of these three N j bins of 0.05 is used to define a systematic uncertainty in the prediction. The parameters of the error function (Section 3.3) are constrained by Gaussian terms reflecting the respective values and covariance from simulation.
The predictions for the N b = 2 signal regions are obtained by integrating the function repre-

3 Search in missing transverse momentum and H T
senting the E T / model over the relevant E T / range and summing over the H T bins. In each H T bin, the predicted distribution is scaled to match the observed number of events in the normalization region defined by 150 < E T / < 250 GeV. The statistical uncertainties of the predictions are evaluated by repeating the procedure using parameter values randomly generated according to the results of the fit, including the covariance matrix. The predictions are stable to within 1% if the E T / model described in Ref.
[13] is used in place of the model described here.
The results of the MT method can be affected by several systematic uncertainties that are related to detector effects, assumptions made on the shape of the distribution, as well as theoretical uncertainties and the contamination due to non-leading backgrounds. Systematic uncertainties related to the jet and E T / scale, lepton reconstruction efficiencies, W-boson polarization in tt events, and cross sections of non-leading backgrounds are evaluated in the same way as for the LS method (Section 3.1). Effects due to b-jet identification efficiencies and pileup are also taken into account. In addition, the following uncertainties specific to the MT method are considered. The β parameter and parameters of the error function are varied as described above. The differences with respect to the standard result define the systematic uncertainty for each signal region. The effects of a possible residual non-linearity in the error function parameters versus H T are also taken into account. To test the validity of the method, the procedure is applied to simulated events. The resulting background predictions are found to be statistically consistent with the true numbers from simulation. Conservatively, the maximum of the relative difference and its uncertainty are assigned as a further systematic uncertainty ("closure"). The dominant contributions to the systematic uncertainty are related to the E T / model (1-35%, depending on the H T and E T / bin) and the closure (8-43%).

Background estimation in the N b ≥ 3 bin
The numbers of data events in the N b ≥ 3 control samples are too low for an application of the LS or MT technique. Therefore we estimate the background for high b-jet multiplicities with transfer factors (R 32 ) describing the ratio of the number of events with ≥ 3 and = 2 b-tagged jets for each of the signal regions. The central values for the R 32 factors are determined from simulation. The scale factors R 32 increase with jet multiplicity from approximately 0.05 for events with three jets to approximately 0.2 in events with ≥ 6 jets because of the higher probability of misidentifying one or more jets. For constant jet multiplicity they do not demonstrate a strong dependence on H T .
The ratios between N b ≥ 3 and = 2 events in data and simulation could differ because of incorrect modeling of the heavy-flavor content, the jet kinematics, and uncertainties in the b-tagged jet misidentification rates. To probe the impact of the first source of uncertainties, the fraction of events with at least one c quark is varied by 50%. The same variation is applied to events with additional b-or c-quark pairs. The effect of possible differences between data and simulation in the kinematics of the system of non-b jets on R 32 is tested in a control sample with exactly two b-tagged jets. The remaining jets in the event are randomly assigned a parton flavor: one jet is marked as a c-quark jet, while the others are marked as light-quark jets. Based on this assignment the ratio of probabilities to tag at least one additional jet is calculated. This procedure is applied to both data and simulation. Good agreement is found and the residual difference is interpreted as a systematic uncertainty. The uncertainty related to b-tagged jet misidentification is evaluated from the uncertainties of the misidentification scale factors relative to simulation. The total systematic uncertainties for R 32 are approximately 9-19% depending on the signal region.
In the LS method, the transfer factors are applied to the signal regions for H T > 500, 750, and 3.6 Results for signal regions in missing transverse momentum and H T bins 9 1000 GeV. In the MT method, signal regions for H T > 400 GeV and 150 < E T / < 250 GeV are added for the N b ≥ 3 bin, corresponding to the limits of the control and normalization regions in the N b = 2 bin, respectively.

Results for signal regions in missing transverse momentum and H T bins
The predictions of both methods are compared with the observed number of events in Table 1. For the LS method the predictions consist of the single-lepton and τ-lepton backgrounds with a small contribution from dilepton events. Drell-Yan events are heavily suppressed by the N j , N b , and kinematic requirements. The yield of this small component of the background is taken from simulation. For the MT method the predictions consist of the inclusive estimation of the leading backgrounds. Additional contributions to the signal regions from QCD multijet events are heavily suppressed, but their cross section is large and not precisely known. Therefore, they are predicted from data based on scaling the sideband of the relative lepton isolation distribution. These contributions are neglected as they are found to constitute 1% or less of the total background in all cases.
The corresponding observed and predicted E T / spectra are shown in Fig. 1 for the two b-jet multiplicity bins and different H T requirements. The two methods differ in their leading systematic terms and in the correlations they exhibit between the background predictions in different signal regions. The predictions are consistent, an indication of the robustness of the methods. No excess is observed in the tails of the E T / distributions with respect to the expectations from SM processes. The results are interpreted in terms of upper limits on the production cross section for different benchmark models in Section 5.

T and ∆φ(W, )
After applying the selection criteria in Section 2, the sample is dominated by single-lepton tt events. In the delta phi (∆φ) analysis method, this background is further reduced by applying a requirement on the azimuthal angle between the W-boson candidate and the charged lepton. The W-boson candidate transverse momentum is obtained as the vector sum of the lepton p T and the E T / vectors. For single-lepton tt events, the angle between the W-boson direction and the charged lepton has a maximum value, which is fixed by the mass of the W boson and its momentum. Furthermore, the requirement (direct or indirect) of large E T / selects events in which the W boson yielding the lepton and the neutrino is boosted, thus resulting in a fairly narrow distribution in ∆φ(W, ). On the other hand, in SUSY decays, the "effective W boson" that is formed from the vector sum of the transverse momenta of the charged lepton and the E T / vector will have no such maximum. Since the E T / results mostly from two neutralinos, the directions of which are largely independent of the lepton flight direction, the ∆φ(W, ) distribution is expected to be flat.
Distributions of ∆φ(W, ) in different S lep T bins are shown for the N b ≥ 3 and N j ≥ 6 samples in Fig. 2. We select ∆φ(W, ) > 1 as the signal region. The complementary sample, events with ∆φ(W, ) < 1, constitutes the control region. It can be seen that this selection is effective in reducing the background from single-lepton tt decays; the dominant background in the signal regions comes from dilepton tt events. Table 2 shows the event yields from simulation for the signal and control regions in different S lep T bins for N b ≥ 3.    Table 2: Event yields for the combined e and µ channels, as predicted by simulation, for N j ≥ 6 and N b ≥ 3. The R CS column lists the ratio of yields in the signal and control regions. The yields for signal benchmark points are shown for comparison, with the ( g, χ 0 1 ) masses (in GeV) listed in brackets. The uncertainties are statistical only.

Prediction of standard model background
The estimate of the number of SM background events in the signal region is given by the number of events in the control region multiplied by a transfer factor R CS . The R CS factor is defined as the number of events with ∆φ(W, ) > 1 to the number with ∆φ(W, ) < 1. The typical values of R CS estimated from simulation are much less than 10%, leading to SM background expectations in the different S lep T signal regions that range from a few events to less than about one event.
The predicted factors R pred CS for N b = 2 and N b ≥ 3 are calculated using a data control sample selected with the same criteria as the standard sample except with N b = 1. Very few signal events are expected to appear in this control sample. Specifically, the R pred CS factors for N b = 2 and N b ≥ 3 are calculated for each bin in S lep T as where the possible dependence of the transfer factors on N b is taken into account by correction  the SM alone and also with the addition of signal from a SUSY benchmark scenario. In the absence of a SUSY signal, the value of R CS is roughly independent of the b-jet multiplicity. In the presence of a signal containing four top quarks, the N b ≥ 2 bins change significantly, but the N b = 1 bin does not. This illustrates the primary motivation for the analysis strategy: we measure the transfer factor needed, R pred CS (N b ≥ 3), using the background-dominated data with N b = 1 (Table 3) and account for possible differences in the transfer factor between the N b = 1 and N b ≥ 3 samples by applying the κ CS correction factor (Table 4) obtained from simulation. The same approach is used for the N b = 2 bin.
The calculation of the κ CS factor in simulation is shown in Table 4, which lists the yield without a κ CS factor correction, and the observed event yields, as well as the corresponding κ CS correction factors for N b ≥ 3. The κ CS factor ranges from 0.93 to 1.45 with statistical uncertainties up to ±0.6. The large statistical uncertainty reflects the very small event yields expected in the signal region from SM processes. We observe only a weak dependence of the transfer factor R CS on N j and, as stated above, on N b . Two sources of this dependence have been identified: the relative composition of SM samples (W+jets, tt (1 ), tt ( ), single top quark), and the residual dependence of R CS within each SM sample. Both effects are individually found to be smaller than approximately 50% and are effectively captured by the κ CS factor, which also absorbs the corresponding uncertainties.
A potential signal would result in much larger values of R CS (e.g., of up to a factor of five larger for the benchmark points) than the variations above, as can be seen from Fig. 3.  Table 4: Comparison of the simulated yields, combined for the e and µ channels, in the signal region and the estimate using R CS from the N b = 1 sample. The κ CS factor is calculated as the ratio of the "true" and "predicted" yields. The only elements of the background estimate that depend on simulation are the κ CS factors. Most potential sources of systematic uncertainties leave κ CS unaffected, since the correction factor reflects only residual changes in the value of R CS from N b = 1 to N b ≥ 3 (N b = 2) as a result of each systematic uncertainty. Systematic uncertainties are estimated for κ CS with the method described in Section 3. The jet /E T / energy scale and the b-tagging efficiencies are varied within their uncertainties. For each independent source (energy scale, heavy-and light-parton tagging efficiencies) the effects of the upwards and downwards variations are averaged. The W+jets cross section is varied by 30% as in Ref. [12]. The cross section for W+bb is varied by 100% [42,43] and that for single-top-quark production by 50% [44]. We assign an uncertainty of 5 and 10%, respectively, to the W boson and tt polarizations [40,45]. These effects are negligible.
Since the estimate of the background in the signal region is based on ratios of events in the data and the κ CS factor that only depends on the number of b-tagged jets, the systematic uncertainties of the background prediction are expected to be the same for the electron and muon samples. This is confirmed with an explicit calculation of these uncertainties, and thus the final result uses the combination of the uncertainties from the two lepton flavors. The overall systematic uncertainty found for κ CS , which is dominated by the limited statistics in the simulated samples, is 23%, 45% and 70%, respectively, in the three S lep T ranges. The total systematic uncertainty of the background prediction is dominated by the statistical uncertainty that arises due to the limited number of events in the data control samples.

QCD multijet background estimate
Contributions of QCD multijet events to the control and signal regions could affect the correction factors. Therefore we estimate these contributions from data. For the muon channel, the MC prediction for the QCD multijet background is smaller than all other backgrounds by two to three orders of magnitude. This was confirmed by an estimate from data in the previous single-lepton SUSY search [12].
In the electron channel, the QCD background is larger than in the muon channel, but it remains significantly smaller than the other backgrounds. We make use of the method described in Ref. [45], employing a control sample in data that is enriched in electrons from QCD multijet events, obtained by inverting some of the electron identification requirements ("antiselected" sample). While the method works well at low N b and N j , it yields statistically limited results in the samples with higher N b and higher N j . To obtain more precise predictions for the QCD background in these regions, the estimate from the N b = 1 sample is extrapolated with two methods that rely on the relative insensitivity of the QCD multijet background to N b . The results of these methods are found to be consistent, and the fraction of QCD multijet events is determined to be less than 5 − 7% of the total number of data events observed in the control region. Based on the antiselected sample, the corresponding transfer factor for QCD multijets is estimated to be smaller than approximately 2%. The QCD contamination in the signal region (∆φ(W, ) > 1) is therefore determined to be negligible and so the QCD multijet background is included only in the control region.

T and N b
The background prediction method is validated with the 3 ≤ N j ≤ 5 control sample, which is background dominated with dilepton tt events and with a relative contribution from W+jets larger than in the signal region. The compatibility between the predicted and observed yields in this sample is demonstrated by the results shown in the left portion of Table 5.
The predicted and observed data yields in the signal regions are also presented in Table 5. Combining all signal bins we predict 19.2±4.0 events and observe 26. In the N b ≥ 3 bins, which are the most relevant regions for the signal, we predict 5.3 ± 1.5 events and observe 4. For S lep T > 350 GeV we predict 5.6 ± 2.5 events and observe 4. Table 5: Event yields in data for the 3 ≤ N j ≤ 5 (validation) and N j ≥ 6 (signal) samples. The number of events in the control regions used for the predictions are also shown. For the lower jet multiplicity validation test, only the statistical uncertainties stemming from the event counts in the control regions are given, while statistical and systematic uncertainties are listed for the signal region prediction.

Interpretation
The compatibility between the observed and predicted event counts in the searches described above is used to exclude regions in the parameter space of the three models of gluino-mediated production of final states with four top quarks and two LSPs introduced in Section 2. The expected signal yield obtained from simulation is corrected for small differences in the efficiencies between data and simulation and for an overestimation of events with high-p T radiated jets in MADGRAPH, as described in Ref. [11]. Systematic uncertainties in the signal yield due to uncertainty in the jet/E T / scale [36], initial-state radiation, PDFs [46], pileup, b-tagging scale factors [37], lepton efficiency, and trigger efficiency are calculated for each of the models and for every mass combination. The uncertainty due to the measurement of the integrated luminosity is 2.6% [47]. For model A, the total uncertainty in the signal yields ranges from 20% to 60%. The largest uncertainties are related to the PDFs and occur in regions with small mass differences m g − m χ 0 1 and high m g .
The modified-frequentist CL S method [48][49][50] with a one-sided profile likelihood ratio test statistic is used to define 95% confidence level (CL) upper limits on the production cross section for each model and mass combination. Statistical uncertainties related to the observed number of events in control regions are modeled as Poisson distributions. All other uncertainties are assumed to be multiplicative and are modeled with lognormal distributions. Upper limits on the cross section at a 95% CL are set in the parameter plane of the three models. Corresponding mass limits are derived with the next-to-leading order (NLO) + next-to-leading logarithm (NLL) gluino production cross section [52][53][54][55][56] as a reference. These limits are summarized in Fig. 4, which shows a comparison of the mass limits obtained for signal regions in H T and E T / , cross section and mass limits for the signal regions in S lep T and ∆φ(W, ), and a comparison of the observed mass limits obtained by the three methods. For each of the considered models the LS and MT methods show a similar reach; the most stringent limits are set by the ∆φ method. For model A, with off-shell top squarks, the limits extend to a gluino mass of 1.26 TeV for the lowest LSP masses and to an LSP mass of 580 GeV for m g = 1.1 TeV. At low gluino masses the sensitivity extends to the region m χ 0 1 > m g − 2m t . For model B, where the top squarks are on-shell, the limits for m χ 0 1 reach 560 GeV for m t = 800 GeV. For model C the gluino mass limits for low LSP mass are similar to model A for m t > 500 GeV but decrease to m g = 1.0 TeV for lower stop masses because the signal populates the lower E T / region, which has higher background. For m g = 1.0 TeV, the limits cover the full range of top-squark masses if the LSP mass lies below approximately 530 GeV. Conservatively, these limits are derived from the reference cross section minus one standard deviation [51].

Summary
A sample of proton-proton collisions recorded with the CMS detector at a center-of-mass energy of 8 TeV and corresponding to an integrated luminosity of 19.3 fb −1 has been used for a search for new physics in events with a single isolated electron or muon, multiple high-p T jets, including identified b jets, and missing transverse momentum. This event topology is a possible signature for the production of supersymmetric particles in R-parity conserving models, in particular the production of gluinos with subsequent decays into top squarks. The dominant standard model background in a search region defined by the presence of at least six jets, in- [GeV] [GeV]   cluding at least two jets identified as originating from the fragmentation of b quarks, is due to tt production.
The search is performed with two sets of signal regions, and uses three different methods, each based on data, to estimate the leading background contributions. The lepton spectrum and the missing transverse momentum template methods are designed as searches in the high H T , high E T / region. They estimate the SM backgrounds (dominated by single-lepton tt decays) for events with two identified b jets and extrapolate these predictions to additional signal regions requiring ≥3 b-tagged jets. The first of these methods uses the lepton p T distribution to estimate the E T / spectrum while the second obtains the predictions in a parametrized form by fitting a E T / model to control regions in data. The delta phi method uses the azimuthal angle between the lepton and W boson directions as a discriminating variable, leading to a strong suppression of the single-lepton backgrounds and leaving dilepton tt events as the leading SM contribution. The signal regions are defined by the use of the same two b-jet multiplicity requirements and by bins in S lep T , which probes the total leptonic ( and ν) scalar transverse momentum in the event.
While the delta phi approach shows the highest sensitivity, the use of different methods, which probe complementary kinematic aspects and both hadronic and leptonic event characteristics, increases the robustness of this search. Together these methods examine the event sample in both high-and low-yield regions to provide sensitivity to signal topologies with high hadronic activity, missing transverse momentum, and at least two bjets.
No significant excess is observed in any of the signal regions. Upper limits are set at 95% CL on the product of production cross section and branching fraction for three benchmark models of gluino pair production with subsequent decay into virtual or on-shell top squarks, where each of the two top squarks decays in turn into a top quark and the lightest supersymmetric particle. In the case of decays via virtual top squarks and for light LSPs, gluino masses below 1.26 TeV are excluded.