Search for top squarks in the four-body decay mode with single lepton final states in proton-proton collisions at $\sqrt{s}$ = 13 TeV

A search for the pair production of the lightest supersymmetric partner of the top quark, the top squark ($\tilde{\mathrm{t}}_1$), is presented. The search targets the four-body decay of the $\tilde{\mathrm{t}}_1$, which is preferred when the mass difference between the top squark and the lightest supersymmetric particle is smaller than the mass of the W boson. This decay mode consists of a bottom quark, two other fermions, and the lightest neutralino ($\tilde{\chi}^0_1$), which is assumed to be the lightest supersymmetric particle. The data correspond to an integrated luminosity of 138 fb$^{-1}$ of proton-proton collisions at a center-of-mass energy of 13 TeV collected by the CMS experiment at the CERN LHC. Events are selected using the presence of a high-momentum jet, an electron or muon with low transverse momentum, and a significant missing transverse momentum. The signal is selected based on a multivariate approach that is optimized for the difference between $m(\tilde{\mathrm{t}}_1)$ and $m(\tilde{\chi}^0_1)$. The contribution from leading background processes is estimated from data. No significant excess is observed above the expectation from standard model processes. The results of this search exclude top squarks at 95% confidence level for masses up to 480 and 700 GeV for $m(\tilde{\mathrm{t}}_1) - m(\tilde{\chi}^0_1$) = 10 and 80 GeV, respectively.


Introduction
Supersymmetry (SUSY) [1][2][3][4][5][6] predicts the existence of a new symmetry that requires that, for each fermion (boson) in the standard model (SM), there is also a bosonic (fermionic) superpartner. Searches for SUSY are among the important focal points of the physics program at the CERN LHC, since SUSY naturally solves the problem of quadratically divergent loop corrections to the mass of the Higgs boson [7][8][9]. If R parity [10] is conserved, supersymmetric particles would be produced in pairs, and their decay chains would end with the lightest supersymmetric particle (LSP), often considered to be the lightest neutralino χ 0 1 . Such an LSP, being neutral, weakly interacting, and massive, would have the required characteristics for a dark matter particle, and thus, would offer a solution to another shortcoming of the SM. When the symmetry is broken, the scalar partners of an SM fermion acquire a mass different from the mass of the SM partner, with the mass splitting between scalar mass eigenstates being proportional to the mass of the SM fermion. Since the top quark is the heaviest fermion of the SM, the splitting between its chiral supersymmetric partners can be the largest among all supersymmetric quarks (squarks). Furthermore, the top Yukawa coupling can be the greatest among all fermions, which affects the masses of the squarks through the renormalization group equations. The lighter supersymmetric scalar partner of the top quark, the top squark ( t 1 ), could therefore be the lightest squark. If SUSY is realized in nature, cosmological observations imply that for many models the lightest top squark should be almost degenerate with the LSP [11]. In this scenario, because the mass difference between the t 1 and the χ 0 1 is smaller than the mass of the W boson, the two-and three-body decays of the t 1 are kinematically forbidden, while the two-body decay to c χ 0 1 can be suppressed depending on the parameters of the model. This motivates the search for the four-body decay t 1 → bff χ 0 1 , where b stands for the bottom quark, and the fermions f and f can be either quarks or leptons. Throughout this paper, charge conjugation is assumed. Figure 1 represents a simplified model [12][13][14][15][16][17] of the production of t 1 t 1 in proton-proton (pp) collisions, where each t 1 and t 1 undergoes a four-body decay. Ref. [54].
Jets are reconstructed by applying the anti-k T clustering algorithm [55,56] to PF candidates with a distance parameter of 0.4. The pileup contribution to the jet momentum is partially taken into account by excluding the charged hadrons originating from vertices other than the PV from the jet-clustering algorithm. To account for pileup contributions from neutral particles and any inhomogeneity in the detector response, the jet p T is further calibrated as described in Ref. [57]. Jets are required to satisfy p T > 30 GeV, and |η| < 2.4. The tagging of b jets (b tagging) is performed with the DeepCSV algorithm [58] that uses information from the secondary vertex and is based on a deep neural network. The b tagging discriminant is used to tag jets as b jets based on a set of working points (loose, medium, tight) and to define further event variables based on the discriminant value or the jet with highest discriminant value. The b jet identification working points are defined as the selection values in the discriminator distribution at which the probability of misidentifying a light-flavor jet as a b jet is 10%, 1% and 0.1%, respectively, for the loose, medium and tight working points [59].
The missing transverse momentum vector, p miss T , is computed as the negative vector p T sum of all PF candidates in the event, and its magnitude is p miss T . The calibrations associated with the jet energy estimations are propagated to the p miss T [60].
Electron candidates are reconstructed from energy deposits in the ECAL and matched charged particle tracks in the inner tracker obtained using the Gaussian sum filter algorithm [25]. To reduce the number of misidentified electrons, additional constraints on the shape of the electromagnetic shower in the ECAL, the quality of the match between the trajectory of the track, and the ECAL energy deposit around the electron, and the relative HCAL deposit in the electron direction are applied. Electrons are required to have p T above 5 GeV and |η| < 2.5, with a veto on electron candidates in the ECAL gap region (1.4442 < |η| < 1.5660). They are identified with requirements on the observables that describe the matching of the measurements in the tracker and the ECAL, the description of energy clusters in the ECAL, and the amount of bremsstrahlung radiation emitted during the propagation through the detector. A loose working point of this algorithm is required for electrons to be selected, which has an average efficiency of 90%.
Muon candidates are reconstructed by combining the information from the silicon tracking systems and the muon spectrometer in a global fit [26] that assigns a quality to the matching between the tracker and muon systems and imposes minimal requirements on the track to reduce the misidentification of muons. The medium working point of this algorithm is required for muons to be selected, which ensures an efficiency above 98%. Muons are required to pass the selection requirements of p T > 3.5 GeV and |η| < 2.4.
To select electrons or muons originating from the PV, the point of closest approach of the associated track with respect to the PV is required to have a transverse distance |d xy | < 0.02 cm, and a longitudinal distance |d z | < 0.1 cm. A lepton is defined as being nonprompt either when it does not originate from the PV, or when a jet is misidentified as a lepton. Background processes with nonprompt leptons are one of the main contributions to the SM background in the signal regions. In this analysis, nonprompt leptons mostly arise from heavy-quark decays in jets produced in association with a Z → νν +jets decay, from multijet production, or from W+jets and tt events where the prompt lepton was not reconstructed and a different one was accepted. In order to suppress these types of processes, a requirement on the lepton isolation is applied, which uses a combination of an absolute and a relative isolation variable. The absolute isolation variable I abs of the lepton is defined as the scalar sum of the p T of PF candidates within a cone size of R ≡ √ (∆φ) 2 + (∆η) 2 = 0.3, where φ is the azimuthal angle, around the lepton candidate, which is excluded from the sum, as are charged PF candidates not associated with the PV. The contributions from neutral particles originating from pileup are estimated according to the method described in Refs. [61,62], and are subtracted from I abs . The ratio of the lepton I abs to the lepton p T is defined as the lepton relative isolation I rel . A uniform lepton selection efficiency as a function of p T is achieved by requiring leptons to have I abs < 5 GeV for p T ( ) < 25 GeV and I rel < 0.2 for p T ( ) ≥ 25 GeV.

Event selection
The data events collected by the trigger system are required to have both p miss T and H miss T above 120 GeV, where H miss T is the magnitude of the missing transverse momentum calculated only from jets. In order to maintain the performance of the online selection with increased luminosity from the late runs of 2017 onward, the condition H T > 60 GeV is also required, where H T is defined as the scalar p T sum of all jets in the event. The efficiency of the combined p miss T and H miss T trigger is measured using an independent event sample with single-electron triggers and p T thresholds of 35 and 32 GeV for the 2017 and 2018 data-taking periods, respectively.
The offline event selection is a two-step process. First, a preselection is applied to reduce the contribution of the main background processes (Section 5.1) by selecting a single charged lepton, large p miss T , and jets. Then, boosted decision trees (BDTs) [63, 64] are trained and used to define the signal selection (Section 5.2). The preselection is constructed to be as inclusive as possible in order to maintain a high signal efficiency for all ∆m values, leaving the main selection to the BDT.

Preselection
The value of the preselection p miss T threshold is set close to the beginning of the maximum efficiency plateau of the combined p miss T and H miss T trigger, while optimizing the separation between signal and background performed by the BDTs. Events with p miss T > 280 GeV are selected, favoring the signal where two χ 0 1 's escape detection and where the p miss T is therefore larger than for SM processes. For these events, the trigger efficiency is above 98% for both years. To account for the small inefficiency, simulated samples are reweighted as a function of p miss T to match the efficiency of the triggers in data.
To suppress the contribution of SM processes, additional requirements are imposed on the selected events. In particular, to reduce the W+jets background, H T > 200 GeV is required. To select the single-lepton topology, it is demanded exactly one identified electron or muon in the event, along with at least one jet. This selection reduces the contribution from the dilepton topology of tt events. To further improve the selection of signal over SM background events, at least one jet must have p T > 110 GeV. These requirements are geared towards signal events in which the t 1 t 1 system recoils against a high-momentum ISR jet, Lorentz boosting the χ 0 1 and increasing p miss T . The ISR jet will often be the highest momentum (leading) jet in these events, and the leading-jet p T threshold value is optimized in the same manner as for p miss T . Lastly, in events with at least two jets, the azimuthal angle between the directions of the leading and second-highest-p T (subleading) jets must be smaller than 2.5 radians, suppressing the SM multijet background.
After the preselection, the W+jets and tt processes are the main SM backgrounds, making up about 70 and 20%, respectively, of the total expected background. The Z → νν +jets process contributes to the SM background by having jets, genuine p miss T , and a jet misidentified as a lepton. The remaining background processes are diboson, single top quark, Drell-Yan (DY), multijet, and ttX production where X is a vector boson. These processes are a less-important part of the expected background because of having a smaller cross section, a lower acceptance, or both. The p T ( ), p miss T , and N jet distributions after the preselection from the 2017 and 2018 data and the simulations are shown in Fig. 2, where N jet is the number of jets in the event satisfying the jet criteria. The simulated background distributions for each year are normalized to the corresponding integrated luminosity. The level of agreement with data gives us confidence in training the BDTs with the simulated distributions for the second step in the event selection.

Classification and final selection
The selection of the signal events is based on a BDT [64] to take advantage of the different correlations among the discriminating variables for the signal and background processes. For each event passing the preselection, the BDT discriminator value, henceforth referred to as the BDT output, is evaluated. If the discriminator value exceeds the determined threshold, the event is retained. The choice of the discriminating variables used as input to the BDT is made by maximizing a figure of merit (FOM) [65] that takes into account the statistical and systematic uncertainties in a selection. Various BDTs are trained with different sets of discriminating variables, and a variable is included in the final set only if it significantly increases the FOM obtained for any selection using the BDT output. The list of the twelve retained input variables and a short description of their signal and background distributions is as follows: • Variables related to p miss T : p miss T and m T , where m T is the transverse mass of the lepton + p miss T system, defined as: , where ∆φ is the azimuthal angular difference between the lepton p T and p miss T . The p miss T distribution extends to higher values for the signal than for the backgrounds due to the two undetected LSPs in the signal decays. The m T spectrum peaks around 80 GeV for the SM background and is a broad distribution for the signal.
• Lepton-related variables: p T ( ), η( ), and Q( ). The correlations between p miss T and p T ( ) are different for the signal, where p miss T comes from three undetected particles (two χ 0 1 and a ν), than for W+jets and tt backgrounds, where p miss T is the result of a single undetected particle (ν). Because the decay products of the signal are more centrally produced than those of the W+jets process, the lepton pseudorapidity η( ) distribution is populated at more central values for the signal than this background. The lepton charge Q( ) is a discriminating variable because W + and W − bosons are not produced equally at the LHC, while the signal events contain equal numbers of positively and negatively charged leptons.
• Jet-related variables: p T (ISR), p T (b), N jet , and H T . The variable p T (ISR) is defined as the p T of the leading jet, and selects the high-momentum ISR jet in signal events. The p T (b) variable is the transverse momentum of the jet with the highest b tagging discriminant value. Both the p T (ISR) and p T (b) variables are sensitive to the available phase space, which depends on m( t 1 ) − m( χ 0 1 ) for the signal, and m(t) − m(W) for the tt background. The N jet variable is sensitive to the mass difference ∆m, while the H T variable provides discrimination between signal and both the W+jets and tt backgrounds. BDT to help discriminate between the signal and mainly the W+jets background.
The five most discriminating variables, in decreasing power, are p T ( ), p miss T , p T (ISR), H T , and m T .
The discrimination power of the input variables varies as a function of ∆m, as illustrated in Fig. 3 (left). An important feature of this search is the adaptation of the selection tool to the evolving kinematic variables of the signal over the (m( t 1 ), m( χ 0 1 )) plane. Therefore, this plane is divided into eight ∆m regions (from 10 to 80 GeV, in steps of 10), and a separate BDT is trained for each ∆m region. The BDTs are trained to discriminate signal from background using the binomial log-likelihood loss function. Only the W+jets and tt processes, which constitute a large fraction of the total background after preselection, are included in the training. They are normalized in proportion to their theoretical cross sections. As seen in Fig. 3 (right), different signal points with the same ∆m have similar input variable distributions. This is expected since with the same ∆m they have the same available phase space. Because of this, all the signal points with the same ∆m are grouped together when training the BDT, thus increasing the number of signal events for each training. Because of the large variation of the p T ( ) spectrum across the (m( t 1 ), m( χ 0 1 )) plane, p T ( ) < 30 GeV is required for ∆m < 70 GeV signal regions before training the BDTs, while imposing no restriction on p T ( ) for signal regions with higher ∆m. This improves the ability of the BDT to separate the signal from the tt background.
The BDT output distributions for data and simulated SM background are shown in Figs. 4 and 5 for the 2017 and 2018 data, respectively. In each case a (m( t 1 ), m( χ 0 1 )) signal point belonging to the ∆m value for which the training has been done is also reported. The BDT output is found to be different for various values of ∆m, which is to be expected because of the changing mix of signal and background and the varying correlations across the (m( t 1 ), m( χ 0 1 )) plane, resulting in different BDT outputs for different ∆m values. A good agreement between the data and simulation is observed for the BDT output distributions over the entire range, for all trainings; the region at small BDT output values (e.g., <0.3) is dominated by background events.
To check the validity of the BDT output in regions depleted in signal, a set of validation regions (VRs) are defined. These regions are chosen to be kinematically close but nonoverlapping with the region selected by the preselection, while using the same online selection. The first VR uses the preselection requirements discussed in Section 5.1, but where 200 < p miss T < 280 GeV is required. This VR is used to validate the BDT output for all the trained BDTs. The second VR also uses the preselection requirements, but where p T ( ) > 30 GeV is required. It is used for the validation of BDTs trained for signals with ∆m < 70 GeV. This region is not used for BDTs trained for signals with ∆m = 70 or 80 GeV because the entire range of the p T ( ) distribution is considered at preselection. The BDT output distributions for these VRs from data are consistent with those from the simulation. Fig. 6 illustrates for both years the p T ( ) distribution for the VR where 200 < p miss T < 280 GeV, as well as the output of the BDT for ∆m = 10 GeV. Fig. 7 reports for both years the p miss T distribution for the VR where p T ( ) > 30 GeV, and the output of the BDT for ∆m = 60 GeV. As observed in these figures, the BDT output of data is well described by the simulation. The mismodeling of an input variable, possibly resulting in the one of the BDT output, is covered by a systematic uncertainty. As described in Section 6, the VRs are used to evaluate the uncertainty in the background determination.
A signal region (SR) is defined by requiring a lower limit on each BDT output. This limit is determined by minimizing the expected upper limit on the signal cross section of a benchmark (m( t 1 ), m( χ 0 1 )) signal point at the exclusion limit of the 2016 search. This choice implies that the benchmark signal points for the search from 2017 and 2018 data are at higher t 1 and χ 0 1 masses than for the 2016 search. The exact values of the BDT selection requirements are reported in Table 2. As an illustration of the selection power of the BDT, in the case of ∆m = 80 GeV, the SM background is suppressed by a factor of ≈3.7 × 10 3 compared to the preselection, while the signal is only reduced by a factor of ≈13.

6 Background estimation
The main background processes in this search are W+jets and tt, both with a prompt lepton, and events where the lepton arises from the decay of heavy-flavor quarks or from misidentified hadrons that pass the lepton criteria. The latter category is labeled as nonprompt background. The processes contributing to the nonprompt background are mainly Z → νν +jets, and to a lesser extent, W+jets and tt, where a jet is misidentified as a lepton, as well as the multijet background. Furthermore, there can also be events in which a genuine lepton (mainly from W+jets or tt) escapes detection, while a nonprompt lepton is selected. These three main sources of background are estimated using data, as described in Sections 6.1 and 6.2. The background from other SM processes, such as single top quark, diboson, DY, and ttX production, are estimated from simulation. In the following section, background yields estimated using data are denoted by Y while background yields estimated only from simulated samples are denoted by N.

Nonprompt background
The nonprompt background is estimated from data using the "tight-to-loose" method [66]. The tight criteria correspond to the selection of the lepton as described in Section 4. The loose selection is defined by relaxing the requirement on the isolation variable to I abs < 20 GeV for p T ( ) < 25 GeV and I rel < 0.8 for p T ( ) > 25 GeV, and on the impact parameters to |d xy | < 0.1 cm and |d z | < 0.5 cm. A lepton passing these requirements is called a loose lepton. The probability TL for a loose lepton to pass the tight criteria is measured as a function of its p T and η in a data CR that is largely dominated by multijet events and enriched in nonprompt leptons. For each SR, it is defined a side-band region with the same requirements, but where the lepton must pass the loose criteria while failing the tight ones ("L!T"). The number of such events in data is denoted as N L!T (Data). The number of events N L!T p (MC) from simulation where a vector boson or a top quark produce a prompt lepton are subtracted from the data sample with a loose-not-tight lepton. The predicted nonprompt yield Y SR np in each SR is obtained by weighting the resulting number of events by TL /(1 − TL ): (1)

Dominant prompt backgrounds
To estimate the prompt contributions from the W+jets and tt processes, a method based on the number of these background events observed in data CRs is used. The method uses the output of the BDT, and a transfer factor between the CR and the SR, obtained from simulation. This factor, of the order 10 −3 for both backgrounds and for both years, is the ratio of the number of predicted events in the SR, N SR p , to the one in the CR, N CR p . The estimated yield Y SR p of the dominant prompt background in the SR, estimated independently per process and per year, is then determined using: where X refers to the background process being estimated, either W+jets or tt, and where the terms prompt and nonprompt refer to their definition as given at the beginning of Section 6. To obtain a data sample enriched in the backgrounds being estimated, a CR is defined by applying the preselection criteria, with the additional requirement BDT < 0. The number of such events is denoted as N CR (Data). To enrich the CR in W+jets or tt events, the number of loosely b tagged jets is required to be zero, or the number of tightly b tagged jets to be at least one, respectively, where loose and tight were discussed in Section 4 [59]. The purity of W+jets and tt processes in the corresponding CRs is approximately 93% and 78%, respectively. The level of signal contamination in the CR is well below 5%. The number N CR p (non-X) is the number of prompt background events other than the process being estimated in the CR, estimated from simulation and subtracted from the number of data events; e.g.: if X = W+jets, this term includes tt, and vice versa. The yield Y CR np , which is the predicted number of nonprompt background in the CR, is also subtracted.

Summary of systematic uncertainties
Processes for which the absolute yield is predicted by simulation are subject to systematic uncertainties in the determination of the integrated luminosity, which is estimated year-by-year with uncertainties in the 1.2-2.5% range [67,68]. All simulated samples are subject to experimental uncertainties in the jet energy scale (JES) and jet energy resolution (JER). The uncertainties arising from miscalibration of the JES are estimated by varying the jet energy corrections up and down by one standard deviation of their uncertainties and propagating the effect to the calculation of p miss T . Differences in the JER between data and simulation are accounted for by smearing the momenta of jets in simulation. The uncertainties corresponding to the b tagging efficiencies and misidentification rates for tagging light-flavored quark or gluon jets as b jets have been evaluated for all simulated samples. The systematic uncertainties in the scale factors applied to the simulated samples for trigger and lepton efficiencies are taken into account. The uncertainty due to the simulation of pileup for simulated background processes is estimated by varying the inelastic pp cross section by 4.6% [69]. An uncertainty of 50% is assigned to the cross sections of all backgrounds whose yields are predicted from simulation.
The estimation of nonprompt backgrounds, as described in Section 6.1, depends on the tightto-loose fraction TL , which is sensitive to the flavor content of jets. The systematic uncertainty arising from this source in the measurement region is estimated by changing the b tagging requirement in the b veto to demand at least one b tagged jet using the medium working point. The resulting uncertainty ranges from 3 to 90% from low to high lepton p T , respectively. The method is also tested by repeating this procedure on the simulated event samples, where any variations in the background determination are considered as systematic uncertainties and added in quadrature to the aforementioned uncertainty.
The systematic uncertainties associated with the predictions of W+jets and tt processes are based on the comparison of two methods: one to assess the closure of the background prediction method, and the other to evaluate the effect of the modeling of the BDT output distribution. The closure method measures differences between the predicted number of events (obtained from Eq. (2)) and the observed number of data events in both VRs, as defined in Section 5.2, where the statistical uncertainty in the number of CR events is taken into account. Uncertainties in modeling the BDT output distribution, which can affect the background prediction, are assessed by comparing the ratio of the BDT output distributions for data to the background prediction in the CR with the ratio in the SR, for the two VRs. To be conservative, the uncertainties are evaluated in the two VRs for both methods, and the largest value is used. These uncertainties range from 10 to 20% for the prediction of W+jets events and from 8 to 80% for the estimation of tt processes over the various SRs. The estimations of the W+jets and tt backgrounds rely partially on the simulation and are therefore sensitive to theoretical uncertainties in the modeling of ISR. For the tt process, half of the ISR correction is assigned as the system- Table 1: The relative systematic uncertainties in percent from the different sources in the signal and the total relative uncertainty in the W+jets, tt, and nonprompt background predictions, shown separately for the 2017 and 2018 data analysis. The ranges given are across the eight SRs. The "-" symbol means that a given source of uncertainty is not applicable. atic uncertainty, which also applies to the simulated signal samples. For the W+jets process, the difference between the ISR-corrected and uncorrected simulation is taken as the systematic uncertainty.
Uncertainties from unknown higher-order theoretical effects are estimated through uncorrelated variations of the renormalization and factorization scales by factors of 0.5, 1, and 2 [70]. Finally, differences between the fast and the full GEANT4-based modelings of p miss T are used as the corresponding systematic uncertainty and assigned to the signal yields. The statistical uncertainty in the signal simulation samples of 3 to 20% over the various SRs is included as a systematic uncertainty. The relative systematic uncertainties in the signal from the various sources, and the total relative systematic uncertainties in the W+jets, tt, and nonprompt backgrounds, are given in Table 1 as ranges over the eight SRs.
To combine the results from the different data-taking years, systematic uncertainties whose sources are exactly the same for the different years are taken as fully correlated. This includes the uncertainty in the theoretical cross sections, pileup, JES, the reweighting of the W+jets sample, the renormalization and factorization scales, and the prediction of the W+jets, tt, and nonprompt backgrounds. The systematic uncertainty in the integrated luminosity has multiple components and is thus considered as partially correlated between the years [19, 67, 68], as is the systematic uncertainty in the b tagging procedure.

Results and interpretation
The observed and expected numbers of signal and background events from the 2017-18 data analysis for the eight values of ∆m are given in Table 2 and shown in Figs. 8 and 9. The predictions and the associated uncertainties in these figures are given before a profiled likelihood fit [71][72][73] is performed. The post-fit uncertainties do not get reduced because of the lack of constraints from a single bin. It should be noted that the background composition varies for the same ∆m region for different years. This is because an independent BDT is trained per ∆m and per year, with a different selection on its output. There is good agreement between the observed and predicted numbers of events for all SRs. The largest difference is for ∆m = 10 GeV, where there are 1.1 and 2.9 standard deviations (local significance) excesses of data events over the predicted background for the 2017 and 2018 data, respectively. The 2016 analysis had a similar excess for the same ∆m value, corresponding to 0.7 standard deviations. None of these excesses is statistically significant, so it is concluded that there is no evidence for direct top squark production. Table 2: The predicted number of W+jets, tt, nonprompt, and other (N SR (Other)) background events and their sum (N SR (Total)), in the eight SRs for the 2017 and 2018 data analysis. The first 3 predicted yields are derived from data, while the yields of the other background processes come from simulation. The uncertainties shown are the quadratic sum of the statistical and systematic uncertainties given in Table 1 for all the background processes. The corresponding ∆m and BDT output threshold values for each SR are displayed in the first and second columns, respectively, and the observed number of events in data is shown in the last column.
Year The observed and expected number of events for each signal mass point and their corresponding uncertainties are converted into 95% confidence level (CL) upper limits on the t 1 t 1 production cross section in the (m( t 1 ), m( χ 0 1 )) plane. These are shown by the colored regions in Fig. 10 as a function of m( t 1 ) and ∆m, where the color scale to the right of the figure gives the corresponding upper limit values. The limits are calculated according to the modified frequentist CL s criterion [71][72][73]. A test statistic is defined as the likelihood ratio between the backgroundonly and signal-plus-background hypotheses, and is used to set exclusion limits on the top squark pair production. The distributions of the test statistic are built using simulated experiments, where statistical uncertainties are modeled with Poisson distributions, and where all systematic uncertainties are modeled with a log-normal distribution. When interpreting the results, it is assumed a branching fraction of 100% for the four-body decay scenario. For the combined results of the three years, the largest excess in the data corresponds to 2.5 standard deviations (local significance) for the ∆m = 10 GeV SR.
Using the measured upper limits on the top squark pair cross section and the theoretical predictions for the cross section, it is determined the 95% CL lower limits on m( t 1 ) versus ∆m. The solid black line and thick dotted red line in Fig. 10 give the resulting 95% CL observed and expected exclusion contours, respectively, on m( t 1 ) as a function of ∆m, obtained from combin-   ing the 2016, 2017, and 2018 data. The corresponding thin black lines in Fig. 10 represent the ±1 standard deviation (σ theory ) variations in the limits due to the theoretical uncertainties in the case of the observed limits. The thin dashed red lines give the ±1 and ±2 standard deviation (σ experiment ) variations in the case of the expected limits, coming from the experimental uncertainties. The maximum sensitivity is reached for the highest ∆m (∆m ≈ m(W)), where top squark masses up to 700 GeV are excluded. At the lowest ∆m value of 10 GeV covered by the search, the corresponding value is 480 GeV. The reduced sensitivity at lower ∆m is explained by the lower transverse momentum spectrum of the decay products, as shown in Fig. 2, which results in a loss of acceptance.
The limits of the previous analysis are improved. At low ∆m the top squark mass limit is 60 GeV higher, thus improving the sensitivity at low mass splittings beyond simple luminosity scaling, while at high ∆m the top squark mass limit is extended by 140 GeV. Compared to the results of a similar analysis by the ATLAS Collaboration for the same decay mode and final state [20], the search presented here has comparable limits at intermediate and high ∆m values. However, at low ∆m, the excluded top squark mass is 120 GeV higher than the ATLAS limit. This is attributed to a more inclusive preselection criteria, where b tagging is not used, and where the discrimination between the signal and the dominating W+jets background is done by a multivariate analysis tool, whose performance is further enhanced by a BDT specifically trained for each ∆m.

Summary
The results of a search for the direct pair production of top squarks in single-lepton final states are presented within a compressed scenario where R parity is conserved and the mass difference ∆m = m( t 1 ) − m( χ 0 1 ) between the lightest top squark ( t 1 ) and the lightest supersymmetric particle, taken to be the lightest neutralino χ 0 1 , does not exceed the W boson mass. The considered decay mode of the top squark is the prompt four-body decay to bff χ 0 1 , where the fermions in the final state f and f represent a charged lepton and its neutrino for the decay products of one t 1 , and two quarks for the other top squark. The search is based on data collected from proton-proton collisions at √ s = 13 TeV, recorded with the CMS detector during the years 2016, 2017, and 2018, corresponding to an integrated luminosity of 138 fb −1 . Events are selected containing a single lepton (electron or muon), at least one high-momentum jet, and significant missing transverse momentum. The analysis is based on a multivariate tool specifically trained for different ∆m regions, thus adapting the signal selection to the evolution of the kinematical variables as a function of (m( t 1 ), m( χ 0 1 )). The dominant background processes are W+jets, tt, and events with nonprompt leptons, which are estimated using control regions in the data.
The observed number of events is consistent with the predicted standard model backgrounds in all signal regions. Upper limits are set at the 95% confidence level on the t 1 t 1 production cross section as a function of the t 1 and χ 0 1 masses, within the context of a simplified model. Assuming a 100% branching fraction in the four-body decay mode, the search excludes top squark masses up to 480 and 700 GeV at ∆m = 10 and 80 GeV, respectively. The results summarized in this paper are among the best limits to date on the top squark pair production cross section, where the top squark decays via the four-body mode, and currently correspond to the most stringent limits for ∆m < 30 GeV.  The color shading represents the observed upper limit for a given point in the plane, using the color scale to the right of the figure. The solid black and dashed red lines show the observed and expected 95% CL lower limits, respectively, on m( t 1 ) as a function of ∆m. The thick lines give the central values of the limits. The corresponding thin lines represent the ± 1 standard deviation (σ theory ) variations in the limits due to the theoretical uncertainties in the case of the observed limits, and ± 1 and 2 standard deviation (σ experiment ) variations due to the experimental uncertainties in the case of the expected limits.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid and other centers for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the following funding agencies:   [24] CMS Collaboration, "The CMS trigger system", JINST 12 (2017) P01020, doi:10.1088/1748-0221/12/01/P01020, arXiv:1609.02366.