Search for single production of vector-like quarks decaying into a b quark and a W boson in proton-proton collisions at sqrt(s) = 13 TeV

A search is presented for a heavy vector-like quark, decaying into a b quark and a W boson, which is produced singly in association with a light flavor quark and a b quark. The analysis is performed using a data sample of proton-proton collisions at a center-of-mass energy of sqrt(s) = 13 TeV collected at the LHC in 2015. The data set used in the analysis corresponds to an integrated luminosity of 2.3 inverse femtobarns. The search is carried out using events containing one electron or muon, at least one b-tagged jet with large transverse momentum, at least one jet in the forward region of the detector, and missing transverse momentum. No excess over the standard model prediction is observed. Upper limits are placed on the production cross section of heavy exotic quarks: a T quark with a charge of 2/3, and a Y quark with a charge of -4/3. For Y quarks with coupling of 0.5 and B(Y to bW) = 100%, the observed (expected) lower mass limits are 1.40 (1.0) TeV. This is the most stringent limit to date on the single production of the Y vector-like quark.


Introduction
The standard model (SM) of particle physics has been exceptionally successful in describing phenomena at the subatomic scale. The observation of a Higgs boson with a mass of 125 GeV and with properties consistent with the SM expectations [1][2][3] completed the SM. However, in the absence of enormous order-dependent cancellations, also known as fine-tuning, large SM quantum corrections would shift the bare Higgs boson mass to values far beyond the electroweak scale. New physics is required to stabilize the Higgs boson mass naturally at the electroweak scale, i.e. without invoking fine-tuning.
Many natural extensions of the SM have been proposed in recent decades. Some of these models postulate the existence of vector-like quarks (VLQs) [4][5][6], which are colored fermions with left-and right-handed chiral states both transforming in the same way under the gauge group SU(3) C × SU(2) L × U(1) Y . The VLQs do not acquire masses through the Yukawa coupling to the Higgs field, and could cancel loop corrections from the SM top quark to the Higgs boson mass.
Searches for VLQs have already been performed in various decay modes using proton-proton collisions at √ s = 8 TeV. These searches were primarily focused on the pair production mechanism and they ruled out VLQs with masses up to approximately 0.90 TeV [7-10]. The VLQ single production mechanism is coupling-dependent, and it could become the dominant contribution to the cross section at high VLQ masses. The strength of the VLQ-b-W coupling can be approximately characterized by a single dimensionless parameter that varies from 0 to √ 2 [11], where the latter would correspond to a coupling of full electroweak strength.
In this paper, we present a search for the single production of a heavy vector-like quark that decays into a b quark and a W boson using the 2015 LHC data set. This signature can arise from either a Y or a T quark with a charge of −4/3 or 2/3, respectively, produced in association with a light flavor quark and a b quark. The leading order Feynman diagram for Y and T quark production is shown in Fig. 1. The outgoing light flavor quark q in the upper part of the diagram produces a jet in the forward region of the detector, which is a distinct signature of single production.
The Y quark is expected to decay with a branching fraction (B) of 100% into a b quark and a W boson [12], while the T quark can also decay into tH and tZ via a flavor changing neutral current. Searches with the 2015 LHC data set for single production of a vector-like T quark decaying to tH and tZ have been performed by the CMS Collaboration [13][14][15]. If the T quark is a singlet, then it is expected to decay into bW 50% of the time.
The ATLAS Collaboration published a search for single production of Y and T quarks decaying into bW using 8 TeV proton-proton collisions [16]. The analysis presented here is the first such search using 13 TeV proton-proton data, and sets the most stringent limits to date on the production cross section for a single Y or T quark. The search is carried out based on events containing one electron or muon, at least one b-tagged jet with large transverse momentum (p T ), at least one jet in the forward region of the detector, and missing transverse momentum.

CMS detector and event samples
The essential feature of the CMS detector is the superconducting solenoid, 6 m in diameter and 13 m in length, which provides an axial magnetic field of 3.8 T. Within the solenoid volume a multi-layered silicon pixel and strip tracker is used to measure the trajectories of charged particles with pseudorapidity |η| < 2.5. Outside of the tracker system, an electromagnetic 2 CMS detector and event samples q" calorimeter (ECAL) made of lead tungstate crystals and a hadron calorimeter (HCAL) made of brass and scintillators cover the region |η| < 3.0. The region 3.0 < |η| < 5.0 is covered by the forward hadronic calorimeter, which is made primarily of steel and quartz fibers. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke of the solenoid, and covering the region |η| < 2.4. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [17].
The data used for this analysis were recorded during the 2015 data taking period in protonproton collisions at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 2.3 fb −1 . The electron data sample was collected using a trigger that required at least one isolated electron with |η| < 2.5 and p T > 27 GeV. The muon data sample was collected using a trigger that required at least one isolated muon with |η| < 2.1 and p T > 20 GeV.
The VLQ signal efficiencies and background contributions are estimated using Monte Carlo (MC) samples. They are validated using background enriched data samples. The tt+jets, t-and tW-channel single top-quark production and the WW processes are simulated using POWHEG v2 [18][19][20]. Single top quark production via s-channel and the WZ process are simulated with MADGRAPH5 aMC@NLO v2 [21]. Inclusive boson production (W+jets and Z+jets) is simulated with MADGRAPH v5 [22]. PYTHIA 8.212 [23, 24] is used for parton shower development and hadronization and to simulate QCD multijet events.
The VLQ processes considered in this paper are generated using the tree-level MC event generator MADGRAPH v5 for VLQ masses in the range from 0.70 to 1.80 TeV, in steps of 100 GeV. The VLQ width is set to 10 GeV for all masses. The NNPDF3.0 [25] parton distribution functions (PDFs) are used for both signal and SM MC processes to model the momentum distribution of the colliding partons inside the protons.
The cross sections used to normalize the SM processes are calculated to next-to-leading order (NLO) or to next-to-next-to-leading order (NNLO), where the latter is available [26][27][28]. For the signal, the NLO cross sections are taken from Refs. [29,30]. For the tt+jets, tW-channel single top-quark, and WW SM processes, NNLO cross sections are used, while NLO cross sections are applied to the remaining processes.
All generated events are processed through the CMS detector simulation based on GEANT4 [31]. Additional minimum bias events, generated with PYTHIA 8.212, are superimposed on the hard-scattering events to simulate multiple proton-proton interactions (pileup) within the neighboring bunch crossings. The simulated events are weighted to reproduce the distribution of the number of pileup interactions, 20 on average, observed in data.

Event reconstruction
All physics objects in the event are reconstructed using a particle-flow (PF) algorithm [32, 33], which uses information from all subsystems to reconstruct photons, electrons, muons, and charged and neutral hadrons. Charged particle tracks are used to reconstruct the interaction vertices. The vertex with the highest sum of squared p T of all associated tracks is taken as the primary vertex of the hard collision. Filters are applied to reject events where electronic noise or proton-beam backgrounds mimic energy deposits in the detector.
Electron candidates are reconstructed by combining the tracking information with energy deposits in the ECAL in the range |η| < 2.5 (excluding the range 1.4442 < |η| < 1.566, which is a transition region between endcap and barrel calorimeters). Tight identification criteria are applied to select well-reconstructed electron candidates. Candidates are identified [34] using information on the shower-shape, the track quality and the spatial match between the track and the electromagnetic cluster, the fraction of total cluster energy in the HCAL, and the resulting level of activity in the surrounding tracker and calorimeter regions. The energy resolution for electrons with p T > 40 GeV, measured using Z → ee decays, is on average 1.7% in the ECAL central region of the detector [34].
Muon candidates are identified using track segments reconstructed separately from hits in the silicon tracking system and in the muon system. To identify muon candidates, the track segments must be consistent with muons originating from the primary vertex and satisfying tight identification requirements. The matching of the muon and silicon track segments results in a relative p T resolution of 1.3−2.0% in the central region of the detector for muons with 20 < p T < 100 GeV, and for muons with p T up to 1 TeV the resolution is 10% or better [35].
Lepton (electron or muon) reconstruction and trigger efficiencies are evaluated as a function of p T and |η| in both data and simulation, using a "tag-and-probe" method [36] with recorded and simulated samples of dileptonic Z events.
An isolation variable is employed to suppress leptons originating from QCD processes. We define a relative isolation as the sum of the p T of particle tracks found in the tracker and energy deposits found in the calorimeters within a cone ∆R = √ (∆η) 2 + (∆φ) 2 = 0.3 (0.4) around the trajectory of the electron (muon), divided by the lepton p T . Relative isolation is corrected for the effects of pileup, and is required to be less than 0.15 for muons, and less than 0.4 (0.6) for electrons in the barrel (endcap) region.
Particles reconstructed by the PF algorithm are clustered into jets by using the direction of each particle at the interaction vertex. Charged hadrons found by the PF algorithm that are associated with pileup vertices are not considered. Particles that are identified as isolated leptons are removed from the jet clustering procedure. Jets are reconstructed with the anti-k T algorithm [37, 38] with a distance parameter of 0.4. An event-by-event jet-area-based correction [39,40] is applied to remove, on a statistical basis, neutral pileup contribution that is not already removed by the charged-hadron subtraction procedure described above. Jet energy corrections are applied to each jet, as a function of p T and η, to correct for the calorimeter response [41].
The missing transverse momentum is defined as the negative vector sum of the transverse momenta of all the particles found by the PF algorithm, and its magnitude is referred to as E miss T . The decay of a heavy quark into a leptonically decaying W boson and a b quark is expected to exhibit genuine missing transverse momentum because of the undetected neutrino from the W decay. A missing transverse momentum threshold is applied to the selected events, and the missing transverse momentum vector is used in the mass reconstruction.
To identify jets originating from a b quark (b-tagged jets), the combined secondary vertex (CSV) algorithm is used [42,43]. This tagging algorithm combines variables that can distinguish b quark jets from those originating from light flavors, such as information on track impact parameter significance and secondary vertex properties. The variables are combined using a likelihood ratio technique to compute a b tagging discriminator. We use the CSV medium operating point [42], which achieves a b tagging efficiency of approximately 70% and a mistag rate of 1%. Data-to-Simulation efficiency and mistag rate scale factors account for the small differences observed between data and simulation. We use these scale factors as a function of jet p T and η [42] to correct simulated events.

Event selection and search strategy
The signal event selection requires exactly one lepton with p T > 40 GeV and |η| < 2.1. Events with additional leptons having p T > 10 GeV and |η| < 2.5 and passing relatively loose isolation and identification requirements are rejected to suppress dileptonic events.
Events are required to have at least two jets, one in the central and one in the forward region of the detector. The central jet is required to have p T > 200 GeV and |η| < 2.4 and be btagged. When there is more than one central jet satisfying the above criteria, the leading central jet is used to reconstruct the mass of the VLQ. The forward jet (2.4 < |η| < 5.0) must have p T > 30 GeV.
In the decay of a singly produced VLQ, the b quark and the W boson tend to be produced with the transverse momenta pointing in opposite directions. Hence, the azimuthal angle between the central b jet and the lepton is required to satisfy ∆φ( , b) > 2. In addition, the lepton is required to be separated from any jets with p T > 40 GeV produced in the event. When a hadronic jet is found within ∆R( , jet) < 1.5, the event is rejected. Since W boson originating from heavy VLQ decay has significant p T , events are required to have substantial E miss T (> 50 GeV) due to the undetected neutrino from the W boson decay. The transverse mass, M T , formed by the lepton and E miss T system is required to satisfy M T < 130 GeV to suppress tt dilepton events, which can mimic the signal when one of the leptons escapes detection.
Finally, events are required to have S T > 500 GeV, where S T is defined as the scalar sum of the transverse momenta of the lepton, the leading central jet, and the missing transverse momentum. This requirement reduces the signal efficiency by less than 10% for the VLQ mass range considered in this paper.
The invariant mass of the heavy quark candidate, M inv , is reconstructed from its decay products: the lepton, the leading central jet, and the neutrino, where the x,y-components of the neutrino momentum are given by the missing transverse momentum, while the z-component is determined by constraining the invariant mass of the lepton and neutrino to the W boson mass value. The solution with the smallest value is considered as the z-component. This method is used only when the solution of the relevant quadratic equation is real, otherwise the z-component is set to zero.
The single VLQ production Y/T → bW would result in a peak in the M inv distribution at the mass of the VLQ. The experimental mass resolution is 12−15% and is independent of the VLQ mass.

Background modeling
Events / 40 GeV The dominant background processes in this search are the production of tt and W+jets events. The modeling of these processes is validated by studying background-enriched samples.
To verify the modeling of the tt process, we select events with the lepton and E miss T fulfilling the signal selection criteria, and at least 2 b-tagged jets with the leading (sub-leading) jet satisfying the requirement of p T > 70 (30) GeV. We also remove the ∆R( , jet), ∆φ( , b) and forward jet requirements to enrich the sample with tt events.
The top quark p T spectrum from the tt simulation is known to be mismodeled and is reweighted using the empirical function described in Ref. [44]. After this correction, the data points at large values of all relevant kinematic distributions are consistent within systematic uncertainties. Distributions of S T and the invariant mass of the bW system in the tt sample are shown in Fig. 2.
The W+jets-enriched control sample requirement is identical to the signal event selection except that events with b-tagged jets are vetoed. We observe that in the W+jets simulated sample, the number of events at large jet p T distributions is overestimated as compared with the distributions measured in data. We derive a correction for the W+jets simulation as a function of the H T variable, defined as the scalar sum of the transverse momenta of all jets with p T > 30 GeV. The data to simulation ratio of the H T distribution is well described by a 2-parameter linear fit with a negative slope. A correction to the modeling of the W+jets H T spectrum is made using the results of the fit. After the correction is applied, good agreement in the modeling of all kinematic variables is observed. Distributions of S T and the invariant mass of the bW system in the W+jets sample are shown in Fig. 3.

Systematic uncertainties
We divide the systematic uncertainties into two categories: uncertainties that impact only the rate of background and signal predictions, and uncertainties that affect both the rate and the shape of the fitted M inv spectra. The shape uncertainties affecting the M inv distribution are modeled by varying the nuisance parameters that characterize the associated systematic effects up and down by one standard deviation. To account for the MC mismodeling correction in the W+jets sample, we derive a two-sided uncertainty band using the H T correction procedure. To account for the MC mismodeling correction in the tt sample, we derive a two-sided uncertainty band using the top p T reweighting procedure. One side of the band is obtained by removing the correction, and the other side is obtained by applying the procedure twice. The uncertainties due to these corrections increase with the rise of the top quark p T and H T , which leads to the widening of the uncertainty band at large S T and M inv , as can be seen in Figs. 2 and 3.
In addition, the reconstruction efficiency of forward jets has been observed to be larger in the simulation than in the data. The efficiency as a function of η is corrected to match the data using the W+jets-enriched sample with 0 b-tagged jets, and validated using the tt-enriched sample with two b-tagged jets. An uniform rate uncertainty of ±15% is assigned to cover the forward jet mismodeling in simulation.
Trigger and lepton identification efficiencies in simulation are corrected as functions of lepton p T and η using decays of Z bosons to leptons in data. The associated uncertainty of about 2% is the statistical uncertainty in the data.
The shape uncertainties include uncertainties in the jet energy scale, jet energy resolution, b tagging efficiency, pileup, PDFs, as well as factorization and renormalization scales. These uncertainties are treated as uncorrelated.
The uncertainty related to the modeling of pileup is evaluated by varying the inelastic cross section by ±5% relative to the nominal value of 69 mb [51]. Uncertainties in renormalization and factorization scales are taken into account by varying both scales simultaneously up and down by a factor of two. Uncertainties arising from the choice of PDFs are taken into account according to the PDF4LHC procedure [52].
The systematic uncertainties are summarized in Table 1.

Limit calculation and results
Good agreement between the event yields in the data and in the SM prediction is observed within uncertainties, as shown in Table 2. The sum of the SM backgrounds and a hypothesized signal for the combined electron and muon channels is fitted to the observed spectrum of M inv . The fit uses a binned likelihood method, where the binning of the distributions is chosen in such a way that the statistical uncertainty in the MC estimation of total background per bin is always less than 20%. Contributions from the SM processes are allowed to float independently within their systematic uncertainties, using log-normal priors [53, 54]. The nuisance parameters describing the shape uncertainties are constrained using Gaussian priors. The shapes of the M inv distributions for backgrounds and signal are parametrized and varied according to the nuisance parameters. The post-fit M inv distribution, with the shape and background normalizations corresponding to the maximum likelihood values, is presented in Fig. 4. All Upper limits at 95% confidence level (CL) on the production cross section of the Y/T → bW process are computed using a Bayesian approach [55], where the likelihood is marginalized with respect to the nuisance parameters representing systematic uncertainties. The expected limit is calculated by resampling the data from the background distribution. The 95% CL expected and observed upper limits are listed in Table 3 and shown in Fig. 5. The observed limits at high VLQ mass reflect a 2σ deficit of events above 1.0 TeV in the M inv distribution. The limits are derived assuming a narrow width for the VLQ. The VLQ width is proportional to the square of the coupling, and is negligible compared to the experimental resolution for couplings below 0.5, for the range of VLQ masses considered in this paper. In the framework of the model considered, Y quarks with a coupling of 0.5 and B(Y → bW) = 100% are excluded in the mass range from 0.85 to 1.40 TeV. This result may be compared with the expected region of excluded masses, which extends up to 1.0 TeV. In the case of T quarks with a coupling of 0.5, the theoretical cross section, the selection efficiency and the M inv distribution are the same as those for the production and decay of Y quarks, but the expected decay branching fraction B(T → bW) is 50%, only half that expected for B(Y → bW). Thus mass exclusion limits similar to those achieved for the Y quark would only be obtained for B(T → bW) = 100%.   is reconstructed by forming the invariant mass of the leading b-tagged jet, electron or muon, and missing transverse momentum in the event, and a fit to the invariant mass spectrum is performed. No evidence of an excess due to new physics is observed. Upper limits at 95% CL are set on the cross sections for single production of vector-like Y and T quarks in the mass range from 0.70 to 1.80 TeV. In the framework of the model considered, Y quarks with a coupling of 0.5 and B(Y → bW) = 100% are excluded in the mass range from 0.85 to 1.40 TeV. This result may be compared with the expected region of excluded masses, which extends up to 1.0 TeV. These results represent the most stringent limits to date on the single production of a vector-like Y quark. In the case of T quarks with a coupling of 0.5, the theoretical cross section, the selection efficiency and the M inv distribution are the same as those for the production and decay of Y quarks, but the expected decay branching fraction B(T → bW) is 50%, only half that expected for B(Y → bW). Thus mass exclusion limits similar to those achieved for the Y quark would only be obtained for B(T → bW) = 100%.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWFW and FWF (Aus [13] CMS Collaboration, "Search for single production of a heavy vector-like T quark decaying to a Higgs boson and a top quark with a lepton and jets in the final state", Phys. [32] CMS Collaboration, "Particle-flow event reconstruction in CMS and performance for jets, taus, and E miss T ", CMS Physics Analysis Summary CMS-PAS-PFT-09-001, 2009.
[33] CMS Collaboration, "Commissioning of the particle-flow event reconstruction with the first LHC collisions recorded in the CMS detector", CMS Physics Analysis Summary CMS-PAS-PFT-10-001, 2010.
[47] CMS Collaboration, "Measurement of the inclusive cross section of single top-quark production in the t-channel at 13 TeV", Phys. Lett. B , in proofs, arXiv:1610.00678.