Search for pair-produced vector-like leptons in final states with third-generation leptons and at least three b quark jets in proton-proton collisions at $\sqrt{s}$ = 13 TeV

The first search is presented for vector-like leptons (VLLs) in the context of the"4321 model", an ultraviolet-complete model with the potential to explain existing B physics measurements that are in tension with standard model predictions. The analyzed data, corresponding to an integrated luminosity of 96.5 fb$^{-1}$, were recorded in 2017 and 2018 with the CMS detector at the LHC in proton-proton collisions at $\sqrt{s}$ = 13 TeV. Final states with ${\geq}$ 3 b-tagged jets and two third-generation leptons ($\tau\tau$, $\tau\nu_\tau$, or $\nu_\tau\nu_\tau$) are considered. Upper limits are derived on the VLL production cross section in the VLL mass range 500-1050 GeV. The maximum likelihood fit prefers the presence of signal at the level of 2.8 standard deviations, for a representative VLL mass point of 600 GeV. As a consequence, the observed upper limits are approximately double the expected limits.

The 4321 model is an ultraviolet-complete model that extends the standard model (SM) gauge groups to a larger SU(4) × SU(3) ′ × SU(2) L × U(1) ′ group.It is motivated by recent measurements of b hadron decays that are in tension with the SM.This tension particularly concerns R(D) and R(D * ) measurements [8][9][10][11][12][13][14][15][16] that provide evidence for lepton flavour nonuniversality.The results of the search described here are the first search results for VLLs in the context of the 4321 model and are also expected to hold for other models with a Pati-Salam leptoquark explaining the B anomalies [17,18].Final states are probed that are not explored by existing searches for VLLs in models without leptoquarks [19].
The 4321 model gives a possible explanation for the aforementioned flavour-nonuniversal results, while simultaneously respecting many other measurements that are in agreement with SM expectations and lepton flavour universality [20][21][22][23].Here, we search for pair production of the lightest new particles in this model, the vector-like leptons.These leptons are referred to as vector-like because they are non-chiral, i.e. their left-and right-handed components have the same charges.
The VLLs come in electroweak doublets with one charged VLL, E, and one neutral VLL, N, whose masses are taken to be equal.A mass splitting between these two particles can arise from their couplings to the Higgs boson, but in the 4321 model the splitting is expected to be absent or extremely suppressed.A sizeable mass splitting between the charged and neutral VLL would affect the VLL decay modes; therefore a different search strategy would be required.Also, for the model to explain the B physics anomalies while remaining consistent with other measurements, the mass of the VLLs cannot be too large.In particular, requiring compatibility with the measured R(D) and R(D * ) anomalies and with measurements of B 0 s -B 0 s mixing suggests that the VLL mass cannot be more than a few TeV [2,24].
The VLLs may be produced via electroweak production through their couplings to the SM W and Z/γ bosons or via interactions with a new Z ′ boson introduced by the 4321 model.In this paper, we consider only electroweak production, and ignore potential contributions from the Z ′ boson; the expected size of the Z ′ contribution is dependent on the unknown Z ′ mass and coupling strength.Despite not explicitly considering this contribution, the results are also expected to hold when including Z ′ production.Examples of Feynman diagrams showing the electroweak pair production of VLLs, as well as diagrams of the VLL decays, are shown in Fig. 1.
Figure 1: Left side: example Feynman diagrams showing electroweak production of VLL pairs through s-channel bosons, as expected at the LHC.In these diagrams, L represents either the neutral VLL, N, or the charged VLL, E. Right side: vector-like lepton decays are mediated by a vector leptoquark, U.These decays are primarily to third-generation leptons and quarks.
The VLLs decay, via a virtual leptoquark, U, to two quarks and one lepton.In order to explain the B anomalies, the leptoquark is expected to couple most strongly to the third generation and therefore its decays are expected to be almost entirely to third-generation fermions.For each second-generation fermion, a one to two order of magnitude suppression in the branching fraction is expected, and even larger suppressions are expected for any first-generation fermion.
For this analysis, we have considered only the dominant decays, i.e. decays to third-generation fermions.
The analysis selection is driven by the flavour content of the VLL decay products.Given the expectation of two third-generation quarks in every VLL decay, we search for pairs of VLLs by selecting events with a high b jet multiplicity.These events are further categorized by the number of τ leptons.For each τ multiplicity, dedicated selections are made to divide the category into a signal-enriched "signal region" (SR) and one or multiple background-enriched "control regions" (CRs).Table 1 shows the τ multiplicity categories and the decay modes of the different VLL pairs that contribute to each category.While topologies with electrons or muons in the final state (coming from top quark or tau decays) are possible, we focus only on all-hadronic final states in this analysis.
A maximum likelihood fit is performed simultaneously across all τ multiplicity categories, including both SRs and CRs, to determine the signal strength for each VLL mass hypothesis.Within each region, differential distributions are used as input to the fit.The distributions are chosen to provide additional separation between the signal and backgrounds.In the fit, the data are also separated by the year in which it was collected (either 2017 or 2018), to ensure a good description of the data-taking and detector conditions in each case.
Machine learning techniques are used to build two deep neural network (DNN) classifiers [25][26][27] to separate signal events from background events.One of the classifiers, DNN QCD , is trained to distinguish signal from quantum chromodynamic (QCD) multijet events and is used to define the signal region in the 0-τ h category.The other DNN, DNN tt , is trained to discriminate between signal and tt events and its output distribution is used in the fit for the 1-τ h and 2-τ h categories.
Tabulated results for this analysis are provided in HEPData [28].

The CMS detector and object reconstruction
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections.Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors.Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid.A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [29].
Events of interest are selected using a two-tiered trigger system.The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [30].The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [31].The primary vertex is taken to be the vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [32].
Reconstruction is performed using the particle-flow algorithm [33], which aims to reconstruct and identify each individual particle in an event with an optimized combination of information from the various elements of the CMS detector.The energy of photons is obtained from the ECAL measurement.The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track.The energy of muons is obtained from the curvature of the corresponding track.The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for the response function of the calorimeters to hadronic showers.Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies.
For each event, hadronic jets are clustered from the reconstructed particles using the infrared and collinear safe anti-k T algorithm [34,35] with a distance parameter of 0.4.Jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found from simulation to be, on average, within 5 to 10% of the true momentum over the whole transverse momentum (p T ) spectrum and detector acceptance.Additional pp interactions within the same or nearby bunch crossings can contribute additional tracks and calorimetric energy depositions, increasing the apparent jet momentum.To mitigate this effect, tracks identified to be originating from pileup vertices are discarded and an offset correction is applied to correct for remaining contributions [36].Jet energy corrections are derived from simulation studies so that the average measured energy of jets becomes identical to that of particle level jets.In situ measurements of the momentum balance in dijet, photon+jet, Z+jet, and multijet events are used to determine any residual differences between the jet energy scale in data and in simulation, and appropriate corrections are made [37].Additional selection criteria are applied to each jet to remove jets potentially dominated by instrumental effects or reconstruction failures.All jets considered in the analysis are required to have p T > 40 GeV and |η| < 2.5.
To identify jets originating from b quarks, b-tagging algorithms are applied to all reconstructed jets in an event.To identify b jets at the level of the trigger selections, the CSV v2 algorithm is used in the 2017 data sample and the DEEPCSV algorithm is used in the 2018 sample [38].In the offline selection and analysis, the DEEPJET algorithm is used to identify b jets [39,40] for both data-taking years.
Each of the tagging algorithms outputs a continuous discriminator value.By requiring this value to be above a given threshold ("working point"), a jet can be identified as a b-tagged jet.
The tagging efficiency for b jets is 79 (82)% for a 1% misidentification probability of light flavour quark and gluon jets in 2017 (2018); this is referred to as the medium working point (WP).These values, taken from simulation, are further corrected towards the data using dedicated measurements that correct the identification and misidentification probabilities as functions of the kinematical properties of the jets.In this analysis, the b tagging discriminant score is used as input to the analysis-specific machine learning discriminators, and therefore corrections are applied to the full simulated discriminant score distribution, rather than just to the efficiencies of the individual b-tagging WPs.
Hadronically decaying τ leptons (τ h ) are reconstructed from jets, using the "hadrons-plusstrips" algorithm described in Ref. [41], which combines 1-3 tracks with energy deposits in the calorimeters, to identify the τ decay modes.Modes with two reconstructed tracks target cases in which one of the tracks from a τ decay with three charged particles in the final state is not reconstructed.To take into account the effects of pair conversion and bremsstrahlung in the material in front of the ECAL, neutral pions are reconstructed by combining electron and photon candidates within a defined region in η-ϕ space, where ϕ is the azimuthal angle.The dimensions of this region are in the range 0.05-0.15 in η and 0.05-0.30 in ϕ , depending on the p T of the electron and photon candidates.All τ h candidates in the analysis are required to have p T > 20 GeV and |η| < 2.1.
To distinguish genuine τ h decays from jets originating from the hadronization of quarks or gluons, and from electrons or muons, the DEEPTAU algorithm is used [42].Information from all particles reconstructed near the τ h axis is combined with properties of the τ h candidate and the event.
Several different WPs of the DEEPTAU algorithm are used: the very loose WP (in the rest of the paper referred to simply as the "loose WP") , medium WP, and tight WP.The rate at which jets are misidentified as a τ h is given in Table 2, as estimated from simulated samples of W boson production in association with jets.The missing transverse momentum vector ⃗ p miss T is computed as the negative vector sum of the transverse momenta of all the particle-flow candidates in an event, and its magnitude is denoted as p miss T [43].The ⃗ p miss T is modified to account for corrections to the energy scale of the reconstructed jets in the event.

Data and simulated samples
The analysis is performed on a sample of pp collision data with √ s = 13 TeV, collected with the CMS detector in 2017-2018 and corresponding to a total integrated luminosity of 96.5 fb −1 [5][6][7].The 2017 and 2018 data sets benefit from the upgrade of the pixel tracking detector installed in the winter of 2016-2017 [44].The upgraded tracker significantly improves the b jet identification performance [45], which is crucial to the analysis and online event selection.The 2016 data set, collected prior to this upgrade, is therefore not considered.
The signal is modelled at leading order (LO) with MADGRAPH5 aMC@NLO [46] v2.6.5, using the universal FEYNRULES output interface [47] to implement the 4321 model.The matrix element calculations are interfaced to PYTHIA 8.2 [48], with the description of the underlying event is provided by the CP5 tune [49].The NNPDF 3.1 next-to-next-to-leading order (NNLO) global fit parton distribution function (PDF) set [50] is used.
Signal production is restricted to use only SM electroweak couplings.We studied the impact of contributions from Z' production and found that they do not alter our conclusions.In the simulated samples, the couplings of the leptoquark are set to 0 for all first-and second-generation SM fermions.The mass of the leptoquark is always taken to be 3.5 TeV and the masses of the two VLLs are always taken to be equal to each other.The simulated model parameters, including the leptoquark and Z' masses, are taken from benchmark scenarios proposed to explain the B anomalies [24].However, due to the much lighter mass scale of the VLLs, the results are mostly insensitive to the values of the boson masses.
We generate signal samples for 12 different VLL mass hypotheses, ranging from 500 to 1050 GeV, in 50 GeV increments.The widths of the VLLs are evaluated for each mass.The decay of the VLLs is performed within MADGRAPH5 aMC@NLO, however, the decays of any top quarks produced in the VLL decays are simulated with PYTHIA.
Simulated samples are also used for background estimation.Samples of tt events are simulated at next-to-leading order using POWHEG BOX [51][52][53] interfaced to the NNPDF3.1 NNLO PDF set [50].These samples are generated assuming a 5-flavour scheme, where b quarks are considered massless.The top quark mass is set to 172.5 GeV, and the hdamp parameter, which regulates emissions in POWHEG, is set to 237.8775 GeV.The inclusive tt production cross section is taken to be 831.76pb, as calculated at NNLO in QCD, including resummation of nextto-next-to-leading logarithmic soft gluon terms with TOP++ 2.0 [54][55][56][57][58][59][60].
Because the signal events are expected to produce four b quarks, the ttbb component of the tt simulation plays a particularly important role in the analysis.This component is taken from the tt simulation described above.Events are taken as part of the ttbb component if there is at least one jet containing a b hadron not coming from a top quark decay.Otherwise the event is categorized as ttjj.Recent measurements from both the ATLAS and CMS Collaborations indicate that the cross section of the ttbb component is underpredicted by the simulation [61][62][63][64].In order to account for this discrepancy, an overall scale factor of 1.2 is applied to the ttbb component.
Smaller backgrounds coming from tt production in association with a boson, denoted ttX, from vector boson production with additional associated jets, denoted V+jets, and from multiboson production, denoted VV(V), are also considered.The contributions from these backgrounds are taken from simulation.These processes are all simulated using MADGRAPH5 aMC@NLO for the matrix element calculation, which is interfaced with PYTHIA8 for showering and hadronization.
Samples of QCD multijet events are generated at LO using MADGRAPH5 aMC@NLO, merging two-, three-, and four-jet matrix elements, and interfaced to PYTHIA8 for showering and hadronization.These samples are only used for training the DNN QCD classifier; in the analysis, background contributions from QCD multijet processes are estimated using control samples in data.
For all simulated data, additional minimum bias pp collisions are simulated and superimposed on the primary simulated interaction.Simulation of the passage of particles through the detector material is performed by GEANT4 [65].

Event selection and categorization
All events considered in the analysis must satisfy a trigger that requires three b-tagged jets each with p T > 30 GeV and places additional requirements on the p T of the jets in the event.A further selection, after the trigger, is used in the analysis to avoid the region near the trigger thresholds, which may be sensitive to mismodelling.In order to pass this basic selection, an event is required to contain at least three jets passing the medium b tagging WP and at least four jets with p T greater than 80, 65, 50, and 50 GeV.In addition, the scalar sum of the p T of all jets in the event (H T ) must be greater than 400 GeV, and no isolated electrons (p T > 15 GeV, |η| < 2. 5) or muons (p T > 30 GeV, |η| < 2.5) must be present.
These events are then further categorized based on the τ h candidate multiplicity.Three categories are considered, with either 0, 1, or ≥ 2τ h candidates.Within each of these categories, SRs and CRs are defined.The 1-τ h and 2-τ h categories also make use of several determination regions, which are not themselves included in the signal extraction fit, but are used for the background estimation described in Section 6.A diagram summarizing all these regions is shown in Fig. 2.
The 0-τ h region is defined by requiring that there are no τ h candidates passing the loose, medium, or tight WP.No additional event selections beyond the basic ones described above are required.The primary background in this region is from QCD multijet events.This background in the SR is estimated using a technique that makes use of three background-enriched regions in data.The SR is defined by requiring p miss T > 160 GeV, as expected from the two neutrinos, as well as by making a selection on the DNN QCD (described in Section 5).The jet multiplicity distribution, N jets , is used for the final fit.
The 1-τ h region is defined by requiring that there is exactly one τ h candidate.In the SR, the τ h candidate must pass the tight τ h WP and there must be no other τ h candidates passing the medium τ h WP.In the CR, the τ h candidate must pass the loose τ h WP and there must be no τ h candidates passing the medium τ h WP, in order to avoid signal contamination.Additionally, in both the SR and the CR, the τ h candidates must have p T > 60 GeV, and the event must have p miss T > 60 GeV.In the 1-τ h region, significant backgrounds arise from both tt and QCD multijet events.Multijet events can pass the selection criteria when a jet has been misidentified as a τ h .The tt background has components both with and without genuine τ h s.The DNN tt classifier distribution, described in Section 5, is used for the final fit.
The 2-τ h region is defined by requiring that there are at least two τ h candidates.In our data set, no events with more than two τ h candidates are found.The SR requires that there are at least two τ h candidates passing the medium WP; the CR requires that there are at least two τ h candidates passing the loose WP and no τ h candidates passing the medium WP, in order to avoid signal contamination.Additionally, for all events in 2-τ h region, the τ h candidates must

Machine learning classifiers
Two DNN classifiers are trained to separate signal and background events.One of them, DNN QCD , is specifically trained to discriminate between signal and QCD multijet events.The other, DNN tt , is trained to distinguish signal from tt events.Both DNNs use the same underlying architecture, a graph neural network called ABCNET [26].In this implementation, individual final state objects are passed as inputs to the DNN, and are represented as nodes in the graph.The graph connects sets of k-nearest neighbour objects with edges, so that relationships between the objects are taken into account.In all cases, we use k = 8 and a Euclidean metric in the η-ϕ coordinates.
Both τ h candidates and jets are passed to ABCNET as inputs, with up to ten objects per event used as input.If more than ten objects are present in the event, all of the τ h candidates are passed to the network first, followed by the jets, ordered by p T , until the total object count has reached ten.
For each object, several kinematic features are passed to ABCNET.They are shown in Table 3.
For jets, the charge entry is always set to zero.For τ h candidates, the DEEPJET score is always set to zero, which corresponds to an extremely low probability of being a b jet.With this

Background estimation
The primary SM processes that pass the event selection and form backgrounds in the search are tt and QCD multijet production.Additional contributions from ttX, V+jets, and VV(V) production are also considered, but play smaller roles.The estimated contributions of these minor background processes are taken directly from simulation, and the V+jets and VV(V) contributions are found to be negligible.On the other hand, the QCD multijet contribution is estimated using control samples in data.
The expected contributions from tt events, including the ttbb component, are estimated mostly from simulation, as described in Section 3. The number of tt events with misidentified τ h candidates, i.e.where at least one identified τ h candidate does not come from a true hadronic τ lepton decay, is estimated from a combination of simulation and control samples in data.The estimation is performed by means of a "fake factor" method.The method uses a CR with a looser τ h WP to estimate the number of events in the SR with a tighter τ h WP.The ratio of the number of events in the two regions, called the fake factor (FF), is first measured in a dedicated determination region (DR) which is orthogonal to the primary analysis region.Then in the primary analysis region, the estimated number of events in the control region is multiplied by the fake factor to estimate the number of events in the SR: where N misid (X) is the number of events in region X with at least one misidentified τ h candidate.The FF and N misid are estimated in bins of the DNN tt score.
To estimate N misid in the DRs, the number of events in data with correctly identified τ h objects is estimated from simulation and subtracted from the total; the remaining component is taken to be N misid .In the primary analysis region, N misid (CR) is taken directly from simulation.An additional uncertainty is applied to this component to account for possible mismodelling.
The DRs are designed to be enriched in tt events.After passing a single-muon trigger, these events are required to pass the same selection as the main analysis, except that they must have at least two (not three) b-tagged jets.Events are categorized based on the number of τ h candidates as in the main analysis region.However, unlike in the main analysis regions, no additional selection criteria (on di-τ charge, τ h p T , or p miss T ) are applied when performing the τ h categorization in the DRs.These differences in selection are all designed to increase the number of events in these regions, as the primary limitation of our FF estimation is the statistical uncertainty.An additional correction and uncertainty, to account for these differences in the phase space between the determination and primary analysis region, is estimated from simulation.
The fake factor method is applied independently for the 1-τ h and 2-τ h regions, using the same methodology in both cases.
The QCD multijet background in the 1-τ h and 2-τ h regions is also determined using a fake factor method.Unlike the approach adopted for tt, where N misid (CR) is taken from simulation, this component is also taken from data for the QCD multijet background.The number of expected events for all other background processes is subtracted from the data to estimate N misid (CR).The signal contribution is ignored in the subtraction, since it is never expected to contribute more than O(1%) of the QCD multijet yield in the control region.The FF DR is constructed by requiring p miss T < 40 GeV.As for the tt fake factor, a correction is applied to account for the difference between the FFs in the DR and the primary analysis region.However, in the 2-τ h region, no simulated QCD multijet events pass the signal region selection criteria and therefore no correction is applied.
In the 0-τ h region, the QCD multijet contribution is estimated using an "ABCD" method.This method makes use of two observables, over which the background distribution is uncorrelated.Each observable is divided into two subregions, to form four regions in total.Then, from the fact that the two observables are uncorrelated, the background contribution in a signalenriched region is estimated from the background yields in three signal-depleted regions: where N QCD (X) represents the number of QCD multijet events in region X and is determined independently for each bin of the N jets distribution used in the fit.We use the p miss T and the DNN QCD classifier score as the two observables for the ABCD method.The regions are divided at DNN QCD = 0.6 and p miss T = 60 GeV, with the region DNN QCD > 0.6, p miss T > 60 GeV being the SR.The training of the DNN is designed to ensure that its output and the p miss T are uncorrelated [27], and this assumption was further tested both in simulated samples and in a dedicated validation region.

Uncertainties
The dominant uncertainty comes from the limited number of events in the 1-τ h and 2-τ h SRs.The statistical uncertainties are approximately 70% of the total uncertainty in the signal strength estimate.A number of other experimental and theoretical uncertainties are considered and included in the fit as nuisance parameters.In each case, the nuisance parameters alter one or more of the processes included in the fit.Except where otherwise noted, the nuisance parameters alter both the shape and normalization of the affected processes.
Uncertainties in the measurement of the energies and resolutions of jets and τ h candidates are estimated from the calibrations described in Section 2.
Uncertainties in the b-tagging efficiencies and misidentification rates are taken from dedicated measurements [38].The tagging rates and their uncertainties are determined by looking separately at events enriched in genuine b jets, using a tt selection, and events enriched in light flavour quark and gluon jets, using Z → e + e − and Z → µ + µ − events with additional jets.An iterative fit to these two regions is performed and the uncertainties are characterized by nine separate components, which account for different uncertainty sources such as the limited sample size and uncertainties in the flavour compositions of the regions.
Uncertainties in the identification and misidentification rates of τ h candidates are also considered.The identification rate uncertainties are taken from measurements using a set of Z → ττ enriched events, where one of the τ leptons decays into neutrinos and a muon [41].The muonically decaying τ lepton is used as a probe to measure the identification rate for the hadronically decaying τ lepton.For each τ h identification WP, there is one nuisance parameter that alters the τ h identification efficiency.
Three additional sets of uncertainties related to τ h misidentification are included.A 20% normalization uncertainty is applied to the contributions arising from misidentified τ h candidates in the CRs, i.e.where the loose τ h WP is used.To account for the misidentification uncertainties at tighter WPs, the uncertainties from the FF derivation are propagated to background components with misidentified τ h candidates in the signal regions, i.e.where the medium or tight τ h WP is used.Systematic uncertainties in the simulated samples used in the derivation of the FFs are small compared with the associated statistical uncertainties, and are neglected.Each bin in which FFs are derived is therefore assigned an independent uncertainty corresponding to the statistical uncertainty in the FF derivation, which is typically between 10% and 50%.There are 20 such uncertainties for the QCD multijet FFs and nine for the tt FFs.Finally, a set of uncertainties accounts for differences between the FFs in the determination regions and the main fit region.The uncertainty is taken to be the relative difference between the two FFs, as found in simulation, and is comparable to the statistical uncertainty in the FFs.For the 2-τ h QCD multijet FFs, where no simulated events pass the selection criteria, an 80% uncertainty is applied.These uncertainties are fully correlated within each region and across years.The total FF uncertainties are the second largest source of uncertainty in the measurement and are approximately 40% of the uncertainty in the signal strength estimate.
A simplified Barlow-Beeston approach [66], with a single nuisance parameter per bin of the distribution, is used to treat the statistical uncertainties due to the limited size of the simulated samples.
Other experimental uncertainties that have a smaller impact on the analysis include: uncertainties in the total delivered luminosity (normalization only), the uncertainty in the number of inelastic pp collisions in a single bunch crossing, uncertainties related to detector conditions, and uncertainties in the determination of the trigger efficiencies.
Theoretical uncertainties are assessed separately for the tt, ttX, and signal processes.Each of these uncertainty sources is treated as uncorrelated between these processes.For each process, a single PDF uncertainty is constructed by adding in quadrature the contributions from all of the Hessian eigenvectors of the process PDF set and its variations in the strong coupling (α S ) [67].Independent variations of the renormalization and factorization scales by factors of two are used to assess the uncertainty from the approximations used in the matrix element calculations; an envelope that describes the largest impacts of these variations is formed and encoded as a single nuisance parameter per process.Variations where the two scales differ by a factor of four are not included.Uncertainties in the parton shower simulation are assessed by changing the scale at which α S is evaluated in the shower by factors of two up and down.This is done independently for the initial state and final state radiation, creating two independent nuisance parameters for each process.An additional uncertainty related to the mismodelling of the top quark p T spectrum in tt events by next-to-leading order Monte Carlo generators is also applied to the tt sample.
An additional 20% normalization uncertainty is applied to the ttbb component.This accounts for the fact that the ttbb component is not as well determined as the overall tt cross section.

Fit setup and validation
The results of the analysis are extracted by performing a simultaneous binned maximum likelihood fit over both years and all signal and control regions.The SRs defined in Fig. 2 provide the main sensitivity to any potential signal, while the CRs are used to provide estimates of some of the important background processes from data.For the 0-τ h region, the distribution of the number of jets in the event (N jets ) is used in the fit, whereas in the 1-τ h and 2-τ h regions the distributions of the DNN tt scores are fitted.
The fit is performed separately for each mass hypothesis to extract any dependence of the result on the VLL mass.In each case, a single signal strength parameter that multiplies the VLL production cross section is used.Systematic uncertainties are treated as nuisance parameters in the fit.Nuisance parameters that affect the overall rate of a process are described by lognormal probability density functions, whereas nuisance parameters that affect the shapes of distributions are described by Gaussian probability density functions.Upper limits at the 95% confidence level (CL) on the signal strength are determined using the CL s criterion [68,69] with the profile likelihood ratio as the test statistic.In this procedure, we make use of the asymptotic approximation [70].The signal significance is calculated from the likelihoods of the background-only and signal-plus-background models using Wilks' theorem [71].
The analysis procedure and optimization is performed "blinded" (without looking at the data) and a number of validation tests are also performed on the fit setup without using the observed data in the main fit region.Fits are performed on pseudo-data generated based on the expected background model, and then applying Poisson-distributed smearing to the values in each bin.This test is performed in four configurations: without signal, and also including signal simulation with three different input cross sections.For each configuration, 500 pseudo-experiments are performed.The signal strength extracted from the fits is consistent with it being an unbiased estimator of the true signal strength, and the uncertainties quoted on the signal strengths are consistent with the expected frequentist coverage.
Fits are also performed on collision data in validation regions.These regions are chosen to be orthogonal to the primary analysis regions, and with almost no signal contamination.Separate validation regions are chosen for each of the 0-τ h , 1-τ h , and 2-τ h categories.In every case, the validation region is chosen to have exactly two b-tagged jets (as compared to ≥ 3 in the main analysis region).In the 1-τ h and 2-τ h categories, the validation regions also require that the event score from the DNN classifier separating signal from tt is less than 0.8, in order to ensure that there is no signal contamination.The results of these fits are consistent with the background-only hypothesis.

Results
Postfit distributions for the background-only and signal-plus-background fits are shown in Fig. 3.For the fits including signal, the 600 GeV VLL mass point is shown as an example.The signal fit shows consistency across the sensitive channels.
The observed data show excesses in the highest DNN tt bins for both the 1-τ h and 2-τ h channels, for both 2017 and 2018.The 0-τ h channel does not show any excess, which is expected given its lower sensitivity.The expected and observed limits are shown in Fig. 4. The DNN tt output, used in the 1-τ h and 2-τ h signal regions, is not very sensitive to the signal mass, and therefore the results are highly correlated across all mass points.The higher than expected limits are consistent with the existence of a signal.As a result, none of the mass range tested here (500-1050 GeV) is excluded at the 95% CL, and limits are set between 10 and 30 fb, depending on the VLL mass hypothesis.At the representative 600 GeV mass point, the signal-plus-background model is preferred over the background-only model at the level of 2.8 standard deviations (σ).The highest DNN tt bins of each region fit the data better when the signal hypothesis is included, especially for the 2018 2-τ h region, where the background-dominated part of the fit is also substantially improved.Nuisance parameters related to the fake factor estimates and the statistical uncertainties of the simulated events in high DNN tt bins are also pulled to lower likelihood values in the background-only fit, contributing to the significance.
A number of post-unblinding checks were performed to test the robustness of the result, reliability of the model, and consistency of the data with the signal hypothesis.As indicated by the significance in the combined fit, the signal interpretation is consistent across the channels.No excess is observed in the significantly less sensitive 0-τ h regions, whereas the excesses observed in the 1-τ h and 2-τ h regions are consistent with each other.The observed limits for each of the 1-τ h and 2-τ h channels, when fit individually, are above the expected limits.Taking the 600 GeV mass point as an example, the significances for these channels when fit individually range from 0.8 to 2.0σ and are all within 1σ of their expected significances for a signal strength corresponding to the best combined fit signal strength.
Several alternative analysis procedures were also tested and found to give similar results to those presented here.These included rejecting τ h candidates with only two reconstructed tracks, applying a reweighting to account for possible mismodelling of the tt H T spectrum, increasing the size of certain experimental uncertainties, and including an additional uncertainty related to the difference between data and simulation in the τ h FFs.All of these were found to give very similar results to the main result presented here.
These results have been obtained considering only electroweak production of VLL pairs.However, tests showed that the results hold when including Z ′ production of VLL pairs as well.

Summary
The first search for vector-like leptons (VLLs) in the context of the 4321 model has been presented, using proton-proton collision data collected with the CMS detector at √ s = 13 TeV, corresponding to an integrated luminosity of 96.5 fb −1 .The probed model consists of an extension of the standard model with an SU(4) × SU(3) ′ × SU(2) L × U(1) ′ gauge sector that can provide a combined explanation for multiple anomalies observed in b hadron decays, which point to lepton flavour nonuniversality.In the model, a leptoquark is predicted as the primary source of lepton flavour nonuniversality, while the ultraviolet-completion predicts additional vector-like fermion families.In particular, VLLs are investigated via their couplings to standard model fermions through leptoquark interactions, resulting in third-generation fermion signatures.Final states containing at least three b-tagged jets and varying τ lepton multiplicities are considered.To improve the search sensitivity, graph neural networks are trained to discriminate between classes of events using the kinematic relationships between particles in an environment with high jet multiplicity.
An excess of 2.8 standard deviations over the standard model background-only hypothesis was observed in the data at the representative VLL mass point of 600 GeV.As a result, no VLL masses are excluded at the 95% confidence level and limits on the product of the VLL pair production cross section and their branching ratio to third generation fermions are set between 10 and 30 fb, depending on the VLL mass hypothesis.

Figure 2 :
Figure2: Diagram of the event categorization and the various signal and control regions used in the analysis.The regions within the solid box are all used in the signal extraction fit.The regions in the dashed box are used to determine transfer factors from the control regions to the signal region for events with jets misidentified as τ h s.The selections are mutually exclusive between all regions, so that each event only enters a single region.For brevity, not all selection criteria are shown; the detailed selection criteria are described in the text.have opposite charge, and be separated by ∆R = √ (∆η) 2 + (∆ϕ) 2 of at least 0.5.The event is required to have p miss T > 40 GeV.In this region, tt is the dominant background and the QCD multijet contribution is highly suppressed.The same DNN tt used in the 1-τ h region is also used in the final fit of the 2-τ h region.

Figure 3 :
Figure 3: Postfit plots for 2017 (top two rows) and 2018 (bottom two rows)showing the fitted distributions in the signal region for the different τ h multiplicity channels (from left to right: 0-τ h , 1-τ h , 2-τ h ).The jet multiplicity is fit for the 0-τ h region, whereas the DNN tt score is fit for the 1-τ h and 2-τ h regions.The first and third row show the background-only fits and the second and fourth rows show the fit including the signal.

Figure 4 :
Figure 4: Expected and observed 95% CL upper limits on the product of the VLL pair production cross section and the branching fraction to third generation quarks and leptons, combining the 2017 and 2018 data and all τ h multiplicity channels.The theoretical prediction in the 4321 model for electroweak production of VLLs is also shown.
gie (IWT-Belgium); the F.R.S.-FNRS and FWO (Belgium) under the "Excellence of Science -EOS" -be.h project n. 30820817; the Beijing Municipal Science & Technology Commission, No. Z191100007219010; the Ministry of Education, Youth and Sports (MEYS) of the Czech Republic; the Hellenic Foundation for Research and Innovation (HFRI), Project Number 2288 (Greece); the Deutsche Forschungsgemeinschaft (DFG), under Germany's Excellence Strategy -EXC 2121 "Quantum Universe" -390833306, and under project number 400140256 -GRK2497; the Hungarian Academy of Sciences, the New National Excellence Program -ÚNKP, the NK-FIH research grants K 124845, K 124850, K 128713, K 128786, K 129058, K 131991, K 133046, K 138136, K 143460, K 143477, 2020-2.2.1-ED-2021-00181, and TKP2021-NKTA-64 (Hungary); the Council of Science and Industrial Research, India; the Latvian Council of Science; the Ministry of Education and Science, project no.2022/WK/14, and the National Science Center, contracts Opus 2021/41/B/ST2/01369 and 2021/43/B/ST2/01552 (Poland); the Fundac ¸ão para a Ciência e a Tecnologia, grant CEECIND/01334/2018 (Portugal); the National Priorities Research Program by Qatar National Research Fund; MCIN/AEI/10.13039/501100011033,ERDF "a way of making Europe", and the Programa Estatal de Fomento de la Investigaci ón Científica y Técnica de Excelencia María de Maeztu, grant MDM-2017-0765 and Programa Severo Ochoa del Principado de Asturias (Spain); the Chulalongkorn Academic into Its 2nd Century Project Advancement Project, and the National Science, Research and Innovation Fund via the Program Management Unit for Human Resources & Institutional Development, Research and Innovation, grant B05F650021 (Thailand); the Kavli Foundation; the Nvidia Corporation; the Su-perMicro Corporation; the Welch Foundation, contract C-1845; and the Weston Havens Foundation (USA).

Table 1 :
Illustrative contributions from different VLL production and decay modes to the 0-τ, 1-τ, and 2-τ signal regions.The decay products in parentheses represent the objects coming from the intermediate vector leptoquark, U, in the decay.In the table, no distinction is made between particles and antiparticles, the multiplicities of each decay mode are not shown, and the impacts of object misidentification are not considered.The charged and neutral VLLs are represented by E and N; j represents any quark other than t or b.

Table 2 :
Estimated τ h identification efficiencies and jet misidentification rates for the DEEPTAU algorithm used in the analysis.

Table 3 :
List of input features used during training of the ABCNET model for VLL classification.model is able to distinguish τ leptons from jets during training.Using the same model, two different trainings are performed to distinguish VLL events from QCD and tt events.An equal mixture of signal samples with different mass hypotheses is used in the DNN training.The simulated events used for training the DNN QCD classifier are required to pass the 0-τ h selection criteria, whereas those used for training the DNN tt classifier must have at least one τ h candidate passing the medium WP.