Search for long-lived heavy neutral leptons decaying in the CMS muon detectors in proton-proton collisions at √ s = 13 TeV

A search for heavy neutral leptons (HNLs) decaying in the CMS muon system is presented. A data sample is used corresponding to an integrated luminosity of 138 fb − 1 of proton-proton collisions at √ s = 13 TeV, recorded at the CERN LHC in 2016–2018. Decay products of long-lived HNLs could interact with the shielding materials in the CMS muon system and create hadronic and electromagnetic showers detected in the muon chambers. This distinctive signature provides a unique handle to search for HNLs with masses below 4 GeV and proper decay lengths of the order of meters. The signature is sensitive to HNL couplings to all three generations of leptons. Candidate events are required to contain a prompt electron or muon originating from a vertex on the beam axis and a displaced shower in the muon chambers. No significant deviations from the standard model background expectation are observed. In the electron (muon) channel, the most stringent limits to date are set for HNLs in the mass range of 2.1–3.0 (1.9–3.3) GeV, reaching mixing matrix element squared values as low as 8.6 ( 4.6 ) × 10 − 6 .


Introduction
The observation of neutrino oscillations [1][2][3] provides experimental evidence for nonzero neutrino masses [4].Cosmological considerations [5,6] and direct measurements [7] imply that the neutrino masses are much smaller than both the masses of other standard model (SM) fermions and the vacuum expectation value of the Higgs field, hinting at a different mechanism for generating masses of neutrinos compared with that for the masses of charged fermions.One possible explanation of the nonzero neutrino masses is the existence of heavy neutral leptons (HNLs) with right-handed chirality, giving rise to the gauge-invariant mass terms for the SM neutrinos through the "seesaw" mechanism [8][9][10][11][12][13][14][15].The HNLs are singlets with respect to the SM gauge groups and therefore do not interact with SM particles through the electroweak or strong interactions, but can be produced through mixing with the SM electron, muon, and τ neutrinos [16][17][18][19][20].There can be either distinct HNL particles and antiparticles [(pseudo) Dirac type], or the HNL can be its own antiparticle [Majorana type].An HNL is characterized by its mass, m N , and its mixing matrix elements V Nℓ , with ℓ = e, µ, τ, respectively.The HNL lifetime is inversely proportional to m 5 N |V Nℓ | 2 [21], and HNLs can be macroscopically long-lived for sufficiently low values of mixing matrix elements.Models with HNLs are well motivated because they can explain the baryon asymmetry of the universe through CP violation in the HNL system [22,23], provide a dark matter candidate [24], and explain the observed anomalous magnetic moment of the muon [25,26].
In this paper, a search for Dirac and Majorana HNLs with mean proper decay lengths (cτ 0 ) in the range of 0.1-10 m is performed.A data sample collected by the CMS experiment in 2016-2018 and corresponding to an integrated luminosity of 138 fb −1 is used.An HNL that decays via a virtual W boson can result in a final state with two charged leptons and a neutrino, or one lepton and two quarks.Figure 1 shows the Feynman diagram for the production of an HNL via W boson decay, where the prompt lepton from the W boson decay originates from a vertex on the beam axis and serves as a clean signature for triggering.We search for HNL decays occurring within the muon detector system.The hadrons and electrons from the HNL decays, including those resulting from an intermediate τ lepton, could produce a particle shower as they interact with the steel shielding material of the muon detectors.This process results in the striking signature of a localized high-multiplicity cluster of muon detector hits, referred to as a muon detector shower (MDS) object.Because of the macroscopic distance from the primary interaction point to the muon detectors and its large volume of geometric acceptance, this signature is sensitive to a large class of models involving long-lived particles (LLPs).It is uniquely sensitive to HNLs with cτ 0 in the range 0.1-10 m, extending the search sensitivity to unprecedentedly low mixing parameter values in the m N range of 1-3 GeV.We follow an analysis strategy for MDS objects that is similar to one used by CMS in Ref. [42].
Figure 1: Feynman diagrams for the production of a Majorana HNL N M (left) and a Dirac HNL N D (right) via a W − boson decay and through its mixing with an SM neutrino of the same flavor.The prompt lepton from the W − boson serves as a clean signature for triggering, whereas the decay products of the HNL are reconstructed as a cluster of muon detector hits.
The outline of the paper is as follows: a brief description of the CMS detector is given in Section 2. The data set and the simulated samples of events are detailed in Section 3, followed by a description of the event selection and categorization in Section 4. A description of the backgrounds and the method used to estimate them is given in Section 5. Sources of systematic uncertainty affecting the expected numbers of signal and background events are discussed in Section 6.The results and their interpretations are presented in Section 7, followed by a summary of the paper in Section 8.The numerical results of this paper can be found in its HEPData record [43].

The CMS detector
The central feature of the CMS detector is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections.The ECAL consists of 75,848 lead tungstate crystals, which cover |η| < 1.48 in the barrel region and 1.48 < |η| < 3.00 in the two endcap regions.The HCAL is composed of cells of width 0.087 in η and azimuth (ϕ, in radians) for the region |η| < 1.74, progressively increasing to a maximum of 0.174 for larger values of |η|, along with the forward calorimeters extending the η coverage provided by the barrel and endcap detectors.
Muons are measured in the pseudorapidity range of |η| < 2.4, with detection planes embedded in the steel flux-return yoke outside the solenoid and made using three technologies: drift tubes (DTs), cathode strip chambers (CSCs), and resistive-plate chambers (RPCs).
The barrel DT detectors cover |η| < 1.2 and are organized into four layers ("stations"), labeled MB1 to MB4.The stations are located approximately 4, 5, 6, and 7 m in radial distance (r) from the interaction point.Each station is a concentric ring of chambers, interleaved with the layers of the steel flux-return yoke of the solenoid and is divided along the beamline axis (z) into five wheels.In the first three stations, every DT chamber consists of three "superlayers" (SLs), each comprising four staggered layers of parallel DT cells, for a total of 12 layers.The innermost and outermost SLs measure the hit coordinate in the rϕ plane, and the central SL measures in the z direction, along the beamline.The fourth station (MB4) contains only two SLs measuring the hit position in the rϕ plane.The DT cells are designed to provide a uniform electric field, such that the position of a traversing charged particle can be inferred from the measured arrival time and the constant drift velocity.Individual hits have a resolution of about 600 µm which can be improved to about 260 µm by incorporating information from hits in the same SL [44].
The CSC detectors cover 0.9 < |η| < 2.4 and comprise four stations in each endcap, labeled ME1 to ME4.Each station is a ring of chambers interleaved between two layers of steel-flux return yoke at approximately the same value of z.The stations are located approximately 7, 8, 9.5, and 10.5 m from the interaction point along the beamline axis on both ends of the detector.Each chamber consists of six layers containing cathode strips along the radial direction and anode wires perpendicular to the central strip.Position and timing measurements of traversing charged particles are extracted from the electrical signals on the anode wires and the cathode strips in each chamber, for which the typical resolutions are 400-500 µm and 5 ns.
The RPC detectors are placed alongside the DT and CSC detectors and are primarily designed to provide timing information for the muon trigger.
Events of interest are selected using a two-tiered trigger system.The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [45].The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [46].A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, is reported in Ref. [47].

Simulated samples
Simulated samples of Drell-Yan produced Z+jets are used to validate the MDS object reconstruction efficiency.They are produced using POWHEG 2.0 [48][49][50] and normalized to next-tonext-to-leading-order (NNLO) precision [51,52].Signal samples for HNL production are generated at leading order (LO) with the Monte Carlo (MC) generator MADGRAPH5 aMC@NLO 2.3.3 [53][54][55][56].This matrix element level calculation includes only the W-channel production of HNLs with up to two additional partons.The simulated events are interfaced with PYTHIA 8.226 [57] to simulate the parton shower and hadronization of partons, and the underlying event description.
The branching fraction of W boson decays into an HNL plus a lepton is proportional to the mixing parameter |V Nℓ | 2 .The signatures considered in this search are sensitive to mixing with ν e , ν µ , or ν τ since the decay products of the HNL in the muon detector system are reconstructed as particle showers.The final states consist of one prompt lepton (e or µ) from the W boson decay and HNL decay products.To account for higher order effects, the cross sections of the signal samples are scaled to the NNLO W boson production cross section, which is 20 510 ± 770 pb [58] including the branching fraction to one lepton flavor, multiplied by the square of the HNL mixing matrix elements.The branching ratios to all three lepton generations are assumed to be the same.To obtain a more accurate description of the boost of the HNL signal, W-boson-p T -dependent corrections based on an MC sample generated with DYTURBO1.3.2 at next-to-next-to-next-to-leading-logarithm and next-to-NNLO accuracy [59,60] are computed and applied as an event-by-event weight.
Signals corresponding to both Dirac and Majorana HNLs are considered in this search, for cτ 0 values in the range of 0.1-10 m.Lepton number violating (LNV) decays are only possible for Majorana HNLs, whereas lepton number conserving (LNC) decay channels are available for both Majorana and Dirac HNLs.This distinction implies that the widths of the Dirac HNL are exactly half of that of the Majorana HNLs for the same m N and mixing matrix; therefore, the lifetime of the Dirac HNLs will be twice the Majorana ones [56].In this search, the charged lepton from the HNL decay is either not detected or its charge is not measured, resulting in identical detector response between LNV and LNC events.Therefore, we can predict the signal yields of a simulated Majorana sample as those from a Dirac HNL sample with twice the lifetime.Generated events are processed through a simulation of the detector geometry and response using GEANT4 [61].The same reconstruction software is applied to both data and simulated events.Simulated events include an expected distribution of the number of additional pp interactions within the same or nearby bunch crossings (pileup).All generated events are weighted such that the distribution of the number of collisions per bunch crossing matches the one observed in data, with an average of approximately 23 (32) interactions per bunch crossing [62][63][64] in 2016 (2017-2018).The underlying Pythia event tune CUETP8M1 [65] (CP5 [66]) and NNPDF3.0[67] (3.1 [68]) parton distribution functions are used to simulate all samples for the 2016 (2017-2018) data taking period.
The HNL signal samples are generated for m N in the range from 1 to 4 GeV and cτ 0 in the range from 0.1 to 10 m.Since only a discrete set of lifetimes was generated, the signal predictions for intermediate lifetimes were estimated by performing the reweighting procedure described below.The simulated HNLs are assumed to couple exclusively to one of the three SM neutrino families.The signal production cross sections and lifetimes are determined by the values of m N and |V Nℓ | 2 , which in turn determine the signal acceptance and reconstruction efficiency.Thus, for a fixed value of m N , a simple cross section rescaling is not sufficient to correctly reproduce the behavior of other HNLs with the same m N but different |V Nℓ | 2 .To emulate an HNL signal sample with a specific value of |V Nℓ | 2 (and thus cτ 0 ), we thus apply per-event weights to the events, such that the HNL lifetime distribution (taken before the parton shower and detector simulation) matches the predicted distribution for the chosen |V Nℓ | 2 value.For signal scenarios with m N between the masses of the simulated samples, we estimate the signal yield by interpolating between the predictions from the simulated samples at nearby m N using the fact that the acceptance of the muon system differs only through the Lorentz boost factor.

Event reconstruction and selection
The objects in each event are reconstructed using the particle-flow (PF) algorithm [69], which aims to identify each individual particle in an event as an electron, photon, muon, charged or neutral hadron, with an optimized combination of information from the various elements of the CMS detector.The resulting particles are referred to as PF candidates.Each electron is identified as a charged-particle track that extrapolates to an ECAL energy cluster and any nearby ECAL clusters that are consistent with possible bremsstrahlung photons [70].The electron momentum is estimated by combining the energy measurement in the ECAL with the momentum measurement in the tracker.Muons are identified as tracks in the central tracker consistent with either tracks or several hits in the muon system, and associated with calorimeter deposits compatible with the muon hypothesis [71].Photons are identified as ECAL energy clusters not linked to the extrapolation of any charged-particle trajectory to the ECAL [72].The energies of photons are obtained from the ECAL measurement.Charged hadrons are identified as charged-particle tracks neither identified as electrons, nor as muons.The energy of each charged hadron is determined from a combination of the track momentum and the corresponding ECAL and HCAL energies, corrected for the response function of the calorimeters to hadronic showers.Finally, a neutral hadron is identified as HCAL energy clusters not linked to any charged-hadron trajectory, or as a combined ECAL and HCAL energy excess with respect to the expected charged-hadron energy deposit.The energy of each neutral hadron is obtained from the corresponding corrected ECAL and HCAL energies.
To identify the electrons and muons resulting from the decays of the W bosons, we impose additional selection requirements to enhance the signal purities of these objects.For each electron candidate, requirements are imposed on the shape of the electromagnetic shower in the ECAL, the quality of the matching between the track trajectory and the ECAL shower, and isolation from additional particles near the candidate.A tight working point [70] for this electron identification is used, which has an average efficiency of 70% and a misidentification rate of about 2%.We consider electron candidates with transverse momenta p T > 30 GeV and |η| < 2.5; a more stringent requirement of p T > 35 GeV is imposed for the data collected in 2017-2018 to match the increase in the p T threshold of the single electron trigger from 27 to 32 GeV.For muon candidates, requirements are imposed on the quality of the track in the silicon tracker, the global compatibility of the hits in the silicon tracker and muon detector comprising the muon candidate, and isolation from additional particles near the candidate.The tight working point [71] is used, which has an efficiency above 95% and a misidentification rate of about 0.1% for hadrons.We consider muon candidates with p T > 25 GeV and |η| < 2.4; a more stringent requirement of p T > 28 GeV is imposed for the data collected in 2017 to match the increase in the p T threshold of the single-muon trigger from 24 to 27 GeV.
In this search, jets are used to veto muon detector shower cluster objects to suppress the punchthrough jet background.Jets are reconstructed by clustering PF candidates using the anti-k T algorithm [73,74] with a distance parameter ∆R = √ (∆η) 2 + (∆ϕ) 2 = 0.4.Jets are required to have p T > 10 GeV and |η| < 3.0.Pileup can contribute additional tracks and calorimetric energy depositions to the jet momentum.To mitigate this effect, charged particles identified to be originating from pileup vertices are discarded and an offset correction is applied to account for remaining contributions [75].Jet energy corrections are derived from simulation studies so that the average measured energy of jets becomes identical to that of particle-level jets [76].
The missing transverse momentum vector ⃗ p miss T is computed as the negative vector p T sum of all the PF candidates in an event, and its magnitude is denoted as p miss T [77].The jet energy corrections are propagated into the computation of p miss T .The primary vertex (PV) is taken to be the vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [78].
We select data events triggered by the single-electron or single-muon triggers and require the events to have exactly one prompt electron or prompt muon candidate, respectively, satisfying the identification and isolation criteria described above.To further suppress the background from SM events composed uniquely of jets produced through the strong interaction, referred to as quantum chromodynamics (QCD) multijet events, we require p miss T > 30 GeV.The above requirements are designed to select events with an HNL candidate produced in a W boson decay.Because the constituents of MDS clusters are not included in the p miss T calculation, typical HNL signal events would satisfy this p miss T selection.

Muon detector shower clusters
For LLPs that decay within or just prior to the DT and CSC muon detectors, the material in the steel-flux return yoke structure can induce a particle shower, creating a geometrically localized and isolated cluster of detector hits.In CSC chambers, the detector hits are reconstructed by combining the signal pulses from the anode wires and the cathode strips, forming a point on a two-dimensional plane in each chamber layer.In DT chambers, the detector hits are reconstructed from the signal pulses from the anode wires at the center of the DT cells.Since the DT hits only provide measurement in either the ϕ or z dimension, the DT hit position is assumed to be at the center of each DT chamber in the orthogonal direction.Together with the position of the chambers, each CSC and DT hit is assigned a three-dimensional coordinate in space.We cluster the CSC and DT hits based on their η and ϕ coordinates using the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm [79], which is a commonly used density-based clustering algorithm that is robust against outliers.A distance parameter of 0.2 and a minimum number of hits per cluster (N hits ) of 50 is used in the DBSCAN algorithm.We choose a minimum of 50 hits to avoid misidentifying a minimum ionizing muon as a cluster object.A muon is expected to produce a maximum of 24 and 44 hits in the CSC and DT detectors, respectively.The centroid of the cluster is taken to be the geometric center of the hits in the cluster.These clusters, which we refer to as MDS objects, provide a powerful signature to distinguish LLP signal events from background events.
The efficiency for MDS objects to be reconstructed depends on whether the LLP decays primarily to hadrons, including the hadronic decays of τ leptons, or to electrons and photons.Typically, more cluster hits are produced by hadrons, resulting in a higher MDS reconstruction efficiency for LLPs that decay to hadrons as compared with LLPs that decay to electrons and photons.The reconstruction efficiency is about 80 (30-45)% for hadronic (electron and photon) decays, depending on whether the MDS is reconstructed in the CSCs or DTs.Muons that are produced from HNL decays do not produce a particle shower and are not reconstructed.Further details of the efficiency dependence on LLP decay location can be found in reference [42].
Energetic particles produced in jet hadronization or from secondary material interactions can produce MDS clusters by traversing the shielding material between the tracking volume and the muon detectors without being stopped.To suppress this process, known as the punchthrough jet background, we veto any MDS clusters in the DT (CSC) detectors whose centroid lies within ∆R = 0.4 of a jet with p T > 20 (10) GeV and |η| < 3.0.To suppress clusters produced by muon bremsstrahlung, we impose a tighter veto on any MDS clusters in the DT (CSC) detectors whose centroid lies within ∆R = 0.8 of a muon candidate with p T > 10 (20) GeV.The p T thresholds for the jet and muon vetoes were chosen to optimize the signal-to-background discrimination and differ between the DT and CSC detectors because of the different amounts of shielding present.Because MDS cluster hits do not enter the ⃗ p miss T calculation, the MDS cluster tends to align with the ⃗ p miss T and the momentum of the LLP in signal events.We leverage this feature to validate the background estimation method discussed in Section 5. Muons with trajectories pointing toward two regions of the detector near the cavern chimneys have highly reduced reconstruction efficiencies because these regions are occupied by cables and other support structures resulting in a smaller detector coverage.As a result, the rate of background MDS clusters produced by muon bremsstrahlung and passing the muon veto is significantly increased.To suppress these backgrounds, we veto any clusters whose centroid is within ∆R = 0.3 of the following two locations in η and ϕ coordinates: (η = 0.3, ϕ = 1.70) or (η = −0.3,ϕ = 1.15), corresponding to the chimney locations.
The rate of MDS clusters caused by punch-through jets is significantly larger for clusters with hits in the muon detector station closest to the interaction point because of the reduced amount of shielding material.Furthermore, any hits observed in the stations between the interaction point and the station containing the majority of the cluster hits are indicative of a punchthrough jet because signal LLPs are neutral and therefore do not produce any hits before they decay.To further suppress such punch-through jets, we veto any MDS clusters in the CSCs whose centroid lies within ∆R = 0.4 of any hits observed in the two innermost rings of the ME1 station (ME1/1 and ME1/2), or in the RPCs located immediately next to ME1/2, or any segments in the MB1 station.Segments are required in the MB1 station instead of single hits because of the larger rate of noise in the DT detectors, which would produce a larger signal inefficiency for such a veto.For MDS clusters in the DTs, we veto any cluster whose centroid lies within ∆R = 0.5 of two or more hits in the MB1 station.We also require no more than 8 MB1 hits in the adjacent wheels within ∆ϕ < π/4.As a result of the above vetoes, only clusters located beyond the station closest to the interaction point are accepted.
After the jet and muon vetoes, the dominant background consists of clusters from pileup, including interactions not in the same bunch crossing as the PV, known as out-of-time (OOT) pileup.To eliminate the background clusters from OOT pileup, we reconstruct a characteristic time for the clusters in both the CSC and DT detectors.For the CSC clusters, the cluster time is reconstructed from the mean time of the hits comprising the cluster, where the time of each hit is calibrated to result in a distribution centered at zero for the triggering bunch crossing.Note that the adjacent bunch crossings occur at ±25 ns with this definition.We require that the CSC cluster time (t CSC cluster ) is between -5.0 and 12.5 ns and that the root-mean-square spread of the timestamps of hits comprising the cluster (t spread ) is less than 20 ns.Because hits in the DT detectors do not have a time measurement that is independent of their position measurement, we require that DT clusters coincide with at least one hit in an RPC detector in the same wheel and within ∆ϕ < 0.5 of the DT cluster centroid.The time measurements of each matching RPC hit are matched to discrete bunch crossing times.The most frequently occurring bunch crossing time among the matching RPC hits is defined as the DT cluster time (t DT cluster ).The DT cluster time is required to coincide with the bunch crossing corresponding to the PV.
Finally, additional identification requirements, developed in Ref. [42], are imposed on CSC clusters.They are required to satisfy successively more central |η| requirements as the number of CSC stations containing hits (N stations ) and the distance between the station and the primary interaction point decrease.The |η| requirements are • |η| < 1.8 if N stations = 1 and the cluster is in station 4, • |η| < 1.6 if N stations = 1 and the cluster is in station 2 or station 3, and This CSC cluster identification algorithm has ∼60% efficiency and suppresses the background by a factor of 8.
For a certain period of data taking, anomalous detector noise in specific components of the DT detectors produced a high rate of background MDS clusters.To suppress this noise-induced background, we veto MDS clusters comprising hits detected in those particular DT chambers taken during the affected time period.The amount of data rejected by this veto corresponds to less than 0.1% of the total integrated luminosity.
After applying all the cluster vetoes, we require the events to contain at least one CSC or DT cluster with N hits > 50.The minimum number of hits is chosen to exceed the number of hits that a muon is expected to create in either CSC or DT detectors.The presence of an MDS cluster passing the associated vetoes and identification criteria suppresses SM background by a factor exceeding 10 7 , whereas typical signal efficiencies are 25-35%.

Background estimation
After the event selections described in Section 4, two types of background events remain, in which the MDS cluster can be muon-induced or non-muon-induced.The non-muon-induced background involves a hard scattering process that produces a prompt lepton, and low momentum hadrons from pileup or an underlying event generate an MDS cluster.The dominant component of this background comes from W production, and subdominant contributions arise from QCD, tt, or diboson production.The muon-induced background comes from Z → µµ events, in which one of the two muons from the Z boson undergoes bremsstrahlung in the muon detector and produces an MDS cluster back to back with the other prompt muon, mimicking the configuration of a signal event.For the non-muon-induced background, which is present in both the prompt-muon and prompt-electron categories, we use an "ABCD" (matrix) method, which requires two variables that discriminate between signal and background and are independent of one another for the background.For the Z → µµ background, which is present only in the prompt-muon categories, the background estimation is derived from dedicated control regions in data.
In the ABCD method, we select the two independent variables to be (i) the azimuthal angle between the prompt lepton and the cluster centroid (∆ϕ lep ) and (ii) N hits .For non-muon-induced events, ∆ϕ lep is uniformly distributed and is independent of N hits , because the cluster and lepton are produced from two independent processes.Figure 2 shows the shapes of the N hits and ∆ϕ lep distribution for signal and background.
Two separate requirements, one on each variable, partition the two-dimensional space into four bins, A, B, C, and D, as illustrated in Fig. 3.The bin boundaries are optimized for the best expected search sensitivity, separately for each of the search categories.Since the HNL is primarily produced back-to-back with the prompt lepton, the bin with the best signal-tobackground ratio is bin D defined as ∆ϕ lep > 2.8 and N hits > 150 or 200.Similarly, bin C is defined as ∆ϕ lep > 2.8 and N hits ≤ 150 or 200, bin A is defined as ∆ϕ lep < 2.8 and N hits > 150 or 200, and bin B containing the least signal is defined as ∆ϕ lep < 2.8 and N hits ≤ 150 or 200.The responses of the DT and CSC detectors to shower particles are generally different with CSC signal clusters having a larger hit multiplicity compared with DT signal clusters.The optimal bin boundary is found to be higher (200) for the CSC signal regions (SRs) compared with the DT SRs (150), with a similar signal efficiency of about 0.01% for an HNL with a cτ 0 of 1 m.Because of the independence of the two variables, the expected background event rate in the signal-enriched bin D can be related to the other three bins by λ D = (λ A λ C )/λ B , where λ X is the expected background event rate (i.e. the Poisson mean) in bin X.To account for a potential signal contribution to bins A, B, and C, a binned maximum likelihood fit is performed simultaneously in the four bins, with a common signal strength parameter scaling the signal yields in each bin.The background component of the fit is constrained to obey the ABCD relationship.
Because of the different background composition, we separate events with a selected prompt electron and a selected prompt muon into two disjoint categories.We also separate events with clusters in the CSC and DT detectors.Finally, for events in the category with a prompt muon and an MDS cluster in the DT detectors, we separate events into subcategories, which we call the DT-MB2 and DT-MB3/MB4, depending on whether the majority of the cluster's hits fall in the MB2 station or the MB3 or MB4 station of the DT detector.This categorization is motivated by the fact that the background differs for clusters in the MB2 station because of the thinner shielding in front of it.The additional subcategories for DT clusters are advantageous for the prompt muon channel because of its ability to suppress the additional Z → µµ background as  detailed below.This results in a total of five SRs.
The background estimation procedure is validated using events in the early OOT validation region (VR), defined as events passing all analysis selections except those related to the cluster time; instead, a negative cluster time is required.Additionally, the background estimation procedure is also checked for the in-time VR, defined as events with the azimuthal angle between the MDS cluster centroid position and the ⃗ p miss T : ∆ϕ(cluster, ⃗ p miss T ) > 0.7.This VR is signal depleted because the MDS cluster and ⃗ p miss T tend to align, as described in Section 4.1.The observed and predicted yields for these VRs are summarized in Table 1 and show that the ABCD method results are consistent with observed background yields.Similar consistencies are observed when the validations are performed in both in-time and OOT VRs with looser N hits requirements than in the SRs.The ABCD background predictions for bin D of the SRs are listed in Table 2 for each of the event categories considered in this search.The event yields for the bins A, B, and C are shown, as well as the prediction for the background in the signal-enriched bin D. In the prompt-muon event categories, Z → µµ events constitute a significant background source that is not predicted by the ABCD method.The muon veto normally suppresses such backgrounds, but in rare cases, the muon may not be reconstructed in the muon system due to instrumental effects such as gaps between chambers.These nonreconstructed muons will lead to p miss T in the same direction as the cluster, and thus these events are not present in the VRs listed in Table 1.
We define a Z → µµ control region (CR) by inverting the MB1, or ME1/1 or ME1/2 hit veto requirements for MDS clusters in the DT or CSC detectors, respectively.After that, bin D will be dominated by the Z → µµ background.The Z → µµ expected background yield in bin D of the Z → µµ CR is calculated as λ CR Z→µµ,D = N CR D − λ CR ABCD bkg,D , where N CR D is the observed data yield in bin D and λ CR ABCD bkg,D is the ABCD method prediction.We extrapolate λ CR Z→µµ,D to the SR by applying a transfer factor ζ, which estimates the pass-to-fail efficiency ratio for the MB1, ME1/1, and ME1/2 veto requirements for MDS clusters produced by muon bremsstrahlung.
The factor ζ is measured in a sample enhanced in MDS clusters produced by muon bremsstrahlung obtained by selecting dileptonic decays of tt pairs with an electron and muon in the final state.We select events with: (i) one prompt electron matched to an electron trigger object; (ii) no reconstructed muon; (iii) one MDS cluster passing all the veto selections except the MB1, ME1/1, or ME1/2 vetoes; and (iv) exactly two additional jets with p T > 20 GeV, |η| < 2.4 that pass the medium working point of the combined secondary vertex (CSV) btagging algorithm [80,81], which has an efficiency of 60% and a mistag rate of 1% for lightflavor or gluon jets.We also require that the MDS cluster is geometrically separated by ∆R > 0.8 from the two jets that satisfy the CSV algorithm requirement to ensure that the MDS cluster is not produced by a punch-through jet.These requirements result in an event sample that is pure in dileptonic tt events, in which both the electron and jets cannot produce the MDS cluster, thus ensuring that the MDS cluster is produced by the unreconstructed muon.We measure the ζ factors separately for MDS clusters in the DT and CSC detectors.We observed a linear dependence of ζ on N hits in the MB2 category.To account for this dependence, we perform a linear fit to the data in the CR and evaluate the fitted function at N hits = 150 as the ζ for the MB2 category.Finally, the expected Z → µµ background contribution to bin D of the SR is calculated as λ SR Z→µµ,D = ζλ CR Z→µµ,D .Details of the Z → µµ background prediction are summarized in Table 3.
To validate the Z → µµ background estimation method, we define another VR as a subset of the MB2 SR with N hits ≤ 120 which is expected to have a negligible signal contribution.In this VR, we define the bin boundaries for the ABCD method to be at 2.8 in ∆ϕ lep and at 110 in N hits .We perform the prediction of the Z → µµ and non-muon-induced background events using the same method applied to the full SR and obtain predictions of 2.7 ± 1.6 for the Z → µµ background and 9.9 ± 1.3 for the other background, for a total background prediction of 12.6 ± 2.1 events.We observe an event yield of 12, which is consistent with our background predictions.
Table 3: Summary of the Z → µµ background estimate in different categories.The first three columns show the estimates in the Z → µµ enriched control region of the total background and its Z → µµ and non-muon-induced components.The fourth column shows the transfer factors ζ used to predict the Z → µµ background in the signal region, shown in the fifth column.

Systematic uncertainties
The dominant systematic uncertainties in this search are those in the background predictions.
The main source of uncertainty for the ABCD method arises from the statistical uncertainties of the background-enriched bins A, B, and C.This uncertainty accounts for 27% and 22% (40% and 18%) of the size of the total background in the electron and muon categories, respectively, for MDS clusters in the DT (CSC) detector.
For the prompt-muon categories, an uncertainty in the estimate of the Z → µµ background also contributes significantly to the total uncertainty.The uncertainty for that background prediction arises from the statistical uncertainty of the ζ measurement, accounting for about 72%, 2%, and 15% of the size of the total background in the muon channel for DT-MB2, DT-MB3/MB4, and CSC clusters, respectively.
Systematic uncertainties in the signal yield include both theoretical and instrumentation effects.The theoretical uncertainty in the inclusive cross section of W boson production is 3.8%, which is dominated by the PDF uncertainty.The uncertainty in the W boson p T distribution is estimated by varying the renormalization and factorization scales by a factor of two separately and coherently, and evaluating the size of the envelope in the resulting signal yield, which is found to be 1.6%.The uncertainty of parton shower modeling is estimated similarly by varying the renormalization scales for parton showers and is found to be 4%.
The accuracy of the CMS simulation and the GEANT4 software in describing the details of the evolution of electromagnetic and hadronic showers has already been validated extensively in past measurements.However, the accuracy of the CMS simulation implementation of the response of the muon detectors in an environment with a large multiplicity of secondary particles, which affects the simulated reconstruction efficiencies in the SR, has not been explicitly verified.This aspect is validated by comparing clusters produced in Z → µµ data and simulated events, in which one of the muons undergoes bremsstrahlung in the muon detectors and the associated photon produces an electromagnetic shower.With this comparison, we derive the uncertainties in the cluster reconstruction efficiency for both CSC and DT clusters in simulation; they account for 16% and 13% of the signal yield for the DT and CSC categories, respectively.These uncertainties apply to both electromagnetic and hadronic showers.Figure 4 illustrates this validation of the cluster simulation.Data-to-simulation corrections on other cluster properties are also derived with the same method, and the uncertainties in the corrections are propagated as systematic uncertainties.For DT clusters, a correction of 6.8% is applied for the MB1 veto efficiency, with an uncertainty of 7.4%.For CSC clusters, corrections are applied to account for the hit and segment vetoes (2.8%), muon veto (6.8%), and jet veto efficiencies (2.1%), with uncertainties of 0.1%, 4.5%, and 0.06%, respectively.Additionally, the uncertainties on efficiencies of the CSC cluster identification, time, and time spread requirements are estimated to be 5%, 0.9%, and 2.8%, respectively.In data, only the reconstructed hits that have at least two cathode hits in different CSC layers and match a given predefined pattern are read out.In contrast, this readout condition is assumed to be satisfied in the signal simulation, which could lead to an overestimation of the N hits in signal clusters.We estimated this effect by excluding those chambers with less than 6 hits in signal clusters, and assigning the change in signal efficiency (1%) as an uncertainty.
We also propagate additional systematic uncertainties that have a minor impact on the signal yield prediction.They include uncertainties due to pileup, integrated luminosity, jet energy scale, and prompt electron or muon trigger and selection efficiencies.The integrated luminosities for the 2016, 2017, and 2018 data-taking years have 1.2-2.5% individual uncertainties [62][63][64], while the overall uncertainty for the 2016-2018 period is 1.6%.Finally, the uncertainties on the signal yields from the limited numbers of simulated events are in the ranges 5-10% depending on m N and lifetime.Theoretical and experimental uncertainties that are not related to the clusters are treated as fully correlated across different event categories.Experimental uncertainties related to the clusters are treated as fully uncorrelated.
A full list of systematic uncertainties affecting the predicted signal yield is shown in Table 4.

Results and interpretation
Figure 5 shows the expected and observed number of events in the SRs of the different event categories and the corresponding background predictions.The observed yields agree with the predicted background in all channels.
No excess of data events over the background prediction is observed, and upper limits on the HNL production cross sections are evaluated using the CL s criterion [82,83], with the binned profile likelihood ratio [84] as the test statistic.The likelihood is constructed as the product of Poisson distributions with mean values based on the predicted event rates across all five event categories.The predicted event rates in each bin include the background yields predicted with the ABCD method plus the Z → µµ background contribution for bin D and the signal yields obtained from simulated events.Systematic uncertainties in the predicted event rates are incorporated in the likelihood as nuisance parameters with log-normal constraints.The asymptotic formulas [85] are used to evaluate the exclusion limits for each HNL signal scenario.The exclusion limits derived from the asymptotic formulas are consistent within 10% with those based on pseudo-experiments.Due to the lower p T thresholds for the prompt muons, the signal yields in the muon channel are larger than those of the electron channels for the same m N and lifetime.However, the presence of the Z → µµ background reduces the overall sensitivity of the muon channel to a level comparable to the electron channel.For both the muon and electron channels, the CSC category contributes the majority of the overall sensitivity.This is because the CSC category has smaller background rates compared with the DT category, hence retaining more signal events at the optimal thresholds for background rejection. .The lower acceptance in the short lifetime branch is compensated by an increased cross section, and thus achieves a signal yield comparable with the long lifetime branch.Below m N of 2.0 (1.5) GeV for electron or muon (τ) type HNLs, we do not calculate a limit in the short lifetime branch because the signal acceptance in the muon system approaches zero and renders the calculation inaccurate.Furthermore, in that region other experimental results [27,28,30] place more stringent limits.HNL masses in the range 2.1-3.0 (1.9-3.3)GeV for electron (muon) neutrino mixing parameters [43].Above 3.0 (3.3) GeV, previous CMS results [38] place tighter limits on electron (muon) mixing parameters.HNL.The τ neutrino mixing limit is obtained by combining the results from the electron and muon channels.For these limit calculations, the HNL is assumed to mix with a single lepton flavor state only.The differences between the expected and observed limits on |V Nµ | 2 are not visible in this figure .reconstruction efficiencies of the prompt lepton.The limits at f e : f µ : f τ = 0: 1 2 : 1 2 and 1 3 : 1 3 : 1 3 probe parameter space consistent with the constraints from neutrino oscillation data in the minimal seesaw scenarios [86].

Summary
A search for long-lived Dirac or Majorana heavy neutral leptons (HNLs) has been performed using proton-proton collision data at √ s = 13 TeV, corresponding to an integrated luminosity of 138 fb −1 .The search targets events with one prompt electron or muon and a muon detector shower (MDS) that would result from HNL decays occurring in the CMS muon detector.The presence of the MDS signature along with the associated vetoes and identification criteria suppresses the standard model background by a factor exceeding 10 7 , while maintaining typical signal efficiencies of 25-35%.No significant excess over the standard model background is observed.The results are interpreted as 95% confidence level limits on the HNL mixing matrix elements squared |V Ne | 2 , |V Nµ | 2 , and |V Nτ | 2 .We also present limits on the HNL mass and mean proper decay length as a function of the mixing matrix element squared fractions to the three lepton generations.The most stringent limits to date for HNLs in the mass range of 2.1-3.0 (1.9-3.

Figure 2 :
Figure 2: Distribution of N hits (upper) and ∆ϕ lep (lower) for DT clusters (left) and CSC clusters (right).Signal distributions of a Majorana HNL with m N = 2 GeV and cτ 0 = 1 m are compared with the OOT background distributions selected with t DT cluster matched to bunch crossings earlier than the PV for DT clusters and t CSC cluster < −12.5 ns for CSC clusters.The centroids of the clusters in the signal events are required to be within ∆R = 0.4 of the HNL's direction.The distributions are normalized to unit area.The shapes of the distributions shown are similar for the electron and muon channels.

Figure 3 :
Figure 3: Definition of the ABCD plane.The area of the blue squares illustrates the relative amount of expected events in each of the bins, with bins B and C having the majority of the event yields.Bin D is the signal region.

Figure 4 :
Figure4: Comparison of N hits distributions for events with muons from Z → µµ between data and simulation, for CSC clusters (left) and DT clusters (right), using data collected in 2017 and the simulation of the corresponding data-taking conditions.The data sample is selected by requiring a two-muon invariant mass consistent with a Z boson and one of the muons is matched to an MDS cluster.Data-to-simulation correction factors are applied to the Z → µµ simulation.Only statistical uncertainties are included in the figure.Distributions made using data collected in 2016 and 2018 are found to have similar shape as the 2017 data sample.

Figure 6
Figure 6 shows the expected and observed upper limits at 95% confidence level (CL) on the HNL mixing parameters |V Ne | 2 , |V Nµ | 2 , and |V Nτ | 2 , as functions of m N for the Majorana and Dirac HNL interpretations.For |V Ne | 2 (|V Nµ | 2), only events passing the electron (muon) selections are used to evaluate the limits.For |V Nτ | 2 , both the events passing the electron and muon selections are used, because the prompt τ lepton can decay into an electron or muon.The limits of |V Nℓ | 2 versus m N feature an upper branch (short lifetime) and a lower branch (long lifetime).The lower acceptance in the short lifetime branch is compensated by an increased cross section, and thus achieves a signal yield comparable with the long lifetime branch.Below m N of 2.0 (1.5) GeV for electron or muon (τ) type HNLs, we do not calculate a limit in the short lifetime branch because the signal acceptance in the muon system approaches zero and renders the calculation inaccurate.Furthermore, in that region other experimental results[27,28,30] place more stringent limits.Table 5 summarizes the observed limits on |V Nℓ | 2 for Majorana and Dirac type HNL of this search.This result sets the most stringent limits to date in |V Nℓ | 2 for

Figure 5 :
Figure 5: The expected and observed number of events in the signal region (bin D) of different event categories.Signal yields of a 2 GeV Majorana HNL with the mean proper decay length of 1 m are added to the expected background.

1 × 10 − 4 - 5 . 9 × 10 − 3
Signal yields for mixing among several different lepton flavors are obtained by reweighting the single flavor signal yields with the same m N and cτ 0 .In this case, the signal yields for multiple neutrino species mixing are proportional to |V Nℓ | 2 .The ratio of |V Nℓ | 2 to the sum of the mixing matrix elements squared is defined asf ℓ = |V Nℓ | 2 /(|V Ne | 2 + |V Nµ | 2 + |V Nτ | 2 ), and sums to unity by construction.For a fixed value of m N (cτ 0 ), we scan the three f ℓ parameters and exclude a range in the cτ 0 (m N ) parameter for each set of f ℓ values.On the left (right) of Fig.7, we show the largest values of m N (cτ 0 ) that are excluded as a function of the f ℓ parameters.The sensitivity across the f ℓ plane varies mainly because of the different trigger and 138 fb 1(13 TeV)

Figure 6 :
Figure 6: Expected and observed upper 95% CL limits on |V Ne | 2 (upper), |V Nµ | 2 (middle) and|V Nτ | 2 (lower) as functions of the HNL mass (m N ) for a Majorana (left) and Dirac (right) type HNL.The τ neutrino mixing limit is obtained by combining the results from the electron and muon channels.For these limit calculations, the HNL is assumed to mix with a single lepton flavor state only.The differences between the expected and observed limits on |V Nµ | 2 are not visible in this figure.

Figure 7 :
Figure 7: The largest values of the Majorana (upper) and Dirac (lower) HNL mass (left) and mean proper decay length (right) parameters that are excluded at 95% CL are shown as a function of the mixing matrix elements squared ratios f ℓ with the three lepton generations, considering a mean proper decay length of 1 m and a fixed mass of 1.5 GeV, respectively.

3 )
GeV are set, reaching squared mixing matrix element values as low as 8.6 (4.6) × 10 −6 in the electron (muon) channel.NextGenerationEU program (Italy); the Latvian Council of Science; the Ministry of Education and Science, project no.2022/WK/14, and the National Science Center, contracts Opus 2021/41/B/ST2/01369 and 2021/43/B/ST2/01552 (Poland); the Fundac ¸ão para a Ciência e a Tecnologia, grant CEECIND/01334/2018 (Portugal); the National Priorities Research Program by Qatar National Research Fund; MCIN/AEI/10.13039/501100011033,ERDF "a way of making Europe", and the Programa Estatal de Fomento de la Investigaci ón Científica y Técnica de Excelencia María de Maeztu, grant MDM-2017-0765 and Programa Severo Ochoa del Principado de Asturias (Spain); the Chulalongkorn Academic into Its 2nd Century Project Advancement Project, and the National Science, Research and Innovation Fund via the Program Management Unit for Human Resources & Institutional Development, Research and Innovation, grant B37G660013 (Thailand); the Kavli Foundation; the Nvidia Corporation; the SuperMicro Corporation; the Welch Foundation, contract C-1845; and the Weston Havens Foundation (USA).

Table 1 :
Validation of the ABCD method in the OOT and in-time validation regions.The predictions of the method for the signal bin (last column) are consistent with the observed number of events, shown in the second-to-last column.

Table 2 :
The event yields in the bins A, B, and C are shown in each of the event categories considered in the search, as well as the prefit prediction for the ABCD background in the signalenriched bin D.

Table 4 :
Summary of systematic uncertainties affecting the signal yield prediction.For DT clusters, the systematic uncertainties due to jet and muon vetoes are found to be negligible and are omitted.The uncertainties are reported relative to their impact on the predicted signal yield.
Table 5 summarizes the observed limits on |V Nℓ | 2 for Majorana and Dirac type HNL of this search.This result sets the most stringent limits to date in |V Nℓ | 2 for

Table 5 :
Excluded ranges of |V Nℓ | 2 for Majorana and Dirac type HNLs at select HNL masses.The chosen HNL masses are those at which the excluded values of |V Nℓ | 2 have the smallest magnitude.