Search for a heavy composite Majorana neutrino in events with dilepton signatures from proton-proton collisions at $\sqrt{s}$ = 13 TeV

Results are presented of a search for a heavy Majorana neutrino N$_\ell$ decaying into two same-flavor leptons $\ell$ (electrons or muons) and a quark-pair jet. A model is considered in which the N$_\ell$ is an excited neutrino in a compositeness scenario. The analysis is performed using a sample of proton-proton collisions at $\sqrt{s}$ = 13 TeV recorded by the CMS experiment at the CERN LHC, corresponding to an integrated luminosity of 138 fb$^{-1}$. The data are found to be in agreement with the standard model prediction. For the process in which the N$_\ell$ is produced in association with a lepton, followed by the decay of the N$_\ell$ to a same-flavor lepton and a quark pair, an upper limit at 95% confidence level on the product of the cross section and branching fraction is obtained as a function of the N$_\ell$ mass $m_{\text{N}_\ell}$ and the compositeness scale $\Lambda$. For this model the data exclude the existence of N$_\text{e}$ (N$_\mu$) for $m_{\text{N}_\ell}$ below 6.0 (6.1) TeV, at the limit where $m_{\text{N}_\ell}$ is equal to $\Lambda$. For $m_{\text{N}_\ell}$ $\approx$ 1 TeV, values of $\Lambda$ less than 20 (23) TeV are excluded. These results represent a considerable improvement in sensitivity, covering a larger parameter space than previous searches in pp collisions at 13 TeV.


Introduction
The standard model (SM) of particle physics is an extremely successful theory that has been extensively verified against experimental results. Nevertheless, there are several fundamental aspects of particle phenomenology that are not explained within the SM. One of these is the appearance of three generations of leptons and quarks, regarded as fundamental fermions in the SM, and the related question of the mass hierarchy across the generations. A possible solution to these issues is offered by composite-fermion models [1][2][3][4][5][6][7][8][9][10], in which the quarks and leptons have substructure.
In the composite-fermion scenario, quarks and leptons are assumed to have an internal substructure that would manifest itself at some sufficiently high energy scale Λ, the compositeness scale. This scale plays the role of an expansion parameter with which a series of higher-dimensional operators are constructed in an effective field theory (EFT) framework. The fermions of the SM are considered as bound states of some not-yet-observed fundamental constituents, generically referred to as preons [2]. Two model-independent features [8,9,11,12] are experimentally relevant: excited states of quarks and leptons with masses lower than or equal to Λ, and gauge or contact effective interactions (GI or CI) between the ordinary fermions and these excited states. The gauge interaction involves both fermion and gauge boson fields, and, at the lowest order in the EFT expansion, is described by dimension-five operators. Conversely, the contact interaction involves only fermion fields, with corresponding operators of dimension six.
A particular case of such excited states is a heavy composite Majorana neutrino (N , = e, µ, τ) [13][14][15][16], a neutral lepton having a mass above the electroweak energy scale. The introduction of an N is well motivated as an explanation of the baryon asymmetry in the universe. Indeed, in the framework of baryogenesis via leptogenesis [17,18], heavy Majorana fermions are the source of the matter-antimatter asymmetry in CP violating decays in the early universe, and it has been proposed [19,20] that N 's could quantitatively account for the observed asymmetry. Such composite Majorana neutrinos would also lead to observable effects in neutrinoless double beta decay experiments [14,16].
As a general phenomenological framework we consider the composite neutrino model given in Ref. [21], in which the GI and CI enter into both the production and decay of N 's and are governed, respectively, by the effective Lagrangians Here N, , W, and q are the N , charged lepton, W boson, and quark fields, respectively, P L is the left-handed chirality projection operator, and g is the SU(2) L gauge coupling. The effective coupling for contact interactions, g 2 * , takes the value 4π [21]. The factors f and η are additional couplings in the composite model; they are taken here to be unity, a choice that is commonly adopted in phenomenological studies and experimental analyses of composite-fermion models. The total amplitude for the production process is given by the coherent sum of the gauge and contact contributions, as shown in Fig. 1, as well as for the decay modes shown in Fig. 2. The production cross section via contact interaction is dominant for a wide range of Λ values, including the ones to which this search is sensitive.
In this work, we consider a composite neutrino, produced in association with a charged lepton, that subsequently decays to a charged lepton and a pair of quarks, leading to the experimental signature qq . Because the N is a Majorana lepton at the TeV scale, the expected signal is characterized by two leptons with high transverse momentum (p T ) that may be of the same or opposite charge sign, but are of the same flavor. We focus on the cases in which these leptons are both electrons or both muons, and the quark pair is detected as a wide jet. A shape-based analysis is performed, searching for evidence of a signal in the distribution of the invariant mass of the system comprising the two leptons and the quark-pair jet.
The data sample of proton-proton collisions at √ s = 13 TeV was recorded in 2016-2018 with the CMS detector at the CERN LHC, and corresponds to an integrated luminosity of 138 fb −1 . A previous search for N was performed by CMS with a data sample corresponding to 2.3 fb −1 at √ s = 13 TeV [22], and found agreement between the data and SM expectations. A 95% confidence level (CL) upper limit on the Majorana neutrino mass m N was placed at about 4.6 TeV for both the electron and muon channels. With the larger statistical power of the current data sample, the present search explores a wider range of the parameter space (m N , Λ). We further expand the composite model with recent considerations on the scope of validity of the effective operators in Eqs. (1) and (2) as derived in Ref. [23]. The unitarity bounds on these operators are used as guidance to optimize the search and extend the analysis sensitivity to lower m N and higher Λ compared with the previous search. Tabulated results are provided in the HEPData record for this analysis [24].
More generally, excited states interacting with the SM sector have been extensively searched for at high-energy collider facilities. The current most stringent bounds come from the recent LHC experiments. Excited charged leptons (e * , µ * ) have been searched for in the channel pp → * → γ [25][26][27][28][29][30], where they would be produced via CI and then decay via GI, and in the channel pp → * → qq [30] where both production and decay proceed through CI.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid, of 6 m internal diameter, providing a field of 3.8 T. Within the field volume, there are the inner tracker, the crystal electromagnetic calorimeter (ECAL), and the brass and scintillator hadron calorimeter (HCAL). The inner tracker is composed of a pixel detector and a silicon strip tracker, and measures charged-particle trajectories in the pseudorapidity range |η| < 2.5. The finely segmented ECAL consists of nearly 76 000 lead-tungstate crystals that provide coverage up to |η| = 3.0. The HCAL consists of a sampling calorimeter, which utilizes alternating layers of brass as an absorber and plastic scintillator as an active material, covering the range |η| < 3, and is extended to |η| < 5 by the forward hadron calorimeters. The muon system covers the region |η| < 2.4 and consists of up to four planes of gas ionization muon detectors installed outside the solenoid and sandwiched between the layers of the steel flux-return yoke. Events of interest are selected using a two-tiered trigger system. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 µs [31]. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [32]. A detailed description of the CMS detector can be found in Ref. [33].

Monte Carlo simulation
The signal and the SM backgrounds are simulated using the Monte Carlo (MC) method. The simulated samples for the signal are generated at leading order (LO) with CALCHEP v3.6 [34], using the NNPDF 3.0 LO parton distribution functions (PDFs) with the four-flavor scheme [35]. Samples are generated for Λ values from 4 to 20 TeV, and with m N values from 0.5 TeV to Λ, the maximum value consistent with the model.
The background processes simulated are top quark pair production tt, single top quark production tW, the Drell-Yan (DY) process, W+jets, diboson production (WW, WZ, ZZ), tt with vector boson production ttV, and SM production of jets through the strong interaction described by quantum chromodynamics (QCD). The tt events are generated at next-to-leading order (NLO) with POWHEG v2.0 [36][37][38][39][40]. The POWHEG generator is also used to describe tW production at NLO. The DY, QCD, W+jets, and ttV samples are generated at LO with MAD-GRAPH5 aMC@NLO v2.2.2 (v2.4.2) [41] for the 2016 (2017-2018) samples. The DY events are weighted by a p T -dependent K factor, a function of the generator-level Z boson momentum p T (Z). The K factor serves both to adjust a mismodeling of the p T (Z) distribution [42,43] and to account for higher-order effects in the QCD and EW perturbative expansions. It is a product of two terms: one obtained as described in Ref. [44] from Drell-Yan NLO samples, produced with MADGRAPH5 aMC@NLO with the FXFX matching scheme [45], while the other is extracted from theoretical calculations [42]. The diboson processes are generated with PYTHIA [46] at LO.

Event and object selection
Single-lepton triggers that require either an electron with p T > 115 GeV within |η| < 2.5 or a muon with p T > 50 GeV within |η| < 2.4 are used to select events in the eeqq and µµqq channels, respectively. The separate p T requirements reflect different trigger thresholds; these do not affect the relative signal sensitivity, as the signal is characterized by high-momentum leptons in the final state. The primary vertex (PV) of the event is taken to be the vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [59]. Electrons are reconstructed as superclusters in the ECAL associated with tracks in the tracking detector [60, 61]. Requirements on energy deposits in the calorimeter and the number of track measurements are imposed to distinguish electrons from charged pions, and electrons associated with the PV from those produced by photon conversions. Muons are reconstructed using the tracker and muon detectors. Quality requirements, based on the minimum number of measurements in the silicon tracker, pixel detector, and muon detectors are applied to suppress backgrounds from hadron decays displaced from the PV and from hadron shower remnants that reach the muon system [62]. We require exactly two electrons, or two muons, that originate from the PV.
The p T of the leading (subleading) lepton is required to be higher than 150 (100) GeV. Isolation requirements are imposed to suppress backgrounds from jets that are misidentified as leptons or that contain leptons from heavy-flavor hadron decays. The isolation is defined as the p T sum of tracks within a cone around the candidate direction of size ∆R ≡ √ (∆η) 2 + (∆φ) 2 = 0.3, where φ is the azimuthal angle in radians. The momentum of the candidate is excluded from the sum. The isolation is required to be less than 3 (10)% of candidate electron (muon) p T .
Jets are reconstructed using the anti-k T clustering scheme [63, 64] applied to the objects reconstructed with a particle-flow algorithm [65]. The latter combines information from all CMS subdetectors and reconstructs individual particles in the event (electrons, muons, photons, and neutral and charged hadrons). Jets are reconstructed with a distance parameter R = 0.8 and are referred to here as "large-radius jets", labeled by the symbol "J". This value of R is chosen to capture both final-state quarks as a single jet. The large-radius jets are required to have p T > 190 GeV, |η| < 2.4, and to be separated from leptons by ∆R > 0.8. The pileup per particle identification algorithm (PUPPI) [66, 67] is used to mitigate the effect of pileup at the reconstructed particle level, making use of event pileup properties, tracking information, and a local shape variable that distinguishes between collinear and soft diffuse distributions of other particles surrounding the particle under consideration. The collinear component is attributed to particles originating from the hard scatter, and the soft diffuse one to particles originating from pileup interactions.

Analysis strategy and background estimation
To define the signal region (SR) for the search we require that the event contain two sameflavor leptons with invariant mass m( ) > 300 GeV, together with at least one large-radius jet. The requirement on m( ) is introduced to reduce the DY background and part of the tt background, with minimal effect on the signal acceptance. No requirement is placed on the charge of the leptons, to retain efficiency for both same and opposite sign signal events and to avoid the systematic uncertainty associated with the efficiency of the charge sign determination for such energetic particles. While a veto of opposite-sign lepton pairs would reduce the SM background, optimization studies have shown that it is better to impose kinematical requirements that retain the signal efficiency at high momenta.
For gauge-mediated decays of the N , the fragmentation products of the two quarks from the W boson decay typically form at least one large-radius jet. In the case of contact-mediated decays, the two quarks are well separated, but at least one of them will be contained within a large-radius jet. The signal simulation shows that the efficiency for capturing one or both quarks in the jet is 98% for the CI-dominated case m N = 5 TeV, and 95% overall for the gauge or contact interaction with m N > 1 TeV.
The key variable for the analysis is the invariant mass m( J) of the system comprising the two leptons and the leading large-radius jet. This variable provides good discrimination between the signal and SM background contributions and is also correlated with m N , which would become relevant for the signal characterization if an excess were observed. The statistical analysis is implemented with a maximum likelihood (ML) fit to extract the signal strength µ, the ratio of the signal yield observed to that predicted by the model. The inputs to the fit are the distributions in m( J) of the estimated backgrounds, the expected signal, and the data.
The leading background to this search is DY production of a lepton pair accompanied by a jet from initial-state radiation. The second major background comes from processes that produce top quarks, tt and tW.
The DY contribution is estimated from simulation, corrected by scale factors that serve to adjust the simulated m( J) shape for differences with respect to the data. A scale factor for each m( J) bin and each data-taking year is taken from the DY-dominated m( ) region around the Z boson mass peak, 60 < m( ) < 120 GeV. The scale factors have values in the range 0.87-1.57. Their statistical uncertainties are combined with those of the simulation to estimate the total systematic uncertainty in the DY background prediction to be used in the fit.
In addition, we include in the fit an SM-dominated control region (CR) selected with the same criteria as the SR except in an m( ) band, 150 < m( ) < 300 GeV, that lies adjacent to the SR. This CR provides validation of the corrected simulation and improves the precision of the background prediction.
The m( J) distributions of the electron and muon DY CRs are shown in Figure 3, upper left and right, respectively. The distributions for both flavors are included in the ML fit to constrain the DY background contribution. In the figure the data are compared with the background estimated before (pre-fit) and after (post-fit) the simultaneous fit of the signal and control regions. The pulls shown in the lower panels are defined as the difference between data and the post-fit background prediction divided by the quadratic difference of the uncertainties in the data and the post-fit yields. The quadratic difference of the uncertainties is taken to account for the correlation between the data and the post-fit prediction.
The second most important background arises from the leptonic decays of top quarks from tt and single top quark production. The m( J) shape of this background is taken from the MC simulation, with a free normalization parameter in the ML fit for each data-taking year. To constrain these parameters, we include a top quark enriched CR in the fit. The definition of the CR exploits the fact that the decays of the top quarks from tt production give rise to events with ee, µµ, and eµ configurations, with the eµ final state having a branching fraction twice that of either of the same-flavor pairs. We thus select events having one muon and one electron, the leading (subleading) lepton having p T > 150 (100) GeV. In addition, we reject events containing an electron and a muon, each with p T > 5 GeV, that have angular separation ∆R < 0.1. Figure 3 (lower) shows the m(eµJ) distribution of the CR data along with pre-and post-fit background estimates.
The remaining SM backgrounds, arising from QCD, W+jets, and diboson production, are small  (∼5% of the total). Their contribution is taken directly from MC simulation, normalized to the theoretical cross sections cited in Section 3. These processes are designated "other" in the legends in Figs. 3 and 4. In the three highest m( J) bins, which are the most sensitive to a signal, the fractions of backgrounds are approximately 60, 34, and 6% for DY, tt, and Other, respectively. The total event yield information for Fig. 4 is available in the HEPData record for this analysis [24].

Systematic uncertainties
Sources of systematic uncertainties affecting the m( J) distribution include statistical uncertainties in the CR data and in the simulation, together with systematic uncertainties in quantities affecting the modeling in the simulation. The latter are accounted for with log-normaldistributed nuisance parameters in the fitting procedure described in Section 5. Uncertainties from a given source are treated as uncorrelated across the three data-taking years, with the exception of electron energy scale and resolution, small-background theoretical cross sections, and signal shape, which are fully correlated.
The integrated luminosities for the 2016, 2017, and 2018 data-taking years have individual uncertainties of 1.2-2.5% [68-70], while the overall uncertainty for the 2016-2018 period is 1.6%. These uncertainties affect the normalization of signal yields and those background yields that are taken from simulation. The imperfect modeling of pileup interactions is estimated by varying the total cross section for inelastic pp scattering used in the simulation by ±5% [71], and results in an uncertainty of 0.004 (0.006) in the fitted value of µ in the electron (muon) channel.
The lepton trigger, reconstruction, identification, and isolation efficiencies are measured in both data and simulation using Z → events. Data-to-simulation scale factors are applied to all simulation samples to account for the differences observed between the two. The uncertainties in the lepton scale factors are propagated to the estimation of µ, and their effect is found to be 0.004 (0.006) for the electron (muon) signal.
Similarly, the momenta of leptons are varied in the simulation within their uncertainties from the nominal values to ascertain the effect of these uncertainties on the mass distributions. To evaluate the effect of the uncertainty in the momentum resolution of very high energy muons, a Gaussian smearing is applied to the muon momentum and propagated to the m( J) distribution; the resulting effect on the signal yield is less than 0.3%, with a negligible impact on µ. The uncertainties in the jet energy scale and resolutions [72] affect the uncertainty in µ by 0.006-0.010.
For the DY background simulation we account for uncertainties in the higher-order QCD and EW corrections and in the data-to-simulation scale factors. The uncertainty in the p T (Z) reweighting described in Section 3 accounts for theoretical uncertainties, implemented as described in Ref. [44], and a component due to the MC samples, applied by varying the K factor by its upward and downward statistical uncertainty. Similarly, the scale factors are varied within their uncertainties to estimate their impact on the invariant mass distributions. The resulting uncertainties in the signal strength are up to 0.034-0.045 in the two leptonic final states considered.
Leaving the normalization of the top quark processes floating in the fit results in an uncertainty in the signal strength of up to 0.009 (0.008) in the electron (muon) channel. For the smaller SM backgrounds, the uncertainty in the cross section is used. The theoretical uncertainties in the signal simulation originating from the PDFs have been computed using the recommendations of Ref. [73], extracting weights from the MC replicas that vary from a few percent to about 8%, depending on m N . These weights affect the selection efficiency, but their uncertainties make a negligible contribution to the final systematic uncertainty. Finally, uncertainties related to the limited number of simulated events are taken into account with the Barlow-Beeston lite approach [74]. They are considered for all bins of the distributions that are used to extract the results, and kept uncorrelated across the different samples and across the bins of an individual distribution [75]. The limited size of the simulated event and data samples are two major sources of uncertainty and account for up to 0.021 and 0.083 in µ for both lepton channels.
The impacts of the systematic uncertainties on µ as extracted from the ML fit are summarized in Table 1. We have checked the sensitivity of the results to variations of the individual nuisance parameters; no significant overconstraining or underestimating of the systematic uncertainties was found.

Results
From the ML fit we extract values of µ for a range of the parameters m N and Λ of the signal model. The input data are the m( J) distribution in the SR and in the DY CR with 150 < m( ) < 300 GeV, and the m(eµJ) distribution for the top quark-enriched CR (Fig. 3). The result of the fit under the background-only hypothesis is shown in Fig. 4 for the m( J) distribution of the eeqq and the µµqq channels.
The observed data and the estimated SM background contributions are in agreement, and no significant excess is observed. We derive upper limits at 95% CL on the product of cross section and branching fraction σ(pp → N )B(N → qq ), using a CL s method [76,77], in the asymptotic approximation [78]. The adequacy of the asymptotic approximation has been verified with pseudo-experiments. The expected and observed upper limits for the eeqq and the µµqq channels are displayed in Fig. 5, for a benchmark value of Λ = 13 TeV. The limits are of order 10 −4 pb for a range of N signal hypotheses.
The results are recast in terms of the EFT of Ref. [21] in Fig. 6, which shows the region in the (m N , Λ) plane that is excluded by the data. The region of validity is constrained by the model's assumption that m N < Λ. A further consideration, discussed in Ref. [23], is that the unitarity of the scattering amplitude, as approximated in the perturbation expansion, can be violated for some values of the subenergiesŝ ≡ sx 1 x 2 . These subenergies appear in the integral over x 1,2 weighted by the product of proton PDFs P (x 1 )P (x 2 ). Contours giving the fraction of this (x 1 , x 2 ) phase space that is consistent with unitarity are shown as the solid magenta curves in Fig. 6. For the case of Λ = m N , the existence of N e (N µ ) is excluded by the data for masses up to 6.0 (6.1) TeV at 95% CL, improving by more than 1 TeV the current most stringent limit on this class of resonances [22], results that are safe from potential violation of the underlying EFT. Moreover, the accessible range of Λ is almost twice that reached in the previous search, extending the sensitivity to ≈20 TeV at lower m N masses.   µµqq (right) channels. The region below the curve is excluded. The gray shading indicates the region where m N would exceed Λ, the EFT scale parameter, and the three solid magenta lines in the lower part of the plots represent the fraction of the signal-model phase space that satisfies the unitarity condition in the EFT approximation.

Summary
A search is reported for a heavy composite Majorana neutrino N , where the flavor corresponds to an electron or muon, that appears in composite fermion models. In the specific model considered, the N is produced in association with a lepton and subsequently decays into a same-flavor lepton plus two quarks, leading to a signature with two same-flavor leptons and at least one large-radius jet. The analysis is performed using a sample of proton-proton collisions at √ s = 13 TeV recorded by the CMS experiment at the CERN LHC, corresponding to an integrated luminosity of 138 fb −1 . The data are found to be in agreement with the standard model expectations. In the context of an effective field theory with compositeness scale parameter Λ, an upper limit at 95% CL is established on σ(pp → N )B(N → qq ) as a function of Λ and the N mass m N . Masses less than 6.0 (6.1) TeV are excluded for = e (µ), at the limit m N = Λ. For m N ≈ 1 TeV, values of Λ less than 20 (23) TeV are excluded. The present search covers a parameter space larger by about 1.6 TeV in m N compared to previous searches in proton-proton collisions at 13 TeV.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid and other centers for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC, the CMS detector, and the supporting computing infrastructure provided by the following funding agencies:   [30] CMS Collaboration, "Search for an excited lepton that decays via a contact interaction to a lepton and two jets in proton-proton collisions at √ s = 13 TeV", JHEP 05 (2020) 052, doi:10.1007/JHEP05(2020)052, arXiv:2001.04521.