Search for Heavy Sterile Neutrinos in Trileptons at the LHC

We present a search strategy for both Dirac and Majorana sterile neutrinos from the purely leptonic decays of $W^\pm \to e^\pm e^\pm \mu^\mp \nu$ and $\mu^\pm \mu^\pm e^\mp \nu$ at the 14 TeV LHC. The discovery and exclusion limits for sterile neutrinos are shown using both the Cut-and-Count (CC) and Multi-Variate Analysis (MVA) methods. We also discriminate between Dirac and Majorana sterile neutrinos by exploiting a set of kinematic observables which differ between the Dirac and Majorana cases. We find that the MVA method, compared to the more common CC method, can greatly enhance the discovery and discrimination limits. Two benchmark points with sterile neutrino mass $m_N = 20$ GeV and 50 GeV are tested. For an integrated luminosity of 3000 ${\rm fb}^{-1}$, sterile neutrinos can be found with $5 \sigma$ significance if heavy-to-light neutrino mixings $|U_{Ne}|^2 \sim |U_{N\mu}|^2\sim 10^{-6}$, while Majorana vs. Dirac discrimination can be reached if at least one of the mixings is of order $10^{-5}$.


Introduction
The evidence of small but non-zero neutrino masses [1] is currently an outstanding path beyond the Standard Model of particle physics. Most explanations are based on the existence of extra heavy particles. In particular, seesaw models involve extra heavy neutrinos that are sterile under electroweak interactions, but which mix with the Standard Model leptons [2]. Moreover, in most scenarios they are Majorana fermions [3]. The existence of heavy neutrinos and the discrimination between Dirac and Majorana is thus a crucial piece of information that experiments must reveal. The Majorana nature of neutrinos is being searched for in neutrinoless double beta decays [4], but so far no experimental evidence has been found [5]. The Large Hadron Collider (LHC) and future colliders also offer the opportunity to search for heavy neutrinos [6,7]. At such colliders, same-sign dilepton plus dijet events, l ± l ± jj, can be produced if there are heavy Majorana neutrinos (henceforth called N ) in the intermediate state with masses above M W [8]. Instead, for masses below M W , the jets are lost in the background and thus trilepton events l ± l ± l ∓ ν provide clearer signals for a heavy N [9], where l and l denote leptons with different flavors. The choice of having no Opposite-Sign Same-Flavor (no-OSSF) lepton pairs helps eliminate a serious SM background γ * /Z→l + l − [10]. Now, if N is Majorana, the trilepton will contain a Lepton Number Conserving (LNC) channel W + →e + e + µ − ν e as well as a Lepton Number Violating (LNV) channel W + →e + e + µ −ν µ , while if it is of Dirac type, only the LNC channel will appear. An in-between case of neutrino called pseudo-Dirac occurs if N corresponds to pairs of almost degenerate Majorana neutrinos so that the LNV mode becomes relatively suppressed by two interfering amplitudes [11]. Here we will not consider such a case. Since the final neutrino escapes detection, the observed final state is just e ± e ± µ ∓ or µ ± µ ± e ∓ plus missing energy. Hence it is not a simple task to distinguish a Majorana from a Dirac N. In our previous work [12], we studied these trilepton events to discover heavy neutrinos and discriminate between Dirac and Majorana using differences in Received [13], we presented a simpler method for this discrimination by comparing the full rates of e ± e ± µ ∓ and µ ± µ ± e ∓ . However, this discrimination based on full rates only works if the mixing parameters U N e and U N µ are considerably different from each other (See Table 1). µ ± µ ± e ∓ s s(1+1/r)

Discovery limit
In this letter, we present a strategy to discover heavy sterile neutrinos N with m N <M W , and discriminate between their Dirac vs. Majorana character, using trilepton events at the 14 TeV LHC, applying both Cut-and-Count (CC) and Multi-Variate Analysis (MVA) methods. Our strategy is most complete in the sense that it uses all details of each event, including spectra and angular distributions.
We consider the process W ± → l ± W l ± N l ∓ N ν (Fig. 1), where l and l are different leptons, either e or µ (i.e. e ± e ± µ ∓ ν and µ ± µ ± e ∓ ν), and ν is a SM neutrino or antineutrino. For convenience, we introduce two parameters: a normalization factor s and a disparity factor r: Conversely, the heavy-to-light mixing elements |U N e | 2 and |U N µ | 2 can be expressed in terms of r and s as: For our study we choose two benchmark points: m N = 20 and 50 GeV, with r=s=1 (i.e., |U N e | 2 =|U N µ | 2 = 10 −6 ). The production rates of the different trilepton modes are proportional to the scale factors shown in Table 1.
Let us first describe our strategy to discover or set exclusion limits for Dirac and Majorana sterile neutrinos using trileptons at the LHC. We first select trilepton events l ± l ± l ∓ with no-OSSF lepton pairs. Then we apply basic cuts for leptons and jets: p T ,l 10 GeV and |η l | 2.5; p T ,j 20 GeV and |η j | 5.0, and veto the b-jets in order to suppress the tt background. Now, in order to select within the pair l ± l ± the lepton that comes from the N decay, we construct the χ 2 function where m W =80.5 GeV and m N is the assumed mass for N (20 or 50 GeV in our benchmarks), while M W and M N are the reconstructed invariant masses of l ± l ± l ∓ ν and l ± l ∓ ν, respectively; σ W and σ N are the widths of the reconstructed mass distributions, which we take to be 5% of their respective m W and m N , for simplicity. When calculating the reconstructed mass M W and M N , the final neutrino transverse momentum p T ,ν is assumed to be the missing transverse momentum, while the neutrino longitudinal momentum p z,ν and the correct lepton l ± from the N decay are determined by minimizing the A better identification of the correct lepton can be achieved if the production and decay vertices of N are spatially displaced in the detector [14,15]. However, this would be perceptible only if m N 15 GeV at the LHC. For m N ∼15 GeV, by exploiting the displaced lepton jet search and requiring the vertex displacement between 1 mm and 1.2 m, Ref. [9] derived a limits of |U N µ | 2 <10 −5 at 8 TeV LHC with 20 fb −1 , and |U N µ | 2 <10 −7 at 13 TeV LHC with 300 fb −1 at 2σ level. A future e − e + collider with better detector resolution of the vertex displacement will allow probing for heavier sterile neutrinos. By requiring the vertex displacement to be between 10 µm and 249 cm at the FCC-ee, Ref. [16] yields a sensitivity of |U N l | 2 ∼ 10 −11 for the Z-pole running mode with 110 fb −1 , and a sensitivity of |U N e | 2 ∼10 −8 for a 240 GeV running with 5 ab −1 at 2-sigma level. Due to the much more challenging experimental environment, the sensitivity at the FCC-hh might not be as good as that from the FCCee. For this study, the displaced vertex observable is not considered.
A MVA is then performed to exploit the useful observables and maximally reduce the SM background. We use the Boosted Decision Trees (BDT) method in the TMVA package [17] and input the following kinematical observables for training and test processes: (i) the missing energy E T ; (ii) the scalar sum of p T of all jets H T ; (iii) the transverse mass of the missing energy plus lepton(s) (iv) the azimuthal angle difference ∆φ between the missing transverse momentum and lepton(s) ∆φ( E T ,l N l N ), ∆φ( E T ,l W l N ), ∆φ( E T ,l W ), ∆φ( E T ,l N ), ∆φ( E T ,l N ); (v) the invariant mass of the system of leptons M (l W l N l N ), M (l W l N ), M (l W l N ), M (l N l N ); and (vi) the azimuthal angle difference ∆φ between two leptons ∆φ(l W ,l N ), ∆φ(l N ,l N ). For a Dirac (Majorana) N, the simulation data of the LNC (LNC + LNV) processes are input as the signal sample, while the total SM background data (γ * /Z, WZ, and tt inclusively) are input as the background sample for the TMVA training and test processes. The details of our data simulation procedures are described in Ref. [13]. Figure 2 shows the BDT response distributions for a Dirac N signal and total SM background, for our two benchmarks. The signal vs. background separation is better for m N = 20 GeV than for m N = 50 GeV, as the two curves have less overlap in Fig. 2(left).
In Table 2, we show the number of events for both Dirac and Majorana signals with m N =20 GeV and the SM backgrounds at the 14 TeV LHC. The first two rows show the number of events after basic cuts and b-jets vetoes. The number of events using the CC method from Ref. [13] are shown in the third row. The numbers of events for Dirac (Majorana) sterile neutrinos using the BDT method are shown in the fourth (fifth) row. For a Dirac (Majorana) N, we get a statistical significance near 2.6 (5.8) for the CC method and near 6.6 (10.7) for the BDT method, where N s and N b are the number of signal events (either Dirac or Majorana) and SM background events, respectively. Similarly, Table 3 shows the numbers for m N = 50 GeV. From Fig. 2, lower significances are expected for m N = 50 GeV. Indeed, Table 3 shows SS near 2.3 (4.8) for the CC method and near 5.1 (9.0) for the BDT method.     GeV, where the blue curves marked with squares correspond to the 3σ limit, while the red curves correspond to the 5σ limit; solid lines are used for the BDT method and dashed lines for the CC method. Figure 3 shows the discovery and exclusion curves for a Dirac N, for both the BDT and CC methods. By exploiting more useful kinematical observables and better optimization compared with the CC method, the BDT method can greatly enhance the discovery and exclusion limits. Due to the small number of signal events, the performance of the BDT method becomes close to that of the CC method for small s values (see Table 1). Using the BDT method, one can get significances 5.0σ(3.0σ) for s 0.55(0.25) at m N = 20 GeV, or s 1.02(0.55) at m N = 50 GeV. Figure 4 shows the discovery and exclusion curves for a Majorana N, using both the BDT and CC methods. Here the rates depend on both s and r (see Table 1), and so the observables at the LHC can be used to constrain both s and r. When r = 1, one can get a significance above 5.0σ(3.0σ) for s 0.24(0.11) at m N = 20 GeV, or s 0.46(0.25) at m N = 50 GeV. For a given s, the significance becomes larger when r =1, due to the larger number of signal events. Using the BDT method, when r ≈ 10, one can get significances 5.0σ(3.0σ) for s 0.08(0.03) at m N = 20 GeV, or s 0.16(0.09) at m N = 50 GeV.

Discrimination limit
We now show that one can distinguish between a Dirac and Majorana N in the trilepton events, using the following distributions, which differ between the LNC and LNV processes: (i) the transverse mass of the system formed by the missing energy plus lepton(s) M T ( E T ,l N ), M T ( E T ,l N ), and M T ( E T ,l N l W ); and (ii) the azimuthal angle difference ∆φ between the missing transverse momentum and lepton(s) ∆φ( E T ,l N ), ∆φ( E T ,l N ), and ∆φ( E T ,l N l W ).
In order to exploit these differences, we must first reduce as much SM background as possible. After applying the basic cuts and vetoes, we perform the first BDT analysis and input the rest of the observables except those mentioned in the above paragraph to suppress the SM backgrounds. Simulated Majorana data are input as the signal sample, while the total SM background data are input as the background sample for TMVA training and testing processes. After the first BDT cut, the total number of events, for M N = 20 GeV, including all four final states (e ± e ± µ ∓ and µ ± µ ± e ∓ ) for the Dirac signals (the LNC rate only), Majorana signals (LNC + LNV rates) and SM backgrounds (γ * /Z, W ± Z, and tt inclusively) are 48.5, 120.4 and 7.3, respectively.
Since s is a global scale a priori unknown, as a second step we adjust s for the Dirac hypothesis to match the number of events of the Majorana hypothesis, so that our simulation does not artificially distinguish the two scenarios simply by the rates. Just as in Ref. [13], the best matched value of s D is found by minimizing: where i indicates a particular trilepton final state, and Poiss(N expc , N obs ) denotes the probability of observing N obs events in Poisson statistics when the number of ex-pected events is N expc . Here N expc is the expected number of events for the Majorana hypothesis (LNC + LNV + SM background), while N obs is the observed number of events for the Dirac hypothesis (LNC + SM background). The best matched s D found in this way for the Dirac hypothesis gives the closest number of events to the Majorana case. For m N = 20 GeV, we find s D ∼ 2.44. After matching, the Dirac and Majorana hypotheses will have 125.6 and 127.6 events, respectively. As a third step, we perform a second BDT analysis to distinguish the Majorana from the Dirac hypothesis by exploiting the differences in the distributions, mentioned above. Figure 5 shows the distributions of two of these observables after basic cuts, b-jets veto and the first BDT cut. With an optimized second BDT cut of about 0.020, the Majorana case ends up with 46.1 events, while the Dirac hypothesis has 34.1 events. After defining the ex-cess in the Majorana case from the Dirac hypothesis as the "signal" events N s , and the number of events of the Dirac hypothesis as the "background" events N b , the significance for distinguishing Majorana  When r =1, the number of events for different trilepton states will be quite different between Dirac and Majorana (see Table 1), which helps in this discrimination and gives a higher significance. Figure 6 shows the confidence levels for distinguishing Majorana from Dirac after the above three-step method. When r ≈1, one can have significances 5.0σ(3.0σ) for s 7.93 (3.10) at m N = 20 GeV, or s 11.44(5.47) at m N =50 GeV. As r ≈ 10, the same significance is reached with lower s∼0.25(0.10) at m N =20 GeV, or 0.72 (0.38) at m N =50 GeV.

Summary
We present a complete method to discover or set exclusion limits for heavy sterile neutrinos with m N <M W , and discriminate their Dirac vs. Majorana  Here we would like to recall that, according to Eq. (2), when r=1, the mixings are |U N e | 2 =|U N µ | 2 =s×10 −6 .
Moreover, Majorana vs. Dirac can be distinguished with those significances when r ≈ 1 and s 7.9(3.1) for m N =20 GeV, or s 11 (5.8) for m N =50 GeV. As r≈ 10, the same significances are reached for s 0.25(0.10) for m N =20 GeV, or s 0.72(0.38) for m N =50 GeV.
Therefore, for an integrated luminosity of 3000 fb −1 at the 14 TeV LHC, both Dirac and Majorana sterile neutrinos can be found with 5σ significance if heavy-to-light neutrino mixings |U N e | 2 ∼ |U N µ | 2 ∼ 10 −6 , while Majorana vs. Dirac discrimination can be reached if at least one of the mixings is of order 10 −5 .
We thank Jue Zhang for his valuable help.