Impact of Rare Decays $t \to \ell' \nu b \ell\ell$ and $t \to q q' b \ell\ell$ on Searches for Top-Associated Physics

Searches for top quark-associated physics such as $t\bar t W$ or $t\bar t H$ in final states with multiple leptons require a careful accounting of expected backgrounds due to the lack of reconstructible resonances. We demonstrate that the rare top quark decays $t \to \ell' \nu b \ell\ell$ and $t \to q q' b \ell\ell$, when a soft lepton is not detected, can contribute a non-negligible background to such searches. Simulations in the LHC experiments typically do not account for such decays and as such backgrounds to such searches may be underestimated.


Introduction
The achievement of 13 TeV center of mass energy and design collision luminosity at the Large Hadron Collider (LHC) will enable a rich program of studies of the production of electroweakly-interacting particles produced in association with the top quark. The processes pp → ttZ and ttW are established [1][2][3], while the production of ttH remains to be clearly demonstrated [4][5][6][7]. These processes will provide critical information on the coupling of the top quark to electroweak gauge bosons and the mechanism of top quark mass generation.
One of the most powerful ways of accessing ttW and ttH production is through final states with multiple leptons, especially three leptons or two leptons of the same sign. These lepton requirements are sufficient to suppress contamination from top pair production, where at most two same-sign leptons are expected; tt will typically contribute a background only through lepton charge misreconstruction, additional leptons from heavy hadron decay, or ttγ production with subsequent γ → e + e − conversion in detector material. However, since these final states have multiple neutrinos, separation of different processes is complex and simulation of rare processes that mimic the signature can become very important.
The potential relevance of backgrounds with "lost" leptons to multilepton searches was pointed out in the context of the search for H → W W * → ν ν produced in gluon-gluon fusion [8]. The production of W γ * → ν ( ), where one of the daughters of the virtual photon fails to be reconstructed, can mimic the dilepton signature of Higgs boson production. Such a situation is especially common when m(γ * ) is low and the flight directions of the daughter leptons are aligned with the momentum of the virtual photon in the lab frame: one lepton is then often red-shifted to very low momentum in the lab frame and may not pass analysis momentum cuts and not contribute enough isolation energy to cause its pair partner lepton to appear non-isolated. This phenomenon of "asymmetric internal conversion" was found to contribute a non-trivial background to the H → W W * search.
In this paper, we demonstrate that a generally-neglected process, pp → tt, where one top quark undergoes the rare decay t → νbγ * or t → qq bγ * with subsequent decay γ * → , can induce a significant background to multilepton searches for top quarkassociated phenomena via asymmetric internal conversions. In light of persistent excesses over the Standard Model prediction in ttW and ttH multilepton searches, potential additional backgrounds are important to identify.
Since generally-used QED showering algorithms do not include virtual photon splitting to leptons, this contribution needs to be explicitly included in event generators as a matrix element calculation. Recently, QED parton showers supporting γ → splitting kernels have been made public (such as in newer revisions of Pythia 8 [9] and the C++ version of PHOTOS [10]). Prescriptions for the proper matching of matrix element and parton shower event generation for this process remain to be developed and will be critical for future precision understanding of these rare decays, but are outside the scope of this paper.
The paper is organized as follows. In Section 2 the generic behavior expected of AIC as a function of the momentum and mass of the virtual photon is discussed. In Section 3 the decay width for the internal conversion decay of a top quark is determined and compared with previous predictions. In Section 4 a realistic detector parametrization is used to study how AIC events will be reconstructed and the resulting yields in a trilepton analysis are compared to those of the ttW and ttH production processes.

Asymmetric Internal Conversion Kinematics
Asymmetric internal conversion (AIC) events γ * → will leave different observable signatures depending on on m(γ * ), p T (γ * ), and the angle θ between the − flight direction and the γ * flight direction as measured in the virtual photon rest frame.
To gain some insight into the kinematics of asymmetric internal conversion events, we run toy Monte Carlo of γ * → µ + µ − decays for various values of m(γ * ) and p(γ * ), mapping the fraction of events that pass two types of selections: 1. A two-lepton selection, where both leptons are required to have momentum above 10 GeV, and to be separated by more than 0.3 radians (isolation); 2. A one-lepton selection, where exactly one lepton must have momentum above 10 GeV, and where, if the two leptons lie within 0.3 radians, the second is required to satisfy p(sublead)/p(lead) < 0.1 (isolation).
In the toy MC, total momentum is used instead of transverse momentum p T to simplify the problem; they are equivalent for η ≈ 0. For each value of m(γ * ) and p(γ * ) tested, many potential decay angles cos θ are chosen in the rest frame of the virtual photon, and then the system is boosted into the lab frame. The virtual photons are assumed to be completely transversely polarized and therefore are generated according to the probability density  where β is the lepton velocity in the virtual photon rest frame; in the limit of ultrarelativistic lepton daughters this recovers the familiar form P ∝ 1 + cos 2 θ. These results only depend on the parameters of the virtual photon system and so are universal for AIC events with transversely polarized photons.
The probability for an event to be chosen under the two selections, as a function of virtual photon mass and momentum, is shown in Figure 1 for the dimuon case. Several features are evident. At high m(γ * ) and p(γ * ), the two plots are complementary, as at least one lepton will have p > 10 GeV, and the two selections are disjoint. At low mass and momentum, no leptons are found at all; they are both too soft. At low mass and high momentum, the lepton separation/isolation requirements suppress the acceptance. There is a crescent-shaped region of near-zero dilepton acceptance with over 90% acceptance for the single lepton selection, extending to m(γ * ) ∼ 2.5 GeV; the acceptance is still significant until it is truncated by the isolation cut. We conclude that an analysis that requires one lepton and vetoes a second will have high sensitivity to these "lost" leptons.
We can try to see if there are ways to mitigate this background or to constrain it from data. From Figure 1 it is clear that the acceptance for finding two leptons with p > 10 GeV in the region of interest is essentially zero, so we must try other means. First, we ask whether the subthreshold lepton can still be seen, with the leading lepton threshold still at 10 GeV. Figure 2a shows the median momentum of the subleading lepton when exactly one lepton is found with p > 10 GeV; if a subleading lepton can be reconstructed down to 4 GeV momentum, then events down to m(γ * ) = 5 GeV can be reconstructed in most of the dangerous cresecent region, although the very lowest masses cannot be reached.
We also consider the possibility of lowering the leading lepton threshold to 5 GeV and eliminating the separation cut between two reconstructed leptons in order to improve acceptance at low m(γ * ). The results are shown in Figure 2b. With this selection, an acceptance of 30% can be achieved down to dilepton threshold in the dangerous crescent  region. The better matching of the shape of the acceptance curve to the crescent in the latter case, and the ability to reach lower m(γ * ), suggest it as a preferred solution to map out the cross section for AIC processes in data.
3 The Rare Decays t → νb and t → qq b At 13 TeV, a σ(pp → tt) of 832 pb has been calculated at NNLO+NNLL using the Top++ v2.0 program [11] by the LHC Top Working Group [12]. By comparison, the pp → ttW → 3 3νbb ( = e, µ) cross section at NLO is 6.2 fb [13,14], so a 10 −5 top quark branching fraction to a similar final state could produce comparable yields.
Top quark decays are dominated by the CKM-favored two-body decay t → W b. Other potential two-body decays are either suppressed by small CKM elements (for W q) or GIMsuppressed loops (Zq or Hq). The CKM-allowed decay t → W Zb is very near threshold and the decay width is dominated by finite-width effects. Ref. [15] derives a branching fraction B(t → W Zb) ∼ 2 × 10 −6 which means this will not compete with pp → ttZ or tZ production. The decays t → W bg and t → W bγ are significant but generally treated as FSR corrections to t → W b using QCD and QED Monte Carlo showering algorithms.
The four-body decay t → W b + − was considered in Ref. [16] and found to give a branching ratio R = Γ(t → W be + e − )/Γ(t → W b) = 6.3 × 10 −6 for m(γ * ) 2 > 20 GeV 2 , and the same for the dimuon decay (all at leading order). This is clearly in the regime discussed above where this decay might compete with processes such as ttW , considering in particular that one expects significant cross section below this m(γ * ) cutoff of ≈ 4.5 GeV. A leading order calculation suffers from the pole of the photon propagator, especially for the dielectron mode where the branching ratio to the t → W b mode exceeds unity with an m(γ * ) 2 cut less than 10 −3 GeV 2 , which is still above dielectron threshold. One expects these to be compensated by virtual corrections in a higher order calculation and, indeed, the best strategy for for dealing with these decays is most likely a resummed calculation or parton shower. Nevertheless, it is instructive to study the fixed leading order calculation.
We use MadGraph5_aMC@NLO [14] (hereafter MG5_aMC) version 2.5.4, running in leading order mode, to study the inclusive processes t → νb and t → qq b . We use m t = 173 GeV and use non-zero lepton masses in the calculation. First, we compute the ratio R = Γ(t → W be + e − )/Γ(t → W b) for m(γ * ) 2 > 20 GeV 2 , to compare with Ref. [16]. Our result is 5.8 × 10 −6 , in reasonable agreement with the earlier calculation. We find that increasing the top mass by 0.5 GeV results in an increase in the ratio of 0.08 × 10 −6 , indicating good stability with the top quark mass choice.
In the calculation of Ref. [16] the possibility of the virtual photon being radiated from the daughters of the W boson is not considered. Considering the full set of possible diagrams for t → ( ν, qq )be + e − , we compute a somewhat larger R of 1.1 × 10 −5 for the same cutoff m( + − ) 2 > 20 GeV 2 . (In the case of the three leptons having the same flavor, the m( ) threshold is applied on both opposite sign pairs.) We find that Γ(qq be + e − )/Γ( νe + e − ) = 1.2 summing over all lepton flavors , which is much smaller than the equivalent ratio Γ(qq b)/Γ( νb) ≈ 2. This feature seems to be robust to various generation options and may arise from a) the smaller charges of the quarks compared to the charged leptons and b) interference between diagrams involving W -daughter radiation and other diagrams. In Section 4 we focus on the νb + − case and so are not sensitive to potential complications in W hadronic decays.
For the simulations of Section 4 we choose a lower m( + − ) threshold, 1 GeV. This allows us to fully populate the high-acceptance region for AIC events. The corresponding R is 4.9 × 10 −5 . The simulation of tt events with a rare decay is performed completely at LO in MG5_aMC, with the rare decays simulated as a 5-body decay. Although this simulation will miss features such as recoil of the tt system against hard additional jets, it will illustrate the core physics. We strongly emphasize that higher-order corrections are expected to induce large corrections and that in the leading order calculation there is a large additional cross section at m(γ * ) < 1 GeV. This R value should therefore be treated as purely indicative.

Simulation of Rare Top Decay Impact on 3 + b Searches
To give a practical picture of the potential impact of these rare top decays on an actual search, we consider a simplified 3 + b search with a Z-veto. This final state is sensitive to ttW and ttH.
The sensitivity of such an analysis to asymmetric conversions depends strongly on detector response and analysis choices. We pass events generated in MG5_aMC at 13 TeV and showered by Pythia 8 [9,17] to the generic detector fast simulation program Delphes [18], which was configured to approximate the characteristics of the CMS detector. We generate three different processes. The tt rare decays produce the final state where all leptons are either e or µ and the + 3 − 3 pair can come from the decay of either t ort. The pp → ttW process is generated with subsequent decay of all W bosons to a lepton and a neutrino, where here leptons include e, µ, and τ . The pp → ttH process is generated with subsequent H → W W * decay; three of four W bosons are required to decay to a lepton and neutrino (including e, µ, and τ ) and the fourth is required to decay to qq . For tt we assign a cross section of 10.9 fb using a branching fraction B(t → νb ) computed using MG5_aMC and a total inclusive tt cross section of 832 fb. For ttW and ttH we assign cross sections of 20.8 and 10.3 fb, respectively, derived from production cross sections and decay branching fractions in the LHC Higgs Cross Section Working Group Yellow Report 4 [13] and the Particle Data Group averages [19]. All processes are generated at leading order with no additional partons produced in the matrix element.
The default Delphes CMS simulation card was altered to use anti-k T R = 0.4 jets with a p T threshold of 25 GeV, and to require that all reconstructed leptons satisfy p T > 10 GeV. Leptons are required to be isolated; a particle flow algorithm is used to classify additional activity in a cone of ∆R ≡ ∆η 2 + ∆φ 2 = 0.3 around the lepton, and the total transverse energy of particles with p T > 0.5 GeV in this cone is required to be less than 10% of the lepton p T . To remove quarkonia decaying to muon pairs, typical analyses will require that the invariant mass of any opposite sign same flavor lepton pairs to exceed 12 GeV, which will also have the advantage of removing low mass internal conversion events where both leptons are reconstructed.
We consider three possible analysis selection scenarios: 1. exactly three reconstructed leptons with p T > 10 GeV, no further selection on jets; 2. exactly three reconstructed leptons with p T > 20 GeV, no further selection on jets; 3. exactly three reconstructed leptons with p T > 20 GeV, with ≥ 4 jets required.
These are motivated by different potential analysis targets. Scenario 1 illustrates generic features of the different processes and might be used as a preselection for a multivariate discriminant analysis that seeks as much acceptance as possible. Scenario 2 attempts to reduce the impact of the tt rare decay background by raising lepton momentum thresholds (this would also be motivated by reducing the contamination from leptons from b-and c-hadron decays), and would be a starting point for a ttW selection. Scenario 3 is closest to a selection targeting ttH, requiring high jet multiplicity to suppress the ttW contribution. In all cases triggering should be very efficient and is not considered. We make no specific selection on the number of b-tagged jets. All three samples (tt rare decay, ttW , ttH) have virtually identical distributions of the number of reconstructed b-tagged jets, consistent with the tagging efficiency implemented in the Delphes card. To improve statistical yields for comparisons we do not cut on this variable.
The results for Scenario 1 are shown in Figures 3 and 4. Figure 3a shows the transverse momentum spectrum for the two leptons with the same charge. In the case of the tt rare decay with one AIC lepton lost, the AIC lepton that is found will necessarily be one of these same-charge leptons. The large contribution of lower-momentum leptons in the rare decay is clear. Figure 3b shows the same for the one lepton of opposite charge to the other two; this is expected to never come from a conversion, and indeed no low-p T peak is seen. (In fact the ttH process has a larger fractional contribution from the lowest p T leptons, due to the decay of the low mass offshell W * .) Figures 3c and 3d show the distributions for  the smaller and larger of the two possible invariant masses formed between opposite charge leptons in each event. In both cases the tt rare decay distribution looks similar to that of ttH and softer than the spectrum for ttW . The number of jets is shown in Figure 3e; this peaks at 2 for the tt rare decays. The spectrum for ttW is similar, and ttH peaks towards higher multiplicity. Figure 3f shows the ratio of the isolation p T to the lepton p T . Although this shows some slope difference between the rare decay and ttW /ttH, this appears to arise primarily from the different spectrum of the lepton p T which appears in the denominator, rather than any difference in the isolation energy itself. Finally, Figures 4a and 4b show the missing transverse energy and scalar sum of transverse momenta H T ; the three processes are not dramatically different in these distributions, although ttH has a larger H T consistent with having more jets in the final state.
Typically, to suppress non-prompt lepton contributions, the p T threshold on the leptons will be higher than 10 GeV. In Scenario 2 we apply a uniform 20 GeV cut on all leptons. (In principle one can do this for the same sign pair only, but the conclusions are the same.) The results are shown in Figure 5 for certain selected variables. The pp → ttW process is now strongly enhanced over both the rare tt decay and ttH production. Figure 5a shows the p T of the same-charge leptons. The distribution shows a kink at ≈ 40 GeV for the rare decay, above which it has a very similar momentum spectrum to that of ttH. The invariant mass distributions in Figures 5c and 5d look almost identical between the rare decay process and ttH; the two processes are distinguished by the number of associated jets and the H T . Now that the low-p T spike in the rare decay has been removed, the isolation energy ratio (Figure 5f) looks very similar for all processes.
Finally, we make a requirement on the total number of jets in order to isolate ttH and separate it from ttW . Some variables are shown in Figure 6. Other than in the shape of the same-sign lepton p T , where the same kink at 40 GeV is seen as in Scenario 2, it is hard to convincingly discriminate the rare decay and ttH. The yields of the three considered processes under each scenario are shown in Table 1.  In Scenarios 2 and 3 (ttW -and ttH-search-like regions) the tt AIC background contributes 15-20% of the relevant signal yield, and can have a non-trivial impact on extraction of ttW and ttH cross sections. Due to the large uncertainty in the branching fraction for the rare decay, this impact may in fact be larger than seen here. We raise one additional concern. The rare decay process will also appear in same-sign dilepton analyses, as the AIC lepton is not charge-correlated with the leptons from W decay in top events. The AIC leptons will appear as a relatively low p T contribution, and may be hard to distinguish from non-prompt electrons and muons from hadronic decays that happen to be isolated. Extractions of fake factors and validation of data-driven non-prompt estimates may be affected if this additional prompt lepton contribution is not considered. As non-prompt lepton rates are very process-and event selection-specific, we do not discuss them further here.

Conclusion
The rare top decays t → ( ν, qq )b have not generally been considered as potential backgrounds for searches. However, at low invariant mass, lepton kinematics and analysis selections can result in a high likelihood of one of the two leptons being reconstructed; the relatively high top decay branching fraction in this region may yield a significant number of events in signal regions for ttW or ttH production. Considering the fact that there are persistent excesses in ttW and ttH results over the Standard Model expectations in multilepton searches, this background source should be carefully evaluated for its impact in real LHC analyses. A reliable calculation of the decay branching fraction and kinematics, with a proper treatment of the IR divergence as m( ) → 0, will be extremely valuable. Prescriptions for matching QED parton shower calculations including γ → splitting and the corresponding perturbative matrix element calculations will enable consistent simulations of these processes by the LHC experiments. A degree of data-driven background estimation is possible, requiring the use of soft leptons with momentum down to ≈ 5 GeV to fully reconstruct symmetric internal conversion events.