Search for pair production of excited top quarks in the lepton+jets final state

A search is performed for the pair production of spin-3/2 excited top quarks, each decaying to a top quark and a gluon. The search uses the data collected with the CMS detector from proton-proton collisions at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 35.9 inverse femtobarns. Events are selected by requiring an isolated muon or electron, an imbalance in the transverse momentum, and at least six jets of which exactly two must be compatible with originating from the fragmentation of a bottom quark. No significant excess over the standard model predictions is found. A lower limit of 1.2 TeV is set at 95% confidence level on the mass of the spin-3/2 excited top quark in an extension of the Randall-Sundrum model, assuming a 100% branching fraction of its decay into a top quark and a gluon. These are the best limits to date in a search for excited top quarks and the first at 13 TeV.


Introduction
The standard model (SM) of particle physics provides a successful description of the properties of the elementary particles and their interactions. Despite its success, the SM is assumed to be an effective model of a more complete theory. Many extensions of the SM predict that the top quark is a composite particle and not a fundamental object [1][2][3][4]. A direct confirmation of this hypothesis could be achieved by the discovery of an excited top quark (t * ).
In models that describe the proposed excited top quark [5,6], weak isodoublets are used to represent both left-and right-handed components of the t * quark, allowing for a description of finite masses prior to the onset of electroweak symmetry breaking. Thus, in contrast to the heavy top quark from a sequential fourth-generation model, in these models the existence of t * quarks is not strongly constrained by the discovery of a SM-like Higgs boson [7][8][9]. In string realizations of the Randall-Sundrum (RS) model [10,11], the right-handed t * quark is expected to be the lightest spin-3/2 excited state [12].
A spin-3/2 t * quark is described by the Rarita-Schwinger [13] vector spinor Lagrangian. At the energy of LHC, the production cross section of spin-3/2 quarks is proportional toŝ 3 , whereŝ is the square of the energy in the parton-parton collision rest frame, rather thanŝ −1 , as it is for spin-1/2 quarks [14]. Therefore, when integrating over the parton momentum fractions (x) in proton-proton collisions, spin-3/2 quarks receive a contribution at large x values that is greater than that from spin-1/2 quarks. In the RS model, the spin-3/2 t * quark is expected to have a pair production cross section of the order of a few picobarns at √ s = 13 TeV, for a t * of mass m t * = 1 TeV [1,14,15], which dominates over single t * production for most of the parameter space in the model [12]. The t * quark decays predominantly to a top quark through the emission of a gluon [1,12,15,16].
In this Letter, we present a search for pair-produced t * quarks, where each t * quark decays exclusively to a top quark (t) and a gluon (g). We use data recorded in 2016 with the CMS detector in proton-proton (pp) collisions at √ s = 13 TeV at the LHC, corresponding to an integrated luminosity of 35.9 fb −1 . We consider the case where one top quark decays via a hadronically decaying W boson, and the W boson originating from the second top quark decays to an electron or muon and a neutrino: t * t * → (tg)(tg) → (Wbg)(Wbg) → (qq bg)( νbg). We refer to the resulting final state (one reconstructed muon or electron, missing transverse momentum, and multiple jets) as the lepton+jets decay topology.
A search for pair-produced t * quarks was previously performed by CMS using pp collisions at √ s = 8 TeV [17]. This Letter presents a more sensitive search because of the higher collision energy and therefore larger signal cross sections, and the larger data sample, which is nearly twice the size. In addition, the simulation has been improved by explicitly including the Rarita-Schwinger Lagrangian in the generator, resulting in the correct spin correlations for the signal.

The CMS detector and simulated samples
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a def-

Event reconstruction
Event reconstruction is based on the CMS particle-flow (PF) algorithm [30], which takes into account information from all subdetectors, including measurements from the tracking system, energy deposits in the ECAL and HCAL, and tracks reconstructed in the muon detectors. Given this information, all particles in the event are reconstructed as electrons, muons, photons, and charged or neutral hadrons. Photons are identified as ECAL energy clusters not linked to the extrapolation of any charged-particle trajectory to the ECAL. Muons are identified as a track in the central tracker consistent with either a track or several hits in the muon system, and not associated with energy clusters in the calorimeters. Electrons are identified as a primary charged particle track that extrapolates to at least one ECAL energy cluster. The track may be associated with bremsstrahlung photons emitted along the way through the tracker material. Charged hadrons are identified as charged-particle tracks neither identified as electrons, nor as muons. Finally, neutral hadrons are identified as HCAL energy clusters not linked to any charged-hadron trajectory, or to ECAL and HCAL energy excesses with respect to the expected charged hadron energy deposits.
For each event, jets from these reconstructed particles are clustered with the infrared and collinear safe anti-k T algorithm [31], using a distance parameter R = 0.4. Charged hadrons associated with pileup vertices are excluded from jet reconstruction. The jet momentum is the vectorial sum of the momenta of all particles contained in the jet. The reconstructed jet momentum is found in simulation to be within 5 to 10% of the true momentum over the whole p T spectrum and detector acceptance. Jet energy corrections are derived from the simulation and measurements in collision data [32]. The jet energy resolution amounts typically to 15% at 10 GeV, 8% at 100 GeV, and 4% at 1 TeV [32]. The jet energy resolution in simulation is degraded to match that observed in data.
Jets are identified as originating from a bottom quark through a combined secondary vertex algorithm CSVv2 [33,34]. The algorithm uses a multivariate discriminator to combine information on the significance of the impact parameter, the jet kinematics, and the location of the secondary vertex. A working point of the discriminator with ≈70% b quark identification efficiency and ≈1% mistag efficiency for light quarks and gluons is used in this analysis. Small differences in b tagging efficiencies and mistag rates between data and simulated events are accounted for by applying additional corrections to simulation.
The missing transverse momentum vector is defined as the negative vector sum of the momenta of all reconstructed PF candidates in an event projected onto the plane perpendicular to the beams. Its magnitude is referred to as p miss T .

Event selection
This analysis searches for t * t * production, with each t * decaying to t+g and the tt pair in the event reconstructed in the lepton+jets final state. Events are required to contain exactly one isolated lepton, p miss T , and at least six jets, exactly two of which must be b tagged. Events containing a muon are selected with a single-muon trigger that requires the presence of an isolated muon with transverse momentum p T > 27 GeV. Events containing an electron are selected with a single-electron trigger that requires the presence of an isolated electron with p T > 32 GeV. The background rate for the single electron trigger was much higher than for the single muon trigger, requiring more stringent selection criteria for the electron channel. A deterministic annealing algorithm is used to reconstruct the candidate primary vertices [35]; the vertex with the highest track multiplicity is selected as the primary event vertex. Selected events are required to have this primary vertex within 2 cm of the center of the detector in the x-y plane, and within 24 cm along the z-direction.
Offline, muons are required to have p T > 30 GeV and |η| < 2.1. The track associated with a muon is required to have hits in the pixel and muon detectors, a good quality fit, and transverse and longitudinal impact parameters with respect to the primary vertex smaller than 2 and 5 mm, respectively. An isolation factor I is defined as the scalar sum, divided by the muon p T , of the p T of all photons, charged hadrons, and neutral hadrons within an angular cone of ∆R ≡ √ (∆η) 2 + (∆φ) 2 < 0.4 (where φ is the azimuthal angle) around the track, corrected for the effects of pileup [36]. An isolation selection I < 0.15, corresponding to an efficiency of ≈95% is used.
Electrons are required to have p T > 35 GeV and to be within the region |η| < 2.1. Electrons within 1.44 < |η| < 1.56, corresponding to the ECAL barrel-endcap transition region, are rejected to avoid poor reconstruction performance. Electrons are selected using a cutoff-based selection method [37] based on the shower shape, the track quality, the spatial match between the track and the electromagnetic cluster, the fraction of total cluster energy in the HCAL, and the resulting level of activity in the surrounding tracker and calorimeter regions. The criteria imposed in these electron selection algorithms have a combined efficiency of ≈70%.
In addition to the selections above, the leptons are required to have an angular separation ∆R < 0.1 with respect to the lepton reconstructed by the trigger system. The lepton selection efficiencies for data and simulation are measured using the tag-and-probe method [37]. Table 1: Expected numbers of selected events for the simulated signal process as a function of m t * . Also shown are the expected numbers of events predicted by the SM, together with the systematic uncertainties discussed in Section 7 and the uncertainties in the cross sections of the various processes, as well as the numbers of selected events observed in data. Additional corrections are applied to simulation to account for observed differences in the efficiencies between data and simulation.
The p miss T is required to be greater than 20 GeV, while the jets are required to have p T > 30 GeV, |η| < 2.4, and angular separation ∆R > 0.4 with respect to well-identified electrons or muons. In order to reject misreconstructed, poorly reconstructed, and noisy jets, the fractional energy contribution from both ECAL and HCAL must be non-zero and non-unity. Exactly two jets are required to pass the b tagging criteria.
The expected yields after event selection are summarized in Table 1. Simulated signal events pass the selection criteria with acceptance times efficiency of 1.4-2.2%, depending on the channel and on the signal mass. After the application of all selections, 44 573 events are observed in the µ+jets channel and 28 942 events in the e+jets channel. The yields predicted from the simulated SM background processes are 46 600 events in the µ+jets channel and 30 700 events in the e+jets channel.
Small differences between data and the SM predictions are within the estimated uncertainties of the simulation, with the dominant uncertainty being the choice of the renormalization and factorization scales used in the generator of the tt events. Details of the uncertainties are given in Section 7. Furthermore, the differential distributions of kinematic variables of simulated SM processes are also in agreement with data, as shown in Fig. 1. In particular, the distribution of the invariant mass of a t+jet system (m t+jet , see Section 5 for details) in data is in agreement with the background estimation.

Mass reconstruction
Since the dominant background is SM tt production with extra jets, the reconstructed invariant mass spectrum of the t+jet systems is used to distinguish between t * t * signal and tt background. The p miss T is assumed to be carried away entirely by the neutrino from the leptonically decaying W boson (W lep ). We assume that the parent W boson is on shell and the neutrino is massless in order to determine the longitudinal momentum of the neutrino.   Given the high jet multiplicity of the event selection, a measure was designed for evaluating different associations of the reconstructed jets with the parton objects in the final state. For the jets, the six jets with the highest p T values are taken into consideration. The b tagged jets are assigned to one of the b quark partons, and the other jets are associated with the decay daughters of the hadronically decaying W (W had ) or with the gluons from t * decay. The quality of the jet-parton assignment for a single event is evaluated with an S value based on how well the intermediate physical objects are reconstructed: where m qq is the invariant mass of the jets assigned to W had daughters. Invariant masses of the physical objects assigned to hadronically and leptonically decaying t (t * ) quarks are denoted by m qq b (m qq bg ) and m ν l b (m qq bg ), respectively. m W and m t are the mass of the W boson and top quark recorded by the particle data group [38], being 80.4 and 173.34 GeV, respectively. The expected detector resolutions of the intermediate particles σ W , σ t,had , σ t,lep and σ t * are estimated to be 24, 34, 30, and 230 GeV, respectively. These estimates are obtained by reconstructing the t * t * , tt and W had in the decay topology using the truth information from simulated signal samples. Additional studies have shown that the mass reconstruction is insensitive to changes in the detector resolution values.
The jet-parton assignment with the smallest S value is taken to represent the decay topology of a single event, under the t * hypothesis. The average value of the m qq bg and m ν l bg computed for this assignment is taken to represent the reconstructed t * mass of an event, notated as m t+jet .
The rate at which all six jets are all correctly assigned is around 11%, with the main difficulty being the correct assignment of the jets from the hadronically decaying W.

Background modeling
To determine the presence of signal events in data, an unbinned extended maximum likelihood fit of a signal-plus-background model is performed on the m t+jet > 400 GeV spectrum.
The mass template of the t * t * signal is constructed by smoothing the mass distribution from simulations, using an adaptive kernel estimation [39] with a Gaussian kernel and with no restriction on the boundary. The smoothness parameter ρ introduced in Ref. [39] is determined by the square root of the standard deviation of the signal distribution over the subset with ≥4 correctly assigned partons.
The background distribution is modeled using a log-normal function (up to a normalization factor): where m is the mass, and a 2 and m 0 are the parameters that determine the shape of the background. During the fit to the observed data, the number of background events, as well as the shape parameters of the background function, are free parameters.
To verify whether the fit is sensitive to the presence of t * t * signal, a pseudo-data set is generated with the m t+jet spectrum of the simulated backgrounds and then injected with the expected m t+jet signal spectrum for various hypotheses of the signal cross section. Performing the same fit over multiple sets of pseudo-data with varying signal cross sections showed no evidence of bias.  Figure 2: The m t+jet spectrum for data (points), the signal+background fit (green), the background component of the signal+background fit (blue), and the expected spectrum for a simulated 800 GeV signal process (red dashed) normalized to the integrated luminosity of data. Since there is no significant excess of signal found in data, the signal+background curve overlaps the background-only component. The distributions for the µ+jets data are shown on the left while those for e+jets data are shown on the right. The probabilities of the Kolmogorov-Smirnov test between the data versus the signal+background model and between the data versus the background component are denoted by K all and K bkg , respectively.
To ensure that the log-normal function is sufficient to model the background, a likelihood ratio test is conducted by comparing the results of fitting the spectrum of the simulated SM background to an extended log-normal functions of the form: Increasing the number of parameters does not improve the description of the background.
The results of the fit performed on data with the 800 GeV signal spectrum are shown in Fig. 2. The distribution of events in data is in agreement with a null hypothesis. Based on the results of the Kolmogorov-Smirnov tests, the signal+background model and the background-only model both yield good fits to the data.

Systematic uncertainties
The impact of experimental and theoretical sources of uncertainties is considered and summarized in Table 2. For each source of uncertainty, alternative templates for the distribution of m t+jet are generated by adjusting the relevant parameters in the simulation.
The uncertainties in the jet energy scale and jet resolutions depend on the p T and η of the jets. Alternative mass templates are generated by rescaling the nominal jet four-momentum in the simulation by ±1 standard deviation (s.d.) of the associated uncertainties in energy scale and resolution. Such uncertainties are also coherently propagated to all observables, including p miss T . Varying the jet energy used for reconstruction has <0.1% impact on the signal acceptance. The b tagging and lepton selection scale factors for residual differences between data and simulation have their respective systematic and statistical uncertainties. Alternative templates are generated by shifting the correction scale factors by ±1 s.d. for their respective uncertainties. On average, the b tagging scale factor and lepton scale factors affect the signal acceptance by 2.8 and 2.5%, respectively. Because of uncertainties in the total inelastic pp cross section, when calculating the data pileup scenario alternative pileup corrections are made with the inelastic cross section scaled by ±1 s.d. Variations in the pileup corrections have an average impact on the signal acceptance of 0.7%. The number of signal events is also affected by the uncertainty on the integrated luminosity, which is known to a precision of 2.5% [40].
The theoretical uncertainties considered are those associated with the choice of the PDF, and the renormalization and factorization scales used by the event generator. The effects of the theoretical uncertainties are obtained by changing the various generator parameters within their estimated uncertainties and generating new m t+jet fit templates that are used to calculate new sensitivities.
In addition to the statistical uncertainty originating from the signal+background fit, systematic uncertainties are introduced to cover the choice of modeling. Alternative signal templates are generated with different choices of ρ by changing the subset to require ≥3 and ≥5 correctly assigned partons. The background shape is determined from data. Simulated events with different configurations, as well as several alternative models have been tested. The chosen model, with the parameters floated in the limit computation, has proven to describe the data and cover the associated systematic uncertainties sufficiently well.

Statistical analysis and extraction of limits
No excess above SM background is observed. We set an upper bound on the t * t * production cross section using the asymptotic modified frequentist CL s criterion [42][43][44][45]. The null hypothesis likelihood function is taken from the background component of the signal+background fit described in Section 6. For the uncertainties described in Section 7, a joint template is used, where the nominal template is linearly interpolated to the templates generated with the relevant parameters shifted by ±1 standard deviation. Each of the interpolation variables is taken as a nuisance parameter with a standard Gaussian prior.
The fit is performed separately in the muon and electron channels, and the results of both are used to obtain combined limits. Figure 3 shows the observed and expected upper limits at 95% confidence level for the product of the t * t * production cross section and the square of the branching fraction, as a function of the t * mass. The lower limit for m t * is given by the value at which the upper limit intersects with the theoretical cross section from Ref. [14]. Both the observed and expected lower limits of m t * for the combined muon and electron data are 1.  Figure 3: The expected and observed 95% confidence level upper limits for the product of the production cross section of t * t * and the square of the branching fraction, as a function of the t * mass, for the combined lepton+jets analysis. The theoretical production cross section assuming a 100% t * → tg branching fraction is shown along with its uncertainties, described in Section 7. within uncertainties.

Summary
A search has been conducted for pair production of spin-3/2 excited top quarks t * in protonproton interactions, with each t * decaying exclusively to a standard model top quark and a gluon. Events that have a single muon or electron and at least six jets, exactly two of which must be identified as originating from a bottom quark, are selected for the analysis. Assuming t * t * production, the final-state objects are associated with the t * candidates in each event. No significant deviations from standard model predictions are observed in the t+jet system, and an upper limit is set at 95% confidence level on the pair production cross section of t * t * , as a function of the t * mass. Interpreting the results in the framework of a spin-3/2 t * model, assuming a 100% branching fraction of its decay into a top quark and a gluon, t * masses below 1.2 TeV are excluded. These are the best limits to date on the mass of spin-3/2 excited top quarks and the first at 13 TeV.