Searches for light- and heavy-ﬂavour three-jet resonances in pp collisions at

A search for three-jet hadronic resonance production in pp collisions at a centre-of-mass energy of 8 TeV has been conducted by the CMS Collaboration at the LHC with a data sample corresponding to an integrated luminosity of 19 . 4 fb − 1 . The search method is model independent, and events are selected that have high jet multiplicity and large values of jet transverse momenta. The signal models explored assume R -parity-violating supersymmetric gluino pair production and have ﬁnal states with either only light-ﬂavour jets or both light- and heavy-ﬂavour jets. No signiﬁcant deviation is found between the selected events and the expected standard model multijet and tt background. For a gluino decaying into light-ﬂavour jets, a lower limit of 650 GeV on the gluino mass is set at a 95% conﬁdence level, and for a gluino decaying into one heavy- and two light-ﬂavour jets, gluino masses between 200 and 835 GeV are, for the ﬁrst time, likewise excluded.


Introduction
Hadronic multijet final states at hadron colliders offer a unique window on many possible extensions of the standard model (SM), although with the view partly obscured by large backgrounds due to SM processes. Many of these extensions predict resonances, such as heavy coloured fermions transforming as octets under SU(3) c [1][2][3][4] or supersymmetric gluinos that undergo R-parityviolating (RPV) decays to three quarks [5][6][7]. Recent studies from the Fermilab Tevatron Collider and the CERN Large Hadron Collider (LHC) employed the jet-ensemble technique. For this technique, jets are associated into unique combinations of three jets (triplets). Additional selection requirements are imposed to suppress the large backgrounds due to SM processes and to enhance sensitivity to strongly decaying resonances. These analyses set lower mass limits based upon resonance fits for gluinos undergoing RPV decays. The CDF Collaboration at the Tevatron excluded gluino masses below 144 GeV [8] using data from pp collisions at 1.96 TeV, while the CMS Collaboration at the LHC excluded masses below 460 GeV [9,10] with data from pp collisions at 7 TeV. An additional search at the LHC by the ATLAS Collaboration, also based on data collected with pp collisions at 7 TeV, has extended these limits to 666 GeV [11].
Presented here are the results of dedicated searches for pairproduced three-jet resonances in multijet events from pp col-E-mail address: cms-publication-committee-chair@cern.ch. lisions, with one search being inclusive with respect to parton flavours and the second requiring at least one jet from the resonance decay to be identified as a bottom-quark jet (b jet). This latter, heavy-flavour search is the first of its kind and probes additional RPV couplings. The results are based on a data sample of pp collisions at √ s = 8 TeV, corresponding to an integrated luminosity of 19.4 ± 0.5 fb −1 [12] collected with the CMS detector [13] at the LHC in 2012. Events with at least six jets, each with high transverse momentum (p T ) with respect to the beam direction, are selected and investigated for evidence of three-jet resonances consistent with strongly coupled supersymmetric particle decays. The event selection criteria are optimised in the context of the gluino signal mentioned above [5][6][7], using a simplified model where the gluinos decay with a branching fraction of 100% to quark jets. However, the generic features of the selection criteria provide a model-independent basis that can be used when examining extensions of the SM, since any exotic three-jet resonance with a narrow width, sufficient cross section, and high-p T jets would be expected to produce a significant bump on the smoothly falling SM background of our search. Additionally, low trigger thresholds and the application of b-jet identification make it possible to use SM top quark-antiquark (tt) events to validate the analysis techniques.

The CMS experiment
The central feature of the CMS apparatus [13] is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the superconducting solenoid volume are a silicon pixel and strip tracker, a lead tungstate electromagnetic calorimeter (ECAL), and a hadron calorimeter (HCAL) that consists of brass layers and scintillator sampling calorimeters. Muons are measured in gas ionisation detectors embedded in the steel return yoke outside the solenoid. Extensive forward calorimetry complements the coverage provided by the barrel and endcap detectors. CMS uses a right-handed coordinate system, with the origin at the nominal interaction point, the x axis pointing to the centre of the LHC, the y axis pointing up (perpendicular to the LHC plane), and the z axis along the anticlockwise-beam direction. The polar angle θ is measured with respect to the positive z axis, the azimuthal angle φ is measured in the x-y plane, and the pseudorapidity η is defined as η = − ln[tan(θ/2)]. Energy deposits from hadronic jets are measured using the ECAL and HCAL. The energy resolution for photons with E T ≈ 60 GeV varies between 1.1% and 2.6% over the solid angle of the ECAL barrel, and from 2.2% to 5% in the endcaps. The HCAL, when combined with the ECAL, mea- . The ECAL provides coverage in pseudorapidity |η| < 1.479 in a barrel region and 1.479 < |η| < 3.0 in two endcap regions. In the region |η| < 1.74, the HCAL cells have widths of 0.087 in η and 0.087 in φ. In the η-φ plane, and for |η| < 1.48, the HCAL cells map on to 5 × 5 ECAL crystals arrays to form calorimeter towers projecting radially outwards from close to the nominal interaction point.
At larger values of |η|, the size of the towers increases, and the matching ECAL arrays contain fewer crystals. Within each tower, the energy deposits in ECAL and HCAL cells are summed to define the calorimeter tower energies, subsequently used to provide the energies and directions of hadronic jets.
The CMS detector uses a two-tier trigger system to collect data. Events satisfying the requirements at the first level are passed to the high-level trigger (HLT), whose output is recorded and limited to a total rate of ∼350 Hz. An HLT requirement based on at least six jets, reconstructed with only calorimeter information, is used to select events. With the jets ordered in descending p T values, the p T threshold at the HLT for the fourth jet is 60 GeV and, for the sixth jet, 20 GeV. For events passing all offline requirements described in Section 4, the total trigger efficiency is at least 99%.
The CMS particle-flow algorithm [15] combines calorimeter information with reconstructed tracks to identify individual particles such as photons, leptons, and neutral and charged hadrons. The photon energy is obtained directly from calibrated measurements in the ECAL. The energy of electrons is determined from a combination of the track momentum at the primary interaction vertex [16], the corresponding ECAL cluster energy, and the energy sum of all bremsstrahlung photons associated with the track in the offline reconstruction. The muon energy is obtained from the corresponding track momentum. The energy for a charged hadron is determined from a combination of the track momentum and the corresponding ECAL and HCAL energies, corrected for zerosuppression effects and calibrated for the nonlinear response of the calorimeters. Finally, the energy of a neutral hadron is obtained from the corresponding calibrated ECAL and HCAL energies. The particle-flow objects serve as input for jet reconstruction, performed using the anti-k T algorithm [17][18][19] with a distance parameter of 0.5. The jet transverse momentum resolution is typically 15% at p T = 10 GeV, 8% at 100 GeV, and 4% at 1 TeV; when jet clustering is based only upon the calorimeter energies, the corresponding resolutions are about 40%, 12%, and 5%.
Jet energy scale corrections [20] derived from data and Monte Carlo (MC) simulation are applied to account for the nonlinear and nonuniform response of the calorimeters. In data, a small residual correction factor is included to correct for differences in jet response between data and simulation. The combined corrections are approximately 5-10%, and their corresponding uncertainties range from 1-5%, depending on the pseudorapidity and energy of the jet. Jet quality criteria [21] are applied to remove misidentified jets, which arise primarily from calorimeter noise. In both data and simulated signal events, more than 99.8% of all selected jets satisfy these criteria.

Signal event simulation
Pair-produced gluinos are used to model the signal. Gluino production and decay are simulated using the Pythia [22] event generator (v6.424), with each gluino decaying to three quarks through the λ udd quark RPV coupling [23], where u and d refer to any upor down-type quark, respectively. Two different scenarios are considered for this coupling, resulting in both an inclusive search similar to previous analyses [8][9][10][11] and a new heavy-flavour search.
For the first case, the coupling of λ 112 , where the three numerical subscripts of λ refer to the quark generations of the corresponding u-d-d quarks, is set to a non-zero value, giving a branching fraction of 100% for the gluino decay to three light-flavour quarks. The second case, represented by λ 113 or λ 223 , covers gluino decays to one b quark and two light-flavour quarks. The mass of the generated gluino signal ranges from 200 to 500 GeV in 50 GeV steps, with additional mass points at 750, 1000, 1250, and 1500 GeV. For the generation of this signal, all superpartners except the gluino are taken to be decoupled and heavy (i.e. beyond the reach of the LHC), the natural width of the gluino resonance is taken to be much smaller than the mass resolution of the detector of approximately 4-8% in the mass range investigated, and no intermediate particles are produced in the gluino decay. Simulation of the CMS detector response is performed using the Geant4 [24] package.

Event selection
Events recorded with the six-jet trigger described above are required to contain at least one reconstructed primary vertex [16]. Since this analysis targets pair-produced three-jet resonances that naturally yield high jet multiplicity, we require events to contain at least six jets with |η| < 2.5. To ensure that the trigger is fully efficient, we impose minimal requirements that the p T thresholds of the fourth and sixth jets are at least 80 and 60 GeV, respectively, though we impose higher thresholds for two of our three selections, as described below.
We use the jet-ensemble technique [8,9] in this analysis to combine the six highest-p T jets in each event into all possible unique triplets. Each event that satisfies all selection requirements will yield 20 combinations of jet triplets. For signal events, no more than two of these triplets can be correct reconstructions of the pair-produced gluinos, with the remaining 18 triplets being incorrect combinations of jets. Thus, background triplets arising from SM multijet events are supplemented by "incorrect" jettriplet combinations from the signal events themselves. To obtain sensitivity to the presence of a three-jet resonance, an additional requirement is placed on each jet triplet to preferentially remove SM background and incorrectly combined signal triplets. This selection criterion exploits the constant invariant mass of correctly reconstructed signal triplets and the observed linear correlation between the invariant mass and scalar sum of jet p T for background triplets and incorrectly combined signal triplets: where M jjj is the triplet invariant mass, the p T sum is over the three jets in the triplet (triplet scalar p T ), and Δ is an empirically determined parameter. Fig. 1 shows a plot of the triplet invariant mass versus triplet scalar p T for simulated events with 400 GeV gluinos decaying to light-flavour jets.
The value of Δ is chosen so that the analysis is sensitive to as broad a range of gluino masses as possible given the restrictions imposed by the trigger. We find that the peak position of the M jjj distribution in data depends on the value of Δ. From a study of this peak position versus Δ, we find Δ = 110 GeV to be the optimal choice, yielding the lowest value of the peak of M jjj . This simple Δ requirement, rather than model-specific invariant mass requirements, maintains the model-independent sensitivity of our analysis to any three-jet resonance, not just that of our signal model.
Tightening the selection requirement on the p T value of the sixth jet can reduce background stemming from SM multijet production. The optimisation of this requirement to maximise signal significance is performed as follows.
As illustrated in Fig. 2 for a gluino mass of 400 GeV, the triplet invariant mass distribution for signal events has the shape of a Gaussian peak on top of a broad base of incorrect three-jet combinations. We define the Gaussian peak to be the signal. Following Ref. [25], we use a four-parameter function (Eq. (2)) that is representative of the estimated background in the data (see Section 5) and characterised by a steeply and monotonically decreasing shape: (2) where N is the number of triplets and x is the triplet invariant mass. The parametrised signal and background estimates used in the optimisation procedure can be seen in the inset of Fig. 2.
Using these two components, signal triplets from the Gaussian peak and background triplets from the background estimate, we define the signal significance as the ratio of the number of signal triplets to the square root of the number of signal triplets plus the number of background triplets obtained from data. The number of signal and background triplets is calculated within a window around the mass peak with a width corresponding to twice the expected gluino-mass resolution. This procedure is repeated for different thresholds on the sixth-jet p T in steps of 10 GeV, for a given gluino mass. For the inclusive search, the focus is on masses that are higher than those previously excluded by the jetensemble technique [10], so the mass range of the search starts around 400 GeV. We find that a requirement of p T 110 GeV on the sixth jet maximises the signal significance in this mass range.
The use of b-jet identification enables us to perform a heavyflavour search in addition to our inclusive search for three-jet resonances. The combined secondary vertex (CSV) algorithm [26] uses variables from reconstructed secondary vertices along with track-based lifetime information to identify b jets. The tagging efficiency for b jets changes with the p T of the jet, ranging from 70% for jets with 100 p T 200 GeV to 55% for jets with p T 500 GeV. We study different b-tagging requirements for signal events with simulated gluinos that have heavy-flavour decays and use the same definition of the signal significance as for the sixth-jet p T optimisation to determine the best choice. The CSV medium operating point, with a mistagging rate of about 1% for light-flavour jets, is found to be the optimal choice for detecting a potential signal in this analysis. The requirement that each event contain at least one b-tagged jet (b tag) increases the signal significance, and the additional requirement that all selected triplets have a b tag removes a large portion of the incorrectly combined signal triplets.
For the heavy-flavour analysis, we distinguish between a lowmass region covering gluino masses between 200 and 600 GeV and a high-mass region covering larger gluino masses. For the low-mass region, we maximise signal acceptance by using jet-p T requirements of 80 GeV for the fourth jet and 60 GeV for the sixth jet. For the high-mass region, the sixth jet is required to have p T 110 GeV. For both the low-and high-mass regions, the value Δ = 110 GeV is used. All-hadronic tt event production is a significant background in the low-mass region. We use tt events that produce triplets with masses in this region to help validate our analysis technique, as described below.  1)). All distributions are normalised to unit area. The simulated SM multijet events are generated by MadGraph [27] with showering performed by Pythia. High-mass signal events, for both the light-and heavy-flavour signal models, have a more spherical shape than background events, which typically contain back-to-back jets and thus have a more linear shape. To significantly reduce the background in the high-mass searches, we use a sphericity variable, S = 3 2 (λ 2 + λ 3 ), where the λ i variables are eigenvalues of the following tensor [22]: where α and β label separate jets, and the sphericity S is calculated using all jets in each event. A comparison of the sphericity variable for data, simulated SM multijet events, and three different simulated gluino masses can be seen in Fig. 3. For the inclusive search and the high-mass, heavy-flavour search, selected events are required to have S 0.4, which is based on the optimisation of the number of expected signal events divided by the square root of the number of signal-plus-background events. No sphericity requirement is used for the low-mass, heavy-flavour selection because low-mass signal events do not have a significant shape difference from background events.
To conclude, we define three different search regions for this analysis with specific selection criteria applied as previously discussed and summarised in Table 1.

Background estimation and signal extraction
The dominant background for this search comes from SM multijet events, which arise from perturbative QCD processes of order

O(α 3
s ) and higher. The invariant mass shape of incorrectly combined signal triplets is found to be similar to that of the background from SM multijet processes, such that the combined distribution is consistent with that of SM multijets alone. Moreover, because the normalisation of the background component (P 0 in Eq. (2)) is unconstrained, any incorrectly combined signal triplets, if present, would be absorbed into the background estimate. The triplet invariant mass distribution for the background decreases smoothly with increasing mass, and we model this background using a four-parameter function (Eq. (2)) fit directly to the data, except in the case of the low-mass, heavy-flavour search. For the low-mass, heavy-flavour search, there is an additional background contribution from all-hadronic tt events. These events are modelled using the MadGraph [27] generator, and the expected number of tt events is determined from the next-to-nextto-leading-order (NNLO) cross section of 245.8 +8.7 −10.5 pb [28]. The shape of the contribution from SM multijet processes is modelled with a statistically independent data sample, constructed by imposing a veto on b-tagged jets while retaining all other selection requirements. This sample is referred to as the b-jet control region, and the combination of simulated tt events and the background from SM multijet processes, modelled by this control region, gives the total SM background estimate for the low-mass, heavy-flavour analysis.
A comparison of the background estimate to the data is performed, in which the data are fit using a binned maximum likelihood method with either the four-parameter function of Eq. (2) for the inclusive analysis and the high-mass, heavy-flavour analysis, or the background shape described above for the low-mass, heavyflavour analysis. Fig. 4 shows a comparison between the three-jet invariant mass distribution in data and the background estimate for the inclusive analysis. Fig. 5 shows the comparisons between data and background estimates for the low-and high-mass heavyflavour analyses. In all three cases, no statistically significant deviations from the data are observed.
As a validation of the analysis technique, we consider the tt triplets as a signal with the background solely composed of triplets from SM multijet processes, whose shape is modelled by the bjet control region, with the small amount of simulated tt events without b tags subtracted. The tt cross section is extracted based on the contribution of its signal triplets and is compared with the theoretical prediction for the cross section of 245.8 pb. The measurement yields a result of 205 ± 28 pb (combined statistical and systematic uncertainties), which is within less than two standard deviations from the theoretical value, thereby showing our technique can successfully reconstruct hadronically decaying tt events.
To obtain an estimate of the number of signal triplets expected after all selection criteria are applied, the sum of a Gaussian function that represents the signal and a four-parameter function (Eq. (2)) that models the incorrectly combined signal triplets is fit to the simulated M jjj distribution for each gluino mass. The Gaussian component of the fit provides the estimate for the expected number of signal triplets. The factors in this overall triplet signal efficiency are the event acceptance, governed by the kinematic and b-tagging selections, and the triplet rate, which represents the number of selected triplets per selected event. This triplet rate is the product of the average number of triplets per event times the proportion of triplets contained in the Gaussian signal peak compared with the full distribution. Width and acceptance-times-efficiency ( A × ) are both parametrised as functions of gluino mass, as shown in Fig. 6. The width of the Gaussian function modelling the signal varies according to the detector resolution, ranging from 17 to 70 GeV for gluino masses from 200 to 1500 GeV. The A × ranges from about 0.003 to 0.033 for the inclusive search for gluino masses from 400-1500 GeV, and, for the heavy-flavour search, from 0.005 to 0.04 for masses from 200-600 GeV, and from 0.008 to 0.015 for masses from 600-1500 GeV. For high-mass gluinos, the A × flattens slightly because of the decreased efficiency to reconstruct triplets in the Gaussian signal peak.

Systematic uncertainties
Systematic uncertainties in the signal acceptance are assigned in the following manner. For uncertainties related to the jet energy scale (JES) [20], the jet energy corrections are varied within their uncertainties for each signal mass, and then the entire selection procedure is repeated to determine the parametrised values of the A × . The largest difference from the nominal values is taken as a systematic uncertainty. To evaluate the systematic uncertainty associated with the level of simulated ISR and FSR for signal events, i.e. the spontaneous emission of gluons from incoming or outgoing participants of the hard interaction, dedicated signal samples are generated where the relative amounts of ISR and FSR are coherently increased or decreased with respect to the nominal setting of the Pythia event generator [29]. The parameter controlling the amount of ISR (PARP(67)) is varied around its central value of 2.5 by ±0.5 and that for the FSR (PARP(71)) is varied from 2.5 to 8, with a nominal value of 4.0. For each sample, the rederived A × is compared to the nominal value, and the difference is taken as the systematic uncertainty. Analogously, an uncertainty is assigned to account for the effects of multiple pp  collisions in an event (pileup) by reweighting all MC signal samples such that the distribution of the number of interactions per bunch crossing is shifted, high and low, by one standard deviation compared with that found in data [30]. For the analyses using b tagging, an uncertainty is assigned based on the scale factor that comprises the differences in b-tagging efficiencies in data compared with simulation [26]. The same procedure as outlined above is repeated, where the b-tagging scale factors are varied within their uncertainties, and the effect on A × is evaluated. Uncertainties in the fit parameters of the Gaussian signal are used as an additional systematic uncertainty for each mass point. Finally, an overall systematic uncertainty of 2.6% is assigned to the integrated luminosity measurement [12]. The ranges in the values of these uncertainties are summarised in Table 2. Systematic uncertainties related to the signal and background shapes are discussed in Section 7.

Results and limits
The three-jet invariant mass distributions are examined for a Gaussian signal peak on top of the smoothly falling background distribution. As has been described, this analysis uses different selection criteria to search for resonances coupling to light-flavour and to heavy-flavour quarks, with the latter search done separately in low-mass and high-mass regions. In the analysis of each of the three selections, the background normalisation parameter is unconstrained and is therefore determined by the SM multijet component of the combined fit. For the function describing the background, the initial values of its parameters are taken from the background-only hypothesis fit to the data, while they are allowed to float in the background-plus-signal hypothesis fits for the limit calculation. The signal is modelled with Gaussians defined by the width and A × curves shown in Fig. 6. The uncertainties in the expected number of signal triplets are included as log-normal constraints, where the uncertainty for the width of the Gaussian includes a 10% systematic uncertainty to account for jet resolution effects [20]. For the tt background estimate, uncertainties in both the shape and normalisation are included. In addition to those already discussed in the previous section, uncertainties due to ambiguities in the parton shower matching procedure between the MadGraph and Pythia event generators, as well as those due to the dependence on the renormalisation and factorisation scale, are taken into account.
Upper limits are placed on the cross section times branching fraction for the production of three-jet resonances. A modifiedfrequentist approach, using the CL s [31,32] technique and a profile likelihood as the test statistic, is employed. Limits are calculated with the frequentist asymptotic calculator implemented in the RooStats [33,34] package. The full CL s calculations give similar limits within a few percent, and closure tests where a fixed signal is injected yield consistent coverage. The observed and expected 95% confidence level (CL) upper limits on the gluino pairproduction cross section times branching fraction as a function of gluino mass are presented in Fig. 7. The solid red lines in the figure show the next-to-leading-order (NLO) plus next-to-leadinglogarithm (NLL) cross sections for gluino pair production [35][36][37][38][39], and the dashed red lines indicate the corresponding one-standarddeviation (σ ) uncertainties, which range between 15% and 43%. To quote final results, we use the points where the −1σ -uncertainty curve for the NLO + NLL cross section crosses the expected-and observed-limit curves. We additionally quote the result where the central theoretical curve intersects the limit curves.
The production of gluinos undergoing RPV decays into lightflavour jets is excluded at 95% CL for gluino masses below 650 GeV, with a less conservative exclusion of 670 GeV based upon the theory value at the central scale. The respective expected limits are 755 and 795 GeV. These results extend the limit of 460 GeV [10] obtained with the 7 TeV CMS dataset. Gluinos whose decay includes a heavy-flavour jet are excluded for masses between 200 and 835 GeV, which is the most stringent mass limit to date for this model of RPV gluino decay, with the less conservative exclusion up to 855 GeV from the central theoretical value. The respective expected limits are 825 and 860 GeV. While a smaller phase space is probed in the heavy-flavour search, the limits extend to higher masses because of the reduction of the background. Observed and expected 95% CL cross section limits as a function of mass for the inclusive (top) and heavy-flavour searches (bottom). The limits for the heavyflavour search cover two mass ranges, one for low-mass gluinos ranging from 200 to 600 GeV, and one for high-mass gluinos covering the remainder of the mass range up to 1500 GeV. The solid red lines show the NLO + NLL predictions [35][36][37][38][39], and the dashed red lines give the corresponding one-standard-deviation uncertainty bands [40].

Summary
A search for hadronic resonance production in pp collisions at a centre-of-mass energy of 8 TeV has been conducted by the CMS experiment at the LHC with a data sample corresponding to an integrated luminosity of 19.4 fb −1 . The approach is model independent, with event selection criteria optimised using the RPV supersymmetric model for gluino pair production in a six-jet final state. Two different scenarios for this RPV decay have been considered: gluinos decaying exclusively to light-flavour jets, and gluinos decaying to one b-quark jet and two light-flavour jets, with the assumption in both cases of a 100% branching fraction for gluinos decaying to quark jets. Methods based on data have been used to derive estimates of background from SM multijet processes. Events with high jet multiplicity and a large scalar sum of jet p T have been analysed for the presence of signal events, and no deviation has been found between the standard model background expectations and the measured mass distributions. The production of gluinos undergoing RPV decay into light-flavour jets has been excluded at the 95% CL for masses below 650 GeV. Gluinos that include a heavy-flavour jet in their decay have been excluded at 95% CL for masses between 200 and 835 GeV, which is the most stringent limit to date for this model of gluino decay.