Search for pair-produced resonances decaying to jet pairs in proton-proton collisions at sqrt(s) = 8 TeV

Results are reported of a general search for pair production of heavy resonances decaying to pairs of jets in events with at least four jets. The study is based on up to 19.4 inverse femtobarns of integrated luminosity from proton-proton collisions at a center-of-mass energy of 8 TeV, recorded with the CMS detector at the LHC. Limits are determined on the production of scalar top quarks (top squarks) in the framework of R-parity violating supersymmetry and on the production of color-octet vector bosons (colorons). First limits at the LHC are placed on top squark production for two scenarios. The first assumes decay to a bottom quark and a light-flavor quark and is excluded for masses between 200 and 385 GeV, and the second assumes decay to a pair of light-flavor quarks and is excluded for masses between 200 and 350 GeV at 95% confidence level. Previous limits on colorons decaying to light-flavor quarks are extended to exclude masses from 200 to 835 GeV.


Introduction
We present the results of a search for pair-produced dijet resonances decaying into light-and heavy-flavor quarks in multijet events. The analysis is based on data samples corresponding to as much as 19.4 ± 0.5 fb −1 [1] of integrated luminosity from proton-proton collisions at √ s = 8 TeV, collected with the CMS detector [2] at the CERN LHC in 2012. Events that have at least four jets with high transverse momentum (p T ) are selected and investigated for evidence of pair-produced dijet resonances.
Many models of particle physics beyond the standard model (SM) incorporate particles that decay into fully hadronic final states. Supersymmetric (SUSY) models are SM extensions, which simultaneously solve the hierarchy problem and unify particle interactions [3,4]. In natural SUSY models, where there is minimal fine-tuning, the top quark superpartner (top squark) and the superpartners of the Higgs boson (higgsinos) are required to be light [5][6][7][8][9]. Natural SUSY is underconstrained in certain R-parity violating (RPV) scenarios [10]. R-parity is a quantum number defined as R = (−1) 3B+L+2S , where B and L are the baryon and lepton numbers, respectively, and S is the spin. The RPV superpotential, W, is defined as: where λ are the couplings, i,j,k are the generation indices, L and Q are the doublet superfields of the lepton and quark, respectively, and E c , D c , and U c are the singlet superfields of the lepton, down-type and up-type quarks, respectively. Models that incorporate RPV may allow baryon number violation through a non-zero λ UDD coupling, and one such unconstrained scenario [11] is that of the hadronically decaying top squark, t → qq . If the top squarks are pair-produced in hadronic collisions and then decay via such an RPV process, the final state would consist of four jets with no momentum imbalance in the transverse plane.
In addition to top squark production, hadron collider searches for pair production of resonances decaying into jet pairs are sensitive to a number of models that predict new particles carrying color quantum numbers. Some models predict pair production through gg interactions of color-octet vectors, also called colorons (C) [12], which then decay to quark pairs. The associated final state of the signal is characterized by the presence of four high-p T jets.
The CDF collaboration has placed 95% confidence level (CL) exclusion limits [13] on top squark production followed by RPV decays in the mass range 50-90 GeV and on coloron production in the mass range 50-125 GeV. At the LHC, both ATLAS [14] and CMS [15] have performed searches for paired dijet resonances. While ATLAS has placed limits on scalar gluon masses between 100 and 287 GeV, and CMS on coloron masses between 250 and 740 GeV, so far neither search has been sensitive enough to set limits on hadronic RPV decays of directly produced top squarks.
In this paper, we concentrate on searches for top squarks and colorons. The benchmark signals are those where the top squark is the lightest supersymmetric particle, and in one scenario decays into two light quarks, and in the second scenario it decays into a b quark and a light quark [16][17][18][19][20]. We separately consider the possibility of decays within the coloron model (gg → CC → qqqq).

CMS experiment
The central feature of the CMS apparatus [2] is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the superconducting solenoid volume are a silicon pixel and strip tracker, a lead tungstate electromagnetic calorimeter (ECAL), and a hadron calorimeter (HCAL), which is made of interleaved layers of scintillator and brass absorber. Muons are measured in gas ionization detectors embedded in the steel return yoke outside the solenoid. Extended forward calorimetry complements the coverage provided by the barrel and endcap detectors. Energy deposits from hadronic jets are measured using the ECAL and HCAL. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [2].

Triggering and object reconstruction
One data set was recorded over the entire 2012 data-taking period with a multilevel trigger system, which selected events with at least four jets with p T > 80 GeV to be reconstructed from only calorimeter information. In addition, a second data set was recorded using the trigger with a lower jet p T threshold, which was decreased progressively from 50 to 45 GeV. The latter data represent only a subset of the entire 2012 data set, corresponding to an integrated luminosity of 12.4 fb −1 . The analysis is separated into two parts: a dedicated "low-mass" search with a focus on the mass region from 200 to 300 GeV, which takes advantage of this lower jet p T threshold, and a "high-mass" search focusing on top squark masses above 300 GeV, which uses the entire 19.4 fb −1 data set.
The analysis is based upon objects reconstructed using the CMS Particle Flow algorithm [21]. This method combines calorimeter information with reconstructed charged particle tracks to identify individual particles such as photons, leptons, and neutral and charged hadrons. The energy of photons is directly obtained from the calibrated ECAL measurement. The energy of the electron is determined from a combination of its track momentum at the main interaction vertex, the corresponding ECAL cluster energy, and the energy sum of all bremsstrahlung photons associated to the track. The energy of a muon is obtained from its associated track momentum. The charged hadron energy is calculated from a combination of the track momentum and the corresponding ECAL and HCAL energies, corrected for zero-suppression effects, and calibrated for the combined response function of the calorimeters. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies. Jets are reconstructed from the particle flow "objects" using the anti-k T algorithm [22] with a distance parameter of 0.5.
Jet energy scale corrections [23] are applied to account for the combined response function of the calorimeters to hadrons. The corrections are derived from Monte Carlo (MC) simulation and are confirmed with in situ measurements of the energy balance of dijet and photon+jet events. In data, a small residual correction factor is included to account for differences in jet response between data and simulation. The total size of the applied corrections is approximately 5-10%, and the corresponding uncertainties vary from 3 to 5%, depending on the measured jet pseudorapidity η and p T . To remove misidentified jets, which arise primarily from calorimeter noise, jet quality criteria [24] are applied. More than 99.8% of all selected jets, in both data and signal event samples, satisfy these criteria.
To identify jets produced by b quark hadronization, the analysis uses the medium selection of the combined secondary vertex b-tagging algorithm [25]. The algorithm employs a multivariate technique, which takes as input information from the transverse impact parameter with respect to the primary vertex of the associated tracks and from characteristics of the reconstructed secondary vertices. The output of the algorithm is used to discriminate b quark jets from light-flavor and gluon jets, with typical values of b-tagging efficiency and misidentifica-tion probabilities of 72% and 1.1%, respectively.

Generation of simulated events
Both top squark production and coloron production are simulated using the MADGRAPH 5.1.5.12 [26] event generator with the CTEQ6L1 parton distribution functions [27], and their decays are simulated using the PYTHIA 6.426 [28] MC program. Top squark signal events are generated with up to two additional initial-state partons, and each top squark decays into two jets through the λ UDD quark RPV coupling. Two scenarios are considered for this coupling. First, the coupling λ 312 , where the three numerical subscripts refer to the quark generations of the corresponding quarks, is set to a non-zero value such that the decay of the top quark to two light-flavor jets is allowed. The second case instead sets a non-zero value for λ 323 , resulting in top squark decay into one b jet and one light-flavor jet. In both of the above cases, the branching ratio of the top squark decay to two jets is set to 100%. For the generation of this signal, all superpartners except the top squarks are taken to be decoupled [16-20] and no intermediate particles are produced in the top squark decay. Top squarks are generated with masses from 200 GeV to 1 TeV in 50 GeV steps for both coupling scenarios. The cross section estimates are made at next-to-leading order (NLO) with next-to-leading-logarithm (NLL) corrections [29][30][31][32][33], and assigned appropriate theoretical uncertainties [34]. For the coloron signal scenario, each coloron decays into two light-flavor jets with a branching ratio of 100%, and masses are generated from 200 GeV to 2 TeV. Backgrounds from SM multijet processes are simulated using MADGRAPH, which generates events with two to four partons via matrix element calculations, and these events are showered through PYTHIA. In all samples, the MLM matching procedure [35] is used to avoid double counting of jets, and simulation of the CMS detector is performed with GEANT4 [36].

Event selection
Events recorded with the four-jet triggers are required to have a well-reconstructed primary event vertex [37]. Events must also contain at least four jets, each with |η| < 2.5 and reconstructed p T greater than 80 GeV for the low-p T trigger and 120 GeV for the higher-p T trigger. With the above requirements, the offline efficiency is above 99% for all selected events.
The leading four jets, ordered in p T , are used to create three unique combinations of dijet pairs per event. A distance variable is implemented to select the jet pairing that best corresponds to the two resonance decays, ∆R = √ (∆η) 2 + (∆φ) 2 , where ∆η and ∆φ are the differences in η and φ of between two the jets, respectively. This variable exploits the smaller relative distance between daughter jets from the same top squark parent decays compared to that between uncorrelated jets. For each dijet pair configuration the value of ∆R dijet is calculated: where ∆R i represents the separation between two jets in dijet pair i, and an offset of 1 is used to maximize the signal efficiency and to minimize dijet pair splitting effects. The configuration that minimizes the value ∆R dijet is selected, with ∆R min representing the minimum ∆R dijet for the event. Figure 1 shows the distributions of the fourth highest jet p T and the ∆R min variable for data events, those of a simulated SM multijet sample, and those of 400 GeV top squark signal sample.  Figure 1: Distributions of the fourth highest jet p T (left) and ∆R min (right) for events from data, the simulated SM multijet sample, and a 400 GeV top squark signal. Events contain at least four jets, each with p T > 120 GeV and |η| < 2.5, and all distributions have an area normalized to unity.
Once a dijet pair configuration is chosen, two additional quantities are used to reject the backgrounds from SM multijet events and incorrect signal pairings: the pseudorapidity difference between the two dijet systems ∆η dijet , and the absolute value of the fractional mass difference ∆m/m av , where ∆m is the difference between the two dijet masses and m av is their average value. The ∆m/m av quantity is small with a peak at zero in signal events where the correct pairing is chosen, while for SM multijet background or incorrectly paired signal events, this distribution is much broader. Thus, the sensitivity of the search benefits from imposing a maximum value on ∆m/m av . Similarly, it is advantageous to require that ∆η dijet be small. This rejects background events with energy more evenly shared among the jets, since these typically yield higher values of ∆η dijet . Figure 2 shows the distributions of the ∆m/m av and ∆η dijet variables for data events, those of a simulated SM multijet sample, and those of 400 GeV top squark signal sample. An additional kinematic variable ∆ is calculated for each dijet pair: where the p T sum is over the two jets in the dijet configuration. This type of variable has been used extensively in hadronic resonance searches at both the Tevatron and the LHC [15, 38-41].
Requiring a minimum value of ∆ results in a lowering of the peak position value of the m av distribution from background SM multijet events. With this selection the fit to the background can be extended to lower values of m av , making a wider range of top squark and coloron masses accessible to the search.
Finally, as the presence of heavy-flavor final state jets is a natural extension of the RPV top squark scenarios, the use of b tagging is exploited to further increase signal sensitivity by increasing background rejection. We consider two scenarios: the heavy-flavor search, which uses b tagging to increase the sensitivity for top squark decays into heavy-flavor jets, and the inclusive search, which focuses instead on decays into light-flavor jets.
The optimization for the signal selection is performed as a function of the three kinematic variables described above: ∆m/m av , ∆η dijet , ∆, as well as the fourth jet p T . Because the number of expected background events is large, we use S/ √ B as the metric for signal optimization, where S and B are the number of signal and background events, respectively, and B of events within a window of width ±10% centered at the generated top squark mass, where the value of 10% is roughly twice the expected resolution for signal masses. We study this metric by evaluating S and B based on events passing a number of thresholds of each kinematic variable and obtain several maps, in which a value of S/ √ B is found for every combination of the four variables. These maps are produced in the low-and high-mass search regions, and for the inclusive and heavy-flavor analyses separately. An example of this is given in Fig. 3, where the distribution for a 500 GeV top squark and for a fit to the SM multijet distribution are shown for one operating point. The signal shape is bimodal owing to a small fraction of events with incorrect signal pairings, and the Gaussian peak centered at the generated mass is the part of the distribution used in the optimization. The threshold values of the four kinematic variables, corresponding to maximum values of S/ √ B in these maps, are taken as a working point. Because of similar results in this optimization, the inclusive and heavy-flavor searches use common working points, with the exception of the heavy-flavor analysis requirement of b tagging. A summary of the requirements is listed in Table 1 for both the low-and high-mass searches. An example of the selection is shown in Fig. 4. The correlation between the pseudorapidity values for the two dijets is plotted, for both 400 GeV top squark and simulated SM samples, with the optimized ∆η dijet threshold overlaid. For the heavy-flavor search, we repeat the optimization procedure by varying the selections based on the number of b-tagged jets in the event. We find that the optimal selection is the requirement that events contain at least two b-tagged jets among the four highest-p T jets.
After all selection requirements are applied, the fraction of signal events remaining in the heavy-flavor search ranges from 0.4% to 1.2% for the low-mass search and from 0.4% to 1.6% for the high-mass search. For the inclusive search, the fraction of signal events remaining ranges from 1.4% to 7.4% for the low-mass search and from 1.4% to 6.5% for the high-mass search. In all scenarios, the leading efficiency loss is due to the required jet p T thresholds. In the data, approximately 20% of the selected events passing the high-mass search criteria are in common with the low-mass search.

Background estimation and systematic uncertainties
The dominant background for this search comes from SM multijet events. Following a method used previously for similar resonance searches [39-42], the steeply falling SM background   shape is modeled with the use of a four-parameter function: where N is the number of events and p 0 through p 3 are parameters of the function. Localized deviations of the data from the background hypothesis are indications of a signal, and the fitted data distributions for the four search scenarios are shown in Fig. 5. The agreement of each background fit to its respective mass distribution is quantified by computing in each bin the difference of the data and the fit, divided by the statistical uncertainty associated with the data. These distributions indicate that no significant deviation is found in any of the four search scenarios.
The dominant systematic uncertainties that affect the yield originate from six sources: the imperfect knowledge of the integrated luminosity (2.6%) [1]; the simulation of initial-state radiation (5%) [26]; the precision of the jet energy corrections (1-6.2%) [23]; the jet energy resolution (10%) [23]; the efficency of b tagging (2%) [25]; the modeling of the effect of multiple pp interactions (<1.5%) [43]. We use log-normal priors to model systematic uncertainties on the signal, which are treated as nuisance parameters. For the uncertainty associated with the background, specifically the choice of function used to model the background shape, we consider several families of functions as a basis of comparison: exponentials, power-law functions, and Laurent series. Using a method previously employed by CMS [44], we study the difference in expected yield in the presence of a signal by using each of these functions instead of the default one, using simulated SM events as the default background shape as input to the pseudo-experiments.
For each pseudo-experiment, each of the parameterizations is fit to the fluctuated background shape, and the largest value of the fractional difference between the alternate fit result and the default one is calculated for every m av bin. The mean of the resulting distribution is taken as the bin-by-bin uncertainty for each alternate parameterization, and the average of the alternate parameterization uncertainties determines the overall assigned uncertainty. This uncertainty increases with m av from 0.3% to 0.6% in the low-mass search range, and from 0.5% to 30% in the high-mass search range.

Results
We set upper limits on the production cross section using a Bayesian formalism with a uniform prior for the cross section. The binned likelihood L can be written as: where µ i is defined as µ i = αN i (S) + N i (B) and n i is the measured number of events in the ith bin of m av . Here, N i (S) is the number of expected events from the signal in the ith m av bin, α is a constant to scale the signal amplitude, and N i (B) is the number of expected events from background in the ith m av bin. The likelihood is combined with the prior and nuisance parameters, and then marginalized to give the posterior density for the signal cross section. Integrating the posterior density to 0.95 of the total gives the 95% CL limit for the signal cross section. The expected limits on the cross section are estimated with pseudo-experiments generated using background shapes, obtained by signal-plus-background fits to the data. Figure 6 shows the observed and expected 95% CL upper limits on σ, the cross section, and a dotted red line indicating the NLO+NLL predictions for top squark production [29][30][31][32][33], where the top squark mass is equal to m av . The vertical dashed blue line at a top squark mass of 300 GeV indicates the transition from the low-to the high-mass limits, and at this mass point the limits are shown for both analyses. The production of top squarks undergoing RPV decays into light-flavor jets is excluded at 95% CL for top squark masses from 200 to 350 GeV. Top squarks whose decay includes a heavy-flavor jet are excluded for masses between 200 and 385 GeV. We exclude the production of colorons decaying into four jets at 95% CL for masses between 200 and 835 GeV, as seen in Fig. 7.

Summary
A search has been performed for pair production of heavy resonances decaying to pairs of jets in four-jet events from proton-proton collisions at √ s = 8 TeV with the CMS detector. The distribution in the average mass of selected dijet pairs has been investigated for localized disagreements between the data and the background estimate. This method takes advantage of a number of additional optimized kinematic requirements imposed on the dijet pair. No significant deviation is found between the selected events and the expected standard model multijet background. Limits are placed on the production of colorons decaying into four jets with a 100% branching fraction, excluding at 95% confidence level, masses between 200 and 835 GeV. For this model, these results include first limits in the mass ranges of 200-250 GeV and 740-835 GeV, extending previous limits [15] to lower masses by 50 GeV, and to higher masses by 95 GeV. Limits are set on top squark pair production through the λ UDD coupling to final states with either only light-flavor jets or both light-and heavy-flavor jets with a 100% branching fraction. We exclude at a 95% confidence level top squark production followed by R-parity violating decays to light-flavor jets for top squark masses from 200 to 350 GeV and decays to heavy-flavor jets for masses between 200 and 385 GeV. Both sets of limits are the most stringent such limits to date, and the first from the LHC for this model of R-parity violating top squark decay.