Search for heavy resonances decaying to two Higgs bosons in final states containing four b quarks

A search is presented for narrow heavy resonances X decaying into pairs of Higgs bosons (H) in proton-proton collisions collected by the CMS experiment at the LHC at sqrt(s) = 8 TeV. The data correspond to an integrated luminosity of 19.7 inverse femtobarns. The search considers HH resonances with masses between 1 and 3 TeV, having final states of two b quark pairs. Each Higgs boson is produced with large momentum, and the hadronization products of the pair of b quarks can usually be reconstructed as single large jets. The background from multijet and t t-bar events is significantly reduced by applying requirements related to the flavor of the jet, its mass, and its substructure. The signal would be identified as a peak on top of the dijet invariant mass spectrum of the remaining background events. No evidence is observed for such a signal. Upper limits obtained at 95% confidence level for the product of the production cross section and branching fraction sigma(gg to X) B(X to HH to b b-bar b b-bar) range from 10 to 1.5 fb for the mass of X from 1.15 to 2.0 TeV, significantly extending previous searches. For a warped extra dimension theory with a mass scale Lambda[R] = 1 TeV, the data exclude radion scalar masses between 1.15 and 1.55 TeV.


Introduction
The production of pairs of Higgs bosons (H) in the standard model (SM) has a predicted cross section in gluongluon fusion at √ s = 8 TeV [1,2] for the Higgs boson mass m H ≈ 125 GeV [3] of only 10.0±1.4 fb. Many BSM theories suggest the existence of narrow heavy particles X that can decay to a pair of Higgs bosons [4][5][6][7][8][9][10][11][12]. The natural width for such a resonance is expected to be a few percent of its pole mass m X , which corresponds to a typical detector resolution. In contrast, the SM production of Higgs boson pairs results in a broad distribution of effective mass, falling mainly in the range from 300 to 600 GeV. Thus the presence of a narrow e-mail: cms-publication-committee-chair@cern.ch state would be readily detected, even if produced with a cross section as small as that for the SM process.
Searches for narrow particles decaying to two Higgs bosons have already been performed by the ATLAS [13][14][15] and CMS [16][17][18][19] collaborations in pp collisions at the CERN LHC. Until now their reach was limited to m X ≤ 1.5 TeV. Because longitudinal W and Z states are provided by the Higgs field in the SM, any HH resonance potentially also decays into WW and ZZ final states. Searches for X → WW, ZZ, and WZ states were performed by ATLAS and CMS [20][21][22][23][24]. The combinations of these results [24][25][26][27] indicate that the region around m X ≈ 2 TeV is particularly interesting to explore.
This paper reports on a search for X → HH covering the mass range 1.15 < m X < 3.0 TeV, significantly extending the reach of the present results beyond 1.5 TeV. The final state that provides the best sensitivity in this mass range is HH → bbbb, which benefits from the expected large branching fraction (B) of 57.7 % for H → bb [28] and a relatively low background from SM processes.
Many BSM proposals explicitly considered in this paper postulate the existence of a warped extra dimension (WED) [6] and predict the existence of a scalar radion [7-9]. The radion is a spin-0 resonance associated with the fluctuations in the length of the extra dimension. The production cross section as a function of m X is proportional to 1/ 2 R , where R is the scale parameter of the theory. In this paper we consider two cases: R = 1 and 3 TeV. In the first case, the WED theory predicts a cross section that can be detected at the LHC [17], but is challenged by the constraints derived from the electroweak precision measurements [29]. This specific model is excluded up to m X = 1.1 TeV by the previous X → HH searches [14,17]. In contrast, the predicted cross section for R = 3 TeV is a factor of 9 times smaller, but the theory is less constrained by these searches. We consider that the radion is produced exclusively via gluon-gluon fusion processes, with B(radion → HH) ≈ 25 % above 1 TeV.
In the mass range of this search, the topology of the bbbb final state is constrained by the size of the Lorentz boost of the Higgs bosons that is typically γ H ≈ m X /2m H 1 and defines the so-called boosted regime [30][31][32]. In this regime each Higgs boson is produced with a large momentum and its decay products are collimated along its direction of motion. The hadronization of a pair of narrowly separated b quarks will result in a single reconstructed jet of mass compatible with m H . The H candidates are selected by employing jet substructure techniques to identify jets containing constituents with kinematics consistent with the decay of a highly boosted Higgs boson. These candidates are then required to be consistent with decays of B hadrons, based on our b tagging algorithms. The signal is identified in the dijet mass (m jj ) spectrum as a peak above a falling background which originates mainly from multijet events and tt production.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. A silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections, reside within the solenoid volume. Extensive forward calorimetry complements the coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. A detailed description of the CMS detector, together with a definition of the coordinate system and the basic kinematic variables, can be found in Ref. [33].

Simulated events
Monte Carlo (MC) simulations are used to provide: predictions of background processes, optimization of the event selection, and cross-checks of data-based background estimations.
Signal, multijet and tt background events are generated using the leading-order matrix element generator Mad-Graph 5v1.3.30 [34,34]. Parton shower and hadronization are included using pythia 6.4.26 [35], and the matrix element is matched to the parton shower using the MLM scheme [36]. The Z2* tune is used to describe the underlying event. This tune is identical to the Z1 tune [37], but uses the CTEQ6L parton distribution functions (PDF) [38]. The signal events are simulated with an intrinsic width of the radion fixed to 1 GeV, m H = 125 GeV. Different samples are generated for m X ranging from 1.15 to 3 TeV. All generated events are processed through a simulation of the CMS apparatus based on Geant4 [39]. Additional pp interactions within a bunch crossing (pileup) are added to the simulation, with a frequency distribution chosen to match that observed in data.
During this data-taking period the mean number of interactions per bunch crossing is 21.

Event reconstruction and selections
The analysis is based on data from pp interactions observed with the CMS detector at √ s = 8 TeV. The data correspond to an integrated luminosity of 19.7 fb −1 . Events are collected using at least one of the two specific trigger conditions based on jets reconstructed online: the first trigger requires a large m jj calculated for the two jets of highest transverse momentum (referred to as leading jets); the second trigger requires a large value of H T = i p i T , where the sum runs over the reconstructed jets in the event with transverse momenta p T > 40 GeV. The lower thresholds applied to m jj and the H T triggers were changed during the data-taking period to maintain a constant trigger rate while the LHC peak luminosity steadily increased. More than half of the data were collected with m jj > 750 GeV and H T > 650 GeV. The remaining data were collected with the requirement H T > 750 GeV.
Events are required to have at least one reconstructed pp collision vertex within |z| < 24 cm of the center of the detector along the longitudinal beam directions. Many additional vertices, corresponding to pileup interactions, are usually reconstructed in an event using charged particle tracks. We assume that the primary interaction vertex corresponds to the one that maximizes the sum in p 2 T of these associated tracks. Individual particles are reconstructed using a particle-flow (PF) algorithm [40,41] that combines the information from all the CMS detector components. Each such reconstructed particle is referred to as a PF candidate. The five classes of PF candidates correspond to muons, electrons, photons, and charged and neutral hadrons. Charged hadron candidates not originating from the primary vertex of the event are discarded to reduce contamination from pileup [42].
The Cambridge-Aachen (CA) algorithm [43], implemented in FastJet [44], clusters PF candidates into jets using a distance parameter R = 0.8. An event-by-event jet area-based correction [42,45,46] is applied to each reconstructed jet to remove the remaining energy originating from pileup vertices primarily consisting of neutral particles. The jet four-momenta are also corrected to account for the difference between the measured and the expected momentum at the particle level, using the standard CMS correction procedure described in Refs. [47,48].
Events are required to have at least two jets, and the two leading jets each to have p T > 40 GeV and pseudorapidity |η| < 2.5. In addition, identification criteria are applied to remove spurious jets associated with calorimeter noise [40]. To reduce the contribution from multijet events, the two leading jets must be relatively close in η, | η jj | < 1.3, a selection discussed in Refs. [23,49]. Events with m jj < 1 TeV Simulated m P j spectrum for spin-0 radion signals, multijet and tt events, and the spectrum measured in data. Each event contributes twice to the distribution, once per jet. The multijet contribution is rescaled to match the event yield in data, while the signal and tt spectra are rescaled by the large factors indicated, to be visible in the figure are rejected. Above this mass threshold, the efficiency of the trigger requirement for the chosen selections exceeds 99.5 %.
The mass and b flavour properties of the leading jets are used to suppress the multijet and tt backgrounds. Soft gluon radiation and a fraction of the remaining neutral pileup particles are first removed from each jet through the implementation of a jet-grooming algorithm called jet pruning [50,51]. This technique reduces significantly the mass of jets originating from quarks and gluons [52], while improving the resolution of the jets resulting from the hadronic decays of a heavy SM boson [53]. The invariant mass m P j is calculated for the two leading pruned jets. In Fig. 1, the m P j distribution of the two leading jets is shown for data, signal, and background events. For jets initiated by a quark or a gluon, m P j peaks around 15 GeV, while jets from high-momentum Higgs boson decay usually have a pruned mass around 120 GeV. The difference of ≈5 GeV relative to the nominal m H value is related to the presence of neutrinos produced by the semileptonic decays of B mesons, and the inherent nature of the pruning procedure. A small peak near 15 GeV is also observed for signal events, and corresponds mainly to asymmetric decays in which the jet pruning algorithm removes the decay products of one of the two B mesons. Each of the leading jets has to satisfy 110 < m P j < 135 GeV, a requirement that is chosen to maximize the sensitivity of the analysis to the presence of a narrow resonances. Some differences are observed between the data and background estimated from simulation. These discrepancies do not affect the results of this analysis since the background is estimated using techniques based on data only.
The identification of jets likely to have originated from the hadronization of a pair of b quarks exploits the combined secondary vertex (CSV) b jet tagger [54]. This algorithm combines the information from track impact parameters and secondary vertices within a given jet into a continuous output discriminant [54,55]. The working point used in this paper corresponds to an efficiency of 80 % for identifying b jets and a rate of 10 % for mistagging jets from light quarks or gluons as originating from b quarks. This working point was chosen to maximize the sensitivity of the analysis, while retaining a sufficient number of events to allow a reliable estimation of the background.
In the first step of the procedure used to select H jet candidates, the pruned jets are split into two subjets by reversing the final iteration in the jet clustering algorithm. The angular separation between the subjets is R ≡ where η is the pseudorapidity and φ the azimuthal angle. Two cases are considered, with the transition between them occurring at m X ≈ 1.6 TeV: 1. R > 0.3: in this group the jet is considered to be b tagged if at least one subjet satisfies the requirements of the CSV working point. Moreover, the jet is considered as "double b tagged" if both subjets satisfy the CSV requirement. 2. R < 0.3: here the subjet b tagging selection is inefficient [55]. The b tagging algorithm is therefore applied directly to the jet. In this case it is not possible to distinguish between b-tagged and double b-tagged jets, and therefore either of these two possibilities are accepted.
In summary, a jet is considered an H jet candidate if it satisfies the mass and b tagging requirements. Events are selected when both leading jets are H jets, and at least one of them is double b tagged. The simulated results are corrected to match the H and b tagging efficiencies observed in data [55].
A final selection is based on the kinematic properties of the constituents of H jets. The quantity N-subjettiness [56-58] τ N is used to quantify the degree to which constituents of a jet can be arranged into N subjets. The ratio τ 21 = τ 2 /τ 1 is calculated for each of the two H jet candidates. High-(HP) and low-purity (LP) Higgs boson candidates are defined as having τ 21 < 0.5 and 0.5 ≤ τ 21 < 0.75, respectively. Events are required to have at least one HP H jet and another H jet that passes either the HP or LP requirements.
The sample of events satisfying the previously defined criteria is subsequently divided into three categories. Events with two high-purity H jets form the HPHP category. Among the remaining events, those for which the high-purity H jet is the leading jet constitute the HPLP category. The rest of the sample constitutes the LPHP category.
The selection criteria applied to reduce the background are summarized in Table 1. The region of phase space defined by all these criteria is referred to as the signal region. The fraction of the simulated signal and tt samples, satisfying these criteria, as well as the number of data events passing the selections is also provided.
The fiducial selection is defined by the two leading jets having |η| < 2.5, p T > 40 GeV, and a separation | η jj | < 1.3. The fraction of the signal within this fiducial region depends on its spin, and is ≈60 % for a spin-0 resonance. The efficiency of the combined H mass and b tagging criteria for events within the fiducial region, for signal and data, is shown in Fig. 2. The number of data events is reduced by four orders of magnitude while the signal efficiencies range from 10 to 20 % with a weak dependence on m X , and are observed to be independent of the spin of the resonance. Finally, the total acceptance times efficiency is provided in Table 1, and varies between 4.0 and 8.8 %, with the largest fraction of events populating the HPHP category. Figure 2 shows that the probability of incorrectly identifying multijet or tt events as events with two Higgs bosons is less than 0.1 %, and appears to be independent of m jj within statistical uncertainties. A more precise quantification is provided in Table 1 for tt events. In particular, we observe that the dijet mass, the pruned jet mass, and b tagging criteria are each sufficient for reducing the tt background by an order of magnitude. In contrast, the N-subjettiness criterion is inefficient in reducing it.

Signal extraction
The signal is identified in the binned m jj spectrum in bin widths chosen to match the resolution of the dijet mass, as  . This resolution is ≈50 GeV at m X = 1.15 TeV, increasing slowly to ≈100 GeV for m X = 3 TeV. The analysis defines a likelihood, for each m X hypothesis, based on the total number of events in data, signal, and background counted in a mass window in each category. These mass windows have a typical size of three or four bins centered approximatively around m X (see Table 2) and contains more than 95 % of signal events. The amount of signal is estimated in the mass window using MC simulation, while the amount of background is estimated as the integral of a parameterized model. The total likelihood combines the information from the three event categories.

Parameterization of background
After event selection, ≈75, 90, and 95 % of the total background is expected to originate from multijet events in HPHP, HPLP, and LPHP categories, respectively. The remaining contribution is from tt production, which is modelled in simulation, and rescaled to the total next-to-next-to-leading order cross section [60]. All other backgrounds containing Higgs bosons or W/Z bosons decaying into jets represent less than 1 % of the total background.
The total background is estimated from data, without separating the multijet or tt fractions. The expected m jj background spectrum is approximated by a falling exponential for 1 < m jj < 3 TeV, where the parameterization has been chosen to minimize the correlation between the normalization N B and slope a. We obtain a from a fit to the m jj distribution in a control region, defined as the portion of phase space where one of the jets satisfies 110 < m P j < 135 GeV and the other jet is required to have 60 < m P j < 100 GeV. This choice of the window for m P j results from a compromise between limited signal contamination, sufficiently large statistics, and similarity in substructure properties between the sideband jet and the H jet. To use this control region we assume that there is no resonant signal in the ZH final state.
The control region contains between 1.1-2 times the number of events in the signal region depending on the category. The result of the fit and the uncertainty band associated with the uncertainty in the parameter a are shown in Fig. 3. The effect of a residual contamination of the control region by the signal is explicitly checked by adding an HH signal to the control region at different masses, with a typical σ (gg → X → HH) B(X → HH → bbbb), corresponding to the sensitivity of the analysis at a given m X . The change in the slope parameter a is observed to be negligible.
We extract N B for each signal hypothesis from the fit to the data that excludes events in the counting window described in Sect. 5. This background extraction procedure motivates the choice of the lower value of the m X window for which the search is performed. In order to improve the constraint on N B , there must be at least one bin on the left side of the mass window to be retained.
This background estimation procedure assumes, on the one hand, that the m jj spectrum is similar in the signal and the control regions, and on the other hand, that it is similar for multijet and tt event samples. The following cross-checks are performed to validate these hypotheses: -The similarity of distributions for the signal and control regions are confirmed in the simulated multijet sample. -The parameters a and N B are extracted from the signal region (using an approach similar to that of Ref.
[23]), and found to be compatible within statistical uncertainties with the parameters obtained through the normal method of background estimation. -The bin-by-bin normalization between the signal and control regions is calculated using a sideband obtained by inverting the b tagging criterion on one of the jets (using a technique similar to that in Ref. [61]), and the normalization factor found to be independent of m jj , within the statistical uncertainties. -Thett contribution in the signal region obtained from simulation is fitted by the function in Eq. (1) and the resulting fit is found to be consistent with the distribution of the overall background within the statistical uncertainties.
Closure checks of the background-estimation procedure are performed using simulated multijet events. These are also performed directly in data in the control region. For this purpose, the control region is split in two, a low mass control region with 60 < m P j < 90 GeV, and a pseudo-signal region with 90 < m P j < 100 GeV. In both cases, the predicted background is found to be compatible with that observed, within the statistical uncertainties.

Systematic uncertainties
The largest contributions to the systematic uncertainty in the signal yields are the uncertainties associated with the classification of the events into the purity categories, the estimation of the efficiency to identify a H jet, and the calculation of the total integrated luminosity (2.6 %) [62], as well as with the determination of the jet energy scale (JES) and resolution (JER). The major systematic uncertainties are summarized in Table 3.
The uncertainty in the b tagging efficiency originates from the uncertainty in the data-to-simulation scale factors that  The uncertainty in the mass selection efficiency is 2.6 % for each jet and 5.2 % for the event. This uncertainty is esti-mated by studying high p T W bosons in a tt data control sample [53] and comparing to MC predictions. It includes the effect of the difference in fragmentation between light and b quarks. This uncertainty is fully correlated for all H jets. In addition, the impact of the pileup modelling uncertainty in the Higgs boson mass-tagging efficiency is assumed to be 1.5 % per jet, i.e., 3 % for the event [23]. 19.7 fb CMS HPHP Observed 95% upper limit Expected 95% upper limit 1 std. deviation ± Expected limit 2 std. deviation ± Expected limit = 1 TeV)

Fig. 5
Observed and expected 95 % CL upper limits on the product of cross section of a narrow resonance and the branching fraction σ (gg → X) B(X → HH → bbbb). Theory curves corresponding to WED models with radion are superimposed An uncertainty accounting for possible migration of signal events from the HPHP to the HPLP and LPHP categories results in uncertainties of +25 and −19 %, and of +59 and −37 % in the normalization of the HPHP category, and of both the HPLP and LPHP categories, respectively. These uncertainties are estimated by comparing the τ 21 distribution in measured and simulated tt events [23,53]. It also includes a quantification of the difference between the fragmentation of W and Higgs bosons decaying hadronically. The fraction of signal events that do not enter any of the three categories changes from 2 % at 1.1 TeV to 20 % at 3.0 TeV. The uncertainty associated with migration out of the three categories is estimated to be much smaller than that associated with migration within them.
The uncertainties in the JES (1-2 %) [48] and JER (10 %) [47] impact the signal acceptance in the m jj counting window. Each of these systematic contributions provide less than 1 % uncertainty in the normalization of the expected signal events.
In summary, the uncertainty in the signal normalization associated with the migration of signal events between categories is larger than the total contribution of all other uncertainties, which varies from 7 % at m X = 1.1 TeV to 15 % at m X = 3 TeV.
The statistical uncertainty in the total background ranges from 15 % at 1.3 TeV up to 100 % at 3 TeV. It is calculated by generating pseudo-experiments in the signal and control regions, assuming Poisson fluctuations in the num-ber of events in each bin about its central value. For low m jj , the statistical precision is limited by the uncertainty in the parameter N B , and for high masses, by the uncertainty in the slope parameter a. The impact of the choice of the functional form used in the parameterization of the background distribution is evaluated by comparing the results from the exponential fit to those from an alternative power-law function, and is found to be negligible compared to the statistical uncertainty.
The uncertainty related to the efficiency of the τ 21 tagger is assumed to be fully correlated between the HPLP and LPHP categories and anticorrelated with the HPHP category. The uncertainties in the background estimate are uncorrelated between categories, while all other uncertainties are expected to be fully correlated among all three categories.

Results
The observed data are shown separately for the three event categories in Fig. 4. For comparison, we also show the predictions obtained for the background-only hypothesis. The N B normalization parameter is extracted for all events in the signal region with 1 < m jj < 3 TeV. The bottom panel of each plot shows the difference between the observed data and the predicted background, divided by the statistical uncertainty estimated in the data. The background model describes the data within their statistical uncertainties. The events with the largest masses in the HPHP, HPLP, and LPHP categories are at m jj = 1780, 1560, and 1800 GeV, respectively.
Upper limits on the cross section for the production of resonances are extracted using the asymptotic approximation of the CL s method [63,64]. Figure 5 shows the observed and expected 95 % confidence level (CL) upper limits on the product of the cross section and the branching fraction σ (gg → X) B(X → HH → bbbb) obtained for each event category. The HPHP category is always the most sensitive, nevertheless above 2 TeV the HPLP and LPHP categories are also important because of inefficiencies in N-subjettiness at high p T . Figure 6 and Table 4 provide the combined limits. The excluded cross sections at 95 % CL vary from 10 fb at 1.15 TeV to 1.5 fb at 2 TeV. Above 2 TeV the excluded cross sections increase to 2.8 fb at 3 TeV, since the sensitivity is limited by the increasing inefficiency of H jet identification, as described in Sect. 4. Figure 7 extends the X → HH → bbbb search down to m X = 260 GeV by including limits from Ref. [17]. This search, referred to as the resolved analysis, considers a case where the decay products from two Higgs bosons are reconstructed as four jets. It is interesting to observe that the sensitivity of the resolved analysis starts to degrade at m X ≈ 1 TeV. At this point the typical angular distance between two jets from one Higgs boson reaches R = 4m H /m X ≈ 0.5

CMS
Observed 95% upper limit Expected 95% upper limit 1 std. deviation ± Expected limit 2 std. deviation ± Expected limit Observed and expected 95 % CL upper limits on the product of cross section of a narrow resonance and the branching fraction σ (gg → X) B(X → HH → bbbb). Theory curves corresponding to WED models with radion are also shown Table 4 Observed and expected 95 % CL upper limits on the product of cross section and the branching fraction σ (gg → X) B(X → HH → bbbb) for HPHP, HPLP and LPHP categories combined. The one standard deviation on the 95 % CL upper limit is also provided m X Observed limit Expected limit ±1σ (GeV) ( To quantify the sensitivity of this analysis to new physics, the limits are compared to predictions of radion production for R = 1 and 3 TeV, as shown in Fig. 6. We find that a radion corresponding to R = 1 TeV is excluded by the boosted analysis alone, for masses between 1.15 and

Summary
A search is presented for narrow heavy resonances decaying into a pair of Higgs bosons in proton-proton collisions collected by the CMS experiment at √ s = 8 TeV. The full data sample of 19.7 fb −1 is explored. The background from multijet and tt events is significantly reduced by applying requirements related to the flavor of the jet, its mass, and its substructure. No significant excess of events is observed above the background expected from the SM processes. The results are interpreted as exclusion limits at 95 % confidence on the production cross section for m X between 1.15 and 3.0 TeV, extending significantly beyond 1.5 TeV the reach of previous searches. A radion with scale parameter R = 1 TeV decaying into HH is excluded for 1.15 < m X < 1.55 TeV for the first time in direct searches.
Acknowledgments We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centers and personnel of the Worldwide LHC Computing Grid for deliver-ing so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: