Search for a massive resonance decaying to a pair of Higgs bosons in the four b quark final state in proton-proton collisions at $\sqrt{s}=$ 13 TeV

A search for a massive resonance decaying into a pair of standard model Higgs bosons, in a final state consisting of two b quark-antiquark pairs, is performed. A data sample of proton-proton collisions at a centre-of-mass energy of 13 TeV is used, collected by the CMS experiment at the CERN LHC in 2016, and corresponding to an integrated luminosity of 35.9 fb$^{-1}$. The Higgs bosons are highly Lorentz-boosted and are each reconstructed as a single large-area jet. The signal is characterized by a peak in the dijet invariant mass distribution, above a background from the standard model multijet production. The observations are consistent with the background expectations, and are interpreted as upper limits on the products of the $s$-channel production cross sections and branching fractions of narrow bulk gravitons and radions in warped extra-dimensional models. The limits range from 126 to 1.4 fb at 95% confidence level for resonances with masses between 750 and 3000 GeV, and are the most stringent to date, over the explored mass range.


Introduction
In the standard model (SM), the pair production of Higgs bosons (H) [1][2][3] in proton-proton (pp) collisions at √ s = 13 TeV is a rare process [4]. However, the existence of massive resonances decaying to Higgs boson pairs (HH) in many new physics models may enhance this rate to a level observable at the CERN LHC using the current data. For instance, models with warped extra dimensions (WED) [5] contain new particles such as the spin-0 radion [6][7][8] and the spin-2 first Kaluza-Klein (KK) excitation of the graviton [9][10][11], which have sizeable branching fractions to HH.
The WED models have an extra spatial dimension compactified between two branes, with the region between (called the bulk) warped via an exponential metric κl, κ being the warp factor and l the coordinate of the extra spatial dimension [12]. The reduced Planck scale (M Pl ≡ M Pl /8π, M Pl being the Planck scale) is considered a fundamental scale. The free parameters of the model are κ/M Pl and the ultraviolet cutoff of the theory Λ R ≡ √ 6e −κl M Pl [6]. In pp collisions at the LHC, the graviton and the radion are produced primarily through gluon-gluon fusion and are predicted to decay to HH [13].
Other scenarios, such as the two-Higgs doublet models [14] (in particular, the minimal supersymmetric model [15]) and the Georgi-Machacek model [16] predict spin-0 resonances that are produced primarily through gluon-gluon fusion, and decay to an HH pair. These particles have the same Lorentz structure and effective couplings to the gluons and, for narrow widths, result in the same kinematic distributions as those for the bulk radion. Hence, the results of this paper are also applicable to this class of models. Searches for a new particle X in the HH decay channel have been performed by the ATLAS [17][18][19] and CMS [20][21][22][23][24] Collaborations in pp collisions at √ s = 7 and 8 TeV. More recently, the ATLAS Collaboration has published limits on the production of a KK bulk graviton, decaying to HH, in the bbbb final state, using pp collision data at √ s = 13 TeV, corresponding to an integrated luminosity of 3.2 fb −1 [25]. Because the longitudinal components of the W and Z bosons couple to the Higgs field in the SM, a resonance decaying to HH potentially also decays into WW and ZZ, with a comparable branching fraction for X → ZZ, and with a branching fraction for X → WW that is twice as large. Searches for X → WW and ZZ have been performed by ATLAS and CMS [26][27][28][29][30][31][32][33][34][35].
This letter reports on the search for a massive resonance decaying to an HH pair, in the bbbb final state (with a branching fraction ≈33% [36]), performed using a data set corresponding to 35.9 fb −1 of pp collisions at √ s = 13 TeV. The search significantly improves upon the CMS analysis performed using the LHC data collected at √ s = 8 TeV [24], and extends the searched mass range to 750-3000 GeV. This search is conducted for both the radion and the graviton, whereas the earlier search only considered the former.
In this search, the X → HH decay would result in highly Lorentz-boosted and collimated decay products of H → bb, which are referred to as H jets. These are reconstructed using jet substructure and jet flavour-tagging techniques [37][38][39]. The background consists mostly of SM multijet events, and is estimated using several control regions defined in the phase space of the masses and flavour-tagging discriminators of the two H jets, and the HH dijet invariant mass, allowing the background to be predicted over the entire range of m X explored. The signal would appear as a peak in the HH dijet invariant mass spectrum above a smooth background distribution.
AK4 jets with invariant mass above 900 GeV and a pseudorapidity separation |∆η| < 1.5. A third set of triggers selected events with the scalar p T sum of all AK8 jets greater than 650 or 700 GeV and the presence of an AK8 jet with a "trimmed mass" above 50 GeV, i.e. the jet mass after removing remnants of soft radiation using jet trimming technique [61]. The fourth triggering condition was based on the presence of an AK8 jet with p T > 360 GeV and trimmed mass greater than 30 GeV. The last trigger selection accepted events containing two AK8 jets having p T > 280 and 200 GeV with at least one having trimmed mass greater than 30 GeV, together with an AK4 jet passing a loose b-tagging criterion.
The pp interaction vertex with the highest ∑ p 2 T of the associated clusters of physics objects is considered to be the one associated with the hard scattering interaction, the primary vertex. The physics objects are the jets, clustered using the jet finding algorithm [43,44] with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets. The other interaction vertices are designated as pileup vertices.
To mitigate the effect of pileup, particles are assigned weights using the pileup per particle identification (PUPPI) algorithm [62], with the weight corresponding to its estimated probability to originate from a pileup interaction. Charged particles from pileup vertices receive a weight of zero while those from the primary vertex receive a weight of one. Neutral particles are assigned a weight between zero and one, with higher values for those likely to originate from the primary vertex. Particles are then clustered into AK8 jets. The vector sum of the weighted momenta of all particles clustered in the jet is taken to be the jet momentum. To account for detector response nonlinearity, jet energy corrections are applied as a function of jet η and p T [63,64]. In each event, the leading and the subleading p T AK8 jets, j 1 and j 2 , respectively, are required to have p T > 300 GeV and |η| < 2.4.
The removal of events containing isolated leptons (electrons or muons) with p T > 20 GeV and |η| < 2.4 helps suppress tt +jets and diboson backgrounds. The isolation variable is defined as the scalar p T sum of the charged and neutral hadrons, and photons in a cone of ∆R = 0.3 for an electron or ∆R = 0.4 for a muon, where ∆R ≡ √ (∆η) 2 + (∆φ) 2 , φ being the azimuthal angle in radians. The energy from pileup deposited in the isolation cone, and the p T of the lepton itself, is subtracted [65,66]. The isolation requirement removes jets misidentified as leptons. Additional quality criteria are applied to improve the purity of the isolated lepton samples. Electrons passing combined isolation and quality criteria corresponding to a selection efficiency of 90% (70%) are designated "loose" ("medium") electrons. For the "loose" ("medium") muons, the total associated efficiency is 100% (95%). The probability of a jet to be misidentified as an electron or a muon is in the range 0.5-2%, depending on p T , η, and the choice of medium or loose selection criteria. Events containing one medium lepton, or two loose leptons of the same flavour, but of opposite charge, are rejected.
The H → bb system is reconstructed as a single high-p T AK8 jet, where the decay products have merged within the jet, and the two highest p T jets in the event are assumed to be the Higgs boson candidates. The jet is groomed [67] to remove soft and wide-angle radiation using the soft-drop algorithm [68,69], with the soft radiation fraction parameter z set to 0.1 and the angular exponent parameter β set to 0. The groomed jet is used to compute the soft-drop jet mass, which peaks at the Higgs boson mass for signal events and reduces the mass of background quark-and gluon-initiated jets. Dedicated mass corrections [70], derived from simulation and data in a region enriched with tt events with merged W → qq decays, are applied to the jet mass in order to remove residual dependence on the jet p T , and to match the jet mass scale and resolution observed in data. Multijets: c Multijets: light Multijets: c Multijets: light Multijets: c Multijets: light  Multijets: c Multijets: light Multijets: c Multijets: light The soft-drop masses of j 1 and j 2 are required to be within the range 105-135 GeV, with an efficiency of about 60-70%, for jets arising from a signal of mass m X in the range 750-3000 GeV. The "N-subjettiness" algorithm is used to determine the consistency of the jet with two subjets from a two-pronged H → bb decay, by computing the inclusive jet shape variables τ 1 and τ 2 [71]. The ratio τ 21 ≡ τ 2 /τ 1 with a value much less than one indicates a jet with two subjets. The selection τ 21 < 0.55 is used, having a jet p T -dependent efficiency of 50-70%, before applying the soft-drop mass requirement.
For background events, j 1 and j 2 are often well separated in η, especially at high invariant mass (m jj ) of j 1 and j 2 . In contrast, signal events that contain a heavy resonance decaying to two energetic H jets are characterized by a small separation of the two jets in η. Events are therefore required to have a pseudorapidity separation |∆η(j 1 , j 2 )| < 1.3.
The efficiency of the trigger combination is measured in a sample of multijet events, collected with a control trigger requiring a single AK4 jet with p T > 260 GeV, and with the leading and the subleading p T AK8 jets, j 1 and j 2 , respectively, passing the above selections on p T , η, and the soft-drop mass. The efficiency is greater than 99% for m jj ≥ 1100 GeV, and in the range 40-99% for 750 < m jj < 1100 GeV. The trigger efficiency of the simulated samples is corrected using a scale factor to match the observed efficiency in the data. This scale factor is applied as a function of |∆η(j 1 , j 2 )| because it has a mild dependence on this variable.
The main method to suppress the multijet background is b tagging: since a true H → bb jet contains two b hadrons, the H jet candidates are identified using the dedicated "double-b tagger" algorithm [72]. The double-b tagger exploits the presence of two hadronized b quarks inside the H jet, and uses variables related to b hadron lifetime and mass to distinguish between H jets and the background from multijet production; it also exploits the fact that the b hadron flight directions are strongly correlated with the axes used to calculate the N-subjettiness ob-servables. The double-b tagger is a multivariate discriminator with output between −1 and 1, with a higher value indicating a greater probability for the jet to contain a bb pair. The double-b tagger discriminator thresholds of 0.3 and 0.8 correspond to H jet tagging efficiencies of 80 and 30% and are referred to as "loose" (L) and "tight" (T) requirements, respectively. Events must have the two leading p T AK8 jets satisfying the loose double-b tagger requirement. The datato-simulation scale factor for the double-b tagger efficiency is measured in an event sample enriched in bb pairs from gluon splitting [72], and applied to the signals to obtain the correct signal yields.
The main variable used in the search for a HH resonance is the "reduced dijet invariant mass" where m j 1 and m j 2 are the soft-drop masses of the leading and subleading H-tagged jets in the event, and m H = 125.09 GeV [73,74] is the Higgs boson mass. The quantity m jj,red is used rather than m jj since by subtracting the soft-drop masses of the two H-tagged jets and adding back the exact Higgs boson mass m H , fluctuations in m j 1 and m j 2 due to the jet mass resolution are corrected, leading to 8-10% improvement in the dijet mass resolution. A requirement of m jj,red > 750 GeV is applied for selecting signal-like events.
The soft-drop mass, τ 21 , and double-b tagger discriminator distributions of the two leading p T jets are shown in Fig. 1 for simulated events after passing the online selection, lepton rejection, kinematic selection, and the requirement m jj,red > 750 GeV. Also, the N-subjettiness requirement of τ 21 < 0.55 is applied for the soft-drop mass and the double-b tagger distributions, while the soft-drop mass requirement is applied to the τ 21 , and double-b tagger discriminator distributions. Since some of the triggers impose a trimmed jet mass requirement, this affects the shape of the offline soft-drop jet mass, resulting in a steep rise above ∼20 GeV. The distributions of the |∆η(j 1 , j 2 )| and the m jj,red variables are shown in Fig. 2. In these figures, the multijet background is shown for different jet flavour categories: jets having two B hadrons (bb) or a single one (b), jets having a charm hadron (c), and all other jets (light).
The double-b tagger discriminator of the two leading AK8 jets must exceed the loose threshold. In addition, if both discriminator values also exceed the tight threshold, events are classified in the "TT" category. Otherwise, they are classified in the "LL" category, which contains events with both j 1 and j 2 failing the tight threshold as well as events with either j 1 or j 2 passing the tight threshold while the other passes the loose threshold only. The backgrounds are estimated separately for each category, and the combination of the likelihoods for the TT and LL categories gives the optimal signal sensitivity over a wide range of resonance masses, according to studies performed using simulated signal and multijet samples. The TT category has a good background rejection for m X up to 2000 GeV. At higher resonance masses, where the background is small, the LL category provides better signal sensitivity. The full event selection efficiencies for bulk gravitons and radions of different assumed masses are shown in Fig. 3. The radion has a smaller efficiency than the bulk graviton because its |∆η(j 1 , j 2 )| distribution is considerably wider than that of a bulk graviton of the same mass, as shown in Fig. 2 (left).

Signal and background modelling
The method chosen for the background modelling depends on whether the resonance mass m X is below or above 1200 GeV, since at low masses the background does not fall smoothly as a function of m jj,red , because of the trigger requirements, while above 1200 GeV it does. The background estimation relies on a set of control regions to predict the total background shape and normalization in the signal regions. The entire range of the m jj,red distribution above 750 GeV is used for the prediction.
For signals with m X ≥ 1200 GeV, the underlying background distribution falls monotonically with m jj,red , thus allowing the background shape to be modelled by a smooth function, above which a localized signal is searched for. This smooth background modelling helps to reduce uncertainties in the background estimation from local statistical fluctuations in m jj,red , thereby improving the signal search sensitivity. The parameters of the function and its total normalization are constrained by a simultaneous fit of the signal and background models to the data in the control and the signal regions. For m X ≥ 1200 GeV, the m jj,red distributions for the signal are modelled using the sum of a Gaussian and a Crystal Ball function [75], as shown in Fig. 4 for one signal category. The same modelling is used for the other signal categories, with different parameters for the Gaussian and the Crystal Ball functions.
The signal and control regions are defined by two variables related to the leading p T jet j 1 : (i) its soft-drop mass m j 1 and (ii) the value of the discriminator of the double-b tagger. The background is estimated in bins of the m jj,red distribution. Considering these two variables, several regions are defined.
The pre-tag region includes events fulfilling the selection requirements in Sections 2-3 apart from those on m j 1 and on the j 1 double-b tagger discriminator. The signal region is the subset of pre-tag events where m j 1 is inside the H jet mass window of 105-135 GeV, and with the j 1 double-b tagger discriminator greater than 0.3 or 0.8, for the LL and TT regions, respectively. The antitag regions require the j 1 double-b tagger discriminator to be less than 0.3, with the requirement on j 2 being the same as that for the corresponding LL or TT signal regions. The m j 1 sideband region consists of events in the pre-tag region, where m j 1 lies outside the H jet mass window. Based on whether j 1 passes or fails the double-b tagger discriminator threshold, the sideband region is divided into either "passing" or "failing", respectively. The antitag regions are dominated by the multijet background, and have identical kinematic distributions to the multijet background events in the signal region, according to studies using simulations. The definitions of the signal, the antitag, and the sideband regions are given in Table 1.
In the absence of a correlation between m j 1 and the double-b tagger discriminator values, one could measure in the m j 1 sideband the ratio of the number of events passing and failing the double-b tagger selection, R p/f ≡ N pass /N fail , i.e. the "pass-fail ratio". The yield in the antitag region (in each m jj,red bin) could then be scaled by R p/f to obtain an estimate of the background normalization in the signal region. However, there is a small correlation between the double-b tagger discriminator and m j 1 , which is taken into account by measuring the pass-fail ratio R p/f as a function of m j 1 . The signal fraction was found to be less than 10 −3 in the sideband regions used to evaluate R p/f , assuming a signal cross section σ(pp → X → HH → bbbb) of 10 fb.
The R p/f for the LL signal region is measured using ratio of the number of events in the "LL, passing" and "LL, failing" sideband regions, as defined in Table 1. Likewise, the R p/f for the TT signal region uses the ratio of the number of events in the "TT, passing" to the "TT, failing" sideband regions. The variation of R p/f as a function of m j 1 in each m j 1 sideband is fitted with GeV. An alternative fit using a third order polynomial was found to give the same interpolated value of R p/f in the Higgs jet mass window. Every event in the antitag region is scaled by the pass-fail ratio evaluated for the m j 1 of that event, to obtain the background prediction in the signal region. Figure 5 shows the quadratic fit in the m j 1 sidebands of the pass-fail ratio R p/f as a function of m j 1 , as obtained in the data. The background prediction using this method, along with the number of observed events in the signal region is shown in Fig. 6.
For resonance masses of 1200 GeV and above, the background estimation is improved by simultaneously fitting a parametric model for the background and signal to the data in the signal and the antitag regions for m jj,red ≥ 1100 GeV. In the fit, the ratio R p/f obtained from the sidebands is used to constrain the relative number of background events in the two regions. To account for possible R p/f dependence on m jj,red at high m jj,red values, the R p/f obtained from the fits shown in Fig. 5 is also parametrized as a linear function of m jj,red . The signal normalization is unconstrained in the fit, while the uncertainties in the parameters of the functions used to model the background and R p/f are treated as nuisance parameters. For the background modelling, a choice among an exponential function Ne −a m jj,red , a "levelled exponential" function Ne −a m jj,red /(1+a b m jj,red ) , and a "quadratic levelled exponential function"

Ne
[−a m jj,red /(1+a b m jj,red )]−[−c m 2 jj,red /(1+b c m 2 jj,red )] was made, using a Fisher F-test [77]. At a confidence level of 95%, the levelled exponential function was found to be optimal. Since the background shapes in the signal regions, as predicted using the antitag regions, were found to be similar (Fig. 6), the parametric background modelling was tested using the antitag region in the data before applying it to the signal region.
The simultaneous fits to the antitag and the signal regions are shown in Figs. 7 and 8, respectively, using the background model only. These are labelled as "post-fit" curves with the signal region background yields constrained to be R p/f times the background yields from the antitag regions. The "pre-fit" curves, obtained by fitting the antitag and the signal regions separately to the background-only model, with the background event yields unconstrained, are also shown  Figure 6: The reduced mass distributions m jj,red for the LL (left) and TT (right) signal region categories. The points with bars show the data, the histogram with shaded band shows the estimated background and associated uncertainty. The m jj,red spectrum for the background is obtained by weighting the m jj,red spectrum in the antitag region by the ratio R p/f of Fig. 5. The signal predictions for a bulk graviton of mass 1000 GeV, are overlaid for comparison, assuming a cross section σ(pp → X → HH → bbbb) of 10 fb. The last bins of the distributions contain all events with m jj,red > 3000 GeV. The differences between the data and the predicted background, divided by the data statistical uncertainty (data unc.) as given by the Garwood interval [76], are shown in the lower panels.  Figure 7: The reduced mass m jj,red distributions in the antitag region for the LL (left) and TT (right) categories. The black markers are the data while the curves show the pre-fit and post-fit background shapes. The differences between the data and the pre-fit background distribution, divided by the statistical uncertainty in the data (data unc.) as given by the Garwood interval [76], are shown in the lower panels. → X → HH → bbbb) of 10 fb. The differences between the data and the pre-fit background distribution, divided by the statistical uncertainty in the data (data unc.) as given by the Garwood interval [76], are shown in the lower panels.
for comparison. In the post-fit results, the R p/f dependence on m jj,red was found to be negligible.
Among the four fitted regions, corresponding to the antitag and the signal regions in the LL and TT categories, the events with the highest value of m jj,red occur in the antitag region of the LL category, at around m jj,red = 2850 GeV. As the parametric background model is only reliable within the range of observed events, the likelihood is only evaluated up to m jj,red = 3000 GeV. This results in a truncation of the signal distribution for resonances having m X of 2800 GeV and above, with signal efficiency losses increasing to 30% for m X = 3000 GeV, as shown in Fig. 4.
Closure tests of the background estimation methods were performed using simulated multijet samples with signals of various cross sections. The tests indicated a good consistency between the expected and the assumed signal strengths.

Systematic uncertainties
The following sources of systematic uncertainty affect the expected signal yields. None of these lead to a significant change in the signal shape.
Trigger response modelling uncertainties are particularly important for m jj,red < 1200 GeV, where the trigger efficiency drops below 99%. A scale factor is applied to correct for the difference in efficiency observed between the data and simulation. The control trigger used to measure this scale factor requires a single AK4 jet with p T > 260 GeV, and it too is subject to some inefficiency when m jj,red is close to 750 GeV, because of a difference between the jet energy scale used in the trigger and that used in the offline reconstruction. This inefficiency is measured using simulations, and has an associated total uncertainty of between 1% and 15%.
The jet energy scale and resolution uncertainty is about 1% [63,64]. The jet mass scale and resolution, and τ 21 selection efficiency data-to-simulation scale factor are measured using a sample of merged W jets in semileptonic tt events. The corresponding uncertainties are extrapolated to a higher p T range than that associated with tt events, using simulations. A correction factor is applied to account for the difference in the jet shower profile of W → qq and H → bb decays, by comparing the ratio of the efficiency of H and W jets using the PYTHIA 8 and HER-WIG++ shower generators. The jet mass scale and resolution has a 2% effect on the signal yields because of a change in the mean of the H jet mass distribution. The τ 21 selection efficiency uncertainty amounts to a +30/−26% change in the signal yields. The uncertainty in the H tagging correction factor is in the range 7-20% depending on the resonance mass m X . The double-b tagger efficiency scale factor uncertainty is about 2-5%, depending on the double-b tagger requirement threshold and jet p T , and is propagated to the total uncertainty in the signal yield.
The impact of the PDFs and the theoretical scale uncertainties are estimated to be 0.1-2%, using the PDF4LHC procedure [50], and affect the product of the signal acceptance and the efficiency. The PDF and scale uncertainties have negligible impact on the signal m jj,red distributions. Additional systematic uncertainties associated with the pileup modelling (2%) and the integrated luminosity determination (2.5%) [78], are applied to the signal yield.
The main source of uncertainty for the multijet background in the region m jj,red < 1200 GeV is due to the statistical uncertainty in the fit to the R p/f ratio performed in the H jet mass sidebands. This uncertainty, amounting to 2.6% for the LL, and 6.8% for the TT signal categories, is fully correlated between all bins of a particular estimate. Furthermore, the statistical uncertainty in the antitag region is propagated to the signal region when the estimate is made. This is uncorrelated from bin to bin, and the Barlow-Beeston Lite [79,80] method is used to treat the bin-by-bin statistical uncertainty in the data. These uncertainties affect both the shape of the background in the m jj,red distribution and the total background yield.
For m jj,red ≥ 1200 GeV, the overall background uncertainty is obtained from the uncertainty in the four simultaneous fits performed for the antitag and the signal regions in the LL and the TT categories. The dependence of R p/f on m jj,red is accounted for, although this was found to be negligible.
A complete list of systematic uncertainties is given in Table 2.
[GeV] X m 1000 1500 2000 2500 3000   Figure 9: The limits for the spin-0 radion (left) and the spin-2 bulk graviton (right) models. The result for m X < 1200 GeV uses the background predicted using the control regions, while for m X ≥ 1200 GeV the background is derived from a combined signal and background fit to the data in the control and the signal regions. The predicted theoretical cross sections for a narrow radion or a bulk graviton are also shown.

Results
As shown in Figs. 6 and 8, for the signal regions, the observed m jj,red distribution is consistent with the estimated background. The results are interpreted in terms of upper limits on the product of the production cross sections and the branching fractions σ(pp → X)B(X → HH → bbbb) for radion and bulk graviton of various mass hypotheses. The asymptotic approximation of the modified frequentist approach for confidence levels, taking the profile likelihood as a test statistic [81][82][83], is used. The limits are shown in Fig. 9 for a narrow width radion or a bulk graviton. These are compared with the theoretical values of the product of the cross sections and branching fractions for the benchmarks κ/M Pl = 0.5 and Λ R = 3 TeV, where the narrow width approximation for a signal is valid, and where the corresponding HH decay branching fractions in the mass range of interest are 10 and 23%, for the graviton and the radion, respectively [13]. The expected limits on the bulk graviton are more stringent than those on the radion because of the higher efficiency of the |∆η(j 1 , j 2 )| separation requirement for the former signal.
The upper limits on the production of the cross sections and branching fraction lies in the range 126-1.4 fb for a narrow resonance X of mass 750 < m X < 3000 GeV. Assuming Λ R = 3 TeV, a bulk radion with a mass between 970 and 1400 GeV is excluded at 95% confidence level, except in a small region close to 1200 GeV, where the observed limit is 11.4 pb, the theoretical prediction being 11.2 pb.

Summary
A search for a narrow massive resonance decaying to two standard model Higgs bosons is performed using the LHC proton-proton collision data collected at a centre-of-mass energy of 13 TeV by the CMS detector, and corresponding to an integrated luminosity of 35.9 fb −1 . The final state consists of events with both Higgs bosons decaying to b quark-antiquark pairs, which were identified using jet substructure and b-tagging techniques applied to large-area jets. The data are found to be consistent with the standard model expectations, dominated by multijet events. Upper limits are set on the products of the resonant production cross sections of a Kaluza-Klein bulk graviton and a Randall-Sundrum radion, and their branching fraction to HH → bbbb. The limits range from 126 to 1.4 fb at 95% confidence level for bulk gravitons and radions in the mass range 750-3000 GeV. For the mass scale Λ R = 3 TeV, a radion of mass between 970 and 1400 GeV (except in a small region close to 1200 GeV) is excluded. These limits on the bulk graviton and the radion decaying to a pair of standard model Higgs bosons are the most stringent to date, over the mass range explored.

Acknowledgments
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: BMWFW and FWF (Aus     [32] ATLAS Collaboration, "Search for WW/WZ resonance production in νqq final states in pp collisions at √ s =13 TeV with the ATLAS detector", (2017). arXiv:1710.07235. Submitted to JHEP.