Search for narrow high-mass resonances in proton–proton collisions at √ s = 8 TeV decaying to a Z and a Higgs boson

A search for a narrow, high-mass resonance decaying into Z and Higgs (H) bosons is presented. The ﬁnal state studied consists of a merged jet pair and a τ pair resulting from the decays of Z and H bosons, respectively. The analysis is based on a data sample of proton–proton collisions at a center-of- mass energy of 8 TeV, collected with the CMS experiment in 2012, and corresponding to an integrated luminosity of 19.7 fb − 1 . In the resonance mass range of interest, which extends from 0.8 to 2.5 TeV, the Z and H bosons are produced with large momenta, which implies that the ﬁnal products of the two quarks or the two τ leptons must be detected within a small angular interval. From a combination of all possible decay modes of the τ leptons, production cross sections in a range between 0.9 and 27.8 fb are excluded at 95% conﬁdence level, depending on the resonance mass. 2015 CERN for the beneﬁt of the CMS Collaboration. Published open access 3 .


Introduction
Very recently, the validity of the standard model (SM) of particle physics has been confirmed by the discovery of a Higgs boson with mass near 125 GeV by the ATLAS and CMS experiments [1,2]. Though the SM successfully describes a broad range of high energy phenomena, the solution to remaining problems with the structure of the SM, particularly the hierarchy problem, leads naturally to the introduction of physics beyond the standard model (BSM), possibly at the TeV scale [3][4][5][6][7][8]. Many of the BSM models predict the existence of heavy resonances with masses of the order of a TeV, which may have sizable couplings to the gauge and Higgs boson fields of the SM [9][10][11][12]. We consider here one important family among these models, which incorporate composite Higgs bosons [11,12]. In these models, the Higgs boson is a pseudo-Nambu-Goldstone boson of a broken global symmetry. Other composite bound states beyond the Higgs boson are expected to exist and could be experimentally observed.
Several searches for massive resonances decaying into pairs of vector bosons or Higgs bosons have been performed by the AT-LAS and CMS experiments [13][14][15][16][17][18][19][20][21][22][23][24]. In this analysis, we search for a resonance with a mass in the range 0.8-2.5 TeV decaying to ZH, where the Z boson decays to qq and the Higgs boson decays to τ + τ − . It is assumed that the natural width of the resonance E-mail address: cms-publication-committee-chair@cern.ch. is negligible in comparison to the experimental mass resolution, which is between 6% and 10% of the mass of the resonance, depending on the mass. There is also a small variation with the type of decay channel because of the dependence of the resolution on the number of neutrinos in the final state. In the model considered, the spin of the resonance is assumed to be one. However, it has been verified that the analysis is insensitive to the angular distributions of the decay products and therefore applies to other spin hypotheses.
The theoretical model used as benchmark in this work is described in Ref. [25]. In this model a heavy SU(2) L vector triplet (HVT) containing neutral (Z ) and charged (W ,± ) spin-1 states is introduced. This scenario is well-motivated in cases where the new physics sector is either weakly coupled [26], or strongly coupled, e.g., in the minimal composite model [27]. The cross sections and branching fractions (B) for the heavy triplet model depend on the new physics scenario under study and can be characterized by three parameters in the phenomenological Lagrangian: the strength of the couplings to fermions c F , to the Higgs c H , and the self-coupling g V . In the case of a strongly coupled sector, the new heavy resonance has larger couplings to the W, Z, and H bosons, resulting in larger branching fractions for the diboson final states. Our benchmark model characterizes this scenario by choosing the parameters g V = 3 and c F = −c H = 1, which configure a strongly coupled sector. In the high-mass case under study, the directions of the particles stemming from Z and H boson decays are separated by a small angle. This feature is referred to as the "boosted" regime. For the case of Z → qq, this results in the presence of one single reconstructed jet after hadronization called a "Z-jet". The novel feature of this analysis is the reconstruction and selection of a τ pair in the boosted regime. The presence of missing energy in τ decays does not allow a direct determination of the invariant mass.
The experimental strategy is to reconstruct and identify the two bosons and to combine their information into a variable that can discriminate between signal and background and on which a statistical study can be performed. This variable is the estimated mass of the Z after applying dedicated reconstruction techniques to the boosted qq and τ τ pairs (m ZH ). The m ZH distribution would show an excess of events at the assumed Z mass if a signal were present.

CMS detector
A detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [28]. The central feature of the CMS detector is a 3.8 T superconducting solenoid of 6 m internal diameter. Within the field volume are the silicon tracker, the crystal electromagnetic calorimeter (ECAL), and the brass and scintillator hadron calorimeter (HCAL). The muon detectors are located outside the solenoid and are installed between the layers of the steel fluxreturn yoke of the solenoid. In addition, CMS has extensive forward calorimetry, in particular two steel and quartz-fiber hadron forward calorimeters.

Data sample and simulation
The analysis is based on a data sample collected by the CMS experiment in proton-proton collisions at a center-of-mass energy of 8 TeV in 2012, corresponding to an integrated luminosity of 19.7 fb −1 . Events are selected online by a trigger that requires the presence of at least one of the following: either a hadronic jet reconstructed by the anti-k T algorithm [29] with a distance parameter of 0.5, transverse momentum p T larger than 320 GeV, and |η| < 5.0; or a total hadronic transverse energy, H T , defined as the scalar sum of the transverse energy of all the jets of the event, larger than 650 GeV. The transverse energy of a jet is defined as the reconstructed energy multiplied by the sine of the polar angle of the jet axis. Using events selected by less restrictive, pre-scaled triggers, it has been verified that the efficiency of this trigger after applying the offline event selection is above 99%. The difference from 100% is considered as a systematic uncertainty.
The process qq → Z → ZH → qqτ + τ − is simulated at parton level using a MadGraph5 1.5.11 [30] implementation of the model described in Ref. [31]. Seven signal samples are generated with masses between 0.8 and 2.5 TeV. For this mass interval, the Z production cross section times branching fraction to ZH ranges from 179.9 fb (m Z = 0.8 TeV) to 0.339 fb (m Z = 2.5 TeV). Although the main sources of background are estimated using observed events, Monte Carlo (MC) simulations are used to develop and validate the methods used in the analysis. Background samples are generated using MadGraph5 1.3.30 (Z/γ + jets and W + jets with leptonic decays), powheg 1.0 r1380 (tt and single top quark production) [32][33][34][35], and pythia 6.426 [36] (SM diboson production and QCD multijet events with large H T ). Showering and hadronization are performed with pythia and τ decays are simulated using tauola 1.1.5 [37] for all simulated samples. Geant4 [38] is used for the simulation of the CMS detector.

Event reconstruction
A particle-flow (PF) algorithm [39,40] is used to identify and to reconstruct candidate charged hadrons, neutral hadrons, photons, muons, and electrons produced in proton-proton collisions.
Jets and τ h candidates are then reconstructed using the PF candidates. The jet energy scale is calibrated through correction factors that depend on the p T and η of the jet. These factors were computed using a data set of proton-proton collisions at corresponding to an integrated luminosity of 19.7 fb −1 , following the method described in [41]. All particles reconstructed with the PF algorithm are used to determine the missing transverse momentum, p miss T . In first approximation, p miss T is defined as the negative vector sum of transverse momenta of all reconstructed particles [42].
Jets are reconstructed using the Cambridge-Aachen (CA) algorithm [43], with a distance parameter of 0.8, chosen so that it contains the hadronization products of the two quarks from the Z boson. Jet pruning and subjet-searching algorithms are applied to these jets as in Ref. [17]. In these algorithms the original jets are re-clustered by removing pileup and underlying-event particles at low-p T and large angle. The term pileup refers to additional interactions occurring in the same LHC bunch crossing. We define m P jet as the invariant mass of the jet constituents after the pruning procedure. This invariant mass provides good discrimination between Z-jets and quark/gluon-jets since it tends to be shifted towards the energy scale at which the jet was produced. We also define a quantity called "N-subjettiness", τ N , that is sensitive to the different jet substructure characteristics of quark/gluon and Zjets, as [44]: (1) where N is the number of subjets in which the original jet can be reclustered with the k T algorithm [45,46]; the index k runs over the PF constituents of the jet; p T,k is the transverse momentum of the kth constituent; R n,k is a distance defined as ( η n,k ) 2 + ( φ n,k ) 2 where η n,k and φ n,k are the differences in pseudorapidity and azimuthal angle between the kth constituent and the nth subjet axis; and d 0 = k p T,k R 0 is a normalization factor with R 0 equal to the original jet distance parameter. The variable τ N quantifies the tendency of a jet to be composed of N subjets, having smaller values for jets with a N-subjets-like configuration. We define τ 21 as the ratio between the 2-subjettiness and the 1-subjettiness, τ 21 = τ 2 /τ 1 . The variables m P jet and τ 21 have been shown to have a good discrimination power between signal and background [47], therefore in the following they are used to define signal and background enriched regions of the analysis. In order to match trigger requirements and avoid inefficiencies close to the threshold, at least one jet in the event is required to have p T > 400 GeV and |η| < 2.4. In addition, this jet is required to pass minimal consistency requirements on the fraction of charged and neutral particles contributing to it, to avoid fake jets from isolated noise patterns in the calorimeters or the tracker systems. While the CA jet selection is common to all the channels considered, the reconstruction of the τ τ system is performed differently depending on the τ decay channel.
The all-leptonic channels are identified by combinations of electrons, muons, and p miss T , which are products of the decay of a pair of τ leptons from the Higgs boson. Electrons are reconstructed by combining the information from an ECAL energy cluster with that of a matching track in the silicon tracker [48]. Electrons are selected if they have p T > 10 GeV, |η| < 2.5, and satisfy requirements on the ECAL shower shape, the ratio of energies measured in HCAL and ECAL around the electron candidate, the compatibility with the primary vertex of the event [49], and the track-cluster matching parameters. Muon candidates [50] are reconstructed by performing a global track fit in which the silicon tracker and the muon system information is combined. For the τ μ τ μ channel, to avoid identification inefficiencies caused by the small angular separation of the two muon trajectories, the second muon candidate is reconstructed with a different algorithm in which tracks in the silicon tracker are matched in space to signals in the muon detectors [17]. Muons are required to have p T > 10 GeV, |η| < 2.4 and to pass additional requirements on the quality of the track reconstruction, on the impact parameter of the track, and on the number of measurements in the tracker and the muon systems. Electron and muon candidates are required to satisfy particle-flow based isolation criteria that require low activity in a cone around the lepton, the isolation cone, after the removal of particles due to additional interactions. Because the lepton from the other signal τ decay in the boosted pair can fall in the isolation cone, other electrons and muons are not considered in the computation of the isolation criteria.
In the semileptonic channels, a lepton selected with all the criteria above is combined with a τ h candidate. The reconstruction of τ h starts from the clustering of jets using the anti-k T algorithm with a distance parameter of 0.5. Electrons and muons, identified by looser criteria than the nominal ones used in the analysis, are removed from the list of particles used in the clustering if they fall within the jet distance parameter. The τ h is reconstructed and identified using the "hadron-plus-strips" technique [51], which searches for the most common decay modes of the τ h starting from charged hadrons and photons forming π 0 candidates. We select τ h candidates with p T > 20 GeV and |η| < 2.3. Electrons and muons misidentified as τ h are suppressed using dedicated criteria based on the consistency between the measurements in the tracker, the calorimeters, and the muon detectors. Finally, loose PF-based isolation criteria are applied to the τ h candidates, not counting electrons and muons in the cone.
In the all-hadronic τ τ channel, a subjet-searching technique [52] is applied to all CA-jets (distance parameter R = 0.8) in each event to identify the τ h candidates. At the next-to-last step of the clustering algorithm, there are two subjets, which are ordered by mass. If both have p T > 10 GeV and the mass of the leading subjet is smaller than 2/3 of the mass of the original merged jet, the two objects are used as seeding jets for τ lepton reconstruction via the "hadron-plus-strips" technique. If any of the criteria above fail, the procedure for one of the subjets is performed again for a maximum of four iterations. The efficiency for finding subjets with this method in signal events is 92%, independent of p T , for τ h with p T > 40 GeV. In the lowest bin investigated (p T between 20 and 40 GeV) the efficiency is around 80%.
The visible mass, m vis , of the τ τ system is defined as the invariant mass of all detectable products of the two decays. Because the unobserved neutrinos can carry a significant fraction of the τ τ energy/momenta, this variable is not suited for reconstructing resonances that include the τ τ system among its decay products. Instead, the Secondary Vertex fit (SVFIT) algorithm described in [53], which combines the p miss T with the visible momenta to calculate a more precise estimator of the kinematics of the parent boson, is used to reconstruct the τ τ system in all search channels.

Background composition
The composition of the background remaining after reconstruction is different for each of the search channels.
In the τ e τ e , τ e τ μ , and τ μ τ μ channels, the background is almost entirely composed of Z/γ + jets events with genuine τ or other lepton decays. In the τ e τ h and τ μ τ h channels, additional significant contributions to the total background come from W + jets and tt events with leptonic W-boson decays, and a hadronic jet misidentified as τ h . Among tt events, those with one W boson decaying leptonically and one decaying to quarks can potentially produce a signal-like structure in m P jet and τ 21 . We refer to this as the "tt peaking contribution" in the following.
The background in the τ h τ h channel is dominated by QCD multijets production. There is a small but non-negligible contribution from Z + jets, W + jets, and tt production. For all these processes, it is possible that genuine τ h or at least one extra jet or lepton misidentified as τ h allow the event to pass the selection.
In all channels there is a very small, irreducible component of genuine SM dibosons, which are not distinguishable from signal, except for the non-peaking structure in m ZH .

Event selection
In all channels, the boosted Z boson decaying to qq is identified by requiring the selection: 70 < m P jet < 110 GeV and τ 21 < 0.75.
This region is referred to as the "signal region".
In the all-leptonic and semileptonic channels, the τ τ fourmomentum estimated from SVFIT is combined with that of the CA-jet to obtain the resonance mass m ZH . Several preselection requirements are applied to remove backgrounds from low-mass resonances and from overlaps of lepton and τ lepton reconstruction in the detector: m vis > 10 GeV, 2 and denotes electrons, muons, or hadronically decaying taus), | p miss T | > 20 GeV, and p T,τ τ > 100 GeV, as estimated from the SVFIT procedure.
Since the background in the all-hadronic channel is initially dominated by QCD multijet events, a different preselection is applied for the all-hadronic channel. Only events that have not been included in the all-leptonic or semileptonic categories are considered in this category. The event is then separated into two hemispheres containing the decay products of the two bosons by requiring the following preselection: | p miss Further criteria investigated for signal selection in all channels include tighter requirements on variables like the p T of the highest-p T (leading) lepton or τ h and m τ τ as estimated from the SVFIT procedure. An upper limit is placed on R in order to reject W + jets events, where a jet misidentified as a τ lepton is usually well-separated in space from the isolated lepton. The number of b jets in the event also provides a useful criterion to reduce the tt contribution. Jets may be identified as b jets, using the combined secondary vertex algorithm [54] which exploits observables related to the long lifetime of b hadrons, and are considered if not overlapping with τ candidates and CA-jets. Those b jets are clustered with the anti-k T jet algorithm, with a distance parameter R = 0.5. Optimization of the selection on these variables is based on the Punzi factor of merit (P ) [55], defined as: , where ε sig is the signal efficiency and B is the background yield after applying the selection. The results of the optimization are listed in Table 1. It has been verified that these results are not sensitive to the choice of m ZH window used to evaluate ε sig and B. In Table 2 we show the efficiency of the selection in signal events for all search channels. Table 1 Summary of the optimized event selection for the six τ τ channels. The selection variables are explained in the text. The label refers to electrons, muons, and τ leptons decaying hadronically. Selection

Background estimation
Because of the non-uniformity of the background composition, different estimation techniques are used in each channel.
In the τ e τ e , τ e τ μ , and τ μ τ μ channels the main background source lacks events with a genuine massive boson decaying to quarks, therefore a technique based on sidebands of the m P jet and τ 21 variables is used for background estimation. In an enlarged search region defined by m P jet > 20 GeV, we define the "sideband region", inverting the selections on m P jet and τ 21 , therefore including both m P jet regions outside the signal range and regions with The total background is estimated in intervals of m ZH , using the formula: where "erf" is the error function and the parameters a, b and c are estimated from the MC simulation. A fit to the observed distribution, excluding the signal region, is then used to determine N . Fig. 1 shows the observed distributions of m ZH in all-leptonic channels, along with the corresponding MC expectations for signal and background, as well as the background estimation derived with the above procedure.
In the semileptonic channels, a control sample defined by the preselection described before, but requiring at least one b-tagged jet, is selected. It has been established with simulation that more than 95% of this sample is composed of tt events. Two scale factors (SFs) relating the ratio of the observed to simulated event rates, one for the tt peaking contribution and the other for the tt combinatorial background, are estimated from this control sample. The pruned jet mass distribution is fit with the sum of two functions: where A, B, and C define the shape of the non-peaking component, analogous to Eq. (3), and G(D, E) is a Gaussian function of mean D and standard deviation E. The values of these two parameters are fixed to those found in the analysis searching for vector boson pair resonances [17] because we are using the same Z-jet reconstruction. From this fit, the two scale factors between data and MC are found, one for each contribution: r SF The same procedure as for the all-leptonic channels is then applied, fitting the observed sideband distribution but using a modified function, given by the sum of the tt contribution and the function of Eq. (3), where the tt normalization is fixed at the MC expectation, scaled by the two SFs. Fig. 2 shows the distributions of m ZH in semileptonic channels, along with the corresponding MC expectations and the background estimation derived with the above procedure.
For each of the methods used, consistency checks comparing data and background predictions are performed using samples of events at the preselection level, that are expected to have Table 2 Summary of the signal efficiencies, number of expected background events, and number of observed events for the six τ τ channels. Only statistical uncertainties are included.
For the all-leptonic and semileptonic channels, numbers of expected background events and observed events are evaluated for each mass point in m ZH intervals corresponding to ±2.5 times the expected resolution. For the all-hadronic channel we consider the number of expected background, signal, and observed events for m ZH > 800 GeV. When the expected background is zero, the 68% confidence level upper limit is listed.  small contributions from potential signal resonances. In the case of the semileptonic channels, we show in Fig. 3 the distribution of m P jet for data and MC at the preselection level. The black line, representing the fit to data, is obtained by the sum of Eqs. (3) and (4), with the tt shape as obtained from the control sample, the tt normalization is fixed to MC scaled by the two SFs, and the other components are free in the sideband fit. An overall agreement between data and prediction is observed. The background prediction in the signal region is 156 ± 26 events, with an observation of 151 events, for the τ e τ h channel and 204 ± 31 events, with an observation of 203 events, for the τ μ τ h channel.
In the all-hadronic channel, for events where the leading jet satisfies the requirement τ 21 < 0.75, a plane is defined using the  to the signal region show that the variables m P jet and m τ τ are essentially uncorrelated. In this case, the total number of background events in the region A can be estimated as: The method described by Eq. (5), called "ABCD method", gives a background prediction in the signal region that has been checked to be insensitive to possible signal contamination in the regions B, C, D. The low number of events in regions B, C, D is not sufficient to derive the shape of the distribution in the signal region using the ABCD method. We use the results from this method to compute the cross section upper limits, which are obtained without assumptions about the shape of the distributions. The ABCD method is checked using an alternative background estimation technique, where tt, W + jets and Z + jets background contributions are given by Eq. (2), while the QCD multijet background is estimated from a control sample of events where at least one τ candidate fails the isolation requirement. The same control sample is used to obtain the shape of the QCD distribution in the signal region presented in Fig. 5.

Systematic uncertainties
The sources of systematic uncertainty in this analysis, which affect either the background estimation or the signal efficiencies, are described below.
For the signal efficiency, the main uncertainties come from the limited number of signal MC events (3-10%), the integrated lumi-   nosity (2.5%) [56], and the uncertainty on the modeling of pileup (0.2-2.2%). Hereafter, the ranges indicate the different channels and mass regions used in the evaluation of the upper limits. The scale factors for lepton identification are derived from dedicated analyses of observed and simulated Z → + − events, using the "tag-and-probe" method [50,51,57]. The uncertainties in these factors are taken as systematic uncertainties and amount to 1-4% for electrons, 1-6% for muons and 9-26% for τ leptons decaying hadronically. The jet and lepton four-momenta are varied over a range given by the energy scale and resolution uncertainties [41].
In this process, variations in the lepton and jet four-momenta are propagated consistently to p miss T . For the all-leptonic and semileptonic channels, additional uncertainties come from the procedure of removing nearby tracks and leptons used in the hadronic τ reconstruction, and from the isolation variable computation in the case of boosted topologies. The inefficiency resulting from these procedures, as measured in signal simulation, is assigned as a systematic uncertainty, corresponding to 1-16% for τ reconstruction and 1-21% for isolation. In the all-hadronic analysis, a constant uncertainty of 10% is assigned for the application of the τ reconstruction procedure to collimated subjets, comparing the performance for isolated and non-isolated τ leptons in simulation. The jet trigger efficiency has an uncertainty of <1%, as determined from a less selective trigger. Following the method derived for vector boson identification in merged jets [58], a scale factor of 0.94 ± 0.06 is used for the efficiency of the pruning and subjet searching techniques applied on the CA jet, where the uncertainty is included in the estimation of the overall systematic uncertainty. For the b tagging, data-to-MC corrections derived from several control samples are applied and the uncertainties on these corrections are propagated as systematic uncertainties in the analysis (2-6%). The procedure used to derive the b-tagging systematic uncertainties is described in Ref. [54].
The uncertainties in the background estimate are dominated by the limited numbers of MC events and sideband data events (4-16 events in all-leptonic channels, 34-37 events in semileptonic channels and 29 in the all-hadronic channels). In the analysis of the all-leptonic and semileptonic channels, additional uncertainties in the background yields of 10-96% originate from the limited number of events of the background MC samples used in the computation of the α(x) quantity, and 18-47% from the normalization fit. Table 2 shows the signal efficiencies (computed using a sample generated with corresponding τ decays), the background expectation and the number of observed events for the six analysis channels.

Results
Having observed no significant deviations in the observed number of events from the expected background, we set upper limits on the production cross section of a new resonance in the ZH final state. We use the CL s criterion [59,60] to extract upper bounds on the cross section, combining all six event categories. The test statistic is a profile likelihood ratio [61] and the systematic uncertainties are treated as nuisance parameters with the frequentist approach. The nuisance parameters are described with log-normal prior probability distribution functions, except for those related to the extrapolation from sideband events, which are expected to follow a distribution [61]. In the all-leptonic and semileptonic channels, the numbers of signal and background events are calculated for a region corresponding to ±2.5 times the expected resolution around each mass point in m ZH , while in the all-hadronic channel we consider the number of expected background, signal and observed events in m ZH > 800 GeV for each mass point. The  Fig. 6. Production cross sections times branching fraction in a range between 0.9 and 27.8 fb, depending on the resonance mass (0.8-2.5 TeV), are excluded at a 95% confidence level.
In Fig. 6, the results from this analysis are also compared to the cross section of the theoretical model, used as benchmark in this paper and studied in Ref. [25]. In this model, the parameters are chosen to be g V = 3 and c F = −c H = 1, corresponding to a strongly coupled sector. In Fig. 7, a scan of the coupling parameters and the corresponding regions of exclusion in the HVT model are shown. The parameters are defined as g V c H and g 2 c F /g V , related to the coupling strength of the new resonance to the Higgs boson and to fermions. Regions of the plane excluded by this search are indicated by hatched areas. Ranges of the scan are limited by the assumption that the new resonance is narrow.

Summary
A search for a highly massive (≥0.8 TeV) and narrow resonance decaying to Z and H bosons that decay in turn to merged dijet and τ + τ − final states has been conducted with data samples collected in 8 TeV proton-proton collisions by the CMS experiment in 2012. For a high-mass resonance decaying to much lighter Z and H bosons, the final state particles must be detected and reconstructed in small angular regions. This is the first search performed by adopting novel and advanced reconstruction techniques to accomplish that end. From a combination of all possible decay modes of the τ leptons, production cross sections in a range between 0.9 and 27.8 fb, depending on the resonance mass (0.8-2.5 TeV), are excluded at a 95% confidence level. [4] A.H. Chamseddine, R. Arnowitt, P. Nath, Locally supersymmetric grand unification, Phys. Rev. Lett. 49 (1982)