Search for a Higgs boson in the diphoton final state using the full CDF data set from proton-antiproton collisions at \surds = 1.96 TeV

A search for a narrow Higgs boson resonance in the diphoton mass spectrum is presented based on data corresponding to 10 fb-1 of integrated luminosity from proton-antiproton collisions at \surds = 1.96 TeV collected by the CDF experiment. In addition to searching for a resonance in the diphoton mass spectrum, we employ a multivariate discriminant technique for the first time in this channel at CDF. No evidence of signal is observed, and upper limits are set on the cross section times branching ratio of the resonant state as a function of Higgs boson mass. The limits are interpreted in the context of the standard model with an expected (observed) limit on the cross section times branching ratio of 9.9 (17.0) times the standard model prediction at the 95% credibility level for a Higgs boson mass of 125 GeV/c2. Moreover, a Higgs boson with suppressed couplings to fermions is excluded for masses below 114 GeV/c2 at the 95% credibility level.

P.F. Shepard, 43 M. Shimojima v , 51 M. Shochet, 11 I. Shreyber-Tecker, 34 A. Simonenko, 13 P. Sinervo, 31 K. Sliwa, 52 J.R. Smith, 7 F.D. Snider, 15 A. Soha, 15  A search for a narrow Higgs boson resonance in the diphoton mass spectrum is presented based on data corresponding to 10 fb −1 of integrated luminosity from proton-antiproton collisions at √ s = 1.96 TeV collected by the CDF experiment. In addition to searching for a resonance in the diphoton mass spectrum, we employ a multivariate discriminant technique for the first time in this channel at CDF. No evidence of signal is observed, and upper limits are set on the cross section times branching ratio of the resonant state as a function of Higgs boson mass. The limits are interpreted in the context of the standard model with an expected (observed) limit on the cross section times branching ratio of 9.9 (17.0) times the standard model prediction at the 95% credibility level for a Higgs boson mass of 125 GeV/c 2 . Moreover, a Higgs boson with suppressed couplings to fermions is excluded for masses below 114 GeV/c 2 at the 95% credibility level.

I. INTRODUCTION
The standard model (SM) of particle physics has proven to be a robust theory that accurately describes the properties of elementary particles and the forces of interaction between them. However, the origin of mass has remained an unsolved mystery for decades. The SM suggests that particles acquire mass due to interactions with the Higgs field via spontaneous symmetry breaking [1]. Direct searches at the Large Electron-Positron Collider (LEP) [2], combined with recent search results from the Tevatron [3] and the Large Hadron Collider (LHC) [4,5], exclude all potential SM Higgs boson masses outside the ranges 116.6-119.4 GeV/c 2 and 122.1-127 GeV/c 2 .
In the SM, the branching ratio for a Higgs boson decaying into a photon pair B(H → γγ) is maximal for Higgs boson masses between about 110 and 140 GeV/c 2 . This is a mass range that is most useful for Higgs boson searches at the Fermilab Tevatron [3] and is favored by indirect constraints from electroweak observables [6]. The cc Universidad Tecnica Federico Santa Maria, 110v Valparaiso, Chile, dd Yarmouk University, Irbid 211-63, Jordan SM H → γγ branching ratio peaks at a value of about 0.23% for a Higgs boson mass m H = 125 GeV/c 2 [7]. This is a very small branching ratio; however, the distinctive signal that photons produce in the detector makes H → γγ an appealing search mode. Compared to the dominant decay modes involving b quarks, a larger fraction of H → γγ events can be identified and the diphoton invariant mass of these events would cluster in a narrower range, thus providing a better discriminator against the smoothly distributed background. There are also theories beyond the standard model that predict a suppressed coupling of a Higgs boson to fermions. In these "fermiophobic" Higgs boson models, the diphoton decay can be greatly enhanced [8].
The Collider Detector at Fermilab (CDF) and D0 experiments at the Tevatron have searched for both a SM Higgs boson, H, and a fermiophobic Higgs boson, h f , decaying to two photons [9][10][11][12]. The CDF and D0 experiments recently set 95% credibility level (C.L.) upper limits on the cross section times branching ratio σ × B(H → γγ) relative to the SM prediction and on B(h f → γγ) using data corresponding to an integrated luminosity L of 7.0 fb −1 [13] and 8.2 fb −1 [14], respectively. The h f result sets a lower limit on m h f of 114 GeV/c 2 and 112.9 GeV/c 2 , respectively. These results surpassed for the first time the 109.7 GeV/c 2 mass limit obtained from combined searches at the LEP collider at CERN [8].
Recently, the ATLAS and CMS experiments at the LHC at CERN have searched for a SM Higgs boson decaying to two photons using L = 4.9 fb −1 [15] and 4.8 fb −1 [16], respectively. In the low mass range, rates corresponding to less than twice the SM cross section are excluded at 95% C.L. An excess of nearly 2σ is present in both the CMS and ATLAS results, which could be consistent with a SM Higgs boson with a mass near 125 GeV/c 2 .
In this Letter, we present a search for a Higgs boson decaying to two photons using the final CDF diphoton data set, corresponding to an integrated luminosity of 10 fb −1 . This analysis searches the diphoton mass distribution for a narrow resonance that could reveal the presence of a SM or fermiophobic Higgs boson, updating the previous CDF result [13] with more than 40% additional integrated luminosity. We furthermore implement a new multivariate technique for events that contain two central photons, using both diphoton and jet kinematic variables to improve the sensitivity for identifying a Higgs boson signal from the diphoton backgrounds.

II. HIGGS BOSON SIGNAL MODEL
For the SM search, we consider the three most likely production mechanisms at the Tevatron: gluon fusion (GF); associated production (VH), where a Higgs boson is produced in association with a W or Z boson; and vector boson fusion (VBF), where a Higgs boson is pro-duced alongside two quark jets. As an example, the SM cross sections for m H = 125 GeV/c 2 are 949.3 fb [17], 208.0 fb [18], and 65.3 fb [19], respectively. In the fermiophobic search, we consider a benchmark model in which a Higgs boson does not couple to fermions, yet retains its SM couplings to bosons [8]. In this model, the GF process is suppressed and fermiophobic Higgs boson production is dominated by VH and VBF. With L = 10 fb −1 , about 28 (43) H → γγ (h f → γγ) events are predicted to be produced for m H = 125 GeV/c 2 .
Only about 25% of these events would produce photons that are absorbed in well-instrumented regions of the CDF detector and pass the full diphoton selection discussed in Section III [13]. This fraction, along with the predicted distributions of kinematic variables, is obtained from a simulation of Higgs boson decays into diphotons. For each Higgs boson mass hypothesis tested in the range 100-150 GeV/c 2 , in 5 GeV/c 2 steps, signal samples are developed from the pythia 6.2 [20] Monte Carlo (MC) event generator and a parametrized response of the CDF II detector [21,22]. All pythia samples were made with CTEQ5L [23] parton distribution functions, where the pythia underlying event model is tuned to CDF jet data [24]. Each signal sample is corrected for multiple interactions and differences between the identification of photons in the simulation and the data [13]. The GF signal is furthermore corrected based on a higher-order theoretical prediction of the transverse momentum distribution [25].

III. DETECTOR AND EVENT SELECTION
We use the CDF II detector [26] to identify photon candidate events produced in pp collisions at √ s = 1.96 TeV. The silicon vertex tracker [27] and the central outer tracker [28], contained within a 1.4 T axial magnetic field, measure the trajectories of charged particles and determine their momenta. Particles that pass through the outer tracker reach the electromagnetic (EM) and hadronic calorimeters [29][30][31], which are divided into two regions: central (|η| < 1.1) and forward or "plug" (1.1 < |η| < 3.6). The EM calorimeters contain finegrained shower maximum detectors [32], which measure the shower shape and centroid position in the plane transverse to the direction of the shower development.
The event selection is the same as in the previous H → γγ search [13]. Events with two photon candidates are selected and the data are divided into four independent categories according to the position and type of the photons. In central-central (CC) events, both photon candidates are detected within the fiducial region of the central EM calorimeter (|η| < 1.05); in central-plug (CP) events, one photon candidate is detected in this region and the other is in the fiducial region of the plug calorimeter (1.2 < |η| < 2.8); in central-central events with a conversion (C C), both photon candidates are in the central region, but one photon converts and is recon- structed from its e + e − decay products; in central-plug events with a conversion (C P), there is one central conversion candidate together with a plug photon candidate.
In order to improve sensitivity for the fermiophobic Higgs boson search, the event selection is extended by taking advantage of the final-state features present in the VH and VBF processes. Because the Higgs boson from these processes will be produced in association with a W or Z boson, or with two jets, the transverse momentum of the diphoton system p γγ T is generally higher relative to the diphoton backgrounds. A requirement of p γγ T > 75 GeV/c isolates a region of high h f sensitivity, retaining roughly 30% of the signal while removing 99.5% of the background [12]. Two lower-p γγ T regions, p γγ T < 35 GeV/c and 35 GeV/c < p γγ T < 75 GeV/c, are additionally included and provide about 15% more sensitivity to the h f signal. With four diphoton categories (CC, CP, C C, and C P) and three p γγ T regions, twelve independent channels are included for the fermiophobic Higgs boson search.

IV. DIPHOTON RESONANCE SEARCH
The decay of a Higgs boson into a diphoton pair would appear as a very narrow peak in the distribution of the invariant mass m γγ of the two photons. The diphoton mass resolution as determined from simulation is better than 3% for the Higgs boson mass region studied here and is limited by the energy resolution of the electromagnetic calorimeters [33] and the ability to identify the primary interaction vertex [13]. The diphoton invariant mass distribution for the most sensitive search category in the SM and fermiophobic scenarios is provided in Fig. 1, with an inset showing the signal shape expected from simulation. In each diphoton category, we perform a search of the m γγ spectrum for signs of a resonance.
For this search, the total diphoton background is modeled from a fit to the binned diphoton mass spectrum of the data using a log-likelihood (log L) method, as described in [13]. The fit is performed independently for each diphoton category and includes only the sideband region for each m H hypothesis, which is the control region excluding a mass window centered on the Higgs boson mass being tested. The full width of the mass window is chosen to be approximately ±2 standard deviations of the expected Higgs boson mass resolution, which amounts to 12 GeV/c 2 , 16 GeV/c 2 , and 20 GeV/c 2 for mass hypotheses of 100-115 GeV/c 2 , 120-135 GeV/c 2 , and 140-150 GeV/c 2 , respectively. The fit for the CC category for m H = 125 GeV/c 2 is shown in Fig. 1.

V. MULTIVARIATE DISCRIMINATOR
The diphoton mass distribution is the most powerful variable for separating a Higgs boson signal from the diphoton backgrounds. However, other information is available that can be used to further distinguish this signal. We improve the most sensitive search category (CC) by using a "Multi-Layer Perceptron" neural network (NN) [34], which combines the information of several well-modeled kinematic variables into a single discriminator, optimized to separate signal and background events. Four diphoton kinematic variables are included: m γγ , p γγ T , the difference between the azimuthal angles of the two photons, and the cosine of the photon emission angle relative to the colliding hadrons in the diphoton rest frame (the Collins-Soper angle) [35]. For events with jets, we also include four variables related to the jet activity, which are particularly useful for identifying VBF and VH signal events. These variables are the number of jets in the event, the sum of the jet transverse energies, and the event sphericity and aplanarity [36]. Jets are reconstructed from tower clusters in the hadronic calorimeter within a cone of radius 0.4 in the η − φ plane [37]. Each jet is required to have |η| < 2 and a transverse energy E T > 20 GeV, where the energy is corrected for calorimeter response, multiple interactions, and absolute energy scale.
In order to optimize the performance of the method, we divide the CC category into two independent subsamples of events: the CC0 category for events with no jets and the CCJ category for events with at least one jet. The CC0 category uses a network trained with only the four diphoton variables; the CCJ category uses a network trained with the four diphoton and four jet variables.
The sideband fit used in the diphoton resonance search provides an estimate of the total background prediction in each signal mass window; however, the multivariate analysis requires a more detailed background model. Specifically, we divide the background into its distinct components in order to best model all input variables used by the discriminant, which is also sensitive to correlations. There are two main background components in the CC data sample: a prompt diphoton (γγ) background produced from the hard parton scattering or from hard photon bremsstrahlung from energetic quarks, and a background comprised of γ-jet and jet-jet events (γj + jj ) in which the jets are misidentified as photons [38]. To model the shape of kinematic variables in the γγ background, we use a pythia MC sample developed and studied in a measurement of the diphoton cross section [35].
To model the variable shapes in the γj + jj background, we obtain a data sample enriched in misidentified photons by selecting events for which one or both photon candidates fail the NN photon ID requirement [13]. In the diphoton cross section analysis [35] it was found that a p γγ T -dependent correction was needed for the pythia modeling. We adopt the correction for this analysis, reweighting the p γγ T distribution from pythia to match the p γγ T distribution from control regions in prompt diphoton data. For each category, CC0 and CCJ, and for each Higgs boson mass hypothesis, event weights are derived based on the sideband regions, excluding the signal mass window. The weights are derived by fitting a smooth function to the ratio of the p γγ T distribution from the data to that from the pythia prediction. The best fit in the CC0 category is obtained from a polynomial (constant) function for p γγ T < 50 GeV/c (p γγ T > 50 GeV/c). A different polynomial (constant) function provides the best fit in the CCJ category for p γγ T < 60 GeV/c (p γγ T > 60 GeV/c). Figure 2 shows the reweighting function for a Higgs boson mass hypothesis of 125 GeV/c 2 . The solid curve shows the best fit to the data and the other two curves show the variations induced by propagating the 68% C.L. fit uncertainties to the fitting function. The rise of the reweighting function from p γγ T ∼ 20 GeV/c to p γγ T ∼ 50 GeV/c in both the CC0 and CCJ categories is interpreted in Ref. [35] as an effect of parton fragmentation not modeled in pythia, which contributes to the prompt diphoton production cross section in that range.
The relative contributions of the two background components are obtained from a fit to the diphoton data. Three histograms for each NN input variable are constructed: one from the γγ background sample after reweighting, one from the γj + jj background sample, and one from the diphoton data. Events used for the fit are required to have diphoton mass values greater than 70 GeV/c 2 and to be outside of the signal mass window. The histograms are then used to build a χ 2 function defined by where g ij , f ij , and d ij refer to the number of events in the ith bin of the jth input variable for the prompt γγ background, γj + jj background, and diphoton data samples, respectively. The sums are over all bins of each input variable for which there are at least 5 events in the data, and the global α and β coefficients are determined by minimizing the χ 2 function. This function is defined and minimized separately for each Higgs boson mass hypothesis and for each category (CC0 and CCJ).
A neural network discriminant is trained separately for each mass hypothesis using signal and background events. The signal events used in the training are optimized for the SM scenario and are composed of GF, VH, and VBF pythia samples so that the corresponding total numbers are proportional to their SM cross section predictions. The background sample is made by taking a portion of the γj + jj sample available for each mass hypothesis and adding γγ events from pythia weighted by the ratio α/β from the χ 2 fit for the given mass hypothesis.
After training, the NN is applied to the diphoton data sample. Figure 3 shows input variables such as the p γγ distribution for events with no reconstructed jets and the sum of the jet E T for events with ≥1 reconstructed jet. The signal shapes are scaled to 20 times the expected number of reconstructed events in the SM scenario. The background prediction is also provided. While the χ 2 fit described by Eq. (1) is used to fix the relative composition of the γγ and γj + jj background components, the total expected number of background events is more accurately determined from sideband mass fits, which is the technique described in Section IV. The resulting NN shapes for m H = 125 GeV/c 2 are provided in Figure 4.

VI. SYSTEMATIC UNCERTAINTIES
The sources of systematic uncertainties on the expected number of signal events are the same as in the previous CDF H → γγ search [13]. They arise from the conversion ID efficiency (7%), the integrated luminosity measurement (6%), varying the parton distribution functions used in pythia (up to 5%) [39,40], varying the parameters that control the amount of initial-and finalstate radiation from the parton shower model of pythia (about 4%), and the pythia modeling of the shape of the p γγ T distribution for the h f signal (up to 4%) [41]. Finally, we include uncertainties from the photon ID efficiency (up to 4%), the trigger efficiency (less than 3%), and the EM energy scale (less than 1%).
The statistical uncertainties on the total background in the signal region are determined by the fit. They are 4% or less for the channels associated with the SM diphoton resonance search and are less than 7% for the CC0 and CCJ categories used in the multivariate technique. For the channels associated with the fermiophobic Higgs boson diphoton resonance search, the background rate uncertainty is 12% or less, except for the high-p γγ T bins with conversion photons, where it is 20%.
For the search using the multivariate technique, in addition to the rate uncertainties summarized above, we consider shape uncertainties and bin-by-bin statistical uncertainties of the NN discriminant. The signal shape uncertainties are associated with initial-and finalstate radiation and the jet energy scale [37], and the background shape uncertainties are associated with the pythia p γγ T -correction and the jet energy scale. The pythia shape uncertainties due to the p γγ T fits are taken as uncorrelated between the CC0 and the CCJ categories because the fits determining the corrections for each category are done independently. The jet energy scale shape uncertainties are correlated between the two categories in order to take into account event migration between categories. The dominant uncertainty in the multivariate analysis is the bin-by-bin statistical uncertainty of the γj + jj background histograms.

VII. RESULTS
No evidence of a narrow peak or any other structure is visible in the diphoton mass spectrum or the NN output distribution. We calculate a Bayesian C.L. limit for each Higgs boson mass hypothesis based on a combination of likelihoods from the discriminant distributions for all channels in the corresponding mass signal region. The combined limits for the SM search use the NN discriminants of the CC0 and CCJ categories and the mass discriminants from the CP, C C, and C P categories. The fermiophobic limits use the NN discriminants of the CC0 and CCJ categories and the mass discriminants from the CP, C C, and C P categories divided into p γγ T regions. For the limit calculation, we assume a flat prior (truncated at zero) for the signal rate and a truncated Gaussian prior for each of the systematic uncertainties. A 95% C.L. limit is determined such that 95% of the posterior density for σ × B(H → γγ) falls below the limit [42]. The expected 95% C.L. limits are calculated assuming no signal, based on expected backgrounds only, as the median of 2 000 simulated experiments. The observed 95% C.L. limits on σ × B(H → γγ) are calculated from the data.
For the SM Higgs boson search, the results are given relative to the theory prediction, where theoretical cross section uncertainties of 14% on the GF process, 7% on the VH process, and 5% on the VBF process are included in the limit calculation [43]. For the h f model, SM cross sections and uncertainties are assumed (GF excluded) and used to convert limits on σ × B(h f → γγ) into limits on B(h f → γγ). The SM and fermiophobic limit results for the CC category alone are provided in Table I, showing the gain obtained by incorporating a multivariate technique for this category. The combined limit results for both searches are displayed in Table II and graphically in Fig. 5. Limits are also provided on σ × B(H → γγ) for the SM search without including theoretical cross section uncertainties. For the SM limit at m H = 120 GeV/c 2 , we observe a deviation of greater than 2.5σ from the expectation. After accounting for the trials factor associated with performing the search at 11 mass points, the significance of this discrepancy decreases to less than 2σ. When the analysis is optimized for the fermiophobic benchmark model, no excess is observed. For the h f model, we obtain a limit of m h f < 114 GeV/c 2 by linear interpolation between the sampled values of m h f based on the intersection of the observed limit and the model prediction.

VIII. SUMMARY AND CONCLUSIONS
This Letter presents the results of a search for a narrow resonance in the diphoton mass spectrum using data taken by the CDF II detector at the Tevatron. We have improved upon the previous CDF analysis by implementing a neural network discriminant to increase sensitivity in the most sensitive diphoton category by as much as 13% (17%) for the SM (fermiophobic) scenario. In addition, we have included the full CDF diphoton data set, which adds more than 40% additional integrated luminosity relative to the previous diphoton Higgs boson search. There is no significant evidence of a resonance in the data. Limits are placed on the production cross section times branching ratio for Higgs boson decay into a photon pair and compared to the predictions of the standard model and a benchmark fermiophobic model. The latter results in a limit on the fermiophobic Higgs boson mass of m h f < 114 GeV/c 2 at the 95% C.L. TABLE II. Expected and observed 95% C.L. upper limits on the production cross section times branching ratio relative to the SM prediction, the production cross section times branching ratio with theoretical cross section uncertainties removed, and the h f branching ratio. The fermiophobic benchmark model prediction for B(h f → γγ) is also shown for comparison.