Search for neutral Higgs bosons in the multi-b-jet topology in 5.2fb-1 of ppbar collisions at sqrt(s)=1.96 TeV

Data recorded by the D0 experiment at the Fermilab Tevatron Collider are analyzed to search for neutral Higgs bosons produced in association with b quarks. The search is performed in the three-b-quark channel using multijet-triggered events corresponding to an integrated luminosity of 5.2fb-1. In the absence of any significant excess above background, limits are set on the cross section multiplied by the branching ratio in the Higgs boson mass range 90 to 300 GeV, extending the excluded regions in the parameter space of the minimal supersymmetric standard model.

A.A. Shchukin, 38 R.K. Shivpuri, 28 V. Simak, 10 V. Sirotenko, 47 P. Skubic, 72 P. Slattery, 68 D. Smirnov, 53 K.J. Smith, 66 G.R. Snow, 63 J. Snow, 71 S. Snyder, 70 S. Söldner-Rembold, 43 L. Sonnenschein, 21  Data recorded by the D0 experiment at the Fermilab Tevatron Collider are analyzed to search for neutral Higgs bosons produced in association with b quarks. The search is performed in the three-b-quark channel using multijet-triggered events corresponding to an integrated luminosity of 5.2 fb −1 . In the absence of any significant excess above background, limits are set on the cross section multiplied by the branching ratio in the Higgs boson mass range 90 to 300 GeV, extending the excluded regions in the parameter space of the minimal supersymmetric standard model. free parameter in the MSSM, there are indications which suggest that it should be large. A value of tan β ≈ 35 naturally explains the top to bottom quark mass ratio [2], and high tan β values also provide a good explanation for the observed density of dark matter [3].
The couplings of the Higgs bosons to fermions in the MSSM are proportional to the corresponding couplings in the standard model (SM). The proportionality factor depends on the type of the quark (up-or down-type) and on the type of the Higgs boson. At large tan β, the two Higgs bosons A and either h or H have approximately the same mass and a down-type quark coupling enhanced by tan β compared to the SM coupling, while the coupling to up-type quarks is suppressed. Here, the three neutral Higgs boson couplings to b quarks follow the sum rule is the SM coupling. Therefore, in these cases the production of Higgs bosons associated with bottom quarks (down-type quarks with the highest mass) is enhanced by a factor of 2 × tan 2 β compared to SM production. Due to the tan β enhancement, the main decay for the three neutral Higgs bosons is φ → bb with branching ratios near 90% (the remainder being mostly φ → τ τ ). Since a direct search for φ → bb is difficult due to large multijet backgrounds, searches rely on the case where φ is produced in association with one b quark. The final state with three b quarks represents a powerful search channel, with the third b-jet providing additional suppression of the large multijet background at a hadron collider. MSSM Higgs boson production has been studied at the CERN LEP e + e − collider which excluded M h,A < 93 GeV for all tan β values [4]. The CDF [5] and D0 [6][7][8] collaborations have extended MSSM Higgs boson searches to higher masses for high tan β values. This Letter uses data collected during Run II at the Fermilab Tevatron collider by the D0 collaboration corresponding to an integrated luminosity of 5.2 fb −1 [9]. The dataset is broken into two periods, corresponding to the period before (1.0 fb −1 ) and after (4.2 fb −1 ) the upgrade of the D0 silicon vertex detector and trigger system. The dataset is five times larger than that used in the previous publication [7]. The full dataset has been reanalyzed to incorporate recent improvements to analysis procedures, algorithms, and calibrations. Improved modeling of the background has resulted in reduced systematic uncertainties. In addition, the Higgs boson mass range under consideration has been extended to 300 GeV. The limits are calculated using a program [10], which is an implementation of the modified frequentist limit setting procedure [24], and are based only on the shape, and not the normalisation, of the distribution of the final discriminating variable. The D0 detector is described in Ref. [11]. Dedicated triggers for the three trigger levels (L1, L2, L3) designed to select events with at least three jets are used in this analysis. The majority of the data were recorded with b-tagging requirements at the trigger level, either at L3 or at both L2 and L3. The trigger has an efficiency of approximately 60% for φb → bbb events with a Higgs boson mass of 150 GeV when measured relative to events with three or four reconstructed jets.
The midpoint cone algorithm [12] with a radius R = (∆y) 2 + (∆ϕ) 2 = 0.5, where y is the rapidity and ϕ the azimuthal angle, is used to reconstruct jets from energy deposits in the calorimeters. Details of the jet reconstruction and energy scale determination are described in Ref. [13]. In addition to passing a set of quality criteria, all jets are required to be matched to at least two tracks reconstructed in the central detector, pointing to the pp vertex and with hits in the silicon detector. The matching criterion is ∆R(track, jetaxis) = (∆η) 2 + (∆ϕ) 2 < 0.5, where η is the pseudo-rapidity. Signal events are selected by requiring three or four jets with transverse momenta p T > 20 GeV and |η| < 2.5. The distance in the coordinate along the beam axis of the position of the pp vertex (z P V ) is required to be within 35 cm of the center of the detector. This is well within the geometric acceptance of the silicon detector, as needed for efficient b-tagging. A neural network (NN) b-tagging algorithm [14], which considers lifetime based information involving the track impact parameters and secondary vertices, is used to identify b-quark jets. Each event must have at least three jets satisfying a b-tag NN requirement. The single jet b-tagging efficiency is approximately 50% for a light-quark jet mistag rate of 0.8%. Data events with two tagged jets are used together with simulated events with two and three tagged jets to model the background. Finally, the transverse momenta of the two b-tagged jets with the highest p T are required to be above 25 GeV. The data are split into four independent channels based on jet multiplicity (three or four jets) and running period.
The leading order event generator pythia [15] is used to generate samples of associated production of φ and a b quark in the 5-flavor scheme [16], gb → φb. The cross section [17], experimental acceptance, and the kinematic distributions of the b-quark jet produced in association with the Higgs boson are corrected to next-to-leading order (NLO) using mcfm [16]. Multijet background events from the bbj, bbjj, ccj, ccjj, bbcc, and bbbb processes, where j denotes a light parton (u, d, s quark or gluon), are generated with the alpgen [18] event generator. A matching algorithm is used to avoid double counting of final states [19]. The small contribution from tt production to the background is also simulated with alpgen. Other processes, such as Zbb and single top quark production, are negligible. The alpgen samples are processed through pythia for showering and hadronization. All samples are then processed through a geant-based [20] simulation of the D0 detector. The same reconstruction algorithms are used for the simulated samples as for the data. A parameterized trigger simulation, based on efficiencies measured in data, is used to model the effects of the trigger requirements. The b-tagging is modeled by weighting simulated events based on their tagging probability measured using data [14]. The efficiency of the requirements on the triggers, z P V , and the number of jets, range from 1.9% to 26.4% for Higgs boson masses between 90 and 300 GeV. After the three b-tag requirement the efficiency with respect to the total number of signal events ranges from 0.2% to 1.4% in the three-jet channel (0.1% to 0.9% in the four-jet channel). Table I shows the number of events in data and the signal efficiency at different stages of the event selection.
Multijet processes contribute to the background and the theoretical uncertainty on their cross sections is very large. The background composition is therefore determined by fitting distributions of the transverse momenta  of jets of simulated events to data. The fractional contribution, α i , of the ith multijet background process is calculated from equations linking the b-tag efficiency for the ith background, ǫ i k , with the number of observed events, N k , where k indicates the number of b-tagged jets (0-3) in an event [21], and the total number of events, N tot : The double b-tagged sample is dominated by bbj, while the triple b-tagged sample consists of a mixture of approximately 50% bbb, 30% bbj, 15% bbc + bcc and a remaining fraction consisting of ccj, bjj, cjj, and jjj. For every event, the two jet pairs with the largest scalar summed transverse momenta are considered as possible Higgs boson candidates. To remove discrepancies between data and simulation originating from gluon splitting (g → bb), jet pairs which do not fulfill ∆R > 1.0 are rejected.
Six variables for which the data distributions are well modeled by the simulation are used to separate the jet pair from a Higgs boson from the background: ∆η between the two jets in the pair, ∆φ between the two jets in the pair, the angle between the leading jet in the pair and the total momentum of the pair, the momentum balance in the pair [22], the combined rapidity of the pair, and the event sphericity. Based on these kinematic variables a likelihood discriminant, D, is calculated according to: where P sig i (P bkg i ) refers to the signal (background) probability density function (pdf) for variable x i , and (x 1 , ..., x 6 ) is the set of measured kinematic variables. The pdfs are obtained from triple b-tagged signal and simulated background samples. Two likelihood discriminants, providing discrimination in the Higgs boson mass ranges 90 − 130 GeV (low-mass) and 130 − 300 GeV (high-mass), respectively, are built by combining simulated signal samples from the appropriate Higgs boson mass ranges. Signal samples of equal size, interspaced by 10 GeV in M A , are hence added together within the low-mass and high-mass range, respectively. After evaluating the likelihood, only the jet pairing with the larger D is kept for each event in each mass range. To further remove background from the final analysis sample, events are only selected if D > 0.65. The likelihood requirements are optimized considering the variation of the predicted limit in tan β. The final discriminant used in the limit calculation is the distribution of the jet pair invariant mass, M bb , after the selection requirement of the likelihood appropriate to the mass of the hypothesized Higgs boson.
The bbb background is indistinguishable from the signal events where the wrong b-jet pair is chosen by the likelihood and consequently cannot be normalised from the data. The M bb background shape is modeled using a combination of data and simulated samples. The distribution, S pred 3Tag (D, M bb ), of the predicted triple b-tagged (3Tag) background sample in the two-dimensional D and M bb plane is obtained from the inclusive double b-tagged (2Tag) data shape multiplied by the ratio of the simulated shapes of the SM triple (S MC 3Tag ) and double tagged events (S MC 2Tag ): Fig. 1 shows D for data and background for the low-and high-mass likelihoods in the three-jet channel. The shape of a signal distribution, normalised to the same number of events as data, is also shown. Fig. 2 shows the M bb distribution in the three-jet channel after the low-and high-mass likelihood selection requirements, respectively. The invariant mass of a Higgs boson signal in the threejet channel is shown in Fig. 3 for three different values of M A .
To verify the background model a signal-depleted region is studied -any deviation observed there is unlikely to be as a result of signal and therefore would indicate a possible problem in the background modeling. A sample is hence chosen using the lower likelihood jet pairing and applying a selection of D < 0.12. Fig. 4 shows the invariant mass distributions for background and data in this sample. Agreement (χ 2 /n.d.f. = 0.86) between the background model and the data is observed. A wide variety of additional cross-checks were carried out, examining aspects of the event selection, b-tagging, and background modelling; no significant changes in the results were observed.
Sources of systematic uncertainty on both the signal normalisation and shape are considered. The sources of systematic uncertainty on the signal included are: bquark jet identification efficiency, b-and light-quark jet energy resolution, trigger modeling, jet energy calibration, jet identification, integrated luminosity, and theoretical models. The theoretical uncertainty on the signal cross section is estimated from mcfm [16] and consists of a contribution of 10% from the choice of factorisation scale as well as an uncertainty of 5% to 13% from the parton distribution functions, depending on Higgs boson mass. Both the theoretical uncertainty and the luminosity uncertainty of 6.1% [9] are treated as normalisation uncertainties for each mass hypothesis. The remaining sources of systematic uncertainty are assessed individually by varying parameters within their uncertainties and taking into account the resulting difference in normalisation and shape of the M bb distribution at each mass point.
For the dominant background, only systematic uncertainties affecting the shape of M bb matter, since only the shape and not the normalisation is used to distinguish signal from background in this analysis. Additionally, many uncertainties affecting the simulation, like the jet energy scale and resolution uncertainties, cancel in Eq.3. The estimated variations in the remaining systematic sources are propagated to D, M bb and the predicted shape S pred 3Tag and used in assessing the limits presented below. The uncertainty from the b-tagging of jets is evaluated by varying the b-tag efficiencies within their uncertainties. The uncertainties in the difference of the energy resolution between heavy and light flavor jets is obtained by smearing the energy of the b-and c-quark jets by an additional 7%, corresponding to half the light-quark jet energy resolution. The shape difference between triple and double b-tagged data in the trigger turn-on curves resulting from the b-tagging criteria in the trigger is accounted for as a systematic uncertainty. Small variations in the shape, arising from possible signal contamination when determining the background composition, are included as a systematic uncertainty. Finally, the uncertainty on the tt normalisation is taken as 10% [23].
No significant excess over the background is observed in the data. Limits on the Higgs boson production cross section multiplied by the branching ratio to bb are therefore calculated with the modified frequentist method [10,24]. The confidence level of the signal, CL s , which is used to calculate the exclusion, is defined in Ref. [24]. The overall normalisation of the background expectation is allowed to float independently in the null (background-only) and test (background-plus-signal) hypotheses. The systematic uncertainties on the signal and background are included in the limit setting procedure. Each component of systematic uncertainty is adjusted by introducing multiplicative scale factors and maximizing the likelihood for the agreement between prediction and data with respect to these scale factors, constrained by prior Gaussian uncertainties. Limits on the product of cross section and branching ratio are obtained by scaling the signal cross section until 1 − CL s = 0.95 is reached. These limits are effectively independent of the signal model but assume the width of the φ to be negligible relative to the experimental resolution (≈20% at M A = 150 GeV). The four independent analysis chan- Black crosses refer to data, the solid line shows the total background estimate, and the shaded region represents the heavy flavor component (bbb, bbc, and bcc). The lower panels show the difference between the data and the predicted background. nels are combined in the limit setting procedure. Signal hypotheses are considered for discrete Higgs boson mass points from 90 to 300 GeV in steps of 10 GeV. The treatment of the systematic uncertainties and the limit setting procedure were extensively cross-checked; no unexpected effects were observed.
The combined result is summarized numerically in Table II and the model independent limit is shown in Fig. 5. The deviation from expectation around 120 GeV corresponds to 2.5 standard deviations. Note that it is more likely to find a deviation (in the background-only hypothesis) when several mass bins are probed than if only one bin is probed. A standard convention to account for this "trial factor" [25] gives a significance of the deviation at 120 GeV of 2.0 standard deviations.
As a consequence of the enhanced couplings to b-quarks at large tan β, the total width of the Higgs boson mass also increases with tan β. This can have an impact on the search if the width is comparable to or larger than the experimental resolution. To take this effect into account, the width of the Higgs boson is calculated with feynhiggs [26] and included in the simulation as a function of the mass and tan β by convoluting a relativistic Breit-Wigner function with the NLO cross section. The masses and couplings of the Higgs bosons in the MSSM depend, in addition to tan β and M A , on the SUSY parameters through radiative corrections. Limits on tan β as a function of M A are derived for two particular scenarios assuming a CP-conserving Higgs sector [27]: the m max h [28] and no-mixing [29] scenarios with a negative or positive value of the Higgs sector bilinear coupling, µ. Figure 6 shows the result interpreted for these two scenarios in the case of µ = −200 GeV. Weaker limits are obtained for the µ > 0 scenarios, due to the decrease in the product of cross section and branching ratio for positive values of µ [27].
The results exclude substantial areas in the MSSM parameter space up to Higgs boson masses of 300 GeV, under the assumption that a perturbative treatment is valid over the entire region. These are the most stringent limits to date for this topology over this mass range at a hadron collider.
We thank the staffs at Fermilab and collaborating institutions, and acknowledge support from the DOE and NSF (USA); CEA and CNRS/IN2P3 (France); FASI, Rosatom and RFBR (Russia); CNPq, FAPERJ,     , µ = −200 GeV scenario, b) the lower limit for the no-mixing, µ = −200 GeV scenario. The one and two standard deviation bands around the expected limit and the exclusion limit obtained from the LEP experiments are also shown [4].