Search for the standard model Higgs boson in tau lepton final states

We present a search for the standard model Higgs boson in final states with an electron or muon and a hadronically decaying tau lepton in association with zero, one, or two or more jets using data corresponding to an integrated luminosity of up to 7.3 fb^{-1} collected with the D0 detector at the Fermilab Tevatron collider. The analysis is sensitive to Higgs boson production via gluon gluon fusion, associated vector boson production, and vector boson fusion, and to Higgs boson decays to tau lepton pairs or W boson pairs. Observed (expected) limits are set on the ratio of 95% C.L. upper limits on the cross section times branching ratio, relative to those predicted by the Standard Model, of 14 (22) at a Higgs boson mass of 115 GeV and 7.7 (6.8) at 165 GeV.


I. INTRODUCTION
The standard model (SM) of particle physics postulates a complex Higgs doublet field as the source of electroweak symmetry breaking, giving rise to non-zero masses of the vector bosons and fundamental fermions. The mass of the SM spin-zero Higgs boson, H, that survives after the symmetry breaking is not predicted, but is constrained by direct searches at the LEP [1], Tevatron [2] and LHC [3] colliders, to be in the range 115 -127 GeV at the 95% C.L. Precision measurements of W and Z boson and top quark properties [4] indicate a SM Higgs boson mass, m H = 96 +31 −24 [5]. Collaborations have mainly focused on the decay modes H → bb in the low mass region and H → W W with both W bosons decaying to an electron or muon in the high mass region.
A previous D0 publication [6] reported a Higgs boson search in the tau lepton pair plus two jets final state, with one tau decaying to a muon and the other to hadrons, using 1.0 fb −1 of data. The CDF collaboration has recently reported a similar search in the tau lepton pair plus at least one jet [7]. In this Letter, we report the results of three searches involving the production of tau leptons that extend the previous results by adding more data, increasing the trigger efficiency, adding new search channels, and considering additional signal contributions. The final states used are: (i) µτ plus zero or one jet (denoted µτ 0), (ii) µτ plus two or more jets (µτ 2), and (iii) eτ plus two or more jets (eτ 2). The µτ 0, µτ 2, and eτ 2 analyses use data collected with the D0 detector [8] corresponding to integrated luminosities of 7.3, 6.2 and 4.3 fb −1 respectively. The eτ 0 final state is not considered here as it suffers from large background and brings little increase in sensitivity.
The Higgs boson production processes considered are (i) gluon gluon fusion (GGF), gg → H (+ jets); (ii) vector boson fusion (VBF), qq → qqH; (iii) associated vector boson and Higgs boson production (VH), qq → V H, where V is a W or Z boson, and V → qq (or Z → νν in the case of µτ 0); and (iv) associated Higgs boson and Z boson production (HZ), qq → HZ, with H → bb and Z → τ τ . The GGF, VBF, and VH processes are further subdivided according to the Higgs boson decay, H → τ τ , H → W W , or (for the µτ 0 analysis) H → ZZ, and these subchannels are denoted as GGF τ τ , GGF W W or GGF ZZ , etc. The fractional decompositions of signal contributions expected from Monte Carlo (MC) simulations are shown in Fig. 1 for the Higgs boson production cross section and branching ratios, and the event selection requirements, discussed below. Tau leptons can occur either through direct decays of the Higgs boson (at low mass) or indirectly from H → V V with V decays to τ s (at high mass). The leptons may arise from τ decay or (at high mass) directly from V decay. The ℓτ channel is more uniformly sensitive to Higgs boson production over the full allowed mass range than are the dedicated H → bb or H → W W → ℓℓνν analyses, thus improving the sensitivity of a combination of searches, particularly in the intermediate mass region around 135 GeV. In the following, "τ " represents a hadronically decaying tau and "lepton (ℓ)" denotes e or µ.

II. TRIGGER
The µτ 0 and µτ 2 data were collected from the full suite of D0 triggers. The main contributors were the inclusive high transverse momentum muon, µ + jets, and µ + τ triggers. The trigger efficiency is determined in a two-step procedure starting from the measurement of the efficiency for inclusive muon triggers. This is measured using Z → µµ candidates and parameterized as a function of muon transverse momentum (p T ), pseudorapidity (η), azimuthal angle (φ), and instantaneous luminosity. We then determine the ratio of the yields of the full trigger suite relative to those for the inclusive muon triggers. For the µτ 0 analysis the ratio is parametrized as a function of p τ T while for the µτ 2 analysis it is a constant. The efficiency for Z → τ τ events for the full suite of triggers varies between 80% and 95%, and is about 40% larger than for the inclusive muon trigger.
For the eτ 2 analysis, a set of calorimeter-based inclusive electromagnetic object triggers was used. The efficiency of these triggers, obtained from an analysis of Z → ee events selected with just one identified electron, is found to be about 85%.

III. BACKGROUND AND SIGNAL SAMPLES
The major backgrounds for the Higgs boson search are Z + jets, W + jets, tt, and multijet production (MJ) with misidentification of leptons or taus. Smaller backgrounds arise from boson (W, Z or γ) pair production and single top quark production. All but the MJ background are simulated using MC event generator programs and normalized to the highest available next-to-leading order (NLO) or next-to-NLO (NNLO) theoretical calculations. The MC simulations use the CTEQ6L1 parton distribution functions (PDF) [9].
The Z + jets and W + jets event samples are generated by ALPGEN [10], interfaced to PYTHIA [11] which provides initial and final state radiation and hadronization of the produced partons. The p Z T distribution is reweighted to agree with the D0 measurement [12]. The p W T is also reweighted for the ℓτ 2 analyses using the reweighting factors derived for the p Z T distribution, multiplied by the ratio of the p W T to the p Z T distributions as predicted in NNLO QCD [13]. For the ℓτ 2 analyses, the absolute normalization for the Z + jets and W + jets cross sections are taken from Ref. [14] using the MRST2004 NNLO PDFs [15]. The same Z + jets normalization is used for the µτ 0 analysis but the W + jets normalization is derived from data as discussed below.
We simulate tt and single top quark events using the ALPGEN and COMPHEP [16] generators respectively, with PYTHIA used to simulate hadronization effects. The normalizations are based on the approximate NNLO calculations [17]. The diboson events are generated by PYTHIA.
Higgs boson production is simulated using PYTHIA, with normalizations taken from Ref. [18]. We use HDE-CAY [19] and TAUOLA [20] to obtain the branching fractions of the Higgs boson and tau lepton respectively.
All MC signal and background events are input to a GEANT3-based [21] simulation of the detector response and processed with the same reconstruction programs as used for data. Data events collected from random beam crossings are superimposed on the MC events to account for detector noise and pileup from additional pp collisions in the same or previous bunch crossings. Correction factors are applied to the simulated events to account for the trigger efficiencies and for the differences between MC and data for the lepton, tau, and jet identifications, and for the energy scale and resolution of jets.

IV. EVENT SELECTION CRITERIA
Muons selected for this analysis are required to have hits in the muon chambers before and after the toroidal magnets and to be matched to a track in the tracking system with p T > 15 GeV and |η| < 1.6. Muon candidates are required to be isolated in both the calorimeter and the tracking system using the calorimeter transverse energy, E iso T , in the annular cone 0.1 < R < 0.4 around the muon, where R = (∆η) 2 + (∆φ) 2 , and the track transverse momentum sum, p iso T = Σp track T , within a cone R < 0.5, excluding the p T of candidate muon. For the µτ 0 analysis, E iso T and p iso T must be less than 15% of p µ T . For the µτ 2 analysis, E iso T and p iso T must be less than 2.5 GeV. Muon candidates due to cosmic rays are rejected if the scintillation counters surrounding the detector indicate a time of arrival different by more than 10 ns from that expected for collision products.
Electrons are identified using a likelihood variable, L e , that uses as inputs the quality of the matching of the electromagnetic (EM) shower centroid to a track, the fraction of energy deposited in the EM section of the calorimeter (EMF), a measure of the probability that the energy deposit pattern in the calorimeter conforms to that expected for an electron, E iso T , and the separation along the beam axis of the electron track and the primary vertex (PV) [22]. The signal sample electrons are required to have L e > 0.85. Electron candidate tracks are required to have p T > 15 GeV and |η| < 1.1 or 1.5 < |η| < 2.5, and to impinge upon a module of the central EM calorimeter within the central 80% of its azimuthal range.
The selection of hadronically decaying tau leptons is done separately for three types based on the number of tracks within a cone R < 0.3 and the number of EM subclusters found in the calorimeter using a nearest neighbor algorithm. Type-1, patterned on the decay τ → πν τ , requires one track and no EM subclusters. Type-2, based on τ → ρ(π ± π 0 )ν τ , requires one track and at least one EM subcluster. Type-3, motivated by the τ → π ± π ± π ∓ (π 0 )ν τ decay, requires at least two tracks with or without EM subclusters. We reject type-3 candidates with exactly two tracks of opposite signs since their charge sign is ambiguous. The τ transverse energy, E τ T , is defined as the visible transverse momentum of the τ decay products as measured by the calorimeter with appropriate energy scale corrections. The ratio of E τ T to the sum of the tracks associated with the tau, p trk T , is used to verify that the MC and data tau energy scales are the same. We require E τ T > (12.5, 12.5, 15) GeV, p trk T > (7, 5, 10) GeV, and (p trk T /E τ T ) > (0.65, 0.5, 0.5) for τ types (1, 2, 3). The leading (highest p T ) track for type-3 τ s must exceed 7 GeV. A neural network, NN τ [23], based on energy deposition patterns and isolation criteria in the calorimeter and tracking systems is constructed for each tau type to discriminate a τ from a misidentified jet. Lower bounds placed on NN τ at 0.9, 0.9 and 0.95 for tau types 1, 2, and 3 select hadronically decaying taus with good purity. For type-2 τ leptons we discriminate taus from electrons using a second neural network, NN τ /e , constructed using variables that characterize the longitudinal and transverse energy profiles in the calorimeter, the energy and position correlations between τ tracks and calorimeter energy deposits, and isolation of the calorimeter energy.
Jets are selected using an iterative midpoint cone algorithm [24] with a cone size R = 0.5. We require at least two tracks associated with the jet that point to the PV. Jet energies are corrected to the particle level for out-of-cone showering, underlying event energy deposits and pileup, and the estimated missing energy in jets with identified semileptonic decays of a hadron. The energy scale, resolution, and jet identification efficiency for MC jets are corrected to give agreement with data. For the quark-dominated MC samples (tt and diboson), there is an additional correction of the jet energy that accounts for the differences in the responses of quark jets and the dominantly gluon jets for which the jet energy scale correction was obtained. The µτ 2 and eτ 2 analyses require at least two jets with |η jet | < 3.4 and p jet T > 20 (15) GeV for the leading (other) jet. The µτ 0 analysis imposes these jet p T requirements as a veto to ensure that the selected samples have no events in common.
The missing transverse energy, / E T , is computed from the observed transverse energy deposits in the calorimeter and is adjusted for the appropriate energy scale corrections for all objects and for isolated muons observed in the event.
For the final event selection, all three analyses require exactly one isolated lepton and a hadronic tau with opposite charges. The separations between all pairs of lepton, tau, and jet are required to be R > 0.5. For the µτ 0 analysis, events are required to have only one τ , and the smaller of the transverse masses, m T = 2E lepton T / E T (1 − cos ∆φ) (where "lepton" = τ or µ and ∆φ is the angle between the lepton and / E T ) must exceed 25 GeV to suppress the Z+ jets and MJ backgrounds, while retaining about 80% of the signal. For the eτ 2 analysis, substantial backgrounds arise from Z + jets production with Z → ee where an electron is misidentified as a type-2 τ . To reduce these, we remove τ candidates in the region 1.1 < |η| < 1.5 where the calorimetry has impaired electron identification. Further Z+ jets rejection is obtained by requiring type-2 τ candidates to have NN τ /e > 0.95 to suppress electrons that resemble the track + EM cluster signature. This cut retains more than 80% of type-2 τ s while rejecting about 90% of the electrons [25]. We reject type-2 τ candidates which point near the edge in φ of an EM module in the central calorimeter where the EM response is impaired. In addition, type-3 τ candidates with EMF > 0.95 are excluded. The MJ background in the eτ 2 analysis is suppressed by requiring S > 1, where S is a measure of the significance for / E T to differ from zero [26].

V. BACKGROUNDS DERIVED FROM DATA
The MJ background arising from misidentification of leptons or taus by the detector reconstruction algorithms is difficult to simulate, so for each analysis, the MJ background is taken from data. The general method for all analyses is similar: we define a sample of MJ-enriched events, M, from which residual backgrounds simulated by MC are subtracted, to provide the shapes of the MJ kinematic distributions. The number of MJ events in the signal sample is obtained by multiplying the MJ yield in a signal-like sample N by a scale factor ρ i , obtained from the M sample for each of the tau types, i. The ρ i factors provide the estimate for the differences in the MJ background normalization between the N and signal samples, based on the M sample, and are in all cases within 10% of unity.
For the µτ 0 channel, the sample M is obtained by requiring m T (µ, / E T ) < 30 GeV and NN τ < 0.2, and the ρ i are the ratios of isolated to non-isolated lepton events in M, and are parameterized as a function of p τ T , N jets , / E T , and p µ T . These factors scale the MJ fraction of the sample N , selected as for the signal sample except that the muon is required to be non-isolated, to obtain the MJ normalization in the isolated lepton signal sample. An alternate MJ-enriched sample is defined by NN τ < 0.2 and m T (µ, / E T ) < 30 GeV, in which the τ and µ have the same charge sign, for estimating the MJ background uncertainty.
For the µτ 2 analysis, the MJ sample M is obtained by reversing at least one of the muon isolation requirements and requiring 0.3 < NN τ < 0.8. The MJ fraction of this sample is 94% before the MC-simulated background subtraction. The ρ i factors are the ratio of opposite charge sign (OS) and same charge sign (SS) µ − τ pairs in M and are used to scale the MJ component of the sample N selected as for the signal sample except that we require SS µ and τ . The ρ i show no significant dependence on the kinematic variables.
For the eτ 2 analysis, M is obtained by requiring the electron to satisfy an orthogonal loose electron selection, 0 < L e < 0.85, and 0.3 < NN τ < 0.9. The MJ fraction of this sample is 96% before the MC-simulated background subtraction. The ρ i are obtained from the OS and SS M sample and are applied to the MJ component of the N sample as in the µτ 2 analysis. The ρ i show no significant dependence on kinematic variables. Alternate MJ-enriched samples, in which either the τ or lepton selections (but not both) are reversed, are defined for estimating the MJ background uncertainties in both ℓτ 2+ analyses.
For the µτ 0 analysis, the dominant background is from W + jets with the muon from W decay and a jet misidentified as a tau. Both the normalization of the W + jets sample and the misidentification probability are difficult to model adequately, so the simulation is corrected using a data-driven method [27]. The jet produced in association with a W boson has a charge that is correlated differently with the W boson charge for quarks and gluons. Furthermore, the probability for a jet misidentified as a tau to have the same charge sign as its progenitor parton varies with NN τ . We determine a weight for W + jets MC events that depends on the charge correlation between the muon and recoil parton and on the value of NN τ .

VI. EVENT YIELDS
The numbers of data and expected background events are given in Table I for the µτ 0, µτ 2, and eτ 2 analyses.

VII. MULTIVARIATE ANALYSIS
The expected number of events for Higgs boson signal processes is small in comparison to the backgrounds shown in Table I. For example, the expected signal yields at m H = 165 GeV are 5.2, 1.7 and 0.3 events for the µτ 0, µτ 2 and eτ 2 analyses respectively. The corresponding yields at m H = 115 GeV are 0.9, 1.6 and 0.4 events. We thus employ multivariate techniques that utilize both the magnitudes of the variables and the correlations among them to separate the signal from the backgrounds. We choose well-modeled variables that have the capability to distinguish between at least one signal and one background as shown in Table II. Figure 2 shows distributions for representative variables that offer significant discrimination of signal and background for each of the channels. For calculating the τ τ invariant mass shown in Fig. 2(c), the / E T is apportioned to the neutrinos from the two postulated tau leptons by decomposing the / E T vector into components associated with the observed lepton and hadronic tau [28]. HT is the missing transverse energy computed from the jets in the event. Variable ET invariant mass x τ τ invariant mass [28] x Dijet invariant mass x ∆R between leading jets x ∆η between leading jets x Asymmetry between / ET and / HT x ∆φ between ℓ and τ x ∆θ between ℓ and τ x ∆φ between ℓ and / ET x ∆φ between τ and / ET x ∆φ between / ET from calorimeter and tracks x Cosine of angle between ℓ and beam direction x Minimum δφ between / ET and a jet x Missing ET significance, S x NNτ x The µτ 0 analysis uses neural networks [29] (NN H ) trained to discriminate between all backgrounds and all signals for 115 ≤ m H ≤ 200 GeV in 5 GeV increments. Type-2 τ samples are trained separately, while the τ types 1 and 3 are combined for training to increase statistics. The NN H distributions are binned in 21 equal sized bins for 0 < NN H < 1.05. The µτ 2 and eτ 2 analyses use boosted decision trees (BDT) [29] trained for all signals against the sum of all backgrounds, with all τ types combined for Higgs boson masses 105 ≤ m H ≤ 200 GeV in 5 GeV steps. The BDT output is binned in 15 bins spanning −1 < BDT < 1 with a non-uniform binning to assure sufficiently small statistical uncertainty in the predicted backgrounds within any bin. We smooth the effects of signal MC statistics by averaging BDT distributions for m H with the neighboring distributions at (m H − 5) GeV and (m H + 5) GeV with weights of 50%, 25%, and 25% respectively. Figure 3 shows the NN H distribution for the µτ 0 analysis at m H = 165 GeV and the averaged BDT distributions for the ℓτ 2 analyses at m H = 150 GeV, where the sensitivities are maximal.

VIII. SYSTEMATIC UNCERTAINTIES
A large number of systematic uncertainties have been considered, typically broken down separately for each analysis channel, tau type, background or signal process, or Higgs boson mass. The luminosity and trigger uncertainties are obtained from separate analyses of D0 data. The lepton, tau, and jet energy scale, resolution, and identification uncertainties are obtained from special control samples. Uncertainties in the MC-simulated background cross section normalizations and shapes are obtained using theoretical uncertainties, and the extent to which special data samples enriched in each background process agree with MC predictions. The MJ background uncertainties are determined by comparing the alternate MJ-enriched samples with the results obtained with the nominal choice. Signal cross section uncertainties are obtained from theoretical estimates and include the effect of PDF uncertainties. For each source, the impact on the final variable (NN H or BDT) distribution is assessed by changing the nominal values of a parameter by ±1 s.d. Some of the uncertainties affect only the normalization of the final variable distribution and some also modify its shape. Table III summarizes the systematic uncertainties. Many entries comprise several subcategories. For example, the jet reconstruction uncertainty includes the effects of jet identification, confirmation that the tracks within the jet arise from the PV, jet resolution and jet energy scale. Moreover, these elements of the jet reconstruction uncertainty are computed separately for different background processes and hypothesized Higgs mass values in each analysis channel. The dominant systematic uncertainties are due to the V +jets and MJ backgrounds, with significant contributions from jet reconstruction and modelling for the ℓτ 2 analyses.

IX. CROSS SECTION LIMITS
We observe no excess of events over that expected from backgrounds in Fig. 3. We therefore obtain upper limits on the Higgs boson cross section for each analysis from the final multivariate outputs using the modified frequentist method [30], using a negative log likelihood ratio (LLR) for the background only and signal+background hypotheses as the test statistic. For the µτ 0 analysis, each tau type is input separately to the limit setting calculation for Higgs boson masses from 115 to 200 GeV in 5 GeV steps. The ℓτ 2 calculation uses the BDTs summed over tau type for m H values from 105 to 200 GeV in 5 GeV steps, averaged over neighboring mass bins as described above.
The impact of systematic uncertainties on the limits is minimized by maximizing a likelihood function [31] in which these uncertainties are constrained to Gaussian priors. The value of the Higgs boson cross section is adjusted in each limit calculation until the value of CL s reaches 0.05, corresponding to the 95% C.L., where CL s = CL s+b /CL b and CL s+b (CL b ) are the probabilities for the negative LLR value observed in simulated signal+background (background) pseudo-experiments to be less than that observed in our data. The limits obtained are summarized in Table IV. We combine the information from the three channels by recomputing the LLR and limits for the three analyses together, now also including the limits from the previous independent µτ 2 analysis using 1 fb −1 [6]. In this calculation, the systematic uncertainties across the different analyses are appropriately correlated (e.g. the Z + jets normalization for all channels is the same). The fully combined LLR distributions and the 95% C.L. limits as a function of m H are shown in Figs. 4 and 5. The combined limits are also shown in Table IV.
In summary we have searched for the SM Higgs boson in final states involving an electron or muon and a hadronically decaying tau. We set 95% C.L. limits on the Higgs boson production cross section which are 21.8 and 6.8 times those expected in the SM for Higgs boson masses of 115 and 165 GeV.
We thank the staffs at Fermilab and collaborating institutions, and acknowledge support from the DOE and NSF (USA); CEA and CNRS/IN2P3 (France); MON, Rosatom and RFBR (Russia); CNPq, FAPERJ,