Confirming the LHC Higgs Discovery with WW

We investigate the prospects of observing a neutral Higgs boson decaying into a pair of $W$ bosons (one real and the other virtual), followed by the $W$ decays into $qq' \ell\nu$ or $jj\ell\nu$ at the CERN Large Hadron Collider (LHC). Assuming that the missing transverse energy comes solely from the neutrino in $W$ decay, we can reconstruct the $W$ masses and then the Higgs mass. At the LHC with a center of mass energy ($\sqrt{s}$) of 8 TeV and an integrated luminosity ($L$) of 25 fb$^{-1}$, we can potentially establish a $6\sigma$ signal. A $5\sigma$ discovery of $H \to WW^* \to jj\ell\nu$ for $\sqrt{s} = 14$ TeV can be achieved with $L = $ 6 fb$^{-1}$. The discovery of $H \to WW$ implies that the recently discovered new boson is a CP-even scalar if its spin is zero. In addition, this channel will provide a good opportunity to study the $HWW$ coupling.


INTRODUCTION
Recently, searches for a Standard Model (SM) Higgs boson at both the ATLAS and CMS experiments have furnished compelling evidence for a new particle consistent with a Higgs boson, having a mass near 125 GeV [1,2]. Both collaborations report an apparent excess of events in the γγ channel and the ZZ → 4ℓ channel. At the same time, searches in the channel W W → ℓνℓν have excluded a SM Higgs boson at the 95% confidence level for masses above 129 GeV and combined searches exclude from 110 to 122.6 GeV [3,4]. Results from LEP II preclude the mass region below 114 GeV [5]. The LHC is now taking data for its 2012 run at a center of mass (CM) energy of 8 TeV, after which it is planned to be offline for a year before an upgraded run in 2014.
With the detection of this new particle, the Higgs program at the LHC moves into a new phase of testing to determine the properties of the particle in as much detail as possible. For this mass range, the γγ and ZZ → 4l channels should continue to provide the best mass resolution. However, it is worthwhile to consider all potential channels which can make any significant contribution, both to refine our results with additional data and to test the consistency of any discovery with the Standard Model or its variations.
Towards this end we consider the potential for detection of the Higgs decaying to W + W − , where one W decays hadronically and one leptonically, H → W W * → qq ′ ℓν or jjℓν. This channel has previously been considered for the Tevatron and the LHC, but generally not for such a low Higgs mass [6,7]. The ATLAS collaboration has released the results of a search in this channel with 4.7 fb −1 of data at 7 TeV, focusing on the 300 to 600 GeV mass range, for which they find no significant excess, although this channel by itself does not yet exclude the expected Standard Model cross-section [8]. CMS finds no evidence for a Higgs in the range 170 to 600 GeV and excludes a SM Higgs for 230-480 GeV [9]. In this letter we will consider the lower mass range consistent with the announced discovery.
The jjℓν signal presents some difficulties in this relatively low mass range. First, it is clearly well below the nominal W W production threshold, so that a resonantly produced Higgs boson will decay with at least one W far below its mass shell. This means that W W is not the leading branching fraction and our signal is smaller than it would be at higher masses. On the other hand, as we shall see, this far-off-shell case presents some kinematic characteristics which can help distinguish it from the backgrounds.
The second problem is that introducing jets in our signal inevitably involves dealing with large QCD backgrounds. As mentioned above, W W → ℓνℓν is currently being searched and has already yielded strong upper limits on a SM Higgs. This channel has the advantage of having primarily backgrounds from weak interactions. On the other hand, the presence of two neutrinos limits our ability to reconstruct the event kinematics. Allowing one of the W s to decay hadronically means we must contend with the large W jj background, but has the advantage of including only one neutrino in the signal.
Although the single neutrino still presents us with an unmeasured momentum, we can determine its components as described in our analysis and identify a characteristic mass peak near the physical Higgs mass nonetheless. We estimate the rates for the signal and the background with appropriate cuts and show that this channel (H → W W * → jjℓν) can contribute a 6σ statistical significance by itself with √ s = 8 TeV and an integrated luminosity (L) of 25 fb −1 for the 2012 running in each experiment (ATLAS or CMS) at the LHC. We also find that an independent 5σ discovery of a 125 GeV Higgs boson in this channel can be achieved for the design CM energy of 14 TeV with L = 6 fb −1 .
Additionally, we consider a proposal by Sullivan and Menon to augment this channel with the development of c-tagging algorithms [10]. We show that with ideally perfect c-tagging, one could potentially increase the significance of the signal in 2012 data to 9.7σ. With modest assumptions for c-tagging performance we find only marginal improvements to the statistical significance, although the ratio of signal to background would be improved.
In Section II, we describe the characteristics of the signal and the background for H → W W * → jjℓν. Section III presents our strategy to reconstruct the Higgs signal for the final state with one neutrino. Sections IV and V describe details of our simulations and acceptance cuts. Promising results are shown in Section VI, and prospects with c-tagging are discussed in Section VII. Optimistic conclusions are drawn in Section VIII.

II. SIGNAL AND BACKGROUND CHARACTERISTICS
For a typical signal event of H → W W * → jjℓν, a Higgs particle near its mass shell (in the 125 GeV region) will decay into two W bosons. One of these will be essentially on-shell while the other will be highly virtual, with an invariant mass roughly equal to 40 GeV. Either the hadronically or leptonically decaying W may be the on-shell particle and the events are approximately evenly distributed between these two cases.
The dominant physics background for our signal is W jj where the jets are produced by QCD processes. Thus, before any selection cuts, the background is typically an on-shell leptonically decaying W and a pair of jets which can fake a second W whether real or virtual. The dijet invariant mass distribution is, to first order, a smoothly falling function for the QCD background. Hence, to minimize background we select events with a higher dijet invariant mass and a leptonic invariant mass which is far from the on-shell W mass. Therefore we will concentrate on the half of the signal with an on-shell W decaying into two jets and a virtual leptonically decaying W .
The neutrino momentum is not directly measured so we must make some assumptions to reconstruct the leptonic W or the Higgs invariant mass. Previous analysis have often used the assumption that the neutrino comes from an on-shell W , which is not suitable for our case. We will assume that the transverse neutrino momentum can be approximated by the missing transverse energy ( E / T ), computed from the sum of all detected particles. The Higgs invariant mass can be approximately located by using the cluster transverse mass [6], defined as where M jjℓ is the invariant mass of the 2-jet plus charged lepton system. This quantity can be understood as the invariant mass constructed from the known momenta (assuming E / T for the transverse neutrino momentum) with the longitudinal neutrino momentum chosen so as to minimize it. Equivalently, it corresponds to the invariant mass at an endpoint in the physically allowed parameter space with real momenta. We will use this principle to reconstruct the neutrino's longitudinal momentum as detailed in Section III.
The cluster transverse mass is particularly useful in this scenario because of the low Higgs mass in comparison to the real W W diboson mass. For the signal, the actual invariant mass of the jjℓν system is typically near the minimum value allowed by the visible particles plus E / T . As we raise cuts on the energy of the jets or the leptons, so long as we do not move beyond the range of energies the signal can produce, the jjℓν invariant mass will still be at the relatively low mass, typical of the Higgs resonance, and M C will be a good approximation to the actual mass. For the background, higher cuts on the produced particles will favor a higher M C since there is no resonance which keeps a low minimum invariant mass as a physical solution.
The signal also has a characteristic spin structure which we consider as a potential discriminant against background [11]. In our Higgs signal, the Higgs boson is a scalar decaying into vector bosons with opposite spins and those W bosons couple only to left-handed particles in their decays. As a result, the up-type quark coming from the decay of one W will tend to be aligned with the charged lepton coming from the other, while the down-type quark will tend to be aligned with the neutrino. In general we do not know which jet originates from the up-type quark, but we can still try to select for events where one jet is aligned and the other anti-aligned with respect to the charged lepton.
This phenomenon can be characterized by the angles φ, θ j and θ l . φ is defined as the angle between the ℓν and the jj decay planes in the rest frame of the Higgs. θ j is the angle in the rest frame of the hadronically decaying W between the leading jet (in energy) and the direction of boost from the Higgs rest frame. θ ℓ is similarly defined, with the charged lepton in place of the leading jet. The signal is maximized for φ ≃ 0, π and for θ j , θ ℓ ≃ π 2 .

III. SIGNAL RECONSTRUCTION
As discussed above, we can use the cluster transverse mass M C (H) to approximate the resonance peak of the Higgs. Assuming the neutrino transverse momentum k T can be identified with the missing transverse energy for the event, this is equivalent to choosing the longitudinal momentum of the neutrino as where E vis and p vis are the energy and 3-momentum of the sum over the three visible particles j, j ′ and ℓ. The same concept can be applied to the transverse mass M T (W ) = M T (ℓ, E / T ) often used with leptonically decaying W 's: where This method of assigning neutrino momentum is essentially the same as in the modified MAOS (MT2 Assisted On-Shell) method detailed in Ref. [12] for use with two invisible particles.
Since we expect a low invariant mass (M ℓν ) for the virtual W from Higgs decay, M T (W ) and its associated k z value can also work as signal discriminant. Let us call the longitudinal momentum of the neutrino in the first scheme above k z (H) and in the second k z (W ). In general k z (H) will perform slightly better for reconstructing the Higgs mass near its true peak and k z (W ) is slightly better for W reconstruction, particularly at higher values of M jj .
We have also considered an intermediate, weighted case using the prescription which approximately minimizes the product M H × M W when used in reconstructions. In practice, after cuts to select the mass peaks, there are only small differences in the distributions resulting from using k z (H), k z (W ) or K z . In our analysis, we assign the longitudinal momentum of the neutrino according to k z (H), which appears to give us slightly better signal discrimination than the other options. k z (H) gives the sharpest edge to the Higgs mass peak and also performs well for the W reconstruction in the far below shell region we are selecting.

IV. EVENT SIMULATIONS
We perform Monte Carlo simulations for the signal and the background events using the MadGraph5 package [13]. Our typical signal jets do not have particularly high momentum so we are sensitive to contamination from initial-state radiation. To control this, and to have a better estimate of signal and background shapes, we use the built-in MLM-style matching scheme. This option combines matrix element and showering routines in a consistent way to avoid over-counting. We include up to one additional jet at the matrix element level in both signal and background. Showering and hadronization is performed by the event generator PYTHIA [14], after which our events are passed to the Delphes fast detector simulation for reconstruction [15]. At the Delphes level, we define our jets according to the Cambridge-Aachen (C-A) algorithm with a size parameter of ∆R ≡ ∆φ 2 + ∆η 2 = 0.5.
We require at least one isolated lepton (ℓ = e or µ) in each event and take the leading lepton in transverse momentum (p T ) as our candidate from the leptonic W decay. Since Delphes includes electrons in its listing of jets, we subtract the lepton momentum from any jet within a 0.5 cone in ∆R and recombine any remaining momentum according to the C-A prescription. The transverse momenta of our jets are typically ∼ 40 GeV or less. Energy loss from hadronization, reconstruction and detector effects can be significant for jets in this momentum range. To ameliorate this we apply a jet-energy correction factor according to the pseudo-rapidity and magnitude of the momentum of each candidate jet. This correction factor is based on comparison between jets at reconstruction level and quarks/gluons at parton level when they can be well matched, averaged over a large number of background and signal simulated events. For jets with momentum | p| 20 GeV this can be an order one correction. We apply a similar correction procedure to the charged lepton, although that is only a small adjustment.
As noted above, the background is dominated by W jj production. We separate this into two pieces, a leading QCD piece with only two electroweak vertices, and a sub-leading piece with four electroweak vertices which includes non-Higgs-generated W + W − events. We also consider tt events.
The Higgs signal is produced primarily through gluon fusion, which is implemented in MadGraph via an effective theory derived from one loop calculations with the top quark. However, the total production is significantly enhanced at higher order, suggesting a Kfactor of ∼ 2 compared to our leader order (LO) simulations. To take this into account we scale our signal results for pp → H + X up to match the higher order (NNLO) results, which find a production cross-section of 19.5 pb at 8 TeV and 49.8 pb at 14 TeV [16]. We provide results for a 123, 125 and 127 GeV Higgs in Table I. For the 123 GeV and 127 GeV cases we assume the same scaling as for M H = 125 GeV.
For the backgrounds, we have made use of the MCFM program suite for computing W jj at the next-to-leading-order (NLO) [17]. We impose a p T cut of 5 or 10 GeV and require an invariant mass cut 55 GeV < M jj < 105 GeV for the NLO results. We impose the same mass cut for our MadGraph LO plus matching simulation. The matching algorithm has an implicit cutoff p T ∼ 10 GeV which defines the boundary between matrix element and showering effects. At 7 TeV the NLO and LO+matching cross-sections agree quite well and are stable when varying the p T cut between 5 and 10 GeV. At 14 TeV the NLO estimates are approximately 15% higher than LO + matching, although with estimated errors of the same order. For the results presented below we do not apply a K-factor beyond our LO + matching calculations for our W jj backgrounds. For the tt background we include a K-factor of 2. Figure 1 shows invariant mass distributions (dσ/dM jjℓν ) with basic cuts: p T (j) ≥ 5 GeV, p T (ℓ) ≥ 20 GeV, |η(j)| ≤ 5, and 55 GeV < M jj < 105 GeV. In each event, we assume that there are two jets and one isolated leptons as well as missing transverse energy from a neutrino. In this figure, we present the reconstructed masses for the signal with M H = 125 GeV and for the background from W jj.

V. ACCEPTANCE CUTS
We apply a series of acceptance cuts to improve the statistical significance. We first require that all events have at least two jets and one isolated charged lepton. After jetenergy corrections, the first and second leading jets by transverse momentum are required to have p T (j 1 ) > 30 GeV and p T (j 2 ) > 20 GeV. The invariant mass of this jet pair (M jj ) must be between 65 and 95 GeV. Conversely, the charged lepton must have a transverse momentum p T (ℓ) < 30 GeV, and the missing transverse energy E / T can be capped at 40 GeV. We consider jets with a pseudo-rapidity |η j | < 5 and require the charged lepton to have |η ℓ | < 2.5.
With these inputs we reconstruct the longitudinal neutrino momentum as described above and equate the neutrino transverse momentum to E / T . Using this assumption we can calculate the momentum of W lν , the leptonically decaying weak boson, and H, the candidate Higgs boson.
In addition, we impose the following cuts: • M lν < 45 GeV, • M H ≃ M jjℓν < 130 GeV, • ∆R jℓ > 0.2, and lν is the energy of the leptonically decaying W in its rest frame. With these cuts applied, the remaining background is kinematically similar to the signal, although the signal's characteristic peaks are somewhat sharper. Further tightening the cuts can reduce the ratio of signal to background but generally reduces the statistical significance due to loss of signal. We do not find that angular correlations in the variables φ, θ j and θ l are sufficiently distinct from the background to improve our results.

VI. DISCOVERY POTENTIAL AT THE LHC
With the procedures and cuts discussed above, we present our estimates of signal and background rates in Table I. We consider two cases: the LHC 2012 running at 8 TeV CM energy, and the planned LHC running at a target CM energy of 14 TeV. For 14 TeV we raise the p T cut on the second jet to 25 GeV. The results include a signal calculated for input Higgs masses of 123, 125, and 127 GeV. We use the same cuts and one can see in Table I that the difference in expected signal events is small, although slightly increasing for higher masses. This is owing to the increasing W W branching fraction as the Higgs mass increases, an effect which it mitigated by the decreasing efficiency of the M jjℓν < 130 GeV cut as the signal peak moves up in mass. Obviously, this cut would drastically reduce our signal for masses much larger than those considered. We assume that the 8 TeV running will accumulate an integrated luminosity (L) of 25 fb −1 for each detector of ATLAS or CMS. The statistical significance is defined as N SS ≡ S/ √ B, where S = L × σ S is the number of signal events, B = L × σ B is the number of background events, and σ S,B is the cross section of the signal or the background. Based on our numbers above this would give a statistical significance of 6.6σ for the 2012 run in this channel. A combined analysis of the data from both CMS and ATLAS could therefore potentially approach 9σ. At 14 TeV a 5σ discovery could be made with L = 6 fb −1 for a single detector, not including any data from 2012 running. We should stress at this point that our signal to background ratio is small, on the order of 1 − 2%. Thus systematic uncertainties on the expected size of the background become very important and may wash out the purely statistical significance quoted above. Nonetheless, the signal features a distinct kinematic feature in the reconstructed Higgs peak near 125 GeV, so that measurements of the background outside the peaked area can help constrain the true background.

VII. PROSPECTS WITH C-TAGGING
In this section we consider a proposal advanced by Sullivan and Menon to study this channel with dedicated c-tagging algorithms. Many searches make use of b-tagging algorithms to better discriminate signal from background, and top-tagging programs have also been proposed [18,19]. At present, there are no procedures specifically designed to distinguish c-quark jets from light quarks and gluons. In practice, b-tagging, sometimes referred to as heavy-flavor tagging, already has some utility for this purpose. Jets arising from c-quarks are mis-tagged as b-quark jets at a higher rate than those arising from lighter quarks and gluons. Let us consider ǫ b as the b-tagging efficiency, ǫ c being the effective rate of a c-jet mis-tagged as a b-jet, and ǫ j is the mis-tagging rate for u, d, s, g-jets. The ratio of ǫ b /ǫ j , is an acceptance parameter that characterizes the 'tightness' of the b-tagging algorithm. At the ATLAS or the CMS [20,21], for a b-tagging efficiency of approximately ǫ b ∼ 50 − 60%, the c-mistag rate is ǫ c ∼ 10 − 15%, while the light-jet mistag rate is ǫ j 1%. Thus, the principle of a dedicated c-tagger is plausible although it remains to be developed.
For the discovery channel explored in this letter, c-tagging provides two advantages. The first is that half the events in our signal should involve a W decaying to a charm quark (c) and a strange quark (s), and are thus amenable to c-tagging. In contrast, our backgrounds are dominated by light jets with only ∼ 1/6 of the events involving a final state c-jet. The second is that tagging the c-jet in our signal allows us to better use the angular correlations discussed above. Without tagging we do not know which jet arises from the u-quark or the c-quark, and can only say that one jet or the other should be correlated/anticorrelated with the charged lepton direction. C-tagging would resolve this ambiguity and increase the usefulness of angular correlations as an experimental discriminant.
The requirement of c-tagging necessarily suppresses our overall signal rate, and this reduction in statistics might hurt our significance. Therefore any c-tagging scheme would need to be highly efficient to preserve our signal acceptance, while still rejecting most light jets. Using current b-tagging algorithms as a model, a high c-tagging acceptance can be achieved simply by raising the b-tagging acceptance. This is not a problem for our signal since, even with 100% acceptance of b-jets, they would constitute only a small fraction of our backgrounds. However, for current b-tagging algorithms, a high acceptance reduces the ratio of c-(mis)tag to udsg-mistag rates. As will be seen below, a successful application of c-tagging to this signal would require high c-jet acceptance with better light jet rejection than appears possible with the existing algorithms.
For the analysis with possible c-tagging, we include the same backgrounds as before. In addition, we divide the total W jj backgrounds into those including at least one c orc at the parton level (W cj) and those including only light partons (labeled W jj). We perform the same reconstruction and cuts as described above, with the following modifications: We require at least one c-tagged jet and we consider the leading c-tagged jet in p T as our candidate c quark from W decay. For the second jet we use the leading non-tagged jet or the second-leading c-tagged jet if one is present, whichever is higher in p T . Reconstruction of W s and the Higgs is performed as above. We apply p T cuts on the two chosen jets of p T > 30, 25 GeV on the first and second jet ordered by p T . All other cuts from the untagged analysis are the same. Additionally, we apply the following cuts on the angular variables: • φ > 1.2 radians, • (0.9 cos θ l − 1.2) < cos θ c < (1.1 cos θ l + 1).
Here θ c is the angle between the c-jet and the boost direction of the hadronic decaying W in the rest frame of the W , rather than the angle for the leading p T jet as used in the untagged case.
The cross sections of the signal and the background with c-tagging are given in Table  II. Each sub-channel must be multiplied by an effective tagging efficiency for a hypothetical or existing tagging algorithm as indicated. Note that while ǫ c is essentially the single-jet c-tagging efficiency, with a small enhancement coming from mis-tagged light jets in W cj channels, ǫ j should include the probability of mis-tagging any light jet in a W jj channel. For the backgrounds, which will sometimes include additional jets after showering and reconstruction, we will use ǫ ef f j = 2.5ǫ 0 j where ǫ 0 j is the single light jet mistag rate. For the tt background, b quarks from top decay are likely to be tagged as c-jets. In our estimates we will assume that for a high acceptance c-tagger every tt event will have at least one tagged c-jet. In the table above one can see that c-tagging does potentially improve our statistical significance, as well as improving the ratio of signal to background. However, realizing this potential would require excellent c-tagging acceptance while keeping the ratio ǫ j /ǫ c low. In the ideal case, where ǫ c ≃ 1 and ǫ j ≃ 0.01, we would have 9.7σ at √ s = 8 TeV based on statistical uncertainty. At √ s = 14 TeV a 5σ detection could be made with L = 4.5 fb −1 . On the other hand, let us consider the more modest but still optimistic case where c-tagging has a similar performance to current b-tagging. If ǫ c = 0.5 with ǫ 0 j = 0.01, our nominal significance with 2012 data would be 6.5σ, virtually the same as the untagged case. However, the signal to background ratio would be improved to ∼ 10%. Thus an efficient c-tagging can reduce our sensitivity to background systematics. At √ s = 14 TeV the statistical significance would be somewhat worse than the untagged case, requiring L = 9.8 fb −1 for a 5σ result. This is because the background after c-tagged cuts, especially the W cj component, grows more quickly with increasing beam energy than the overall background with untagged cuts.

VIII. CONCLUSIONS
In this letter we have investigated the discovery channel H → W W * → jjℓν for a Standard Model Higgs with a mass of 125 GeV, consistent with recent LHC results. We have demonstrated that by selecting for an on-shell hadronically decaying W paired with a far-offshell leptonic decaying W , combined with transverse-mass based reconstruction techniques, one can reduce the large W jj backgrounds to a workable level. Based on Monte Carlo simulations we estimate that the 2012 run of the LHC could provide evidence for this channel at the 6σ level with an integrated luminosity of 25 fb −1 based on statistical uncertainty. At the design energy of 14 TeV, 5σ significance could be achieved with L = 6 fb −1 of data. However, this analysis does not include a full estimation of systematic uncertainties which will play an important roll given the small ratio of signal to background. Careful study of the W jj background will be required to make this channel feasible. Nonetheless, our results are promising.
We also considered the prospects for c-tagging to improve our results. We find that exceptional c-tagging capabilities, with high acceptance and good rejection of light jets, could yield somewhat improved statistical significance. However, with more realistic assumptions for c-tagging efficiencies, we would have at best marginal improvement in terms of significance. On the other hand, the increased signal to background ratio is a distinct advantage of this scenario.
We note that our study is based on a simulation of traditional calorimeter-based jets. Due to the relatively low energy of our typical jets, we are quite sensitive to loss of resolution from energy loss and uncertainty in jet-energy corrections. This limits our ability to pick out the pronounced hadronic decaying W mass peak and the Higgs transverse mass peak. A study with particle-flow based jet reconstruction may well be able to improve on our findings.