Finding Higgs bosons heavier than 2 m_W in dileptonic W-boson decays

We reconsider observables for discovering a heavy Higgs boson (with m_h>2m_W) via its di-leptonic decays h ->WW ->l nu l nu. We show that observables generalizing the transverse mass that take into account the fact that both of the intermediate W bosons are likely to be on-shell give a significant improvement over the variables used in existing searches. We also comment on the application of these observables to other decays which proceed via narrow-width intermediates.

We reconsider observables for discovering a heavy Higgs boson (with m h > 2mW ) via its dileptonic decays h → W W → ℓνℓν. We show that observables generalizing the transverse mass that take into account the fact that both of the intermediate W bosons are likely to be on-shell give a significant improvement over the variables used in existing searches. We also comment on the application of these observables to other decays which proceed via narrow-width intermediates.
The LHC collaborations have now excluded a heavy Higgs boson, with mass between 127 and 600 GeV [1,2], at the 95% confidence level, assuming that its production and decay proceed according to the predictions of the Standard Model. Nevertheless, there is a clear motivation to continue the search in this mass region. Indeed, theories beyond the Standard Model, with additional degrees of freedom, typically predict a lower cross section times branching ratio for Higgs production and decay in any given channel, for three reasons. Firstly, the branching ratio will necessarily be reduced whenever there are extra states into which a given Higgs boson may decay. Secondly, in theories with an extended scalar sector, mixing reduces the cross section to produce a given mass eigenstate. Thirdly, since the dominant production mechanism for Higgs bosons at the LHC is via loop-level gluon fusion, the production cross section will be changed by the presence of additional coloured states in the theory. If one of these states plays the rôle of cancelling the quadratic divergence in the Higgs mass, coming from a loop of top quarks, then it necessarily interferes destructively with the dominant contribution to gluon fusion in the Standard Model, which also comes from a top quark loop [3]. Furthermore, it is observed (though sadly no theorem to the effect has been proven) that the crosssection is reduced in all models in which the Higgs boson is composite [3][4][5][6][7].
One way to search for a heavy Higgs boson is to look for evidence of it decaying to the fully leptonic final state lνlν (where l is an electron or muon) which results from an intermediate pair of (possibly virtual) W -bosons [8][9][10].
A key step in many such analyses comprises a selection, or cut, based on a variable that should have a strong correlation with the mass of the Higgs boson in 'signal' events. Typically this variable will be a transverse mass of some kind, of which many have been described in the literature [11,12]. The transverse mass variable m T of Ref. [13] (see also [8,14]) is used in current LHC h → W W → lνlν searches. 1 1 The variable m T was referred to as m true T in [13] to distinguish it from other, approximate transverse mass variables in use at It has previously been observed [15] that a kinematic variable m bound T (which, for reasons that will become clear, we refer to henceforth as m lower T ) which demands intermediate mass-shell constraints for the tau leptons might prove advantageous in h → τ τ decays. As was indicated in that paper, the same variable (after the trivial replacement m τ → m W ) might also prove useful in the context of h → W W when a heavy Higgs boson has mass greater than 2m W , since the intermediate W -bosons should be produced close to their mass shells. Substantiation of this suggestion was left open to further study and one of our intentions here is to perform that follow-up. We also intend to go somewhat further, in providing a complete discussion of the variables that characterize the kinematics of h → W W → lνlν decays, in a sense that will be made explicit below. We will find that the pre-existing variables m T or m ⋆ T [16] provide a complete description in cases where neither or either, respectively, of the intermediate W -bosons is produced on-shell. 2 A complete description of the cases when both of the W -bosons are produced on-shell requires not only the extant variables m lower T and m T2 [18,19], but also a new variable, which we denote m upper T . Our simulations suggest, moreover, that m upper T gives a significant improvement in discovery potential for h → W W → lνlν.
Having finished our preamble, we now describe more explicitly how variables of this kind may provide a complete characterization of the kinematics of events. Since the decay of the parent Higgs boson involves invisible neutrinos, one cannot find a variable that measures its mass directly; rather the best one can do is to find variables that bound the mass of the parent in some way. Myriad variables of this type may exist and what one would like to do, presumably, is to find the variable or variables that give the optimal bounds on the mass in an the time; we now refer to it simply as m T , in line with both the ATLAS and CMS collaborations. 2 In the mass range m W < m h < 2m W , one might reasonably assume that one or other of the W bosons is produced on shell [17]. However, simulations reported in [16] found no significant improvement in discovery potential when m ⋆ T was substituted for m T in existing ATLAS collaboration searches.

event.
This search for the optimal variables can be done in a more-or-less definitive way if one is willing to restrict one's attention to kinematics alone. That is to say, suppose one assumes that a signal event corresponds to some decay topology (here the decay of a parent Higgs into two intermediate W resonances, followed by decays of each intermediate into a combination of visible and invisible daughter particles). One may then write down the various kinematic constraints (corresponding to energymomentum conservation and the mass-shell conditions) and ask which values of the a priori unknown masses are allowed, in the sense that the kinematic constraints then admit solutions in which the unmeasured momenta and energies in the event are real and real-positive, respectively. The resulting allowed mass region is an event observable and encodes all of the information which can be gleaned about the masses in an event using kinematics alone. The boundary of the allowed region can be described in terms of one or more relations between the masses, which are themselves event observables and which give the optimal bounds on the mass in an event. Given multiple events, the allowed mass region is obtained as the intersection of the allowed regions for each event and is itself an observable. 3 This abstract recipe leads, in many cases, to simple, pre-existing observables. In the canonical example of a decay of a parent to a single invisible daughter (together with possibly multiple visible particles), the observable that results [23] is the familiar transverse mass, m T [24], invoked in the discovery of the W -boson [25] in its decay to a charged lepton and a neutrino; for single parent decays into multiple invisibles, the resulting variable [13] is m T ; for identical decays of a pair of parents into invisible particles, the variable [23,26,27] is m T2 .
Thus, these variables encode all of the information that is available from kinematic considerations alone (subject to an assumed decay topology) and there is no point in trying to devise further variables to glean further information from kinematics. 4 In the next Section, we firstly try to prove the equivalence between the kinematic constraints with one intermediate on-shell and the variable m ⋆ T and secondly the 3 Whether this observable is the optimal observable for discovering the parent is unclear. To settle this, one would first have to define what one meant by an optimal discovery observable. For discussion, see e.g. [20][21][22]. 4 The equivalence between variables and kinematics is also important for the purposes of determining which particle masses can be measured using kinematics alone: as an example, for a (possibly pair-produced) parent particle that decays off-shell to multiple invisible particles, it is a theorem that one can -at least in principle -measure the parent mass and the sum of the masses of the invisible daughter particles using kinematic information alone [28][29][30][31].
equivalence between the constraints with both intermediates on-shell and the variables m lower T and m T2 . Our first attempt at a proof succeeds: m ⋆ T is equivalent to all of the information contained in the kinematic constraints when one intermediate is on-shell. But for the topology with both intermediates on shell, we fail. Indeed, we show that m lower T and m T2 do not capture all of the information in the kinematic constraints. Rather there is a third, distinct variable, which we call m upper T , and which gives an upper bound on the mass of the Higgs (or whatever parent particle is being considered).
In what follows we show that while m lower T only marginally enhances the discovery potential for a Higgs boson using current ATLAS collaboration search strategies, m upper T gives a significant improvement. In an Appendix, we discuss issues related to the existence of m lower T and m upper T .

KINEMATIC EQUIVALENCE
Since we are concerned with the three transverse masses m T , m ⋆ T and m lower T , it is helpful to recall and compare their definitions. All three make the assumption that the observed lνlν final state resulted from the decay of a single progenitor particle or resonance (i.e. the Higgs in the case of the signal) whose mass is, a priori, unknown. All three assert that the sum of the transverse momenta of the neutrinos should equal the observed missing transverse momentum. All three accept that: (i) the way the transverse momentum is shared between the two neutrinos is unknown, and (ii) that the longitudinal momentum component of each neutrino is unknown. Finally, all three transverse masses are defined to be the greatest possible lower bound for the unknown mass of that progenitor particle, subject to consistency and any remaining constraints. 5 It is only in these remaining constraints that the transverse masses differ: m T applies none, m ⋆ T permits only neutrino momenta that place at least one W -boson on mass-shell, while m lower  Table I.
We now attempt to prove the equivalence between the  and p ′µ l respectively. The hypothesized four-momenta of the two neutrinos are given by the two p µ ν . The missing transverse momentum is / p T . In all three cases, the variable named in (a) is defined to be the maximal lower bound on the invariant H µ Hµ, where H µ = p µ ν + p µ l + p ′µ ν + p ′µ l , subject to the indicated subset of the constraints shown in (b).
variables m ⋆ T and m lower T and the corresponding kinematic constraints.

One intermediate on-shell
The theorem to be proved is a formalization of the colloquial: m ⋆ T is equivalent to all the information contained in the kinematic constraints. To wit, in a notation in which p µ = (E, p T , q): Theorem 1 The kinematic constraints (1) to (3) and either (4) or (5) admit a solution with momenta in R and energies in R + iff.
Here, m ⋆ T is defined as the minimum value of obtained by varying the unobserved momenta in R and energies in R + . This definition makes the necessity of m h ≥ m ⋆ T obvious; it is the sufficiency of the condition that requires deliberation.
So, does m h ≥ m ⋆ T imply that the kinematic constraints have a solution? One way to show this would be to consider every m h ≥ m ⋆ T and explicitly construct a solution therefor. An easier way is to show that one can find solutions corresponding to arbitrarily large m h and then to invoke continuity and the intermediate value theorem to show that one can find solutions for any m h between m ⋆ T and ∞. Let us then exhibit a solution for arbitrarily large m h that satisfies constraints (1) to (4) (if the other W is on-shell, the required solution can be obtained by interchanging primed and unprimed quantities). It is given by where q ν is given by either of This yields a value of m h given by where in the last line we assumed q ′ ν > 0 without loss of generality (this amounts to a choice of which proton beam constitutes the +z direction). The terms in parentheses in the coefficient are each positive semi-definite and so no cancellation of the coefficient of q ′ ν can occur. Thus m h grows without bound unless all terms vanish. To do so, both leptons must have vanishing transverse momenta, in which case they would fall outside the detector acceptance and this would not be identified as a di-lepton plus missing momentum event.

Both intermediates on-shell
It is now easy to show that an analogous theorem cannot be proven for m lower T , which applies to the topology in which both constraints (4) and (5) are satisfied. The reason is that one cannot find solutions to the kinematic constraints corresponding to arbitrarily large m h . Indeed, one can easily convince oneself that to get arbitrarily large m h , at least one of the neutrino momenta must become arbitrarily large. But each neutrino is now produced in the decay of a W -boson, which simultaneously results in a charged lepton of finite (and measured) momentum. Now, there is only one way in which a two-body decay can produced one daughter of finite momentum and another with infinite momentum in the lab frame: the rest frame of the W must be infinitely boosted exactly in the direction of motion of the neutrino (as measured in the W rest frame). This doesn't work, because the lepton momenta are then not only finite but arbitrarily small, in contrast with their measured values in a generic event.
Thus, there is more information in the kinematic constraints than is captured by m lower T alone. In particular, since m h cannot reach arbitrarily large values in events, there is a distinct variable, obtained by maximizing the sum of lepton and neutrino momenta in events, subject to the above constraints. This variable will be bounded below by m h and we call it m upper T . By the intermediate value theorem, this variable, together with m lower T does contain all of the information in kinematics.

MONTE CARLO SIMULATIONS
Simulations, similar to those performed in Ref [13], are used to test the extent to which using m lower T and m upper T might be expected to enhance the statistical significance of a h → W W signal above the dominant W W continuum background and the tt background. We use the HERWIG 6.505 [32,33] Monte Carlo generator, with LHC beam conditions ( √ s = 7 TeV). Our version of the generator includes the fix to the h → W W ( * ) spin correlations described in [34]. We generate unweighted events for Standard Model Higgs boson production (gg → h) and for the dominant backgrounds, qq → W W , and tt production. 6 We use the leading order Standard Model cross section for all values of m h .
The missing transverse momentum is calculated from the negative sum of the p T of visible particles within the fiducial region p T > 0.5 GeV and |η| < 5. The detector resolution is simulated by smearing the magnitude of the missing momentum vector with a Gaussian resolution function of width σ / p T / / p T = 0.4 GeV 1/2 / √ Σ where Σ is the sum of the | p T | of all visible fiducial particles.
For each value of m h , fifty pseudo-experiments are performed, each corresponding to 10 fb −1 . Selection cuts are applied, requiring: • Exactly two leptons ℓ ∈ {e, µ} with p T > 15 GeV and |η| < 2.5 • Missing transverse momentum, p T > 30 GeV • 12 GeV < m ℓℓ < 300 GeV • No jet with p T > 20 GeV 6 Other backgrounds, such as Z → τ τ , are rendered sub-dominant by the cuts discussed below.
• Z → τ τ rejection: the event was rejected if |m τ τ − m Z | < 25 GeV and 0 < x i < 1 for both i ∈ {1, 2} 7  Figure 1. The first result is that when m h ≈ 2m W the doubly-on-shell lower bound variable, m lower T has significantly better sensitivity than the simple transverse mass m T (or indeed the singlyon-shell variable m ⋆ T , which is almost indistinguishable from m T [16]). Secondly, we observe that the distribution of the doubly-constrained upper bound m upper T shows a markedly greater discovery potential again, relative to either m T or m lower T . Since any event satisfying the constraints of Table I will generate both a lower and an upper bound, and since those bounds are not closely correlated ( Figure 2) each adds information relative to the other.

OTHER FINAL STATES
We note that it is also possible to construct equivalent bounds for any other decays of the form A → B+C where B and C are produced near their mass shell and each decays to an invisible particle and one or more visible particles. A case relevant to LHC Higgs physics is the decay mode h → τ τ , followed by semi-invisible decay of each of the τ leptons. Previous simulations [15] indicated that adding the tau mass constraints led to improved discrimination of the signal from the dominant Z → τ τ 7 The variable x i is the momentum fraction of the ith tau carried by its daughter lepton and mττ is the di-tau invariant mass. They are calculated using the approximation that each τ was collinear with its daughter lepton. background, using the lower bound m lower T [15]. However further simulations indicate that there seems to be little additional discrimination available from the upper bound in this case, perhaps because of the large hierarchy in masses between m h and m τ .

CONCLUSION
When a heavy higgs boson with mass > 2m W decays via h → W W → ℓνℓν, we have shown that one can use knowledge about the narrow width of the intermediate W bosons to construct both upper and lower bounds on m h . We demonstrate that these two bounds together make maximal use of the kinematic information about m h in each event. Since the two bounds are reasonably uncorrelated, the ATLAS and CMS experimental collaborations should be able to obtain improved kinematic sensitivity in an analysis which simultaneously makes use of both bounds.
We note that the same suite of variables can find use for discovery or mass determination in other decays of the form A → B + C followed by semi-invisible decay of each narrow-width intermediate particle.
This work was supported by the Science and Technology Facilities Council of the United Kingdom, by the Royal Society, by the Institute for Particle Physics Phenomenology, by Merton College, Oxford, and by Peterhouse, Cambridge. We gratefully acknowledge T.J. Khoo for helpful comments.
property of the m T 2 variable under which large mismeasurements of / p T can lead, in a non-negligible number of events, to small changes (relative to m τ ) in the value of m T 2 , provided that the intermediate particles in the production process are sufficiently light. Indeed, in the limit that the intermediate pair of particles are massless, their decay products are collinear and so the true / p T vector lies in the smaller of the two sectors bounded by the two visible transverse momenta. It follows that for such configurations m T 2 must vanish, as there exists a splitting of the / p T (in fact the 'true' splitting) for which the transverse mass of each pair of decays is zero. That m T 2 is 'small' for well-measured events in this limit is not surprising -after all, m T 2 is supposed to be bounded above by the mass of either member of the pair, and we are taking the limit in which those masses go to zero. What is surprising, if anything, is that m T 2 can be forced exactly to zero in a large number of events, even for finite yet small intermediate particle masses. In this sense we can say that m T 2 goes to zero faster than m τ . It follows that for such signal events, any mis-measurement of / p T (even large ones) that keeps it 'between' the visible particles' transverse momenta, leads to no observable change in m T 2 . Moving away from the above limit, such that the intermediate particles now have non-zero but still small intermediate masses, we see that changes in m T 2 can still be small, even for large mis-measurements of / p T , provided that mis-measured / p T remains in the appropriate region of the transverse plane.
This goes some way towards explaining why it was found in [15] that m lower T could remain well-defined for a large number of h → τ τ events, even in the presence of momentum mis-measurements. For h → W W events this issue is much less of a concern; we find that m lower T exists for all but a few per cent of the h → W W signal events we simulate. We attribute the greater frequency with which m lower T exists to the greater ease with which the momentum mis-measurement errors (which have a natural scale far below m W ) may be incorporated into the intermediate particle mass constraints (whose natural scale is m W ).