Mass Determination of New States at Hadron Colliders

We propose an improved method for hadron-collider mass determination of new states that decay to a massive, long-lived state like the LSP in the MSSM. We focus on pair produced new states which undergo three-body decay to a pair of visible particles and the new invisible long-lived state. Our approach is to construct a kinematic quantity which enforces all known physical constraints on the system. The distribution of this quantity calculated for the observed events has an endpoint that determines the mass of the new states. However we find it much more efficient to determine the masses by fitting to the entire distribution and not just the end point. We consider the application of the method at the LHC for various models and demonstrate that the method can determine the masses within about 6 GeV using only 250 events. This implies the method is viable even for relatively rare processes at the LHC such as neutralino pair production.


Introduction
At hadron colliders the determination of the masses of new particles associated with missing momentum signals is very challenging due to the fact that the kinematics of the event cannot be completely reconstructed. Hadron colliders collide partons within each hadron, and each parton involved in the collision carries an unknown fraction of the hadron's momentum. Therefore, the center-of-mass (COM) energy and the frame of reference of the parton collision are unknown for each event. The problem is further aggravated because one does not expect any of the new short-lived particle states to travel far enough to create tracks in the detector. In extensions of the Standard Model such as supersymmetry (SUSY) or Universal Extra Dimensions (UED) there is often a massive, stable, neutral particle that will leave the detector unnoticed, leading to missing momentum associated with the production and decay of the new particles required in such extensions.
For these reasons there has been much work developing techniques to determine the mass of the new particles at hadron colliders such as the LHC. Significant information comes from the endpoints of kinematic invariant distributions. This is illustrated in the simple case that a short-lived state Y undergoes a three-body decay to a lepton pair plus the escaping neutral particle N (the LSP in supersymmetry), Y → l + + l − + N. For this decay the invariant mass m 2 ll ≡ (k l − + k l + ) 2 has a maximum value equal to the mass difference New states appear as bumps in the m ll distribution where one can read off the mass difference from the upper edge of the bump 1 . If Y undergoes a two-body decay to an on-shell intermediate state Y → X + l + → N + l + + l − , then the shape of the m ll distribution will be more like a right triangle with a vertical drop, and the maximum m ll is given by These techniques have been extensively used to study SUSY in the context of the LHC (see Ref [3] for a study of several models). Using events with four leptons in the final states and missing energy, Ref [4] shows how such edges can form a Dalitz-like plot to determine information about the mass spectra of new states. In short, such edges can accurately determine relations between the masses of the unknown particles but not the mass M N . The task of determining the complete mass spectra is therefore dependent on determining M N .
Much of the work in determining the M N in a hadron collider focuses on a cascade of decays. The idea is to use events that contain many final states so that one can find enough edges of invariant mass distributions to invert the relationships and solve for the masses or SUSY model parameters. For example Bachacou, Hinchliffe, Paige [5] use a sequenceq → q +χ o 2 →l − + l + + q → l − + l + + q +χ o 1 involving four new states in the event. One can form four invariant mass distributions from these final states, and one has four unknown masses. This set of constraints sometimes has multiple solutions. Fitting the shapes of the distributions can lift this degeneracy as was shown in Miller, Osland, Raklev (MOR) [6] and Lester [7].
Is there a way to find M N if there are only three new states involved in the event? Cheng, Gunion, Han, Marandella, McElrath (CGHMM) [8] study pair-produced states, Y, that decay via an on-shell intermediate state X. An example scenario would be pair-producedχ o 2 s where each branch decays viaχ o 2 →l + + l − → l − + l + +χ o 1 or its conjugate. Their events consist of four leptons and missing energy. They analyze each event's kinematics for compatibility with on-shell condition for the assumed topology. To make their approach robust against background and finite resolution error, they form distributions and use the shape to determine the unknown masses.
Finally, what can one determine from events which involve only two new states? Cho Choi Kim and Park (CCKP) [9,10] show how to use the Cambridge transverse mass variable M T 2 of Lester and Summers [11] to find M Y (which in their case was the gluino mass) assuming a three-body decay toχ o 1 and q,q. Their example uses about 40000 events where gluinos are pair produced and decay to four jets and missing energy. The M T 2 variable is a function χ which is an assumed mass of M N . One plots the maximum M T 2 (χ) over the 40000 events as a function of χ. A kink appears in the function at the correct M N and M Y 2 . Using this approach, CCKP find M Y and M N to about ±2 GeV for the case where M + /M − ≈ 1.3 where In this paper we will concentrate on the latter possibility involving the production of only two new states. Our particular concern is to use the available information as effectively as possible to reduce the number of events needed to make an accurate determination of M Y and M N . The main new ingredient of the method proposed is that it does not rely solely on the events close to the kinematic boundary but makes use of all the events. Our method constrains the unobserved energy and momentum such that all the kinematical constraints of the process are satisfied including the mass difference, eq(1), which can be accurately measured from the ll spectrum. This increases the information that events far from the kinematic boundary can provide about M Y and significantly reduces the number of events needed to obtain a good measurement of the overall mass scale. Although we develop the method for the case that Y decays via a three-body decay to an on-shell final states Y → N + l + + l − , its generalization to other processes is straightforward 3 .
In Section 2, we introduce the M 2C distribution whose endpoint gives M Y , and whose distribution can be fitted away from the endpoint to determine M Y and M N before one has enough events to saturate the endpoint. Section 3 estimates the performance for a few SUSY models where we include approximate detector resolution effects and where we expect backgrounds to be minimal. Finally we conclude and discuss directions for further research. Appendix A discusses the relationship between our distribution and the kink in M T 2 (χ) of CCKP and how this relationship can be used to find M 2C in a computationally efficient manner. Appendix B provides details of our simulations.
2 For a recent study on situations which lead to kinks using the transverse mass see ref [12] 3 We note that the on-shell intermediate case studied by CGHMM is also improved by including the relationship measured by the edge in the ll distribution on each event's analysis. The Y decay channel with an on-shell intermediate state X has an edge in the ll invariant mass distribution which provides a good determination of the relationship max m 2   We adapt the concept from M T 2 of minimizing the transverse mass over the unknown momenta to allow for the incorporation of all the available information about the masses. To do this we form a new variable M 2C which we define as the minimum mass of the second to lightest new state in the event M Y constrained to be compatible with the observed 4-momenta of Y 's visible decay products with the observed missing transverse energy, with the four-momenta of Y and N being on shell, and with the constraint that M − = M Y − M N is given by the value determined by the end point of the m 12 distribution. The minimization is performed over the ten relevant unknown parameters which may be taken as the 4-momenta p and q of the states N , and the lab-frame collision energy P o and longitudinal momenta P z . We neglect any contributions from unobserved initial state radiation (ISR). Thus we have subject to the 7 constraints Although one can implement the minimization numerically or by using Lagrange multipliers, we find the most computationally efficient approach is to modify the M T 2 analytic solution from Lester and Barr [13]. Details regarding implementing M 2C and the relation of M 2C to M T 2 and the approach of CCKP are in Appendix A. Errors in the determined masses propagated from the error in the mass difference in the limit of k = 0 are given by where δM − is the error in the determination of the mass difference M − . To isolate this source of error from those introduced by low statistics, we assume we know the correct M − , and one should consider the error described in eq(8) as a separate uncertainty from that reported in our initial performance estimates in the next section.
Because the true p, q, P o , P z are in the domain over which we are minimizing, M 2C will always satisfy M 2C ≤ M Y . The equality is reached for events with either m 12 or m 34 smaller than M − , with p z /p o = α z /α o , and q z /q o = β z /β o , and with the transverse components of α parallel to the transverse components of β.
The events that approximately saturate the bound have the added benefit that they are approximately reconstructed (p and q are known). If Y is produced near the end of a longer cascade decay, then this reconstruction allows one to determine the masses of all the parent states in the event. The reconstruction of several such events may also aid in spin correlation studies.
In order to determine the distribution of M 2C for the process shown in fig 1, we computed it for a set of events generated using the theoretical cross section and assuming perfect detector resolution and no background. Details of the simulation are in Appendix B. Figure  One can also see that as M + /M − becomes large, the M Y determination will be hindered by the small statistics available near the endpoint or backgrounds. To alleviate this, one should instead fit to the entire distribution. However it is clear that events away from the endpoint also contain information about the masses. For this reason we propose to fit the entire distribution of M 2C and compare it to the 'ideal' distribution that corresponds to a given value of the masses. As we shall discuss this allows the determination of M Y with a significant reduction in the number of events needed. This is the most important new aspect of the method proposed here.

Application of the method : SUSY model examples
To illustrate the power of the fit to the full M 2C distribution, we now turn to an initial estimate of one's ability to measure M Y in a few specific supersymmetry scenarios. Our purpose here is to show that fitting the M 2C distribution can determine M Y and M N with very few events. We include detector resolution effects but neglect backgrounds but assume k = 0 in the simulation. We calculate M 2C for the case where the the analytic M T 2 solution of Barr and Lester can be used to speed up the calculations as described in Appendix A. Details on our calculations and simplifying assumptions can be found in Appendix B. A more complete detailed study will follow in a subsequent publication.
Although fitting the M 2C distribution could equally well be applied to the gluino mass studied in CCKP, we explore its applications to pair-producedχ o 2 . We select SUSY models whereχ o 2 decays via a three-body decay to l + + l − +χ o 1 . The four momenta α = p l + + p l − for the leptons in the top branch, and the four momenta β = p l + + p l − for the leptons in the bottom branch.
The production and decay cross section estimates in this section are calculated using MadGraph/MadEvent [14] and using SUSY mass spectra inputs from SuSpect [15]. The distributions in this section still neglect background, but scale the α and β four vectors by a scalar normally distributed about 1 with the width of to simulate the typical LHC detector lepton energy resolution [16,17]. The missing transverse momentum is assumed to be whatever is missing to conserve the transverse momentum after the smearing of the leptons momenta. We do not account for the greater uncertainty in missing momentum from hadrons or from muons which do not deposit all their energy in the calorimeter and whose energy resolution is therefore correlated to the missing momentum. Including such effects requires a more detailed detector simulation and is beyond the scope of this Letter. These finite resolution effects are simulated in the determination of the ideal distribution and in the small sample of events that is fit to the ideal distribution to determine M Y and M N . We do not expect expanded energy resolutions to greatly affect the results because the resolution effects are included in both the simulated events and in the creation of the ideal curves which are then fit to the low statistics events to estimate the mass. We consider models where the three-body decay channel forχ o 2 will dominate. These models must satisfy mχo 2 − mχo 1 < M Z and must have all slepton masses greater than the mχo 2 . The models considered are shown in Table 1. The Min-Content model assumes that there are no other SUSY particles accessible at the LHC other thanχ o 2 andχ o 1 and we place mχo 1 and mχo 2 at the boundary of the PDG Live exclusion limit [18]. SPS 6, P1, and γ are models taken from references [19], [20], and [21] respectively. Each has the χ 2 decay channel to leptons via a three-body decay kinematically accessible. We will only show simulation results for the masses in model P1 and SPS 6 because they have the extreme values of M + /M − with which the performance scales. The Min-Content model and the γ model are included to demonstrate the range of the masses and production cross sections that one might expect.
Bisset, Kersting, Li, Moortgat, Moretti, and Xie (BKLMMX) [4] have studied the 4 lepton + missing energy standard model background for the LHC. They included contributions from jets misidentified as leptons and estimated about 190 background events at a L = 300 fb −1 which is equivalent to 0.6 fb. Their background study made no reference to the invariant mass squared of the four leptons, so one only expects a fraction of these to have both lepton pairs to have invariant masses less than M − . Their analysis shows the largest source of backgrounds will most likely be other supersymmetric states decaying to four leptons. Again, one expects only a fraction of these to have both lepton pairs with invariant masses within the range of interest. The background study of BKLMMX is consistent with a study geared towards a 500 GeV e + e − linear collider in ref [22] which predicts 0.4 fb for the standard model contribution to 4 leptons and missing energy. The neutralino decay to τ leptons also provide a background because the τ decay to a light leptons l = e, µ (Γ τ →lν l /Γ ≈ 0.34) cannot be distinguished from prompt leptons. The neutrinos associated with these light leptons will be new sources of missing energy and will therefore be a background to our analysis. The di-τ events will only form a background when both opposite sign same flavor τ s decay to the same flavor of light lepton which one expects about 6% of the time. Table 2 breaks down the LHC production cross section for pair producing twoχ o 2 in each of these models. In the branching ratio to leptons, we only consider e and µ states as the τ will decay into a jet and a Model Min Content (ref [18]) SPS 6 (ref [19]) P1 (ref [20]) γ ( ref [21])      neutrino introducing more missing energy. Direct pair production ofχ o 2 has a rather modest cross section, but production via a gluino or squark has a considerably larger cross section but will be accompanied by additional QCD jets. One does expect to be able to distinguish QCD jets from τ jets [23].
We now estimate how well one may be able to measure mχo 1 and mχo 2 in these models. Figures 3 and  4 show a χ 2 fit 4 of the M 2C distribution from the observed small set of events to 'ideal' theoretical M 2C distributions parameterized by mχo 2 . The 'ideal' theoretical distributions are calculated for the observed value of M − using different choices for mχo 2 . A second-order interpolation is then fit to these points to estimate the value for mχo 2 . The 1 σ uncertainty for mχo 2 is taken to be the points where the χ 2 increases from its minimum by one.
The difficulty of the mass determination from the distribution grows with the ratio M + /M − . Figures  3 and 4 show the two extremes among the cases we consider. For the model P1 M + /M − = 3.2, and for model γ M + /M − = 3.3. Therefore these two models can have the mass of mχo 2 and mχo 1 determined with approximately equal accuracy with equal number of signal events. Figure 3 shows that one may be able to achieve ±6 GeV resolution after about 30 fb −1 . Model SPS 6 shown in fig 4 represents a much harder case because M + /M − = 13.6. In this scenario one can only achieve ±20 GeV resolution with 3000 events corresponding to approximately 150 fb −1 . In addition to these uncertainties, one needs to also consider the error propagated from δM − in eq(8).

Summary and Conclusions
We have proposed a method to extract the masses of new pair-produced states based on a kinematic variable, M 2C , which incorporates all the known kinematic constraints on the observed process and whose endpoint determines the new particle masses. However the method does not rely solely on the endpoint but uses the full data set, comparing the observed distribution for M 2C with the ideal distribution that corresponds to a given mass. As a result the number of events needed to determine the masses is very significantly reduced so that the method may be employed at the LHC event for processes with electroweak production cross sections.
We have performed an initial feasibility study of the method for several supersymmetric models. This includes the effect of detector resolution but not backgrounds, cuts and combinatoric complications but was modeled with an assumption that k = 0. We demonstrated that for some of the models studied we are able to determine the masses to within 6 GeV from only 250 events. This efficiency is encouraging although a study including more of the real-world complications is needed to augment this initial study.
The method we advocate here can be readily extended to other processes. By incorporating all the known kinematical constraints, the information away from kinematical end-points can, with some mild process dependent information, be used to reduce the number of events needed to get mass measurements. We shall illustrate this for other cases elsewhere[in preparation]. Appendix A : Using M T 2 to Find M 2C The variable M T 2 , which was introduced in by Lester and Summers [11], is equivalent to subject to the 7 constraints (P o , 0, 0, P z ) = p + q + α + β + k (13) As is suggested in the simplified example of [24], the minimization over P o and P z is equivalent to assuming p and α have equal rapidity and q and β have equal rapidity. Implementing this eq(10) reduces to the traditional definition of the Cambridge transverse mass. By comparing M T 2 (χ) as defined above to M 2C defined in eq(3), one can see that they are very similar with the exception that the constraint eq(7) is replaced by the constraint eq (14). M 2C can be found by scanning M T 2 (χ) for the χ value that such that the constraint in eq (7) is also satisfied. One At k = 0 the maximum χ of such an intersection occurs for χ = M N which is why the endpoint of M 2C occurs at the correct M Y and why this corresponds to the kink of CCKP. Because Barr and Lester have an analytic solution to M T 2 in ref [13] in the case k = 0, this is computationally very efficient as a definition.

Appendix B: Numerical Simulation Details Numerical simulation of "ideal" events
In order to determine the distribution of M 2C for the processes shown in fig 1, it is necessary to generate a large sample of "ideal" events corresponding to the physical process shown in the figure. For simplicity in the numerical simulations included in this note we always assume k = 0 and decay via an off-shell Z-boson as this is what could be calculated quickly and captures the essential elements to provide an initial estimate of our approach's utility. Even under these assumptions, one might expect that the shape of distribution depends sensitively on the parton distribution and many aspects of the differential cross section and differential decay rates. Surprisingly this is not the case; the shape of the distribution depends sensitively only on two properties: (i) the shape of the m 12 (or equivalently m 34 ) distributions.
In the examples studied here for illustration we calculate the m 12 distribution assuming it is generated by a particular supersymmetric extension of the Standard Model, but in practice one should use the measured distribution which is accessible to accurate determination. The particular shape of m 12 does not greatly affect the ability to determine the mass of N and Y so long as one can still find the endpoint to determine M Y −M N and use the observed m ll distribution to model the shape of the M 2C distribution.
(ii) the angular dependence of the N 's momenta in the rest frame of Y . In the preliminary analysis presented here we assume that in the rest frame ofχ o 2 ,χ o 1 's momentum is distributed uniformly over the 4π steradian directions. While this assumption is not universally true it applies in many cases and hence is a good starting point for analyzing the efficacy of the method.
Under what conditions is the uniform distribution true? Note that theχ o 2 's spin is the only property ofχ o 2 that can break the rotational symmetry of the decay products. Forχ o 2 's spin to affect the angular distribution there must be a correlation of the spin with the momentum which requires a parity violating coupling. Consider first the Z contribution. Since one is integrating over the lepton momenta, the parity violating term in the cross section coming from the lepton-Z vertex vanishes and a non-zero correlation requires that the parity violating coupling be associated with the neutralino vertex. The Z-boson neutralino vertex vanishes as the Z interaction is proportional toχ o 2 γ 5 γ µχo 1 Z µ orχ o 2 γ µχo 1 Z µ depending on the relative sign of mχo 2 and mχo 1 eigenvalues. However if the decay has a significant contribution from an intermediate slepton there are parity violating couplings and there will be spin correlations. In this case there will be angular correlations but it is straightforward to modify the method to take account of correlations. We hope to study this in another publication 5 .
Even in the case that the slepton contribution is significant the correlations may still be absent. Because we are worried about a distribution, the spin correlation is only of concern to our assumption if a mechanism aligns the spin's of theχ o 2 s in the two branches. Table 2 shows that most of theχ o 2 one expects follow from decay chains involving a squark, which being a scalar should make uncorrelated the spin of theχ o 2 in the two branches. One would then average over the spin states ofχ o 2 and recover the uniform angular distribution ofχ o 1 's momentum inχ o 2 's rest frame. Once one has fixed the dependencies (i) and (ii) above, the shape of the distribution is essentially independent of the remaining parameters. To illustrate this result we show in fig 6 two cases: (1) The case that the collision energy and frame of reference and angle of the produced Y with respect to the beam axis are distributed according to the calculated cross section for the process considered in Section 3 in whichχ o 2 decays via Z exchange to the three-body state l + + l − +χ o 1 , convoluted with realistic parton distribution functions.
(2)The case that the angle of the produced Y with respect to the beam axis is arbitrarily fixed at θ = 0.2 radians, the azimuthal angle φ fixed at 0 radians, and the total 4-momentum of the colliding particles arbitrarily set to P = (500, 0, 0, 0) GeV.
The left plot of fig 6 shows the two distributions intentionally shifted by 0.001 to allow one to barely distinguish the two curves. On the right side of fig 6 we show the difference of the two distributions with the 2 σ error bars within which one expects 95% of the bins to overlap 0 if the distributions are identical. In addition to tests with k = 0, we also tested that k 20 GeV does not change the shape of the distribution to within our numerical uncertainties for any of our results. In a test case where we constructed events with M Y = 150 GeV and M N = 100 GeV, with √ k 2 uniformly distributed with between 2 of 20 GeV, with | k/k 0 | = 0.98, and with uniform angular distribution, we found the M 2C distribution agreed with the distribution shown in figure 6 within the expected error bars after 10000 events. Scaling this down to the masses studied in P 1 we trust these results remain unaffected for k 20 GeV. Introduction of cuts on jets and missing traverse energy will probably introduce some dependence on the COM energy of the collision that is absent in this ideal case.
Given this structure detailed in (i) and (ii) above we calculate the "ideal" distributions for M 2C assuming that k = 0 and that in the rest frame of Y there is an equal likelihood of N going in any of the 4π steradian directions. The observable invariant α 2 is determined according to the differential decay probability of χ o 2 to e + e − and χ o 1 through a Z-boson mediated three-body decay. Analytic expressions for cross sections were obtained from the Mathematica output options in CompHEP [28] Inclusion of backgrounds will change the shape. Backgrounds that one can anticipate or measure, like di-τ s or leptons from other neutralino decays observed with different edges can be modeled and included in the ideal shapes used to perform the mass parameter estimation. A more complete study is beyond the scope of this letter and will follow in a subsequent publication.

Least squares fit
In order to determine M Y it is necessary to quantify the comparison between the N observed events and the "ideal" events. To do this we define a χ 2 distribution by computing the number of events, C j , in a given range, j, (bin j) of M 2C . Assuming a Poisson distribution, we assign an uncertainty, σ j , to each bin j given by Here the normalized distribution of ideal events is f (M 2C , M Y ), and the second term has been added to ensure that the contribution of bins with very few events, where Poisson statistics does not apply 6 , have a reasonable weighting. Then χ 2 is given by The minimum χ 2 (M Y ) is our estimate of M Y . The amount M Y changes for an increase of χ 2 by one gives our 1 σ uncertainty, δM Y , for M Y [29]. As justification for this we calculate ten different seed random numbers to generate ten distinct groups of 250 events. We check that the M Y estimates for the ten sets are distributed with about 2/3 within δM Y of the true M Y as one would expect for 1 σ error bars. One might worry that with our definition of χ 2 , the value of χ 2 per degree of freedom is less than one. However this is an artifact of the fact that the bins with very few or zero events are not adequately described by Poisson statistics and if we remove them we do get a reasonable χ 2 per degree of freedom. The determination of M Y using this reduced set gives similar results.