Energy spectra of massive two-body decay products and mass measurement

We have recently established a new method for measuring the mass of unstable particles produced at hadron colliders based on the analysis of the energy distribution of a massless product from their two-body decays. The central ingredient of our proposal is the remarkable result that, for an unpolarized decaying particle, the location of the peak in the energy distribution of the observed decay product is identical to the (fixed) value of the energy that this particle would have in the rest-frame of the decaying particle, which, in turn, is a simple function of the involved masses. In addition, we utilized the property that this energy distribution is symmetric around the location of peak when energy is plotted on a logarithmic scale. The general strategy was demonstrated in several specific cases, including both beyond the SM particles, as well as for the top quark. In the present work, we generalize this method to the case of a massive decay product from a two-body decay; this procedure is far from trivial because (in general) both the above- mentioned properties are no longer valid. Nonetheless, we propose a suitably modified parametrization of the energy distribution that was used successfully for the massless case, which can deal with the massive case as well. We establish the accuracy of this parametrization using concrete examples of energy spectra of Z bosons from the decay of a heavier stop into a Z boson and a lighter stop. We then study a realistic application for the same process, but now including dominant backgrounds and using foreseeable statistics at LHC14, in order to determine the performance of this method for an actual mass measurement. The upshot of our present and previous work is that, in spite of energy being a Lorentz-variant quantity, its distribution emerges as a powerful tool for mass measurement at hadron colliders.


Introduction
Extensions of the Standard Model (SM) at the TeV scale are very well-motivated for several reasons, including solving the Planck-weak hierarchy problem and the attractiveness of weakly-interacting massive particle (WIMP) as Dark Matter (DM) of the Universe. In this respect, it is expected that new physics signatures will be discovered at the second phase of the Large Hadron Collider (LHC) and at future colliders. Once we establish a signal for new particles, it is of course crucial to carry ouy measurements in order to identify the underlying dynamics governing the new particles. Of various properties, we particularly focus on determining the mass of such new, heavy, (un)stable particles using the observed energy/momentum of its decay products at hadron colliders.
Some desirable features of a mass measurement methods are worth spelling out, as we do in the following.
• In view of the little a priori knowledge of the dynamics of the new particles (at least to begin with), methods for mass measurement of a new particles should ideally be based simply on the kinematics of its decay and not rely heavily on assuming particular dynamics of the states to be measured, i.e., it is advantageous if the strategies are independent of details of the production mechanism (e.g., matrix elements, proton PDFs or the actual partons initiating the production).
Of course, many such kinematics-based techniques have long been proposed, starting with the simplest case where the decay products are all visible and the complete and unambiguous reconstruction of the decaying particle four-momentum is possible on an event-by-event basis. In this case, the resonant peak in the invariant mass of the decay products -which is described by the standard Breit-Wigner (BW) shape -can provide a robust measurement of its mass, e.g., the case of the Higgs boson (→ γγ) or Z boson (→ l + l − ) in the past or for a Z boson in the future.
However, in other cases, even if the decay is fully visible, the mother particle is often produced in pairs so that full reconstruction faces a combinatorial ambiguity in associating the right set of decay products to each mother, and thus it might not be possible to determine the mother mass on an event-by-event basis. 1 In other words, even in the narrow decay width approximation, we might still get a "broad" distribution of invariant mass, that too possibly with a shifted peak, due to the inclusion of "wrong" combinations of the invariant mass. Of course, one can resolve this ambiguity statistically with a prediction of the resulting "modified" BW shape, but this prescription typically requires knowledge on the underlying physics (e.g., production mechanism), which invalidates the strategy to measure new physics masses without prior knowledge of their dynamics.
In addition to the fully visible case, there are cases of a semi-invisible decay of a heavy particle, 2 in which the decay produces both visible particles and invisible ones. Clearly, even with single production of the mother and a single invisible particle in decay chain, it is not possible in such a case to reconstruct the resonance mass on an event-by-event basis at hadron colliders. The reason is that, although the transverse momentum of the invisible particle is known via the "missing" transverse momentum (henceforth called MET), its longitudinal component and mass are a priori unknown. 3 Actual measurements can be even more challenging since often such particles are pair-produced so that the missing transverse momentum is shared between (at least) two invisible particles; furthermore, one might face combinatorics even for the visible part in the case where each parent particle decay consists of 2 or more particles.
Although developed with mass measurement of new particles in mind, these methods can of course be "tested" via mass measurement of SM particles, for example, the top quark. In fact, this has already been done for top quark mass measurement using the kinematic endpoints of invariant mass and M T 2 variables [69]. These applications are particularly worth noticing as they have been used also to provide a model-independent measurement of the top quark mass, to be compared with previous methods determining the top quark mass more precisely but with many more assumptions on the knowledge of SM matrix elements in production and decay of top quarks at hadron colliders. In this sense the merit of being based on kinematics has already granted these ideas a certain recognition. Furthermore, the same "transverse" methods can be used as discriminators in the search for such semi-invisibly decaying new particles, i.e., even before mass measurement: for a review of such search strategies, including others such as α T variable [70], see, for example, Ref. [67]).
In order to frame the work that will be carried out in this paper, it is worth discussing potential limitations of the above-listed methods, despite their model-independent nature and even a successful application to real experimental data. As mentioned before, combinatorial ambiguity is often challenging in constructing the relevant observables. Additionally, the distributions are sometimes characterized by a long tail so that it may be very difficult to identify the true location of kinematic endpoints. Finally, the variables involving MET are typically affected by detector effects: the point being that even if the decay of interest does not result in quarks/gluons, jets are ubiquitous at hadron colliders and their measurement becomes a part of accurately determining MET.
From this series of considerations, it is clear that there is no single best method for mass measurement. Thus, in order to compensate for possible shortcoming of these methods, new observables for mass measurement are needed and should be devised keeping in mind the following points. For example, the new methods can be useful if they have different, possible little, sensitivity to systematics affecting previous methods, e.g., by avoiding the use of MET, be less sensitive to combinatorics or assumptions about boosts, or work even for a single visible particle in the relevant decay chain. With the above goal in mind, over the past few years, an exciting idea has emerged: • The mass of a decaying particle can be measured at hadron colliders using the energy spectrum of a (till the present work) massless daughter from the decay, with essentially no a priori knowledge of the dynamics governing the measured particle.
We emphasize that this idea sets a new paradigm in the sense that opens the way to use energy, which is a frame-dependent quantity, to obtain robust model-independent information on masses, which are instead frame-invariant. In more detail, we consider the two-body decay of a heavy particle (mother) into one massless, visible particle (daughter) along with another particle. The specification of the latter decay product is irrelevant to the subsequent argument except that its mass parameter enters the relevant formulae. We further assume that the mother particle is produced without any preference for its state of polarization and with a generic boost distribution, which are typical conditions at hadron colliders. Under these circumstances, we have made a remarkable observation [71]: • The location of the peak in the energy distribution of the massless daughter is exactly at the value of energy of the daughter in the rest frame of the mother. 5 Moreover, this rest-frame energy value is simply given by a function of the masses of the mother particle and the other decay product, enabling us to determine the associated combination of mass parameters from the measurement of the energy-peak. Certainly, if the mass of the other decay product is obtained from another independent measurement, the mass of the mother particle is straightforwardly determined, and vice versa. A few comments on this "energy-peak" result and the associated technique for mass measurement are in order. First of all, the energy-peak is invariant under variations in the boost distribution of the mother, which, in turn, depends on details of the production mechanism including matrix element, collider energy, parton distributions, the possibility of initial state radiation, and so on. This fact is striking, because, as one would naturally guess, the overall distribution changes upon variation of these physical quantities, given that energy itself is not Lorentz-invariant. Despite the change of shape of the distribution, under the simple and generic assumptions listed above, it is a rigorous and robust result that the peak position does not change. Hence, modulo the assumption of unpolarized mother particles, this energy-peak method for measuring masses is indeed kinematics-based, i.e., without involving the details of underlying models. Moreover, the method does neither involve any combinatorics, as it is not necessary to associate each particle to their parent particle, nor use MET, as the only quantity used is the energy of visible particles. Thus the energy-peak method is clearly complementary to the existing methods.
In addition to the robust statement on the location of the peak, one can also show that the energy distribution for the massless daughter is symmetric with respect to the peak with the energy being plotted on logarithmic scale. Predicated upon these, and a couple of other properties that can be proven from first principles, a fitting function was developed in [71] as a model for the actual theory curve. The underlying goal was to aid the extraction of the peak position from the relevant data, given that the peak tends to be rather broad. The fitting function contains only two fit parameters responsible for the peak position and the width of the energy distribution, hence the analysis of energy spectra becomes similar in spirit to a Breit-Wigner shape analysis for the invariant mass distribution of a resonance. 6 It is worth emphasizing that one could of course obtain the true energy distribution as a (numerical) "function" of the relevant mass parameters, convolving model-dependent information, and thus use it to determine those masses. However, it would obviously be considered as a fully dynamics-based approach which is contrary to our basic philosophy here. 5 See also Refs. [72][73][74] for related recent work and Ref. [75] for an earlier discussion on this property in the context of cosmic-ray physics. 6 Of course, the latter is truly derived from 1st principles, while the former is only "inspired" that way.
Remarkably, the above fitting function was actually shown to be able to reproduce sufficiently well the theory prediction for energy spectra for numerous cases with massless daughters, for example: i) the bottom quark energy from the decay of top quarks in the SM produced at the LHC7 [71]; and ii) new physics examples as the spectrum of both bottom quarks in gluino cascade decay at LHC14 [76]. This function has also been studied and found to work for bottom quark energy spectra arising from the decay of fermionic quarks with mass ∼ 1 TeV and scalar top quarks decaying in chargino and bottom quarks in the MSSM [77]. Building on the accuracy in reproducing the theory predictions that has been demonstrated using this fitting function, the location of the energy-peak extracted by this fit has been applied for measuring masses: e.g., of the top quark at LHC7 [71] and gluino and sbottom at LHC14 [76]. In particular, as part of the application for measuring gluino mass (at least for some choice of spectra), one could also determine the mass of the invisible neutralino [76], remarkably without measuring MET at all. Furthermore it has been found that this fitting function can describe accurately b-jet energy spectra from top quark production at the LHC including effects from next-toleading order corrections to production and decay mechanism [78]. An adapted version of the energy-peak idea was used in Ref. [82] for determining the Kaluza-Klein graviton mass arising from a warped extra dimensional framework. Above analyses were of course using simulated data. In fact, the CMS collaboration has recently published a measurement of the top quark mass that follows our proposal [79], resulting in a measurement of top quark mass: 172.29±1.17(statistical)±2.66(systematic) GeV, which is consistent with the current world average (using other methods). The results of this analysis of the 8 TeV LHC dataset, together with preliminary results of the calculation of the missing higher-order contribution mentioned earlier [78], indicate very promising prospects for the extraction of the top quark mass with sub-GeV accuracy once more data from the 13 TeV run will be available.
For the sake of completeness, we would also like to mention other (i.e., beyond mass measurement) applications of the energy peak method. For example, Ref. [80] used energypeaks for distinguishing Z 2 DM stabilization symmetry from Z 3 in conjunction with the M T 2 variable. Its potential use for distinguishing bottom quarks from SM top quark decay from those from decays of its supersymmetric partner (stop) was mentioned briefly in Ref. [81]. Finally, the energy-peak observation was applied to interpret the Galactic Center GeV gamma-ray excess in Ref. [83].
All the above witnesses how the idea of energy-peak has become a developed and articulated research program extending to various sub-disciplines of particle physics. Inspired by the general fruition of the above program, • in this paper we study the generalization of the energy-peak method to the case of a massive daughter particle. 7 As in previous applications, we focus on the two-body decay of an unpolarized mother, produced with a generic boost distribution. The motivation for such a step is clear: it is not only that many daughters of a two-body decay are massive, but also that the phase-space of a three-body decay (say, into two massless daughters) can be sliced into several "effective" two-body ones, i.e., consisting of a massive (single) body made of these two daughters with a fixed invariant mass and the third daughter. Such an idea allows an extension of the energy-peak method to the case of multi-body decays, hence makes the extension to massive daughters from two-body decay highly desirable and motivated. Results specific to three-body decay, based on the finding of this work, are presented in a related paper of ours [84]. As a disclaimer, we would like to mention that (as is explained below) for the case of a massive daughter, the energy-peak method will be less robust (i.e., more empirical than theory-based) than for massless case, but still we will show that it is quite useful. First of all, the symmetry property of the energy distribution on logarithmic scale can be shown to be violated "as soon as" the daughter has non-zero mass. Of course, this violation may be negligible if the daughter is very light, for example, bottom quark from top quark decay, as was studied in [71], but it would not be so if one studied W boson energy spectra in the same context. In addition, one finds that the energy-peak shifts from its rest-frame value, provided that the mother particle can be produced with sufficiently large boost at the collider under study. Of course, the significance of these effects depends on the boost distribution of the mother particle, hence on the details of its production environment. Therefore, at least to some extent, • for a massive daughter energy spectrum, the shift in the peak position and the asymmetry of the energy distribution on logarithmic scale become model-dependent, although they are very often small.
Thus, a priori, it seems rather difficult to repeat the success of the massless case here. Fortunately, the successful implementation of the fitting template for the massless case suggests a path to treat the massive daughter case, which arises from the following observation.
• The general form of the energy distribution for massless and massive daughters (i.e., as an integral over distribution of boosts of the parent particle) look "similar". Hence by exploiting suitable matching conditions in limiting cases we can leverage the success of the fitting function for the massless case and suggest a suitably modified model for massive daughter energy spectra that can accommodate all the relevant features of the spectrum.
As with the massless case, we must validate the fitting function, as this function is largely motivated by prime principles, but not entirely. In fact, for the massive case, it is all the more crucial to do so, since (as mentioned above) the theory behind energy-peak is on a less firm ground than for massless case. Therefore, we thoroughly test the new fitting function on the theoretical energy distribution of Z boson coming from decay of the heavier supersymmetric top quark partner to the lighter one. In particular, varying the mass gap between the two supersymmetric top quarks provides us the flexibility (as desired for a systematic evaluation of the fitting function) in terms of the amount of "massiveness" of the Z boson (i.e., its boost in the rest frame of the mother particle). Having developed confidence in the new fitting function, we then apply the massive energy-peak method for measuring masses in the same process including background, cuts, and realistic statistics at the LHC14.
Here is the outline for the rest of this paper. We begin with a discussion of the above derivation of the massive fitting function in Sec. 2. The detailed testing is performed in Sec. 3. In Sec. 4, we discuss the application for measuring masses of the supersymmetric top quark partners. Sec. 5 is reserved for our conclusions and outlook.
2 Developing a template for massive decay products We first revisit (in section 2.1) the derivation of the fitting function that we proposed for massless visible particles in Ref. [71]. Utilizing a similar formalism, in section 2.2, we can find out the general structure of the energy spectrum of a massive particle from a two-body decay and motivate a new fitting function that can deal with the massive case.
To begin with, we outline some notation and basic formulae which are valid for both cases. The process under consideration is a two-body decay: where M denotes "mother" particle, i.e., a heavier particle decaying into a lighter and visible daughter d together with another daughter D. We focus on the energy spectrum of particle d, whereas only mass information of particle D is relevant to the subsequent discussion.
With this simple set-up, it is well-known that the energy and momentum of the visible daughter particle d in the rest frame of the mother particle are expressed in terms of the three masses m M , m D , and m d as: where the usual kinematic triangular function is defined as λ(x, y, z) = x 2 +y 2 +z 2 −2(xy + yz + zx). Here the "starred" quantities denote what would be measured in the rest frame of particle M , while others are understood to be in the laboratory frame. We henceforth call E * d and p * d "rest-frame" energy and momentum of particle d, respectively. In general, the laboratory frame is not the rest frame of the mother particle, therefore, the observed energy of the visible daughter d in the laboratory frame is given by a Lorentz transformation: where γ M describes the boost of particle M in the laboratory frame and θ * M is the emission angle of particle d in the rest frame of the mother particle, which is measured from the boost direction, β M . Throughout this paper, we assume that the mother particles are either scalar or produced in an unpolarized manner so that cos θ * d has a flat distribution.

The energy spectrum of a massless decay product
We now briefly review the case where the visible daughter d is massless. Since p * d = E * d , the Lorentz transformation in eq. (2.4) can be further simplified to (2.5)

Properties
Obviously, the distribution in E d for any fixed (but arbitrary) boost factor γ M is rectangular due to the fact that cos θ * d is a flat variable spanning the range One can easily find that the above range covers x d = 1 (or equivalently E d = E * d ) for any boost factor γ M and it is the only value of x d to enjoy such a property [71]. To get the overall energy distribution, one should "stack up" all resulting rectangles, certainly developing a unique peak at x d = 1. One interesting feature is that E * d appears as the geometric mean of the two endpoints for each rectangle, which implies it becomes the "midpoint" when the energy spectrum is plotted on a logarithmic scale. In other words, the entire energy spectrum is symmetric with respect to E d = E * d in logarithmic scale. One can understand the above heuristic argument more formally by the following integral representation: where g(γ M ) denotes the boost distribution of the mother particle, which encodes all modeldependent information such as the matrix element of production, parton distribution functions, and so on. The upper end in the integral range defines the maximum γ M contributing to x d of interest. Strictly speaking, it is determined by the center of mass energy of the collider under consideration. However, its specific value is irrelevant for the case of a massless visible particle, and thus we simply understand the "infinity" as an arbitrary sufficiently large value. On the other hand, the lower end can be derived from the solution for γ M to the equation, x d = γ M ± γ 2 M − 1, for a given x d , where the positive (negative) signature is relevant to the region of x d ≥ 1 (x d < 1). In order to understand the shape of the energy spectrum we take the first derivative of eq. (2.7), that is, , one should first check whether or not this first derivative vanishes at x d = 1. It can be proven that this point is indeed a maximum both for g(1) = 0 and g(1) = 0 under a well justified assumption of non-vanishing g(γ M ) for any finite non-zero value of γ M . More details on the proof can be consulted in Ref. [71].
We remark that in principle, the integral in eq. (2.7) cannot be performed analytically due to the existence of a model-dependent piece g(γ M ). Nevertheless, we can still exploit some functional properties that the generic f (x d ) should satisfy. We simply enumerate them below without any detailed verification, for which we refer to our work Ref. [71]. The function f : • is a function with an argument of x d + 1 x d , i.e., it is even under the operation of • vanishes as x d approaches 0 or ∞, • tends to a δ-function in some limiting situation.
Here the last property can be interpreted as a boundary condition reflecting the fact that f should return the fixed value as in eq. (2.2) with m d = 0 if g(γ M ) is non-zero valued only at γ M = 1.

The massless ansatz
The challenge in having a closed form of f (x d ) motivates us to come up with a modelindependent ansatz to approximate the true energy distribution. Predicated upon the above-listed properties, the following "simple" function was originally proposed in Ref. [71]: where the normalization factor K 1 (w) is a modified Bessel function of the second kind of order 1, and w is a parameter describing the width of the peak. All model-dependent information is encapsulated in the "width" parameter w, which in general is an indicator of the typical boost of the mother particle; a larger (smaller) value corresponds to a narrower (wider) peak and so fits the case of the mother particle which is typically less (highly) boosted. It is straightforward to see if this ansatz (henceforth called massless fitting template) respects all four properties enumerated before (see Ref. [71] for more details). This function has been tested in the context of i) bottom quark energy spectrum from the decay of SM top quark at LHC-7 TeV [71] and b-jets (including higher order QCD corrections) at LHC-14 TeV [78]; ii) a gluino cascade decay [76]; iii) the mass determination of KK graviton [82], and iv) a DM interpretation for the Galactic Center GeV gamma-ray excess [83]. For all these cases, the massless fitting template was shown to very successfully reproduce the spectrum, and in particular the peak region.

The energy spectrum of a massive decay product
Having reviewed the energy spectrum of a massless daughter from a two-body decay, we now move to the case of a massive daughter.

Properties
We restart from the discussion around eq.(2.6), which is reported here for convenience. The above considerations tell us that for any fixed boost factor γ M the laboratory frame energy distribution is given by a rectangular distribution spanning the range We observe a couple of crucial differences with respect to the case of massless visible particles. First of all, not every single rectangle contains E * d , because when the boost factor for a mother particle is equal to the "critical" boost value given by where γ * d denotes the boost factor of particle d in the rest frame of particle M (that is , then the lower endpoint of x d becomes exactly 1, and therefore, for any γ M greater than γ cr M , the rectangle does not cover x d = 1. A trivial example is the case in which p * d is zero and x d = 1 is populated only when γ M = 1, with all the other boost values populating the region x d > 1. From these considerations we see that for a massive daughter, as p * d = E * d , a priori we cannot conclude that the peak of the energy distribution arises at . We will provide more elaborated analysis on this point shortly. The other difference worth noticing is that for any m d = 0 and any non-zero boost of the mother particle (i.e., γ M > 1), E * d is no longer the geometric mean of the upper and the lower endpoints of each rectangle, that is: This relation implies that, unlike for massless daughter particles, the full energy spectrum, that results from simply stacking up those rectangles, is not symmetric in logarithmic scale. Coming back to the first comment, we remark that γ cr M may not be accessible at a given collider, so that x d = 1 may still be the peak of the distribution. To take this possibility into account we define γ kin as the kinematic limit of γ M given by center-of-mass energy √ s of the given collider. For example, γ kin = √ s/(2m M ) for pair-produced mother particles. Obviously, if γ kin exceeds the critical boost given in (2.11), the proof for the invariance property of the energy-peak in the massless case does not hold any longer and we have to deal with the possibility that the peak of the energy distribution appears at a value different from E * , although in practice in some cases the peak may be (very) close to E * , for instance because of a small probability to produce the mother particle with boost exceeding the critical value. We stress again that, if the kinematic limit of γ M is smaller than the critical boost, the invariant nature of the location of the peak stays intact, and still the peak has its own physical implication as in the massless case. However, the symmetry property is always violated even in the case where the peak position is preserved.
More formally, one can write the integral representation of the energy spectrum in terms of the boost distribution of the mother particle, g(γ M ), with explicit dependence on γ kin : where θ(x) is the usual Heaviside step function. Here γ − M (x d ) and γ + M (x d ) denote the minimal and the maximal boost values to contribute to the laboratory frame energy value They can be readily evaluated as the solutions of and we find the two solutions as The massless limit of these expressions is given by the limit γ * d → ∞ and reads as necessary to reproduce the massless result of (2.7). We emphasize that the addition of θ(γ kin − γ M ) enables us to easily keep track of the consequence of γ cr M being larger or smaller than θ(γ kin − γ M ) on the shape of the energy distribution. It is straightforward to derive f (x d ) from eq. (2.13): where we dropped explicit x d dependence of γ ± M to avoid notational clutter. Based on these formulae we carefully investigate the functional behavior of the energy spectrum in the different regions as follows.
(I) The region x d < 1: In this region the sign function in eq.(2.17) becomes +1, hence For the subsequent discussion we provide Figure 1 showing unless both g(γ + M ) and g(γ − M ) simultaneously vanish by accident. This implies that f (x d ) is a monotonically increasing function below x d = 1. On the contrary, in the case of γ kin < γ cr M , the situation is slightly more complicated.
, and it is still increasing beyond x kink but with reduced slope proportional to g(γ − M ) because the first step function simply vanishes. On the other hand, if γ kin < γ * d , both step functions vanish, hence f (x d ) is flat until the point where γ − M (x) = γ kin , and beyond the point it increases with slope proportional to g(γ − M ). Overall, we conclude that for any generic value of γ kin (either larger or smaller than γ cr M ), f (x d ) is an increasing function (possibly with a kink or a plateau region) and no peak can exist below x d = 1.
(II) The region x d ∼ 1: In order to investigate the structure of the f (x d ) in the vicinity of where we have used the two limiting behaviors of γ + from which one can consider three possibilities enumerated below.
(i) g(1) = 0 and g(γ cr M ) > g (1): f is an increasing function near and beyond x d = 1 with a kink at x d = 1. The relevant slope is proportional to g(γ cr Given the fact that f (x d ) eventually vanishes as x d → ∞, a turnover in the slope should arise at x d > 1, i.e., the energy-peak is shifted to be greater than the associated rest-frame energy value.
(ii) g(1) = 0 and g(γ cr M ) < g (1): In this case, f is peaked at x d = 1 because the sign of f (x d ) is flipped at x d = 1. However, the relevant energy distribution is expected not to be smooth at the peak position, as is evident from the fact that f (x d ) is discontinuous at x d = 1.
is a smoothly increasing function at x d = 1 and again, the energypeak should be shifted to On the other hand, if γ kin < γ cr and clearly we see that there exists a peak at x d = 1. The peak appears smooth for g(1) = 0 while it appears as a cusp for g(1) = 0.
(III) The region x d > 1: As the sign function becomes −1, f (x d > 1) is given by As clear from Figure 1, if γ kin > γ cr M , the horizontal line of γ = γ kin intersects with γ + M at Hence, it is conceivable that there is a point where f (x d ) = 0, that is, there is a peak in this interval. However, since the function g(γ) is highly model-dependent, it is rather challenging to make a robust connection between the (shifted) peak position and the underlying physics parameters. For the case of vanishes because the relevant region is kinematically not allowed. On the other hand, if γ kin < γ cr M , we see from Figure 1 is simply a monotonically decreasing function in the range of 1 < x d < x 0 , and becomes vanishing beyond x 0 again because the relevant region is not kinematically accessible.
In summary, for γ kin > γ cr M . f (x d ) increases below x d = 1 and, typically, there is no peak at x d = 1. More specifically, it increases at x d = 1 (possibly with a kink at x d = 1), develops a peak (i.e., shifted) appearing at some point within 1 < x d < x l , and then decreases monotonically until being flattened out to zero. For γ kin < γ cr M , the energy distribution f (x d ) increases below x d = 1 (possibly with kink), develops a peak at x d = 1, and then decreases monotonically.

The massive ansatz
In order to gain an intuition and bootstrap the construction of a suitable fitting function for massive decay products, we first notice that each term in the second line of eq. (2.13) resembles that in eq. (2.7) for the case of massless case. In particular the first term becomes identical to eq. (2.7) when the massless limit eq.(2.16) is taken. As we already found that in this limit the integral form of the spectrum is well modeled by a function that in such limit is just exp (w/2(x + 1/x)), we can infer that, whatever the integrand and the function g are, the integral can be well approximated by an exponential. Therefore the proposed fitting template for massive daughter particles is given by where N is the overall normalization constant. We immediately see that this modified ansatz reproduces the massless fitting function in (2.9) in the massless limit γ * d → ∞. Additionally, for m d = 0, it becomes a δ-function-like distribution as we take w → ∞. This is simply the boundary condition that, for a certain parameter choice, the ansatz becomes a single-valued distribution, as to accommodate the case where mother particles are produced at rest in the laboratory frame. Finally, since γ ± M does not respect the symmetry under x d ↔ 1 x d , it is obvious that the massive ansatz does not satisfy this symmetry property, as it needs to be, given the discussion in the previous sections.
We remark that, unless w → ∞, the peak position of the proposed template is, strictly speaking, always greater than E * d . This seems an unwanted feature of the proposed fitting template, as the original integral representation in eq. (2.13) may be peaked at E * d in the case in which the collider does have enough energy to boost the mother particle to the critical boost of eq.(2.11). This functional feature can be easily seen from eq. (2.25) by taking its derivative and noticing that f M (x d ) = 0 cannot have a solution at x d = 1. To estimate the size of this effect, we solve the equation f M (1 + ) = 0 for a small expansion parameter up to the leading order, and find that for γ * d 2 1 (i.e., the massless limit where the peak is invariant) (2.26) From the above expression we observe that, as is always positive, the peak is always shifted toward larger energies, and coincides with x d = 1 only in the limit w → ∞. Furthermore we observe that the expected amount of shift is exponentially small. Therefore, for a fairly large γ * d , we expect that the massive template can accommodate the energy distribution with the peak being at x d = 1 although there exists a slight mismatch between the actual peak and the peak of the ansatz. In the following it will be clear that in practical applications this shift is never reason of worry and in practice the proposed massive template can model massive daughter energy spectrum, covering both the case where the peak appears at x d = 1 and the case where the peak arises at x d > 1.
For the case in which the peak appears at x d > 1 it is worth remarking that the peak of the spectrum, not appearing at E = E * d , is no longer univocally linked to the masses that we set out to measure. The link in this case is provided by the fitting function we propose, in which E * d is a fitting parameter which may or may not coincide with the peak position. In this sense our present generalization of the energy peak method to massive daughters is closer to a shape analysis than the analogue for massless particles. For this reason a careful test of the accuracy of the proposed fitting function is needed and will be discussed in the following. Specifically, in the next section we thoroughly test the accuracy of the massive fitting template in modelling an actual energy spectrum theory prediction from the decay process of the heavier supersymmetric partner of top quark. We also discuss the advantages of using the massive fitting template by comparing its accuracy with that of the massless template used to fit the same spectra. To highlight the difference between the results obtained with the fitting function proposed in this paper and the one for massless decay products we have introduced in previous work, we extensively examine the accuracy of the massive template as a function of the "massiveness" of the visible particles, ending up determining the range of applicability of this function.

Accuracy of the template for massive particles energy spectra
In order to test the accuracy of the proposed fitting function we study energy spectra of Z bosons from the production and decay process of a supersymmetric top quark partner: pp →t 2t2 , followed byt 2 →t 1 + Z (3.1) wheret 1(2) denotes the lighter (heavier) top squark. For this study we fix the mass oft 2 to be 1 TeV, while varying the mass oft 1 from 400 GeV to 900 GeV we can vary the boost of the Z boson in the rest frame of thet 2 , hence vary the importance of the Z mass in the kinematics. Numerical theory predictions for the Z boson energy spectrum are obtained, at the leading order in perturbation theory, using MadGraph5 aMC@NLO [87] with the parton distribution functions NNPDF23 [88]. For this calculation we also obtain a total cross-section σ(pp →t 2t2 ) = 27.1 fb at the 14 TeV LHC. For each mass spectrum we obtain the theory prediction from 200K unweighted events, which suffice to obtain a prediction with negligible statistical uncertainties for realistic LHC luminosity. As a matter of fact we perform our study normalizing the spectra at integrated luminosity 300 fb −1 .
In order to evaluate the accuracy of our model eq.(2.25), we perform a least-χ 2 fit to the theory prediction obtained from MadGraph5 aMC@NLO. To compare the performance of the two models we also perform a fit using the massless fitting function eq.(2.9).
Actual results for mt 1 = 800 GeV are shown in Figure 2. In this case, since the center of mass energy is 14 TeV, the kinematic limit for the boost factor of a 1 TeVt 2 is 7 (γ kiñ implies that for this particular case the peak of the spectrum is guaranteed to be at E * Z , but as soon as mt 1 > 800 GeV, the collider kinematic limit will exceeds the critical boost, resulting in a shift of the peak position. In Figure 2 we see two different fits to the theory prediction (black dots). The upper left (right) panel shows the results for the fit of signal events with the massive (red solid curve) and the massless (blue dashed curve) templates using data over a small (large) range of energy between 100 GeV and 300 GeV (1200 GeV). The table in the lower panel shows the best-fit E * Z with 1σ error estimate and the associated reduced χ 2 from the massive and the massless templates over the larger energy range. These values are obtained from standard χ 2 variations procedure, however, they do not have any particular statistical meaning, as the data point we used in the fit is a theory prediction (with negligible statistical error). We present these values only as a measure to quantify the accuracy of the fitting template, which is the most accurate as the χ 2 gets smaller. In this sense we are looking at the obtained χ 2 as a (loosely defined) "norm" in the space of functions, that helps us quantify how far from the theory curve are the best models from the family of functions eq.(2.9) and eq.(2.25). Of course the choice of this "norm" is inspired by how our function would be used in an actual measurement, and in particular for the fact that the peak region, which is most important to our method, will drive the χ 2 minimization. Based on the reduced χ 2 we conclude that the massive template provides a better description than the massless one. Results for these two and other intermediate choices of energy range in the fit are summarized in Table 1. The same trends hold even with different fitting ranges as for all four fitting ranges, the massive template yields better χ 2 . Looking at Table 1 we also observe that the massless fitting function tends to overestimate the actual value of the peak, while the massive fitting function only very mildly does so, consistently with the Fitting range (GeV) Massive Template (GeV) Massless Template (GeV) To demonstrate the general validity of our fitting function to model massive particles energy spectra, we carry out a similar analysis for different values of the mass oft 1 , which gives a sample of how large an effect the Z boson mass can give when the momentum released in the decay changes. The accuracy of the fitting functions is shown by the reduced χ 2 values (left panel) and the fractional difference between the theory E * Z and the extracted E * Z (right panel) is given in Figure 3. The results from the massive template are reported by solid lines while those from the massless template are reported by dashed lines. Four different fitting ranges are distinguished by four different colors, and also labelled by index numbers 1 through 4 as in Table 1. In order to manifest the (relative) "massiveness" of the Z boson, we plot both quantities as function to the boost β * of the Z boson in the rest frame of the heavier supersymmetric top.
As discussed before, for β * > 0.87 (i.e., mt 1 800 GeV), the critical boost factor γ cr Z is not exceeded by the kinematic limit for the boost factor of the mother particle, so that we expect that the actual distribution has a peak exactly at E * Z . On the other hand, the massive nature of the visible particle breaks the symmetry property under the x d ↔ 1/x d operation for any choice of mass spectra. Such a breakdown becomes negligible as the Z boson effectively becomes massless. We also observed that our massive template always has a peak at x d > 1 for any mass spectra, but we estimated such a mismatch to be negligible in eq.(2.26). Based on this series of considerations, we anticipate that for β * > 0.87 both massive and massless templates will reproduce the relevant energy spectrum well enough. In fact, our results in Figure 3 supports our expectation. Nevertheless, based on χ 2 values, we observe that the results from the massive template are systematically better than those from the massless template, as expected, because the former accommodates the broken symmetry property while the latter does not. Therefore, we conclude that, for the mass spectra not causing the peak shift, the two templates produce comparable results, although the massive template provides a better description of the theory numerical prediction.
On the contrary, once the massiveness of the Z boson becomes more manifest, that is for low β * < 0.87, several characteristic features start being noticeable. First of all, the χ 2 for the massless template increases rapidly, hence the results become less and less reliable, up to a point in which the massless template would be untenable as a model to the theory prediction. On the contrary the massive template keeps having a low χ 2 , indicating a closer description of the the theory prediction. This is especially true for fits performed in a region closer to the peak, where we think most of the information on the masses is encoded. Clearly, the fact that the massive template embraces the two functional features of the general form of the spectrum eq.(2.13), i.e., the shifted peak and the broken symmetry property, enables a significantly better modelling of the theory prediction.
Despite this successful description of the theory prediction achieved by the massive fitting function, it is worth noting that even such "prime principles savvy" model of the massive daughter energy spectrum, inaccuracies emerge once the massiveness of Z gauge boson becomes more manifest. For example, in Figure 3 we see that for β * 0.5, that corresponds to mt 1 900 GeV, the fractional difference between the theory value and the value of E * identified by the fit becomes larger and it also becomes sensitive to the choice of the fitting range. We remark, however, that this phenomenon is somehow expected. In fact, in the extreme case β * = 0 we can see from eq.(2.13) that the energy spectrum f (x) is just proportional to the boost distribution of the mother particle: f (x) ∝ g(x). In this situation, since the mother particle boost distribution g(x) is completely unaware of the mass of the particles in which the mother particle can decay into, we are certain that the energy spectrum contain virtually no information on the mass of the daughter. For this reason we expect a loss of correlation between the peak of the daughter energy spectrum and the mass of the daughter, hence inaccuracies are expected as β * get smaller.

Application
In this section we apply the main idea elaborated thus far for the mass measurement of new physics particles under a more realistic environment including backgrounds and cuts to isolate the signal. For this purpose the same SUSY example as in the previous section is taken. We emphasize that our application to a SUSY example is purely for demonstrating the use of the massive fitting template to measure masses from energy spectra; in fact the main strategy is not restricted to the case of SUSY, as it is straightforwardly applicable to other new physics models. Furthermore, potential future exclusion by the LHC experiments of the mass spectra under consideration does not alter our results on the usefulness of the techniques to extract the mass of new physics particles using the energy spectrum of massive visible particles.
In the following subsections we first define the signal collider signature (4.1). Then, we identify the relevant SM background processes (4.2) and devise some cuts to suppress them, while keeping a usable signal rate for our mass measurement. The fitting procedure is described in section 4.3, with the final results being presented in section 4.4.

Signal collider signature
We take the same SUSY process employed in Sec. 3, i.e., the pair-produced heavier supersymmetric top quarks,t 2 . We assume the heavy stops to decay into a lighter supersymmetric top quark,t 1 , and a Z boson, thent 1 in turn is assumed to decay into a top quark and the lightest neutralino:t Since the top quark and Z gauge boson themselves have several decay modes, several different final states are available in combination with two decay sides. We focus on the situation where the two top quarks decay semi-leptonically, and one of Z gauge bosons decays leptonically while the other does invisibly. Therefore, the final partonic reaction we study is where = e, µ and the particles in the second piece collectively emerge as missing transverse momentum. We denote as 1 the leptons originating from a Z decay while 2 is from the leptonic top quark, and similarly we label the neutrinos. In principle our strategy could be applied to other possibilities for the Z boson and top quark decay: for example, ZZ → + − jj along with a semi-leptonic top quark pair. Obviously, this channel would enjoy a larger signal cross section than the signal channel defined in (4.2). However, it comes with a smaller ¡ p T due to fewer invisible particles, so that it may be hard to impose a sizable ¡ p T cut to suppress relevant SM backgrounds. Moreover, large jet multiplicity would render QCD background processes more important. Therefore for this study we stay away from such choice and we study the final state eq.(4.2), which appears a good balance between maximizing the rate, that is not huge, but still observable at the LHC14, and minimizing backgrounds.
The signal cross section for our signal eq.(4.2), denoted by σ sig , is given by the production cross section of a stop pair times branching fractions in the sequential decays: The last two branching fractions are fixed by the SM, whereas the first two branching fractions are model-dependent. Since our goal is to demonstrate the performance of the massive template in a realistic application, we do not discuss the value of these branching fractions in specific models, and we simply pick the following reference values: Br(t 2 → t 1 Z) = 60% and Br(t 1 →χ 0 1 t) = 100%. To demonstrate the several aspects of performing a mass measurement with energy spectra we choose two study points (SP) mass spectra given by: SP1 : mt 2 = 600 GeV, mt 1 = 300 GeV, mχ0 1 = 115 GeV, (4.4) SP2 : mt 2 = 800 GeV, mt 1 = 600 GeV, mχ0 1 = 300 GeV.
For the first spectrum mt 1 is quite close to mχ0 1 , and indeed mt 1 mχ0 1 + m t , which makes such choice of spectrum not excluded [85,86]. 8 This spectrum features a sizable mass hierarchy between mt 2 and mt 1 so that we expect that the "massiveness" of the Z gauge boson will be less manifest. For the second spectrum the mass of both supersymmetric top quarks is sufficiently large that this mass scale has not been probed yet at the LHC and so in this case as well there is presently no bound on this spectrum. Unlike for the first example spectrum, the small mass gap between the two supersymmetric top quarks is such as that the Z bosons mass is important for its energy spectrum.
For these mass spectra the theory prediction for thet 2 rest-frame energy of the Z boson, denoted by E * th , is: The leading order cross-section at the LHC14 for the production of a pair of squarks t 2 computed by MadGraph5 aMC@NLO [87] using parton distribution functions NNPDF23 [88] is: σ(pp →t 2t2 ) = 125.7 fb for SP1 20.5 fb for SP2 . (4.6) The signal cross sections for the final state defined in eq.(4.2) are tabulated in Table 2, together with the rates after the selection that we will elaborate in the following.

Backgrounds and event selection
Because of our choice of a signature with several charged leptons, we anticipate a relatively low amount of background. Nevertheless we need to evaluate the sources of background and devise a strategy to suppress them. We identify two groups of SM backgrounds. The first group comprises pp → bbjjV 1 V 2 in which bbjj are stemming from QCD and the vectors are radiated. Since three leptons are needed in the final state, the case with V 1 = V 2 = W is very unlikely to appear as a 8 We stress that the idea of using energy spectra of massive particles to measure new particle masses remains valid beyond the status of the concrete example we study in this section, which, merely serves the purpose of showing the peculiarities of the analysis, the current exclusion limit.
background. For the same reason the process pp → bbjjZZ, both Z bosons should decay leptonically to make a background to our signal, which would be the case only if one of the leptons is somehow missed by the detector. For pp → bbjjW ± Z, it is sufficient for both W ± and Z to decay leptonically to become a background with very high efficiency. To suppress this background (and others) in the following eq.(4.12) we make a requirement for a semi-leptonic top pair. To suppress the backgrounds we also require a large ¡ p T , which is expected to reject both bbjjW ± Z and bbjjZZ, as in these processes only one neutrino or missed lepton is the source of ¡ p T . The second group of backgrounds is is identified as an irreducible background, and it turns out that it plays a role of the major background to the signal process. The other two possibilities are pp → ttW ± Z in which both Z and W ± decay leptonically and pp → ttW + W − in which both W gauge bosons decay leptonically. This last process can be suppressed by requiring opposite-signed same flavor leptons whose invariant mass falls into the Z mass window |m − 91 GeV| < 5 GeV . (4.7) In the cases in which two di-lepton invariant masses are available, and both satisfy the above Z mass window simultaneously, we take the combination for which m is closer to the nominal value of the Z mass, and regard the remaining lepton as a decay product of the leptonic top quark. Finally we have the process pp → ttW ± Z which is likely to be a background when a charged lepton from the W boson is lost.
As several processes listed above can lead to background if some lepton is not identified, we need to specify a definition for the leptons that we consider as properly identified. In the following we consider as not identified leptons those that does not pass the acceptance due to low p T or large η or both. We also reject leptons that are too close to partons, as they will likely be inside the jet of hadrons resulting for the parton and so will not pass isolation cuts. To take this type of effects into consideration, we define as a "missed" any object having any of the following attributes: p T,j < 30 GeV or |η j | > 5 for jets, (4.8) p T, < 10 GeV or |η | > 2.5 for leptons. (4.9) To estimate the backgrounds coming from non-isolated objects we employ the following criteria for considering two object as a single detector-level object. We merge together two partons j 1 j 2 and consider them as a single jet, or b-jets if any of the two partons is a b quark, when ∆R j 1 j 2 < 0.4, ; (4.10) we merge together a lepton and a parton and consider them a single jet, or b-jets if the parton is a b quark, when One should note that another complication arises in this estimate, considering the fact that the lepton from the leptonic top quark can be missed as well. 9 This possibility is limited by applying some requirements on the presence of semi-leptonic tt pair made of final state particles other than those forming a Z gauge boson. To partition the final states into a top and an anti-top we seek two jets with invariant mass in the W mass window, and that, further paired with a b-jet, give an invariant mass close to the top quark mass. We also require that the remaining b quark and lepton have an invariant mass, denoted by m b , below a cut-off value m max b . All in all, our semi-leptonic tt identification criteria are: |m jj − 80 GeV| < 16 GeV, |m bjj − 173 GeV| < 35 GeV, (4.12) for the hadronic partition, and for the leptonic partition. Events where at least one partition of the jet leads to satisfy these requirements are accepted in our analysis. A typical feature of background processes, in which undetected momentum is carried only by neutrinos, is small ¡ p T . Also in the case of backgrounds where a lepton is missed, e.g., due to its small p T , ¡ p T is not large precisely because the lost object has necessarily low transverse momentum 10 . Therefore, in order to suppress the background we require a hard ¡ p T cut: (4.14) As the name suggests, the energy peak mass measurement method is based on the data from the peak region of the spectrum. Therefore, the cuts to isolate the signal from the background should be imposed keeping in mind that one has to avoid distortions of the spectrum. As we did in previous works as well [76,84] we avoid pushing too hard the requirements on hard single objects in the final state and we rather prefer to cut the background by requirements on the global hardness of the event. According to this spirit we impose rather mild requirements on single objects used in our analysis: p T,j > 30 GeV, |η j | < 5, ∆R j 1 ,j 2 > 0.4 for any jets including b-jets, (4.15) p T, > 10 GeV, |η | < 2.5, ∆R 1 , 2 > 0.1, ∆R j > 0.3 for leptons, (4.16) in every single event that we use for our data analysis. Furthermore we remark that our signal tends to have multiple hard particles that are collectively giving a large recoil to the system of invisible particles. Therefore, we expect only a mild bias in the energy distribution by the ¡ p T requirement, and find that the ¡ p T cut in (4.14) achieves a rather strong reduction of backgrounds with the signal energy spectrum least distorted. 9 Of course, the leptons from the Z boson can be missed, too. However, in such a case, events are very unlikely to meet the requirement of the Z mass window in (4.7). So, we consider it negligible. 10 Events in which W ± decays into the τ ± , which in turn decays into soft jet(s) and a neutrino might have a somewhat large missing transverse momentum. However, this type of background can be made subdominant with the aid of the set of cuts listed above and other requirements as in Ref. [82]. For simplicity we do not include these backgrounds in the following.  Table 2. Cross sections, in fb, for signal and background processes under the selection of eqs. (4.7)-(4.16). The b-tagging efficiency is not taken into account here and is expected to be roughly the same on signal and background.
With the set of cuts, acceptance and isolation criteria, that we have defined in eqs. (4.7) through (4.16), we compute the cross sections for various SM backgrounds and for the signal process. The resulting cut-flow for the expected cross sections for the two study points and various backgrounds is shown in Table 2. 11 We clearly observe that ttZZ is the major background and is largely sub-dominant with respect to expected signal rates of both our study points. Therefore in the following, for sake of simplicity, we proceed to a simplified analysis in which we take into account only ttZZ. In principle, other background sources should be included, but they would results only in minor modifications on our analysis, without any major change in the mass measurement strategy that we are demonstrating here.

Fitting strategy and mass extraction
Our strategy to measure thet 2 rest-frame energy of the Z boson consists in fitting the data with the massive fitting template in eq. (2.25). In both study points, the fit is performed to data that takes into account both signal and the major background, ttZZ. The background energy spectrum is modelled by the function where b and p are fit parameters describing the shape of the function, and N BG is the normalization parameter related to the total number of events described by the function at fixed b and p. This background template has been tested with pure background energy spectra obtained after imposing the selection criteria in eqs.(4.7)-(4.16), and we find that the background data is well-reproduced by eq.(4.17).
If experimental data can be used, this type of background model can be tested on the data itself, so to prove that it is a good model to describe the background in the relevant region of phase space. As a matter of fact, similar types of background models have been often employed in fits of background to data [89,90] and a best fit model for the signal region can be inferred using data-driven techniques. In our study we do not attempt an estimate of the accuracy with which data driven methods can help to predict the background in the signal region, as this is a rather delicate task, which is best carried out by the experimental collaborations. To demonstrate our technique we identify a best-fit model for the background obtained by shape analysis of Monte Carlo simulation and we then proceed to subtract this expected background shape from the pseudo-experiment data that we use in the following. We denote the quantities fixed by background simulation by a "bar" on each symbol, so that the fixed background function to be used in our analysis is denoted byf (4.18) We emphasize that our determination of the background fit parameters from the Monte Carlo simulation is merely for estimating the effect of background consideration on the extraction of the rest-frame energy value for the signal. In more realistic situations, the background shapes and normalization should be ideally determined from the real data.
For the signal component of the data we use the massive fitting function introduced and motivated above: 19) where N SIG is a normalization parameter and γ ± (E) is nothing but eq. (2.15) re-expressed in terms of E: We denote the measured energy spectrum by f D (E), to which we subtract the expected backgroundf BG (E), so that we minimize the χ 2 between our signal massive fitting function and the subtracted data (4.21) The precise fitting range is not crucial to the result, however we find that best results are obtained on ranges that are about the full width at half maximum of the energy distribution. For the second study point, due to its low rate, it is hard to apply this background subtraction scheme. Therefore, instead of binning the data, we perform an unbinned likelihood fit to extract the underlying model parameters of f SIG (E). Denoting the probability distribution functions for the signal and the background asf SIG andf BG , respectively, we define the relevant likelihood as: where r is the signal fraction in the data. In this case as well, despite the low number of events, we have used the expectedb andp for the background expectation. While this is not fully rigorous, the small number of expected background events suppresses possible effects from the mismodelling due to our fixed background shape. In this low rate context we search for the maximum likelihood varying E * , w, and r so to obtain a measurement of the Z boson energy in thet 2 rest-frame.

Simulation study and results
To quantify the mass measurement performance that can be attained analyzing energy spectra as we propose, we carry out 100 pseudo-experiments, each equivalent to L = 3ab −1 at the 14 TeV LHC. For this purpose, we prepare 100 event samples for both the signal and background process, each corresponding to data from an integrated luminosity of 3 ab −1 , and select the events according to the cuts outlined in Secs. 4.1 and 4.2. We take into account b-tagging applying an efficiency equal to 70% over all the selected phase-space. On each of the 100 event samples we apply the procedure described in the following to obtain a measurement of the Z boson energy in the rest-frame of thet 2 .
For each pseudo-experiment we obtain the Z energy spectrum after imposing the selection criteria listed in eqs. (4.7)-(4.16). The Z boson energy for each event is reconstructed by summing the energies of the two opposite-signed same flavor leptons whose invariant mass falls closest tot he Z mass in the mass window defined in eq.(4.7). On each obtained energy spectrum we fit the massive fitting template according to the strategy explained in Sec. 4.3. The result of this fit is a measurement of the rest-frame energy of Z boson accompanied by its uncertainty from the variation of the χ 2 . Our final result will be the average of the 100 best-fit values and the average of the fit errors over the pseudo experiments.
The left panel of Figure 4 demonstrates our fit result for a representative pseudoexperiment of SP1. For all data points (denoted by black dots), the background is subtracted according to eq. (4.21). Then a χ 2 fit has been performed with respect to the data points between 150 and 500 GeV, which yield the best-fit given by the red curve. For this particular pseudo-experiment the best-fit E * Z is 236 +22 −35 GeV. Considering the corresponding theory value, which is 232 GeV from eq.(4.5), we find good agreement. Furthermore, for this pseudo-experiment the reduced χ 2 is 0.74, which suggests that our template describes the data well enough.
The average central value and average fit-error on 100 pseudo-experiments give the expected measurement with statistic uncertainty: with and average reduced χ 2 equal to 0.84, which shows good agreement between measured value and true value and also supports the use of the massive template function as a good parametrization of data. The right panel of Figure 4 displays the distribution of the values of E * Z obtained in the 100 pseudo-experiments that we performed. In order to check the normality and bias from our fit procedure, we fit the histogram with a Gaussian, and report the result as a red curve in the same figure. The central value and the variance from the Gaussian fit are comparable with the average of the measurements of E * Z and its error estimate in eq.(4.23), hence we conclude that no significant bias in E * Z determination is introduced by the our fit procedure.
So far we have discussed the measurement of a feature of the energy spectrum, E * Z , which per se is a physical quantity interesting on its own. This feature is connected, and can in principle coincide, with the peak of the spectrum, but in general is defined as a function of masses involved in the decay, which we have used to parametrize the energy spectrum. The relation of E * Z with the masses mt 2 , mt 1 , and m Z , does not allow to use just E * Z to know any single mass. In order to do that another independent measurement of a different function of the masses is needed, so that mt 2 or mt 1 (or both) can be determined. In this paper we do not offer a specific strategy to obtain this extra piece of information. However, solely for illustration purposes, we assume that a measurement of the mass oft 1 has been performed elsewhere 12 and we plug it in the relation between E * Z with the masses to study how the error on E * Z propagates to the masses. The expressions for mt 2 and its propagated error is: where, to highlight the relation between the error on E * Z from the fit and the extracted mass, we have neglected possible uncertainties on mt 1 . Assuming mt 1 = 300 GeV we obtain the average measurement of mt 2 : mt 2 = 608 +41 −57 GeV, (4.26) 12 In principle, studying energy spectra from the decayt1 → tχ is possible to determine at least part of the necessary information. This would require to study the energy spectrum of the top quark, a task that we leave for future work. which is in quite a good agreement with the true value for mt 2 in SP1, 600 GeV from eq.(4.4).
As discussed above, for SP2 we are presented with the issue of having a small number of signal events. Therefore we employ an unbinned likelihood fit, which allows to deal with such issue and carry out a mass measurement even with such small statistics. Since the background energy spectrum can be reasonably described by the background model in eq. (4.17) for E Z > 120 GeV, we take into consideration signal and background events only if the energy of the Z boson is greater than 120 GeV.
The average measurement of E * Z from the 100 pseudo-experiments for SP2 is: We see that the extracted E * Z is in a good agreement with the corresponding theory, 180.2 GeV from eq.(4.5). Figure 5 exhibits the distribution of E * Z over 100 pseudo-experiments that we performed. To check the normality and bias of our energy spectrum fit procedure, we fit the distribution of E * Z with a Gaussian, the resulting best-fit is denoted by a red curve. As for SP1, for SP2 as well we observe that the central value and the variance from the Gaussian fit are comparable with the average measurement of E * Z and its error estimate in eq.(4.27), hence no significant bias in E * Z is introduced by our energy spectrum fit procedure. The mass oft 2 can be determined also in this case by assuming a value for mt 1 . As we did for SP1, we pick m1 = 300 GeV and we find the average mass measurement: which is in reasonable agreement with the true value, 800 GeV from eq.(4.4).

Conclusions
Needless to say, measurement of masses of heavy particles is a routine part of the experimental program in high-energy physics. As is well-known, the measurement might not be straightforward, especially when new physics particles are under study (where the underlying dynamics might be initially unknown). For this reason, especially in view of the great variety of signatures that might originate from new physics, high-energy experimentalists (and phenomenologists alike) have developed a plethora of techniques for this purpose, so as to be ready to tackle the measurement regardless of the channel in which we will discover new physics at the LHC/future colliders. These mass measurement methods are often tailored for specific processes, which implies that, in spite of tremendous efforts, there is no method that can work in all cases. Following this observation, it is clear that it is always useful to come up with new methods for measuring heavy particle masses, especially if they are complementary to the existing ones, for example, being subject to different systematic uncertainties; in this manner, combinations of various methods (old and new) can reduce the error on the mass measurement and more thoroughly test the gained understanding of new physics. Anticipating in this way that the LHC/future colliders will discover new physics (after which the focus will shift to its mass measurement), a test of such methods can be carried out on existing data, for instance attempting the mass measurement of heavy particles of the SM itself, such as the top quark or the Higgs boson. In some cases, the new methods, although originally formulated for new physics measurements, might be "serious" alternatives to existing techniques even for SM particles, being useful for improving the associated precision or as a cross-check of the previous measurements (see, for example, the need for such methods for the case of the top quark discussed in Ref. [91,92] and the recent application of the kinematic end-point methods to this measurement [69].) In particular, mass measurement methods based primarily on the kinematics of decay and production of new states are especially attractive, because they are minimally sensitive to details of the dynamics of the production mechanisms of the heavy particle. Therefore, such "(production) model-independent" methods have been the the focus of this paper. Traditionally, such a goal has been attained using the (Lorentz-)invariant mass of the decay products, which is calculated as the Minkowski norm of the sum of all 4-momenta from the several prongs of the decay. If viable, this approach is clearly an excellent way to go for mass measurement. However we stress that for the invariant mass to work straightforwardly we need to observe all decay products, only then we will have direct access to the mass of the parent particle. On the contrary, if only a subset of decay products is available, e.g., because some of the decay products are invisible, the invariant mass can at best provide usable information on the difference of mass of parent and invisible states. The presence of invisible states is clearly a challenge to application of such a method and we need to develop strategies to work around these limitations.
In order to deal with such cases, "transverse" mass (i.e., not fully invariant) variables were invented, but they have to resort to using missing transverse momentum arising from the invisible particle(s). Hence, such variables usually bring in sensitivity to the entire event, i.e., they require global information, as opposed to the ideal case in which one would measure a particle mass concentrating on just its decay. Such a feature is clearly not desired since it renders methods based on this quantity sensitive to factors that are not fully under control (e.g., not well understood sources of missing transverse momentum).
On top of these drawbacks of established methods for mass measurement (based either on invariant or transverse mass) they are often afflicted by combinatoric issues because several candidates for the resonance reconstruction may exist in the event and only one will be the correct one. Such combinatoric issues are ubiquitous in all cases in which the parent particle is pair-produced and undergoes the same decay on both sides of the event.
Overall, these considerations provide huge motivation to develop alternative mass measurement techniques to address the above issues. Focusing on a two-body decay into one visible and one (in general massive) invisible particle, if we insist on use of only the visible particle (so that we have a chance to not end up using global information on the event), then we only have at our disposal the energy and three-momentum of the single visible particle. Such quantities, being not Lorentz-invariant, were not expected to robustly provide information about any mass, hence have not really been considered thus far in high energy collider experiments. Of course, if we assume the production and decay matrix elements, then we can compute energy distributions as a function of mass of parent and then, fitting this prediction to the data, extract the parent mass. This assumption, however, is what we would like to avoid; the purpose of this work has been precisely to propose an alternative way to mass measurement which does not rely on such knowledge.
Remarkably, we showed in Ref. [71] (see also [72,75] for related work) that nonetheless certain features in the above energy distributions contain information on Lorentz-invariant quantities. Namely, if one assumes that the heavy parent particle is produced unpolarized and the visible decay product is massless, then one can show that the location of the peak of the energy distribution in the laboratory frame is precisely at the (fixed) value of the energy of the visible particle in the rest-frame of parent particle, irrespective of the distribution of the boost of the parent particle. Of course, the overall shape of the energy distribution is dependent on the boost distribution of the parent, but the crucial point is that the location of peak is invariant. In turn, the location of this peak (i.e., rest-frame value of energy) is a simple, well-known, function of masses, thus allowing us to determine (in general a combination of) masses by measuring this peak. Clearly, this method is larger free of combinatoric ambiguities.
In this paper, we have generalized the above two-body result on the energy-peak of a massless decay product to the case of a massive child particle. Obviously, the resulting modification of the value of the energy of the child particle in the rest-frame of the parent is trivial; so the real question is: where is location of the peak in the laboratory frame in this case? We showed that the location of the peak is (in general) shifted compared to the value in the rest-frame, which means that, not only the δ-function is smeared in passing from the parent rest-frame to the laboratory, but also the peak does not stay put (c.f., massless child particle case).
To deal with this feature of energy spectra of massive particles, we introduced a more general measurement strategy that builds on the one devised for the case of massless child particles (for which, once again, the peak coincides with the rest-frame energy). Namely, we had developed a parametrization of the energy spectrum of the massless child particle in terms of the location of the peak and its width. This function was largely, but not fully, constructed from first principles properties of the energy spectrum. In fact, the specific functional form within a larger class of functions, was fixed empirically and validated on the theoretical numerical prediction for the energy distribution of several relevant examples such as the top quark pair production and decay at the LHC [71] and for other BSM processes, e.g., gluino decays in SUSY [76]. 13 Based on the above success of the massless case, we assumed here that the fitting function for massless child particle is indeed accurate. We then showed that a suitable generalization of that function can be obtained to cover the case of a massive child particle. Clearly, this application to massive child particle case relies on the original parametrization for massless particles being the "truth", hence it relies on one extra assumption compared to the massless case. For this reason it is incumbent on us to test our parametrization of the massive child energy spectrum even more thoroughly than was done for the massless case. In this paper we have carried out such validation studying the (numerical) theory prediction for energy spectra in a SUSY process, namely, the decay of heavier stop to lighter stop and Z boson, the latter being a visible (resonant pair of leptons), massive child particle. In this theoretical study, we paid particular attention to the mass difference between the two stop mass eigenstates, which was varied as an handle to control the importance of the Z boson mass in the stop quark rest frame. As expected, we found that for larger mass gap, i.e., Z boson being more boosted and its mass less relevant, the massless fitting function still works reasonably well, but its performance degrades as we make mass gap smaller, whereas the massive template provides a much more accurate parametrization of the energy spectrum over a large range of stop rest frame Z boson boost. Having gained confidence in the theoretical validity of the massive template for the energy spectrum, we then considered its phenomenological applications. To this end we studied a mass measurement based on 3000 fb −1 at LHC14 for the same SUSY process, including SM backgrounds and selection cuts to isolate the signal. Our study has shown that the rest frame energy of the Z boson can be reconstructed from its laboratory frame energy spectrum performing a χ 2 fit of the data with the model function that we proposed. Performing pseudo-experiments we have checked for two representative spectra that a precision around 10% can be achieved on the rest-frame energy measurement even for cases that yield limited number of signal events, e.g., for the spectrum featuring mt 2 = 800 GeV that, after selections, yields just O(10) events per ab −1 of luminosity.
As future directions to pursue, we would like to mention the fairly straightforward application to other two-body decays with massive child particle (whether in the SM or beyond), for example, t → bW , but this time using energy of W boson (c.f., that of b in the original application of massless energy-peak). Furthermore, armed with the massive template developed in this work, we can contemplate the extension of the energy-peak 13 The CMS collaboration recently applied the energy-peak method for measuring the top quark mass [79], using a related functional form, obtaining a result consistent with other measurements and with a reasonable precision. method beyond two-body decays by slicing the many-body phase space into effective twobody ones; in fact results for a three-body decay into two visible particles and one invisible are presented in Ref. [84]. We remark that the "effective one-body" (formed out of two visible particles, with a fixed invariant mass) that is necessary to deal with in such a phasespace slicing is in general massive, 14 hence we must use the massive energy-peak method developed in this work; in such application, the fitting function would be applied to the sum of energies of the two visible particles.
To conclude, having covered the cases of massless and massive child particles, we can envisage several more diverse applications of our novel idea of exploiting energy spectra for mass measurements and beyond. The crucial step was overcoming the naive expectation of little utility of this method which was based on the superficial disadvantage of energy being a Lorentz-variant. We believe that energy spectra analysis can then become a part of the standard tool-kit to study particle physics at colliders.