Performance of the CMS missing transverse momentum reconstruction in pp data at √s = 8 TeV

The performance of missing transverse energy reconstruction algorithms is presented using √s=8 TeV proton-proton (pp) data collected with the CMS detector. Events with anomalous missing transverse energy are studied, and the performance of algorithms used to identify and remove these events is presented. The scale and resolution for missing transverse energy, including the effects of multiple pp interactions (pileup), are measured using events with an identified Z boson or isolated photon, and are found to be well described by the simulation. Novel missing transverse energy reconstruction algorithms developed specifically to mitigate the effects of large numbers of pileup interactions on the missing transverse energy resolution are presented. These algorithms significantly reduce the dependence of the missing transverse energy resolution on pileup interactions. Finally, an algorithm that provides an estimate of the significance of the missing transverse energy is presented, which is used to estimate the compatibility of the reconstructed missing transverse energy with a zero nominal value.


Introduction
The CMS detector [1] can detect almost all stable or long-lived particles produced in the protonproton (pp) collisions provided by the LHC at CERN. Notable exceptions are neutrinos and hypothetical neutral weakly interacting particles. Although these particles do not leave a signal in the detector, their presence can be inferred from the momentum imbalance in the plane perpendicular to the beam direction, a quantity known as missing transverse momentum and denoted by E / T . Its magnitude is denoted by E / T and will be referred to as missing transverse energy. The E / T plays a critical role in many physics analyses at the LHC. It is a key variable in many searches for physics beyond the standard model, such as supersymmetry and extra dimensions, as well as for collider-based dark matter searches. It also played an important role in studies contributing to the discovery of the Higgs boson, in particular in channels with the WW, ZZ → νν, where is e or µ, and H → ττ final states [2]. In addition, the precise measurement of E / T is critical for measurements of standard model physics involving W bosons and top quarks.
The E / T reconstruction is sensitive to detector malfunctions and to various reconstruction effects that result in the mismeasurement of particles or their misidentification. Precise calibration of all reconstructed physics objects (e, µ, τ, γ, jets, etc) is crucial for the E / T performance. The E / T is particularly sensitive to additional pp interactions in the same, earlier, and later bunch crossings (pileup interactions). It is therefore essential to study E / T reconstruction in detail with data. This paper describes the E / T reconstruction algorithms and associated corrections, together with performance studies conducted in 8 TeV pp data. The average number of interactions per bunch crossing in this dataset is approximately 21. Previous studies of the missing transverse energy reconstruction in 7 TeV data were presented in ref. [3].
This paper is organized as follows. A brief description of the CMS detector is presented in section 2. In section 3, the data and Monte Carlo (MC) simulation samples used for the present study, together with the event selection criteria, are described. In section 4, the different algorithms for reconstructing E / T are presented. In section 5, sources of anomalous E / T measurements from known detector artifacts and methods for identifying them are described. In section 6, the E / T scale and resolution are reported based on the measurements made with event samples containing isolated photon or Z boson candidates. Studies presented in section 6 include a detailed evaluation of E / T resolution degradation caused by pileup interactions. Section 7 reports the performance of novel E / T reconstruction algorithms developed to cope with large numbers of pileup interactions. The algorithm that provides an estimate of the E / T significance is described and its performance presented in section 8. Conclusions are given in section 9.

Data samples, particle reconstruction, and event selection
Data samples used for the studies presented in this paper were collected from February through December 2012 in pp collisions at a centre-of-mass energy √ s = 8 TeV, and correspond to an integrated luminosity of 19.7 ± 0.5 fb −1 [4]. For all studies, we require at least one well-identified event vertex whose z position is less than 24 cm away from the nominal centre of the detector, whose transverse distance from the z-axis is less than 2 cm, and which is reconstructed with at least four tracks. The vertex with the largest value of ∑ p 2 T taken over all associated tracks is considered to be the primary vertex that corresponds to the origin of the hard-scattering process.
The CMS experiment uses global event reconstruction, also called particle-flow (PF) event reconstruction [5,6], which consists of reconstructing and identifying each particle with an optimized combination of all subdetector information. In this process, the identification of the particle type (photon, electron, muon, charged hadron, or neutral hadron) plays an important role in the determination of the particle direction and energy. Photons, such as those from π 0 decays or from electron bremsstrahlung, are identified as ECAL energy clusters not matched to the extrapolation of any charged-particle trajectory to the ECAL. Electrons are identified as primary charged-particle tracks reconstructed by a Gaussian-sum filter (GSF) algorithm [7] and matched to ECAL energy clusters; the matching allows for associated bremsstrahlung photons.
-3 -Muons, such as those from b-hadron semileptonic decays, are identified as tracks in the central tracker consistent with either a track or several hits in the muon system, associated with minimum ionizing particle depositions in the calorimeters. Muon reconstruction and identification are described in detail in ref. [8]. Charged hadrons are defined to be charged-particle tracks identified neither as electrons nor muons. Finally, neutral hadrons are identified as HCAL energy clusters not matched to any charged-hadron trajectory, or as ECAL and HCAL energy excesses with respect to the expected charged-hadron energy deposit.
The energy of photons is directly obtained from the ECAL measurement and corrected for zero-suppression effects [9]. The energy of electrons is determined from a combination of the track momentum at the main interaction vertex, the corresponding ECAL cluster energy, and the energy sum of all associated bremsstrahlung photons. The energy of muons is obtained from the corresponding track momentum. The energy of charged hadrons is determined from a combination of the track momentum and the corresponding ECAL and HCAL energies, corrected for zerosuppression effects, and calibrated for the nonlinear response of the calorimeters. Finally the energy of neutral hadrons is obtained from the associated calibrated ECAL and HCAL energy deposits.
For each event, hadronic jets are clustered from these reconstructed particles with the infrared and collinear-safe anti-k T algorithm [10,11], with a distance parameter R = 0.5. The jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found in simulated samples to be below 2 to 5% of the true momentum over the entire p T range of interest and over the detector acceptance. The jet energy corrections are derived from simulation and are confirmed by in-situ measurements exploiting the energy balance of dijet and photon+jet events [12]. Jet energy resolution (JER) after PF reconstruction is typically about 25%, 10%, and 5% at E = 10, 100, and 1000 GeV, respectively; this may be compared to approximately 40%, 12%, and 5% obtained when the calorimeters alone are used for jet clustering without PF reconstruction.
The data are compared to simulated events generated either with PYTHIA v6.4.24 Monte Carlo [13] for the QCD and γγ processes, or with MADGRAPH v5. 1.3.30 [14,15] interfaced with PYTHIA v6.4.24 for top (tt and single-top), Z + jets, W + jets, γ + jets, and diboson (VV) processes. The PYTHIA v6.4.24 program has been set up with a parameter set description for the underlying event referred to as tune Z2* [16,17]. The generated events are passed through the CMS detector simulation, which is based on GEANT4 [18]. The detector geometry description includes realistic subsystem conditions, such as the simulation of non-functioning channels.
The simulated events are weighted such that the distribution of the simulated pileup interaction multiplicity matches the expected distribution, as based on measurements of the instantaneous luminosities in data. This is demonstrated in figure 1, which shows agreement in the reconstructed vertex multiplicity (N vtx ) distribution between data and simulated samples. The total uncertainty in the N vtx distribution is dominated by the uncertainty in the total inelastic pp scattering cross section measurement [19,20], which affects the pileup profile in the simulated sample. The other uncertainty source is in the luminosity measurement, which constitutes ∼30% of the total uncertainty.

The dijet event selection
The dijet data sample is used in the studies of anomalous high-E / T events are presented in section 5 and in the E / T significance studies in section 8. It was collected with a single-jet trigger that requires  Figure 1. Multiplicity of reconstructed vertices for Z → e + e − candidate events. The grey error band displays the systematic uncertainty of the simulation, and is dominated by the uncertainty in the total inelastic pp scattering cross section measurement [4,19].
at least one jet in the event with p T > 320 GeV. Dijet events are selected offline by requiring a leading jet with p T > 400 GeV and at least one other jet with p T > 200 GeV.

The Z → + − event selection
The Z → + − events, where is either a muon or an electron, are used in the E / T scale, resolution, and significance studies presented in sections 6, 7, and 8. In order to discriminate between prompt leptons and leptons that are produced inside a jet through the decay of a hadron, we define an isolation variable R Iso as the ratio of p T of particles near the lepton to the p T of the lepton itself, The scalar p T sums ∑ HS± p T , ∑ neu p T , and ∑ pho p T are taken over particles from the primary hard-scatter (HS) vertex, neutral hadrons, and photons, respectively; all particles entering the sums must lie within a distance ∆R ≡ √ (∆φ ) 2 + (∆η) 2 < 0.3 of the lepton candidate. Well-isolated leptons, unlikely to have originated from semi-leptonic decay within a jet, are characterized by low values of R Iso . The final negative sum over charged hadrons from pileup (PU) vertices compensates the additional energy produced by photons and neutral hadrons stemming from pileup interactions. The relative balance between charged particles and neutral particles produced by pileup interactions is taken into account using a factor 0.5 in the final sum.
The Z → µ + µ − events were collected using a trigger that requires the presence of two muons passing p T thresholds of 17 and 8 GeV, respectively. The muon candidates must be reconstructed in the tracker and in the muon chambers, must satisfy p T > 20 GeV and lie in the pseudorapidity range |η| < 2.1. In order to veto candidates from non-prompt processes, muons must further satisfy R Iso (p µ T ) < 0.1. The Z → e + e − candidate events were collected using a double-electron trigger with p T thresholds of 17 and 8 GeV. The events are required to have two electron candidates within the ECAL -5 -

JINST 10 P02006
[GeV]  Figure 2. Dilepton invariant mass distributions in events passing the Z → µ + µ − (left) and Z → e + e − (right) selections. The VV contribution corresponds to processes with two electroweak bosons produced in the final state. The top contribution corresponds to the top pair and single top production processes. The grey error band displays the systematic uncertainty of the simulation, due to the muon (left), or electron (right) energy scale. As the invariant mass selection is performed before the computation of the systematic uncertainty on the energy scale, a large event migration is observed for Z → e + e − events. fiducial volume defined by |η| < 1.44 and 1.56 < |η| < 2.5. To reject jets or photons misidentified as electrons, requirements are applied on the shower shape and the matching of the energy cluster with the associated GSF track, in both φ and η. In addition, electrons must satisfy R Iso (p e T ) < 0.1 and p T > 20 GeV.
Events with an invariant mass of the dimuon or dielectron system outside of the Z-boson mass window 60 GeV < M < 120 GeV are rejected. The tt and single-top (top) processes as well as dibosons (VV) processes are the dominant backgrounds in both the Z → e + e − and Z → µ + µ − samples. The spectra for the invariant mass and transverse momentum, q T , of magnitude q T , of the Z → + − candidate are presented in figures 2 and 3, respectively. The data distributions are well modeled by the simulation.

W → eν and tt event selection
The W → eν and semi-leptonic tt events are used in the E / T significance studies presented in section 8. The W → eν candidate events are collected with a single-electron trigger that requires the presence of an electron object with p T > 27 GeV. Offline, we require the presence of an electron candidate passing the medium working point of a set of quality requirements and also satisfying p T > 30 GeV and |η| < 2.5. This working point is identical to the one used for the selection of Z → e + e − events. We reject events with two or more electrons if at least one of the additional electrons satisfies p T > 20 GeV, |η| < 2.5, and passes the loose working point of a set of quality requirements (the same set just mentioned above). The medium and loose working points for the electron quality requirements have been defined so that they select electrons with an efficiency of 80% and 95%, respectively [21].
In the semi-leptonic tt channel, we select single-muon and single-electron events. Each event is required to pass either an e+jet or a µ+jet trigger. Offline we require at least 2 b-tagged jets with p T > 45 GeV, at least 3 jets with p T > 45 GeV, and at least 4 jets with p T > 20 GeV. Jet energies are -6 -2015 JINST 10 P02006  Figure 3. Distributions of Z/γ transverse momentum q T in Z → µ + µ − (left), Z → e + e − (right), and directphoton events (bottom). The points in the lower panel of each plot show the data/MC ratio, including the statistical uncertainties of both data and simulation; the grey error band displays the systematic uncertainty of the simulation. The last bin contains overflow content. The VV contribution corresponds to processes with two electroweak bosons produced in the final state. The top contribution corresponds to the top pair and single top production processes. The EWK contribution corresponds to the Zγ and Wγ production processes as well as W → eν events. fully corrected and required to satisfy the jet identification criteria [22] described in section 5. For b-tagging, we use the combined secondary vertex tagger with the tight working point [23]. Exactly one identified and isolated lepton is required.

The direct-photon event selection
A direct-photon sample corresponding to final states containing at least one photon and at least one jet is used for the measurements of E / T scale and resolution presented in sections 6 and 7. Photon events were collected with a set of triggers based on the measured p T of the hardest reconstructed photon candidate in the event. The p T thresholds of the triggers were 30, 50, 75, 90, 135, and -7 -150 GeV. The rates of the first five triggers were randomly reduced (prescaled) because of the limited data acquisition bandwidth. The approximate effective values of the prescaling factors were 5000, 900, 150, 71, and 1.33 respectively. Events are selected offline by requiring the highest p T reconstructed photon candidate to pass the selection criteria described below.
Photon candidates are selected from clusters of energy in the ECAL within the pseudorapidity coverage |η| < 1.44. Various identification criteria, such as the consistency between the cluster width and a typical photon electromagnetic shower shape, are applied in order to correctly identify photons with high efficiency and to suppress the misidentification of electrons, jets, or spurious ECAL signals as photons [24,25]. An isolation requirement ensures that hadronic jets misidentified as photons are rejected efficiently: activity from charged hadrons, neutral hadrons, and other photons in the event is determined by calculating the scalar sum of their transverse momenta in a cone of ∆R < 0.3 around the photon trajectory. Separate requirements on these isolation sums suppress photon candidates inside jets and jets misidentified as photons: ∑ p T < 2.6 GeV, ∑ p T < 3.5 + 0.04q T GeV and ∑ p T < 1.3 + 0.005q T GeV for charged hadrons, neutral hadrons and photons, respectively. Finally, to prevent the misidentification of electrons as photons, the photon candidate must not match any track with hits in the pixel detector that is associated with the primary vertex and reconstructed in the pixel detector. Events satisfying these criteria form our signal sample.
The background processes that are considered for the direct photon sample are QCD multijet events, diphoton production, production of single W bosons, and single photons produced in association with the W or Z boson, referred as the electroweak (EWK) contribution. Although the majority of QCD multijet events fail the photon selection, they constitute a dominant background due to the large production cross section and occasional misidentification of jets with large electromagnetic fraction as photons. Jets that pass the photon selection are typically enriched in π 0 → γγ and contain little hadronic activity; therefore, the detector response to these jets is similar to that of single photons. To have a robust description of the QCD background, its expected contribution is estimated from data.
We utilize the following procedure to estimate the expected contribution of QCD multijet background processes for a given kinematic variable. We begin with a sample of data events where the highest p T photon candidate failed the charged-hadron isolation requirement but passed all other requirements; we denote this sample of events as the charged hadron isolation sideband. For each kinematic variable studied, we take the distribution of this variable from data in the charged hadron isolation sideband and remove non-QCD background processes by subtracting their simulated distributions. The remaining distribution forms our initial estimate for the shape of the kinematic variable's distribution in the QCD background in the signal sample. We set the normalization of this expected QCD multijet background by scaling the number of events in data from the charged hadron isolation sideband to match the number of events in data from the main signal sample, after subtracting the respective expected contributions of other backgrounds.
In order to account for the differences in detector response to photon candidates between the signal sample and the charged hadron isolation sideband, we correct these distributions with information from simulated QCD multijet events. The magnitude of these corrections depends upon the algorithm used for E / T reconstruction; for PF E / T (defined in section 4), the magnitude of the correction falls within 6-8%. For No-PU PF E / T and MVA PF E / T (both defined in section 7), the magnitudes of the corrections fall within 2-4%.
-8 - Figure 3 shows a comparison between the photon transverse momentum q T distribution in data and the expected signal and background contributions. Note that the signal and background contributions for the prediction have been reweighted in q T to match the distribution observed in data.

Reconstruction of E / T
We define E / T ≡ − ∑ p T , where the sum is over all observed final-state particles; by momentum conservation, E / T is also equal to the total transverse momentum of all unobserved particles, such as neutrinos or other weakly interacting objects. CMS has developed several distinct and complementary algorithms to reconstruct E / T , already presented in ref. [3]. The E / T reconstructed using a particle-flow technique (PF E / T ) is used in the majority of CMS analyses. It is defined as the negative vectorial sum over the transverse momenta of all PF particles. The PF ∑ E T is the associated scalar sum of the transverse momenta of the PF particles. The less commonly used Calo E / T is calculated using the energies contained in calorimeter towers and their directions relative to the centre of the detector. The sum excludes energy deposits below noise thresholds but is corrected for the calorimeter deposits of muons, when they are present, by adding their momentum to the sum [26].
In the following sections, we present the performance of PF E / T and Calo E / T , giving primary attention to PF E / T . In addition, two advanced E / T reconstruction algorithms specifically developed to mitigate effects from large numbers of pileup interactions are discussed in section 7.
The magnitude of the E / T can be underestimated or overestimated for a variety of reasons, including minimum energy thresholds in the calorimeters, p T thresholds and inefficiencies in the tracker, and the nonlinearity of the response of the calorimeter for hadronic particles due to its non-compensating nature. This bias is significantly reduced by correcting the p T of jets to the particle-level p T using jet energy corrections [27]: where the superscript "corr" refers to the corrected values. The sum extends over all jets with an electromagnetic energy fraction below 0.9 and a corrected p T > 10 GeV (20 GeV) for PF E / T (Calo E / T ). Further corrections improve the performance of the E / T reconstruction in events with large numbers of pileup interactions. The contribution to the genuine E / T from such interactions is close to zero, as the probability to produce neutrinos is small in inelastic pp scattering (minimum bias) interactions. The vectorial p T sum of charged particles is therefore expected to be well balanced by that of neutral particles. However, the nonlinearity and minimum energy thresholds in the calorimeters cause E / T to point on average in the direction of the vectorial p T sum of neutral particles.
We correct for this effect by using the vectorial p T sum of charged particles associated with pileup vertices as an estimator of the induced E / T . The correction is parametrized by f ( v) = c 1 (1.0+ erf(−c 2 | v| c 3 )) where v = ∑ charged p T is the vectorial p T sum of charged particles associated with a given pileup vertex. The coefficients c 1 = −0.71, c 2 = 0.09, and c 3 = 0.62 are obtained by fitting the E / T component parallel to the v direction as a function of | v| in simulated minimum bias -9 -2015 JINST 10 P02006 Table 1. The parameters for the E / T φ -asymmetry corrections for PF E / T for data and simulation. As the detector alignment and φ -intercalibrations are different between data and simulation, the values of the respective parameters are expected to be different.
events with exactly one generated pp interaction. When this correction is applied to the data and simulation samples with pileup interactions, the factor f ( v) v, which gives the expected total E / T for each pileup interaction, is summed over all pileup vertices and is subtracted from the reconstructed E / T : Although particles are on average produced uniformly in φ , some φ asymmetry is observed in the p T sums of calorimeter energy deposits, tracks, and particles reconstructed by the particle-flow algorithm, leading to a φ asymmetry in E / T . The φ asymmetry is present not only in the data but also in simulated events. The sources of the asymmetry have been identified as imperfect detector alignment, inefficiencies, a residual φ dependence of the calibration, and a shift between the centre of the detector and the beamline [28].
The observed E / T φ asymmetry is due to a shift in the E / T components along the x and y detector axes (denoted by E / x and E / y respectively), which increases approximately linearly with the number of reconstructed vertices. This correlation is utilized for a correction procedure. The φ -asymmetry corrections are determined separately for data and simulated events. Linear functions are fitted to the correlation of E / x and E / y to N vtx , the number of reconstructed vertices: The linear dependence of E / x and E / y on N vtx is used to correct E / T on an event-by-event basis as: The coefficients c x 0 , c x s , c y 0 , and c y s are determined separately from Z → µ + µ − candidate events in data and simulation samples. These coefficients for the PF E / T are shown in table 1.
In this paper, the correction ∆ jets defined in eq. (4.1) is applied to both PF and Calo E / T , while the pileup correction ∆ PU defined in eq. (4.2) is applied only to PF E / T , as the information from tracking needed for determination of ∆ PU is not used in the Calo E / T calculation. All the E / T distributions are further corrected for the φ asymmetry. In simulated events, jet momenta are smeared in order to account for the jet resolution differences between data and simulation [27], and the E / T is recomputed based on the smeared jet momenta.

JINST 10 P02006
5 Large E / T due to misreconstruction Spurious detector signals can cause fake E / T signatures that must be identified and suppressed. In ref. [3] we showed the results of studies of anomalous high-E / T events in the data collected during 2010 LHC running, associated with particles striking sensors in the ECAL barrel detector, as well as those caused by beam-halo particles and dead cells in the ECAL. Studies of anomalous E / T events caused by (1) HCAL hybrid photodiode and readout box electronics noise and (2) direct particle interactions with the light guides and photomultiplier tubes of the forward calorimeter are discussed in ref. [29].
In the 2012 data, we have identified several new types of anomalous events populating the high E / T tail. There are a few channels in the ECAL endcaps that occasionally produce high-amplitude anomalous pulses. The affected events are identified by the total energy and the number of lowquality hits within the same super-cluster, and are removed. A misfire of the HCAL laser calibration system in the HCAL barrel (HB), endcap (HE), or forward (HF) regions can produce false signals in almost all channels in a subdetector. If this misfire overlaps with a bunch crossing resulting in a trigger, the event can be contaminated, inducing a large, fake E / T . The affected events are identified by the hit occupancies in the channels used for signal and calibration readout and are removed from the sample.
Another source of fake E / T comes from the track reconstruction. The silicon strip tracker can be affected by coherent noise, which can generate ∼10 4 clusters widely distributed in the silicon detectors. A significant fraction of these events are vetoed at early stages of the online trigger selection; however, the veto is not fully efficient and some of these events are read out and reconstructed. In such events the transverse momentum of misreconstructed spurious tracks can exceed 100 GeV. These tracks can mimic charged particles, which are then clustered into jets with high p T creating large spurious E / T . The affected events can be identified by the number of clusters in the silicon strip and pixel detectors.
Although the rejection of anomalous high-E / T events due to noise in HB and HE was studied in ref. [3], further developments have proven necessary to cope with the evolving LHC running conditions, including high luminosities and the shortening of the bunch crossing interval from 100 ns to 50 ns. A noise-rejection algorithm was developed to exploit the differences between noise and signal pulse shapes. The CMS hadron calorimeter signals are digitized in time intervals of 25 ns, and signals in neighboring time intervals are used to define the pulse shape; measured and expected signal pulse shapes are compared and several compatibility tests to a signal hypothesis are performed. The energy reconstructed in channels having anomalous signals is removed during event processing, so that the affected channels do not contribute to the reconstructed physics objects. Figure 4 shows a comparison of the PF E / T distribution before and after the application of the algorithms to remove anomalous events in the dijet sample described in section 3.1. The anomalous events with PF E / T around 600 GeV are mainly due to misfires of the HCAL laser calibration system, and the anomalous events with PF E / T above 1.5 TeV are mainly caused by the electronics noise in HB and HE. Even after applying all the anomaly-removal algorithms developed for the 2012 data, we still find a small residue of anomalous E / T events in the tail of the PF E / T distribution. Imposing jet identification criteria that limit the maximum neutral hadron energy fraction to 0.9 and the maximum photon energy fraction to 0.95 guarantees efficient removal of such events. These data before cleaning data after cleaning QCD requirements are presented in ref. [22] and are frequently used in CMS data analyses. The event is rejected if any jet fails the jet identification criteria. The PF E / T distribution for events passing all cleaning algorithms and jet identification requirements shows a substantial reduction of the high PF E / T tail, and agrees well with the simulated distributions for PF E / above 500 GeV (figure 4).

Missing transverse energy scale and resolution
In this section, we present studies of the performance of E / T reconstruction algorithms using events where an identified Z boson or isolated photon is present. The bulk of such events contain no genuine E / T , and thus a balance exists between the well-measured vector boson momentum and the hadronic system, which dominates the E / T measurement. Using the vector boson momentum as a reference, we are able to measure the scale and resolution of E / T in an event sample with a hadronic system that is kinematically similar to standard model processes such as tt +jets and W+jets, which are typically important backgrounds in searches where E / T is an essential signature.
Even if no genuine E / T is expected in physical processes, many physics and detector effects can significantly affect the E / T measurement, inducing nonzero E / T in these events. The detector noise, particle misreconstruction, detector energy resolution, and jet energy corrections are part of the detector sources of E / T , while the pileup, underlying event activity, and fluctuations in jet composition are physical sources of E / T .
The PF E / T distributions in Z → µ + µ − , Z → e + e − , and direct-photon events are presented in figure 5. Note that for the direct-photon distribution we require q T > 100 GeV in order to avoid biases from the prescales of the lower p T photon triggers. Good agreement between data and sim- ulation is observed in all distributions. Momenta of leptons from Z-boson decays (direct photons) are reconstructed with resolutions of σ p T /p T ∼ 1-4 (1-3)% [8,24], while jet energies are reconstructed with resolutions of σ E /E ∼ 10-15% [30]. Thus the E / T resolution in Z or γ + jets events is dominated by the resolution with which the hadronic activity in the event is reconstructed. Uncertainty bands for the distributions of Z → µ + µ − , Z → e + e − , and direct-photon events include uncertainties in the lepton and photon energy scales (0.2% for muons, 0.6% for barrel electrons and photons, and 1.5% for endcap electrons), jet energy scale (2-10%), jet energy resolution (6-15%), and the energy scale of low-energy particles, defined as the unclustered energy (arbitrary 10%, covering for all differences observed between the data and the simulation). In addition, for the direct-photon events only, we account for the systematic uncertainty in the E / T response correction applied to events used to estimate the QCD multijet contribution to the direct-photon sample (2-10%).
The increase in the uncertainty band in figure 5 around 70 GeV stems from the large impact of jet energy resolution uncertainties in events with no genuine PF E / T : as this region of PF E / T is mostly filled with direct-photon or Z events with at least one jet, the impact of a modification of the jet energy on the E / T reconstruction will be maximized in this area. For higher values of PF E / T , where processes with genuine E / T dominate such as the tt process, the relative uncertainty is much smaller.
We denote the vector boson momentum in the transverse plane by q T , and the hadronic recoil, defined as the vectorial sum of the transverse momenta of all particles except the vector boson (or its decay products, in the case of Z bosons), by u T . Momentum conservation in the transverse plane requires q T + u T + E / T = 0. By definition, the recoil is therefore the negative sum of the induced E / T and q T . Figure 6 summarizes these kinematic definitions.
The presence of a well-measured Z boson or direct photon provides both a momentum scale, q T ≡ | q T |, and a unique event axis, along the unit vectorq T . The hadronic recoil can be projected onto this axis, yielding two signed components, parallel (u ) and perpendicular (u ⊥ ) to the event axis. The direction of u ⊥ is defined by considering the coordinate frame based on the q T axis. Since u ≡ u T ·q T , and because the observed hadronic system is usually in the hemisphere opposite the boson, u is typically negative. The scalar quantity − u /q T is referred to as the E / T response, and -13 - the dependence of − u /q T versus q T as the response curve.
The E / T energy resolution is assessed with a parametrization of the u + q T and u ⊥ distributions by a Voigtian function, defined by the convolution of a Breit-Wigner distribution and a Gaussian distribution, as it is found to describe the observed u + q T and u ⊥ distributions very well. The resolutions of u and u ⊥ , denoted by σ (u ) and σ (u ⊥ ), are given by the full width at half maximum of the Voigtian profile, divided by 2 √ 2 ln 2 2.35.

Measurement of PF E / T scale and resolution
The decomposition of the hadronic recoil momentum into u ⊥ and u components provides a natural basis in which to evaluate PF E / T characteristics. Distributions of u ⊥ are shown in figure 7 for Z → µ + µ − , Z → e + e − , and direct-photon events. The component u ⊥ is expected to be centred at zero by construction, and to be symmetric as it arises primarily from random detector noise and the underlying event. Distributions of u + q T are also shown in figure 7. Again by construction, u is balanced with q T , thus making u + q T centred around zero and approximately symmetric. The increased uncertainty in the u + q T and u ⊥ distributions around ±70 GeV is due to the jet energy resolution uncertainty. The response curves extracted from data, − u /q T versus q T , are shown in figure 8 for Z → µ + µ − , Z → e + e − , and direct-photon events. Deviations from unity indicate a biased hadronic recoil energy scale. The agreement between data and simulation is reasonable for each channel. The curves fit to Z data indicate that the PF E / T is able to fully recover the hadronic recoil activity corresponding to a Lorentz boosted Z-boson with q T ∼ 40 GeV. Below 40 GeV, the uncorrected unclustered energy contribution (energy not contained within jets or leptons) starts to be significant compared to the corrected energy of the recoiling jets, leading to an underestimation of the response. The curves fit to γ + jets data are 2-3% lower than those fit to Z data at q T < 100 GeV. This effect primarily stems from the large contribution of QCD multijet events to the q T < 100 GeV region of the selected photon sample. In these QCD multijet events, the hadronic recoil of the photon candidate tends to have a higher contamination of gluon jets. As the calorimeter response to gluon jets is characteristically lower than for quark jets due to difference of jet composition and collimation, the overall average response is reduced for the photon sample in this region.
The resolution curves, σ (u ) and σ (u ⊥ ) versus q T , are shown in figure 9. The resolution increases with increasing q T , and the data and simulation curves are in reasonable agreement for each channel. As the hadronic recoil is produced in the opposite direction of the Z boson or direct photon, σ (u ) scales linearly with q T while σ (u ⊥ ) is less impacted by the value of q T .
-14 - , and direct-photon events (right); the points in the lower panel of each plot show the data/MC ratio, including the statistical uncertainties of both data and simulation; the grey error band displays the systematic uncertainty of the simulation. The first (last) bin contains the underflow (overflow) content.
The Z-boson and γ + jets q T spectra differ from one another, and comparison of resolution curves between the Z and γ +jets channels may be affected by their dependence on the q T spectrum. Thus, for the remaining resolution curves where direct comparisons between the Z-boson and γ + jets channels are shown, both Z-boson and γ + jets events are required to satisfy q T > 100 GeV, and event-by-event reweighting of both Z data and simulation is applied to make their q T spectra similar to that of γ + jets data. Figure 10 shows the resolution of the PF E / T projections along the x and y axes as a function of PF ∑ E T . The PF ∑ E T is the scalar sum of E T of all the particles reconstructed by the particle-flow reconstruction, except for the selected direct photon or the selected dileptons from the decay of the Z-boson candidate. Resolution curves are found to be in agreement when comparing different channels and are well described by the simulation. The resolution curves for the components of PF E / T can be parametrized by a linear relationship, where σ 0 is the intrinsic detector noise resolution and σ s is the E / T resolution stochastic term. Since the fit only contains data with PF ∑ E T above 300 GeV, the σ 0 parameter is not well constrained in the fits, and has sizable uncertainties. The uncertainties of the σ 0 parameter are smaller in γ + jets data than in Z data due to a larger data-sample in the former case. The stochastic term is σ s ∼ 0.6 and is compatible for different channels, as shown in table 2.  Figure 8. Response curves for PF E / T in events with a Z-boson or direct photon. Results are shown for Z → µ + µ − events (full blue circles), Z → e + e − events (open red circles), and direct-photon events (full green squares). The upper frame shows the response in data; the lower frame shows the ratio of data to simulation with the grey error band displaying the systematic uncertainty of the simulation, estimated as the maximum of each channel systematic uncertainty. The q T value for each point is determined based on the average q T value in data contributing to each point. Figure 11 shows the resolution curves σ (u ) and σ (u ⊥ ) versus the number of primary vertices N vtx , for both Z-boson channels and the γ + jets channel. The offset of the curve is related to the resolution in Z or γ + jets events without pileup and the dependence on N vtx indicates how much the pileup degrades the E / T resolution. Since the hard-scatter interaction and each additional collision are uncorrelated, these resolution curves can be parametrized by the function, where σ c is the resolution term induced by the hard-scatter interaction and σ PU is the resolution term induced on average by one additional pileup collision. The factor 0.7 accounts for the fact that only approximately 70% of pp interactions produce a reconstructed vertex isolated from other vertices. Results of the parameterizations are given in table 3. From there, one can see that different channels are compatible with each other, and that the simulation offers a good description of the performance obtained in data. For each additional pileup interaction, the PF E / T resolution is degraded by around 3.3-3.6 GeV in quadrature. As a pileup interaction is isotropic, the PF E / T response is not impacted by the number of additional pileup interaction in the event.
The  Figure 9. Resolution curves of the parallel recoil component (left) and perpendicular recoil component (right) versus Z/γ q T for PF E / T in events with a Z-boson or γ. Results are shown for Z → µ + µ − events (full blue circles), Z → e + e − events (open red circles), and direct-photon events (full green squares). The upper frame of each figure shows the resolution in data; the lower frame shows the ratio of data to simulation with the grey error band displaying the systematic uncertainty of the simulation, estimated as the maximum of each channel systematic uncertainty. The q T value for each point is determined based on the average q T value in data contributing to each point. Table 2. Parametrization results of the resolution curves for the components of PF E / T , as functions of PF ∑ E T . The parameter values σ 0 and σ s are obtained from data. For each parameter, we also present R r , the ratio of values obtained in data and simulation. For the ratios, the first uncertainty is from the fit, and the second uncertainty corresponds to the propagation of the following into the parameterization: systematic uncertainties in the jet energy scale, jet energy resolution, lepton/photon energy scale, and unclustered energy scale, as well as, for direct-photon events only, the systematic uncertainty assigned to the QCD multijet estimation response correction described in section 3.

Channel
E   Figure 10. -18 - Table 3. Parametrization results of the resolution curves for the u and u ⊥ components calculated with the PF E / T as functions of N vtx . The parameter values σ c and σ PU are obtained from data. For each parameter, we also present R r , the ratio of values obtained in data and simulation. For the ratios, the first uncertainty is from the fit, and the second uncertainty corresponds to the propagation of the following into the parameterization: systematic uncertainties in the jet energy scale, jet energy resolution, lepton/photon energy scale, and unclustered energy scale, as well as, for photon events only, the systematic uncertainty assigned to the QCD multijet estimation response correction described in section 3.
Channel u component   how the PF reconstruction of E / T has stronger performance in terms of E / T resolution dependence on pileup relative to the E / T reconstruction based solely on the calorimeters.

Pileup-mitigated E / T
Since the vast majority of pileup interactions do not have significant E / T and the average value of E / T projected on any axis is zero, the effect of pileup interactions on the E / T response is small. However, as shown in section 6, pileup interactions have a considerable effect on the E / T resolution. Table 3 shows that each pileup interaction adds an additional 3.3-3.6 GeV of smearing to the E / T resolution in quadrature to both u ⊥ and u in Z → µ + µ − , Z → e + e − , and direct-photon events. In events where the recoil p T is small and the number of pileup interactions is around the mean value of the sample collected during the 2012 run, which corresponds to approximately 21 pileup interactions, the contribution to the E / T resolution from pileup interactions is larger than the contribution from the hadronic recoil. In this section we discuss two algorithms that reduce the effect of pileup interactions on the E / T reconstruction, hereafter referred to as the No-PU PF E / T and MVA PF E / T algorithms. These algorithms divide each event into two components: particles that are likely to originate from the primary hard-scattering pp interaction (HS particles) and particles that are likely to originate from pileup interactions (PU particles).

Identification of PU-jets
Separation of charged PF particles originating from the primary hard-scattering pp interaction and those from pileup interactions is best performed by matching them to either the primary vertex or to pileup vertices. This information is also used to identify jets originating primarily from pileup interactions (pileup jets). Pileup jets often appear as an agglomeration of lower-p T subjets. To identify pileup jets we use a multivariate boosted decision tree (BDT) algorithm that uses jet shape variables and vertex information and is referred to as the "MVA pileup jet identification discriminator" (MVA pileup jet ID) [31]. Both No-PU and MVA PF E / T algorithms utilize the MVA pileup jet ID.
Details of the No-PU and MVA PF E / T algorithms and their performance in Z → µ + µ − , Z → e + e − , and γ + jets events are presented in the following sections. These algorithms provide a crucial improvement to physics analyses sensitive to low or moderate E / T values, such as Higgs boson searches in the τ-lepton final states [32].

The No-PU PF E / T algorithm
The No-PU PF E / T algorithm computes the transverse momentum imbalance by separately weighting contributions from the HS and PU particles. In contrast to the global pile-up correction included in eq. (4.2), this algorithm therefore treats individual particles.
The particles that are classified as HS particles are: • "leptons" (electrons/photons, muons, and hadronic tau decays), • particles within jets of p T > 30 GeV that pass the MVA pileup jet ID (HS-jets), • charged hadrons associated to the hard-scatter vertex (unclustered HS-charged hadrons), by matching the associated tracks to the reconstructed vertex of the event.
Particles that are considered to be PU particles are: • charged hadrons that are neither within jets of p T > 30 GeV nor associated to the hard-scatter vertex (unclustered PU-charged hadrons), • neutral particles not within jets of p T > 30 GeV (unclustered neutrals), • particles within jets of p T > 30 GeV that fail the MVA pileup jet ID (PU-jets).
HS particles enter the transverse momentum balance in the usual way (see section 4). The transverse momenta of PU particles are scaled down in order to reduce the impact of pileup on the E / T resolution. The scale factor is based on the ratio of the scalar sum of the transverse momenta of charged particles that originate from hard-scattering pp collision and are neither associated to leptons nor to jets of p T > 30 GeV (unclustered HS-charged hadrons) to the scalar sum of the transverse momenta of all unclustered charged hadrons in the event, Based on this scale factor, the No-PU PF E / T is then computed as, The ∆ PU term is added in a similar way as was done for the pileup correction applied to the PF E / T (cf. eq. (4.2)), which improves the No-PU PF E / T resolution. The parameters α, β , γ, and δ have been determined by numerical optimization of the E / T resolution using a sample of simulated Z → µ + µ − events. The optimal values found by this procedure are α = 1.0, β = 0.6, γ = 1.0, and δ = 1.0.

MVA PF E / T algorithm
The MVA PF E / T algorithm is based on a set of multivariate regressions that provide an improved measurement of the E / T in the presence of a high number of pileup interactions. The MVA PF E / T is computed as a correction to the hadronic recoil u T reconstructed from PF particles. The correction is obtained in two steps. First, we compute a correction to the direction of u T by training a BDT to match the true hadronic recoil direction in simulated events. In the second step, another BDT is trained to predict the magnitude of the true u T on a dataset where we have already corrected the direction of the u T using the regression function from the first step. The corrected u T is then added to q T to obtain the negative MVA PF E / T . The regression for the correction to the recoil angle is trained on a simulated Z → µ + µ − data sample. The training for the recoil magnitude correction uses a mixture of simulated Z → µ + µ − and γ+jets events. The simulated γ+jets sample is added to the training to ensure a sufficiently large training sample over the whole q T region.
To construct the MVA PF E / T , we compute five E / T variables calculated from PF particles: where X 1 is the set of all PF particles (= PF E / T without correction); 2. E / T (2) ≡ − ∑ X 2 p T , where X 2 is the set of all charged PF particles that have been associated to the selected hard-scatter vertex; 3. E / T (3) ≡ − ∑ X 3 p T , where X 3 is the set of all charged PF particles that have been associated to the selected hard-scatter vertex and all neutral PF particles within jets that have passed the MVA pileup jet ID; 4. E / T (4) ≡ − ∑ X 4 p T , where X 4 is the set of all charged PF particles that have not been associated to the selected hard-scatter vertex and all neutral PF particles within jets that have failed the MVA pileup jet ID; 5. E / T (5) ≡ − ∑ X 5 p T + ∑ Y 5 p T , where X 5 is the set of all charged PF particles that have been associated to the selected hard-scatter vertex and all neutral PF particles (also those that have not been clustered into jets), while Y 5 is the set of all neutral PF particles within jets that have failed the MVA pileup jet ID.
The choice of these variables is intended to address five different sub-components of an event, which can be decorrelated from each other by considering various linear combinations of the E / T (i) variables: • the charged PF particles from the hard scatter (in E / T (1), E / T (2), E / T (3) and E / T (5)); • the charged PF particles not from the hard scatter (in E / T (1) and E / T (4)); • the neutral PF particles in jets passing the MVA pileup jet ID (in E / T (1), E / T (3) and E / T (5)); • the neutral PF particles in jets failing the MVA pileup jet ID (in E / T (1), E / T (4) and E / T (5)); • the unclustered neutral PF particles (in E / T (1) and E / T (5)). For each of the E / T (i) variables, the vector u T (i) is computed using the definition from section 6. The BDT regression then takes as inputs the magnitude and azimuthal angle φ of all five types of u T ; the scalar p T sum of all PF particles for each respective E / T variable; the momentum vectors of the two highest p T jets in the event; and the number of primary vertices.
Two versions of MVA PF E / T are used in the following studies. The first one is trained to optimize the E / T resolution, and the second one is trained to reach unity E / T response. The latter one is denoted as the unity training and the related MVA PF E / T is called MVA Unity PF E / T . The unity response training is performed in the same sample used for the non-unity response training. To ensure the uniformity of the MVA Unity PF E / T training as function of q T , the events have an additional weight in the training such that the reweighted q T distribution is flat over the full range.
The No-PU, MVA PF E / T , and MVA Unity PF E / T distributions for Z → µ + µ − , Z → e + e − , and γ + jets events are shown in figures 14, 15, and 16, respectively. Simulation and data are in agreement within the uncertainties.
Some difference between data and simulation can be seen in the region E / T ≤ 70 GeV. The systematic uncertainty in this region is sizeable, and is dominated by the uncertainty in the JER [27]. It is found that the JER in simulated events are overestimated by 5% (up to 20%) for jets reconstructed within (outside) the geometric acceptance of the tracking detectors. The effect is accounted for by smearing the momenta of jets in simulated events by the measured difference in the JER. The uncertainty on the correction is of a similar size as the correction. The difference between data and simulation in the E / T distribution is covered by the present JER uncertainty within one standard deviation.

Measurement of No-PU and MVA PF E / T scale and resolution
The response curves of the No-PU PF E / T , MVA PF E / T , and MVA Unity PF E / T algorithms for Z → µ + µ − , Z → e + e − , and γ + jets events are shown in figure 17. Data and simulated distributions show good agreement, except at the lowest q T where the recoil direction is not well defined and becomes sensitive to small discrepancies in the simulation of low p T particles. The No-PU PF E / T response approaches unity slower than the standard PF E / T (figure 8) for Z → µ + µ − and Z → e + e − -23 -2015 JINST 10 P02006  events. This is due to events in which a sizeable fraction of particles originating from the hard scatter interaction do not carry an electric charge. The response stays below unity for γ + jets events. The parameter β = 0.6 in eq. (7.2) has been optimized to yield the best E / T resolution. Its effect is that the contribution of neutral particles to the PF E / T computation, which are difficult to separate into distinct contributions from the hard scatter interaction and pileup, is underestimated by 40% on average. The MVA PF E / T response is around 0.9 even at high q T , since the BDT is trained to achieve the best E / T resolution, even if at the expense of worse response. In contrast, the MVA Unity PF E / T reaches a unity response, due to the dedicated training to achieve the best resolution given the condition of having unity response.
One conclusion of our studies is that there is a general conflict of objectives between achieving the best PF E / T resolution and reaching a response close to unity. In order to make the resolution insensitive to pileup, one needs to scale down the contribution to the PF E / T computation of "unclustered" particles and low-p T jets, both of which are abundantly produced in minimum bias interactions. This procedure inevitably reduces the response at low q T .
-24 -2015 JINST 10 P02006 The resolution versus boson q T of the u ⊥ and u components are shown in figures 18-20 for the No-PU, MVA, and MVA Unity PF E / T . Good agreement is observed between data and simulation for various algorithms, and between various channels. The resolution distributions as a function of N vtx are shown in figure 21 and include also, as a reference, the standard PF E / T algorithm shown in figure 9, fully corrected as described in section 4. The No-PU PF E / T and particularly MVA and MVA Unity PF E / T show a significantly reduced dependence of the resolution on pileup interactions in both data and simulation. This reduced pileup dependence can significantly increase the sensitivity of searches for new physics. As an example, use of the MVA PF E / T improved the sensitivity of the search for the Higgs boson decaying into tau-lepton pairs by ∼20% with respect to the PF E / T [32].

The E / T significance
The ability to distinguish between events with spurious E / T and those with genuine E / T is important for analyses using missing transverse energy variables. Spurious E / T may arise from object misreconstruction, finite detector resolution, or detector noise. To help identify such events, we -25 -2015 JINST 10 P02006   Figure 19. Resolution of the parallel (left) and perpendicular (right) recoil component as a function of q T for the MVA PF E / T in Z → µ + µ − events (full blue circles), Z → e + e − events (open red circles), and directphoton events (full green squares). The upper frame of each figure shows the resolution in data; the lower frame shows the ratio of data to simulation with the grey error band displaying the systematic uncertainty of the simulation, estimated as the maximum of each channel systematic uncertainty.
-26 -2015 JINST 10 P02006 have developed a missing transverse energy significance variable, which we will denote by " E / T significance", or simply S. On an event-by-event basis, S evaluates the p-value that the observed E / T is inconsistent with a null hypothesis, E / T = 0, given the full event composition and resolution functions for each object in the event. A high value of S is an indication that the E / T observed in the event is not well explained by resolution smearing alone, suggesting that the event may contain unseen objects such as neutrinos or more exotic weakly interacting particles. A first version of the E / T significance algorithm has been described in ref. [3].

Definition of S
The significance is defined as the log-likelihood ratio, The numerator expresses the likelihood of the hypothesis under test that the true value ( ε) of the missing transverse energy is equal to the observed value (∑ ε i ) , while the denominator expresses the likelihood of the null hypothesis, that the true missing transverse energy is actually zero. Under the null hypothesis, observation of any non-zero missing transverse energy is attributed to resolution smearing. The formulation in eq. (8.1) is completely general and accommodates any probability distribution functions for the object resolutions; throughout the bulk of this discussion however, we assume Gaussian resolutions for measured quantities. This assumption accurately describes the dominant behavior of energy and momentum measurements in CMS and greatly simplifies the computation of S as the convolution integrals underlying the likelihood functions can be done analytically. In the Gaussian model, we obtain a simple closed-form solution, 2) in which V is the 2×2 covariance matrix of the total missing transverse energy computed by propagating the uncertainties of all objects in the event or in a defined subset of the event; more details are given in ref. [3]. A particularly useful feature of the Gaussian approximation is that the S, as defined by eq. (8.2), is a χ 2 variable with two degrees of freedom (one degree of freedom for each component of E / T ). For clarity, we note that the term "significance" is often used to denote a linear quantity of the form x/σ x while here it is defined as the quadratic form x 2 /σ 2 x . Despite the convenience of eq. (8.2), a full treatment of E / T significance must also include non-Gaussian resolutions as these are known to occur at the percent level in jet measurements. In section 8.5 of this paper we therefore extend the treatment of S to handle such cases.

Jet resolutions
The E / T resolution captured in the covariance matrix V of eq. (8.2) is determined mainly by the momentum resolution of the hadronic components of the event. For the purpose of E / T significance we separate the hadronic activity into jets with p T ≥ 20 GeV, which are reconstructed with the PF algorithm, and unclustered energy with p T < 20 GeV. The jets are treated as individual objects, each with a unique resolution function depending on the p T and η of the jet, while the objects in the unclustered energy are summed vectorially to produce a single object with p T = ∑ i p i T , whose resolution is determined separately. This division separates those components of the event that carry strong azimuthal information and contribute distinctively to the topology of the event from those that are relatively featureless and contribute only to a general broadening of the E / T resolution. Subsequent results are not sensitive to the choice of the 20 GeV threshold.
The resolution functions of hadronic jets are parametrized with a Crystal Ball function, which has a core Gaussian function with additional power-law terms that describe small non-Gaussian tails [33]. The parameter values are determined initially with samples of QCD multijet events generated by PYTHIA v6.4.24 [13], with jets propagated through the full simulation of the CMS detector; the reconstructed and generated values of p T , η, and φ are compared to extract resolution shapes. A full description of a single jet's Gaussian core resolution is given by the covariance matrix, in which we assume no correlation between p T and φ terms. Both σ p T and σ φ are functions of both p T and η. As written, the covariance matrix U is in the coordinate system aligned with the jet; in use, all such matrices are rotated by the jet azimuthal angle φ into the common CMS xy basis: The widths of the core Gaussian functions obtained from simulation as described above are retuned with data using the Z → µ + µ − control sample defined in section 3.2. This is effectively a zero-E / T sample and the observed E / T is therefore expected to derive primarily from jet resolution smearing rather than from genuine E / T . In this sample, jet activity is modest and the E / T characteristics are dominated by the largely isotropic features of the unclustered energy. The E / T significance therefore conforms well to the null hypothesis, and we use this fact to optimize the Gaussian widths. Each Gaussian width, σ MC , obtained from simulation is rescaled by an η-dependent correction factor: σ (η) = a(η) × σ MC ; the correction factors (in five bins of |η|) are determined by a likelihood fit over the Z → µ + µ − data sample in which we seek to maximize the null hypothesis, L( ε = 0). To reduce possible biases stemming from events with sources of genuine E / T , the fit is performed iteratively with a restriction to exclude high-significance events.
The unclustered energy resolution, σ uc , is parametrized by, (8.4) where the summation is over the n low-p T objects included in the unclustered energy and σ 0 and σ s are free parameters obtained from the same likelihood fit as described above. Because the best fit normally returns σ 0 = 0 (as one would expect), we see that the resolution of the unclustered energy -29 -exhibits the general form σ uc ≈ √ n σ X where the quantity σ 2 X measures the average contribution of low-p T objects to the E / T covariance. Its contribution to the E / T covariance matrix is taken to be isotropic, (8.5) as it is constructed from a large number of (mostly) uncorrelated, low-p T objects. The matrix I in eq. (8.5) is the identity matrix. In practice, a slight ellipticity due to fluctuations of the unclustered energy is found in some events but can be neglected without degrading the E / T significance performance. Systematic uncertainties associated with hadronic activity are evaluated using uncertainties on the jet energy scale (2-10%) and the energy scale of low energy particles entering into the unclustered energy (10%), and are displayed as gray bands in figures 22-25. The systematic uncertainty due to jet energy resolution and unclustered energy resolution is captured here as well.
Electron and muon resolutions are assumed to be negligible when compared to those for the hadronic activity in each event, and thus do not enter into the E / T covariance.

Characteristics of
As S is χ 2 -distributed, an event sample that nominally has no genuine E / T should be flat in the χ 2 probability function for two degrees of freedom, P 2 (S). Here, P 2 (S) is defined such that 1 − P 2 (S) is the standard cumulative distribution function of the χ 2 statistic for two degrees of freedom. Both Z → µ + µ − and dijet samples from pp collisions are dominated by such events. The dijet sample is defined in section 3.1; though heavily populated by events with two high-p T jets, it is not restricted by any limit on the maximum number of jets.
We compare the distributions of S as well as P 2 (S) in data and simulation for both Z → µ + µ − and dijet samples in figures 22 and 23. The observed spectrum conforms to a χ 2 distribution in the core region, but begins to slightly deviate from a perfect χ 2 at high values of significance (S 9). Physics backgrounds containing nonzero true E / T (defined here to be E / T > 3 GeV) are present, but are negligible in comparison to the dominant zero-E / T population. The impact of Z → µ + µ − events with true E / T due to heavy-quark decays and decays in flight is also found to contribute to the high-S region. Such events only constitute about 1% of the signal sample in simulated events, however. The general agreement with a χ 2 distribution is also apparent in the P 2 (S) spectra, which are flat over the bulk of events and show an excess at low values of P 2 (S) (high values of S). It is helpful to keep in mind that P 2 (S) < 0.01 corresponds to S > 9.2, P 2 (S) < 0.02 corresponds to S > 7.8, and P 2 (S) < 0.05 corresponds to S > 6.0.

Events with
The presence of genuine E / T pushes events to higher values of S and lower values of P 2 (S), and thus can be used to separate events with genuine E / T from those with only resolution-induced E / T . To study the discrimination power of the significance variable, we use samples of events containing W-boson or tt production. The W → eν channel offers a probe of E / T significance in a scenario dominated by genuine E / T , accompanied by significant zero-E / T backgrounds; the semileptonic tt -30 -2015 JINST 10 P02006  channel similarly provides a genuine E / T signal, but with background events predominantly from higher-E / T dileptonic tt decays.
The distributions in data and simulation of the E / T significance and corresponding P 2 (S) distributions are shown in figures 24 and 25 for both the W → eν and semi-leptonic tt events. Some interesting features are apparent in the composition of simulation events in the significance spectra. In the W → eν channel, events arising from zero true E / T physics channels, such as QCD and Drell-Yan events, are mostly found at low values of significance compared to the broad distribution -31 -2015 JINST 10 P02006  of non-zero-E / T events. Some QCD events show large values of S, corresponding to the tail of the distribution observed on figure 22. The semi-leptonic tt channel has a significant non-zero-E / T background stemming from dileptonic tt decays. The dileptonic tt spectrum falls more slowly than the semileptonic tt signal in the tail region of S.

Performance in W → eν and semileptonic tt events
Here we examine the potential gain of introducing the significance variable into the selection criteria for W → eν and semileptonic tt events. Figure  ciencies for W → eν events in simulation, where increasing thresholds are placed on the value of S, PF E // √ ∑ E T , and PF E /. (The green curve is discussed in section 8.5.) In the W → eν channel, there is a performance benefit in using E / T significance when compared to simpler background discrimination variables such as E / T alone or the approximate significance variable E / T / √ ∑ E T [34]. For example, choosing a working point with 50% signal efficiency yields a background efficiency of 8.2% using E / T , 5.1% using E / T / √ ∑ E T , and 4.0% using the significance as a discriminating variable. For reference, a 50% signal efficiency working point corresponds to a E / T > 40 GeV requirement. In the semi-leptonic tt channel, S provides discrimination that is comparable to E / T and E / T / √ ∑ E T . This reflects the fact that S is optimized for discriminating events that satisfy the null hypothesis ( = 0) from those that do not. In the case of semileptonic tt, the dominant background contribution comes from dileptonic tt decays with large, genuine E / T .
We have also evaluated the performance benefit of modeling individual jet resolutions down to 3 GeV, as in ref. [3], as an alternative to the current threshold of 20 GeV. Using a lower threshold for individual jets can potentially provide more detailed information about the low-p T hadronic activity, but we find that the performance in the W → eν channel is essentially indistinguishable when implemented with these two different thresholds, and therefore use the simpler 20 GeV threshold.

Pileup
The E / T significance variable exhibits simple behavior as a function of the number of pileup interactions. For event samples such as the Z → µ + µ − and dijet selections, in which in most events there is no source of true E / T , the S value remains essentially constant as the number of primary vertices increases. In samples such as W → eν and tt, where the average value of E / T is non-zero, a decrease with increasing pileup is seen. This behavior can be derived formally from the expression for S given in eq. (8.2) with the isotropic model of unclustered energy given in eq. (8.5) if the additional covariance due to n pileup vertices is incorporated via the replacement V → V 0 + nσ 2 I. In this transformation V 0 represents the covariance matrix in the absence of pileup. It is also confirmed empirically in figure 27. As a side point, we note that S ≈ 2 for the zero-E / T events, as one expects for a χ 2 variable with two degrees of freedom.
As a result of the pileup dependence observed for genuine E / T events, the background rejection performance of the E / T significance can also exhibit a dependence on pileup. This is demonstrated for the W → eν channel in figure 28. Here we see a decreasing signal efficiency as the pileup increases. It is also apparent that while the efficiencies of non-zero-E / T signal events depend on pileup, the efficiencies for the zero-E / T background events are relatively stable. It should be mentioned that the use of a significance algorithm based on No-PU input objects would reduce the dependency of Swith the number of additional pileup interactions.

Treatment of non-Gaussian resolutions
As noted earlier, the jet p T resolution functions exhibit non-Gaussian tails. The challenge presented by such tails lies in the convolution integrals needed to compute the E / T likelihood function. This can be done analytically for Gaussian resolutions, but not when non-Gaussian elements are introduced and direct, numerical convolution is prohibitively slow. The convolution process, however, can be reduced under Fourier transformation to a simple multiplication of the transformed functions. With this approach, each jet resolution function R i (p x , p y ) is transformed to R i (k x , k y ), and then the product ∏ n i=1 R i (k x , k y ) is computed and back-transformed to yield the fully convolved result. When computed with fast Fourier transform (FFT) techniques, this method enables the required convolutions to be done at a speed that, while slower than the evaluation of analytic functions, is still well within reason for late stages of analysis. Both R and R are discretized on 2-dimensional grids in their respective spaces, and the resulting discretized likelihood function is smoothed by cubic spline interpolation before computing the significance. Care is taken in defining the grids to avoid artifacts that can result from aliasing. To verify the validity of this FFT method and its implementation, we have compared the results of the FFT and analytic methods for cases where only Gaussian resolutions are used and find the two methods yield identical results. When introduced into the selection criteria for W → eν events, the two methods give comparable results, as seen in figure 26.
To demonstrate the potential utility of the non-Gaussian treatment, we compare E / T significance computed with the FFT and with the analytic method. For the comparison, we use the dijet event sample, as there is sufficient high-p T hadronic activity to exhibit clearly the effects of non-Gaussian contributions to the resolution. Figure 29 shows the results of the comparison. The significance distribution is plotted in the left panel, with the black histogram computed by the an--35 -alytic method (i.e. assuming only Gaussian resolutions), and red data points computed with the FFT algorithm (using full resolution functions). The steeper fall of the red points demonstrates that the FFT algorithm helps to reduce the excess of high-significance values that arise in the analytic method where the jet measurement uncertainty is underestimated by the Gaussian approximation. Events showing non-Gaussian significance values of S 80 are suppressed due to the finite number of significant digits available to double precision variables used in the FFT algorithm.
The right-hand panel shows the corresponding reduction of events in the lowest bin of the P 2 (S) distribution. The remaining excess in that bin is partly due to events with genuine E / T that arise from semileptonic decays of hadrons. After taking into account these genuine E / T components and other extraneous backgrounds from tt and vector boson production, the net impact of the FFT algorithm is to reduce the excess of zero-E / T events in the high-significance, low-P 2 (S) bin (P 2 (S) < 0.02) by a factor of two. Removal of the remaining zero-E / T events in this bin will require deeper understanding of the jet-by-jet resolution variations that are not captured by the average parametrizations currently available.

Summary
The performance of E / T reconstruction algorithms has been studied using data collected in 8 TeV pp collisions with the CMS detector at the LHC. The data used in this paper were collected from February through December 2012 and correspond to an integrated luminosity up to 19.7 ± 0.5 fb −1 . The E / T reconstruction algorithms and corrections are described with an emphasis on changes compared to those used with the 7 TeV pp data collected in 2010 [3]. Events with artificially high E / T in a dijet event sample are examined, and we find that a majority of such events can be identified and either modified or removed.
We have measured the scale and resolution of PF E / T , as well as the degradation of the PF E / T performance due to pileup interactions in Z → µ + µ − , Z → e + e − , and direct-photon events. The measured PF E / T scale and resolution in data agree with the expectations from the simulation after correcting for the jet energy scale and resolution differences between data and simulation. We find that pileup interactions contribute to the degradation of the PF E / T resolution by 3.3-3.6 GeV (in quadrature) per additional pileup interaction, similar to the results obtained with the 7 TeV pp data.
We have studied the performance of two novel E / T reconstruction algorithms specifically developed to cope with large numbers of pileup interactions. They show significantly reduced dependence of the E / T resolution on pileup interactions, consistently in both data and simulation, although the E / T response is slightly deteriorated. With a dedicated configuration of the algorithms, however, the E / T response can be preserved.
We have also studied the performance of the E / T significance algorithm, developed to distinguish between events with spurious E / T and events with genuine E / T . As an example of its utility, the E / T significance shows better discrimination between W → eν events and QCD or Drell-Yan events compared to a standard E / T reconstruction algorithm.
The studies presented in this paper provide a solid foundation for all the CMS measurements with E / T in the final state, including measurements involving W bosons and top quarks, searches for new weakly interacting neutral particles, and studies of the properties of the Higgs boson.
-36 -Individuals have received support from the Marie-Curie programme and the European Research Council and EPLANET (European Union); the Leventis Foundation; the A. P. Sloan Foundation; the Alexander von Humboldt Foundation; the Belgian Federal Science Policy Office; the Fonds pour la Formationà la Recherche dans l'Industrie et dans l'Agriculture (FRIA-Belgium); the Agentschap voor Innovatie door Wetenschap en Technologie (IWT-Belgium); the Ministry of Education, Youth and Sports (MEYS) of the Czech Republic; the Council of Science and Industrial Research, India; the HOMING PLUS programme of Foundation for Polish Science, cofinanced from European Union, Regional Development Fund; the Compagnia di San Paolo (Torino); the Consorzio per la Fisica (Trieste); MIUR project 20108T4XTM (Italy); the Thalis and Aristeia programmes cofinanced by EU-ESF and the Greek NSRF; and the National Priorities Research Program by Qatar National Research Fund.