Electron and photon reconstruction and identification with the CMS experiment at the CERN LHC

The performance is presented of the reconstruction and identification algorithms for electrons and photons with the CMS experiment at the LHC. The reported results are based on proton-proton collision data collected at a center-of-mass energy of 13 TeV and recorded in 2016-2018, corresponding to an integrated luminosity of 136 fb$^{-1}$. Results obtained from lead-lead collision data collected at $\sqrt{s_\mathrm{NN}} =$ 5.02 TeV are also presented. Innovative techniques are used to reconstruct the electron and photon signals in the detector and to optimize the energy resolution. Events with electrons and photons in the final state are used to measure the energy resolution and energy scale uncertainty in the recorded events. The measured energy resolution for electrons produced in Z boson decays in proton-proton collision data ranges from 2 to 5%, depending on electron pseudorapidity and energy loss through bremsstrahlung in the detector material. The energy scale in the same range of energies is measured with an uncertainty smaller than 0.1 (0.3)% in the barrel (endcap) region in proton-proton collisions and better than 1 (3)% in the barrel (endcap) region in heavy ion collisions. The timing resolution for electrons from Z boson decays with the full 2016-2018 proton-proton collision data set is measured to be 200 ps.

The performance is presented of the reconstruction and identification algorithms for electrons and photons with the CMS experiment at the LHC. The reported results are based on proton-proton collision data collected at a center-of-mass energy of 13 TeV and recorded in 2016-2018, corresponding to an integrated luminosity of 136 fb −1 . Results obtained from lead-lead collision data collected at √︁ NN = 5.02 TeV are also presented. Innovative techniques are used to reconstruct the electron and photon signals in the detector and to optimize the energy resolution. Events with electrons and photons in the final state are used to measure the energy resolution and energy scale uncertainty in the recorded events. The measured energy resolution for electrons produced in Z boson decays in proton-proton collision data ranges from 2 to 5%, depending on electron pseudorapidity and energy loss through bremsstrahlung in the detector material. The energy scale in the same range of energies is measured with an uncertainty smaller than 0.1 (0.3)% in the barrel (endcap) region in proton-proton collisions and better than 1 (3)% in the barrel (endcap) region in heavy ion collisions. The timing resolution for electrons from Z boson decays with the full 2016-2018 proton-proton collision data set is measured to be 200 ps.

K
: Large detector systems for particle and astroparticle physics; Particle identification methods A X P : 2012.06888

Introduction
Electrons and photons are reconstructed with high purity and efficiency in the CMS experiment, one of the two general-purpose detectors operating at the CERN LHC [1]. These electromagnetically interacting particles leave a distinctive signal in the electromagnetic calorimeter (ECAL) as an isolated energy deposit that is also associated with a trace in the silicon tracker in the case of electrons. These properties, together with the excellent energy resolution of the ECAL, make electrons and photons ideal to use both in precision measurements and in searches for physics beyond the standard model with the CMS detector. After a very successful Run 1, at 7 and 8 TeV during the years 2009-2012, which culminated in the discovery of the Higgs boson in July 2012 [2,3] and a two-year maintenance period, the LHC resumed its operations in 2015 with LHC Run 2, providing proton-proton (pp) collisions at an increased center-of-mass-energy of 13 TeV. In this paper, the performance of the reconstruction and identification of electrons and photons with the CMS detector in Run 2 is presented. The Run 1 results are reported in refs. [4,5]. The new results are based on pp collision data collected during 2016-2018, and correspond to a total integrated luminosity of 136 fb −1 [6][7][8]. The pp collisions were delivered with a 25 ns bunch spacing, and an average number of interactions per beam crossing (pileup or PU) increasing through the years from 22 to 32. In addition, the reconstruction of electrons and photons in lead-lead (PbPb) ion collisions is presented, which requires specific updates because of the significantly higher particle multiplicity compared with pp collisions. The PbPb collisions were recorded in 2018 at a nucleon-nucleon (NN) center-of-mass energy of √ NN = 5.02 TeV, corresponding to an integrated luminosity of 1.7 nb −1 . Table 1 lists the main objectives described in the paper concerning electrons and photons, the summary of the methods used to achieve them, as well as the reference to the sections in the paper where they are described.
-1 -tectors to select events with a latency of 4 µs of the collision and with a total average rate of about 100 kHz [17]. The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage. Dedicated techniques [18] are used in all detector subsystems to reject signals from electronic noise, from pileup, or from particles that do not originate from pp collisions in the bunch crossing of interest, such as particles arriving from pp collisions that occur in adjacent bunch crossings (so-called out-of-time pileup).

Data and simulated event samples
The data used in this paper were collected from pp collisions at 13 TeV, satisfying a trigger requirement of an isolated single electron with T thresholds at 27, 32, and 32 GeV in 2016, 2017 and 2018, corresponding to integrated luminosities of 35.9, 41.5, and 58.7 fb −1 , respectively.
The best detector alignment, energy calibrations and corrections are being performed for the full Run 2 data for each year separately; they are obtained using the procedures described in refs. [19,20]. For this paper, only the 2017 data use these most updated conditions and the best calibrations since they were already available at the time of writing. This paper documents the performance and results that are used in more than 90% of CMS physics analyses based on Run 2 data. In the later sections, the recalibrated data set of 2017 is referred to as the "Legacy" data set, whereas the 2016 and 2018 data samples are referred to as "EOY" (end of year). The improvements brought by the recently recalibrated 2017 data are discussed in section 8.
Samples of Monte Carlo (MC) simulated events are used to compare the measured and expected performance. Drell-Yan (DY) Z/ * + jets and Z → events are simulated at next-to-leading order (NLO) with the M G 5_a @ (v2.2.2, 2.6.1 and 2.4.2 for 2016, 2017, and 2018 conditions, respectively) [21] event generator, interfaced with v8.212 [22] for parton showers and hadronization. The CUETP8M1 underlying event tune [23] is used for 2016 MC samples and the CP5 [24] tune is used for 2017 and 2018 MC samples. The matrix elements are computed at NLO for the three processes pp → Z + N jets , where N jets = 0, 1, 2, and merged with the parton showers using the FxFx [25] scheme with a merging scale of 30 GeV. The NNPDF 3.0 (2016) and 3.1 (2017-2018) with leading order (LO), in 2016, and next-to-next-leading order (NNLO), in 2017-2018, parton distribution functions (PDFs) [26] are used. Simulated event samples for + jet final states from direct photon production are generated at LO with . The NNPDF2.3 LO PDFs [27] are used for these samples.
A detailed detector simulation based on the G 4 (v9.4.3) [28] package is applied to all generated events. The presence of multiple pp interactions in the same and nearby bunch crossings is incorporated by simulating additional interactions (including also those out-of-time coming from neighbouring bunch crossings) with a multiplicity that matches that observed in data.
1. The energy reconstruction algorithm starts with the formation of clusters [13] by grouping together crystals with energies exceeding a predefined threshold (typically ∼80 MeV in EB and ∼300 MeV in EE), which is generally 2 or 3 times bigger than the electronic noise expected for these crystals. A seed cluster is then defined as the one containing most of the energy deposited in any specific region, with a minimum transverse energy ( seed T ) above 1 GeV. We define T as T = √︁ 2 + 2 T for an object of mass and transverse momentum T .
2. ECAL clusters within a certain geometric area ("window") around the seed cluster are combined into superclusters (SC) to include photon conversions and bremsstrahlung losses. This procedure is referred to as "superclustering".
3. Trajectory seeds in the pixel detector that are compatible with the SC position and the trajectory of an electron are used to seed the GSF tracking step. 4. In parallel to the above steps, all tracks reconstructed in the event are tested for compatibility with an electron trajectory hypothesis; if successful they are also used to seed the GSF tracking step. The "generic tracks" are a collection of tracks (not specific to electrons) selected with 7. These blocks are resolved into electron and photon (e and γ ) objects, starting from either a GSF track or a SC, respectively. At this point, there is no differentiation between electron and photon candidates. The final list of linked ECAL clusters for each candidate is promoted to a refined supercluster. 8. Electron or photon objects are built from the refined SCs based on loose selection requirements. All objects passing the selection with an associated GSF track are labeled as electrons; without a GSF track they are labeled as photons. This collection is known as the unbiased e/γ collection and is used as a starting point by the vast majority of analyses involving electrons and photons. 9. To separate electrons and photons from hadrons in the PF framework, a tighter selection is applied to these e/γ objects to decide if they are accepted as an electron or an isolated photon.
If the e/γ object passes both the electron and the photon selection criteria, its object type is determined by whether it has a GSF track with a hit in the first layer of the pixel detector.
If it fails the electron and photon selection criteria, its basic elements (ECAL clusters and generic tracks) are further considered to form neutral hadrons, charged hadrons or nonisolated photons in the PF framework. This is discussed further in section 4.5.

Superclustering in the ECAL
Energy deposits in several ECAL channels are clustered under the assumption that each local maximum above a certain energy threshold (1 GeV) corresponds to a single particle incident on the detector. An ECAL energy deposit may be shared between overlapping clusters, and a Gaussian shower profile is used to determine the fraction of the energy deposit to be assigned to each of the clusters. Because electrons and photons have a significant probability of showering when traversing the CMS tracker, by the time the particle reaches the ECAL, the original object may consist of several electrons and/or photons produced from bremsstrahlung and/or pair production. The multiple ECAL clusters need to be combined into a single SC that captures the energy of the original electron/photon. This step is known as superclustering and the combining process uses two algorithms. The first is the "mustache" algorithm, which is particularly useful to properly measure lowenergy deposits. It uses information only from the ECAL and the preshower detector. The algorithm starts from a cluster above a given threshold, called seed cluster. Additional clusters are added if falling into a zone, whose shape is similar to a mustache in the transverse plane. The name mustache is used because the distribution of Δ = seed-cluster − cluster versus Δ = seed-cluster − cluster has a slight bend because of the solenoidal structure of the CMS magnetic field, which tends to spread this radiated energy along , rather than along . An example of the mustache SC distribution can be seen in figure 1, for simulated electrons with 1 < seed T < 10 GeV. A similar shape is observed in the case of a photon.
The size of the mustache region depends on T , since the tracks of particles with larger transverse momenta get less bent by the magnetic field. The mustache SCs are used to seed electrons, photons, and conversion-finding algorithms.
The second superclustering algorithm is known as the "refined" algorithm, and is described in more detail in section 4.4. It utilizes tracking information to extrapolate bremsstrahlung tangents 1 < E (13 TeV) CMS Simulation Figure 1. Distribution of Δ = seed-cluster − cluster versus Δ = seed-cluster − cluster for simulated electrons with 1 < seed T < 10 GeV and 1.48 < seed < 1.75. The axis represents the occupancy of the number of PF clusters matched with the simulation (requiring to share at least 1% of the simulated electron energy) around the seed. The red line contains approximately the set of clusters selected by the mustache algorithm. The white region at the centre of the plot represents the -footprint of the seed cluster. and conversion tracks to decide whether a cluster should belong to a SC. It uses mustache SCs as a starting point, but is also capable of creating its own SCs. The refined SCs are used for the determination of all ECAL-based quantities of electron and photon objects.

Electron track reconstruction and association
Electrons use the GSF tracking algorithm to include radiative losses from bremsstrahlung. There have been no significant changes to the tracking algorithm from Run 1 [4] and any differences arise primarily from a different ECAL superclustering algorithm. Therefore, the algorithms involved in electron tracking are only briefly summarized here, with additional details available in ref. [4].

Electron seeding
The GSF track fitting algorithm is CPU intensive and cannot be run on all reconstructed hits in the tracker. The reconstruction of electron tracks therefore begins with the identification of a hit pattern that might lie on an electron trajectory ("seeding"). The electron trajectory seed can be either "ECAL-driven" or "tracker-driven". The tracker-driven seeding has an efficiency of ∼50% for electrons from Z decay with T ∼ 3 GeV and drops to less than 5% for T > 10 GeV [13].
The ECAL-driven seeding first selects mustache SCs with transverse energy SC,T > 4 GeV and / SC < 0.15, where SC and are the SC energy and the sum of the energy deposits in the HCAL towers within a cone of Δ = √︁ (Δ ) 2 + (Δ ) 2 = 0.15 centered on the SC position. Each mustache SC is then compared in and (or in transverse distance in the forward regions where hits occur only in the disks) with a collection of track seeds that are formed by combining multiple hits in the inner tracker detector: triplets or doublets. The hits of these track seeds must be located -7 -in the barrel pixel detector layers, the forward pixel layers, or the endcap tracker. For a given SC, the trajectory of its corresponding electron is assumed to be helical and is calculated from the SC position, its seed T , and the magnetic field strength. This extrapolation towards the collision vertex neglects the effect of any photon emission. If the first two hits of a tracker seed are matched (within a certain charge-dependent Δ ×Δ window for the barrel pixel detectors, and a Δ ×Δ window for the forward pixel disks and endcap tracker) to the predicted trajectory for a SC under any charge hypothesis, it is selected for seeding a GSF track [4].
The tracker-driven approach iterates over all generic tracks. If any of these KF tracks is compatible with an ECAL cluster, its track seed is used to seed a GSF track [4]. The compatibility criterion is the logical OR of a cut-based selection and a multivariate selection based on a boosted decision tree (BDT) [31,32], using track quality and track-cluster matching variables as inputs.
Since it is computationally expensive to reconstruct all tracks in an event, tracker-driven seeding is performed only in the offline reconstruction and not in HLT.
The ECAL-driven approach performs better for high-T isolated electrons with a larger than 95% seeding efficiency for T > 10 GeV for electrons from Z boson decay. The tracker-driven approach is designed to recover efficiency for low-T or nonisolated electrons with a seeding efficiency higher than ∼50% for electrons with T > 3 GeV [4]. It also helps to recover efficiency in the ECAL regions with less precise energy measurements, such as in the barrel-endcap transition region and/or in the gaps between supermodules.
The GSF tracking algorithm is run on all ECAL-and tracker-driven seeds. If an ECALdriven seed shares all but one of its hits with a tracker-driven seed, the resulting track candidate is considered as both ECAL and tracker-seeded. This is also the case for ECAL-driven seeds, which share all hits with a tracker-driven seed, but in this case the tracker-driven seed is discarded before the track-finding step. The majority of electrons fall into one of these two cases.

Tracking
The final collection of selected electron seeds (obtained by combining the ECAL-driven and trackerdriven seeds) is used to initiate the reconstruction of electron tracks. For a given seed, the track parameters evaluated at each successive tracker layer are used by the KF algorithm to iteratively build the electron trajectory, with the electron energy loss modeled using a Bethe-Heitler distribution [33]. If the algorithm finds multiple hits compatible with the predicted position in the next layer, it creates multiple candidate trajectories by doing a 2 fit, up to a maximum of five for each tracker layer and for a given initial trajectory. The candidate trajectories are restricted to those with at most one missing hit, and a penalty is applied to the trajectories with one missing hit by increasing the track 2 . This penalty helps to minimize the inclusion of hits from converted bremsstrahlung photons in the primary-electron trajectory. Any ambiguities that arise when a given tracker hit is assigned to multiple track candidates are resolved by dropping the track with fewer hits, or the track with the larger 2 value if the number of hits is the same [11].
Once the track candidates are reconstructed by the KF algorithm, their parameters are estimated at each layer with a GSF fit in which the energy loss is approximated by an admixture of Gaussian distributions [4]. The GSF tracks obtained from this procedure are extrapolated toward the ECAL under the assumption of a homogeneous magnetic field to perform track-cluster associations.

Track-cluster association
The electron candidates are constructed by associating the GSF tracks with the SCs, where the position of the SC is defined as the energy-weighted average of the constituent ECAL cluster positions. A BDT is used to decide whether to associate a GSF track to an ECAL cluster. The BDT combines track information, supercluster observables, and track-cluster matching variables. The track information covers both kinematical and quality-related features. The SC information includes the spread in and of the full SC, as well as transverse shape variables inferred from a 5×5 crystal matrix around the cluster seed.
For tracker-driven electrons, only the BDT is used to decide whether to associate a GSF track to an ECAL cluster. Electron candidates reconstructed from ECAL-driven seeds are required to pass either the same BDT requirements as for tracker-driven electrons or the following track-cluster matching criteria:

Supercluster refinement in the ECAL
The mustache SCs can be refined using the information from detector subsystems beyond the ECAL crystal and preshower detectors. Additional conversion and bremsstrahlung clusters are recovered using information from the tracker, with minimal risk of inclusion of spurious clusters. A conversion-finding algorithm [5] is employed to identify pairs of tracks consistent with a photon conversion. A BDT is employed to identify tracks from photon conversions where only one leg has been reconstructed. The input variables to this BDT include the number of missing hits on the track (for prompt electrons no missing hits are expected), the radius of the first track hit, the signed impact parameter or the distance of closest approach ( 0 ). The identified conversion tracks can then be linked to the compatible ECAL clusters. Additionally at each tracker layer, the trajectory of the GSF track is extrapolated to form a "bremsstrahlung tangent", which can be linked to a compatible ECAL cluster.
Mustache SCs, ECAL clusters, primary generic tracks, GSF tracks, and conversion-flagged tracks are all inputs to the PF algorithm, which builds the e/γ objects, as described in ref. [13]. An e/γ object must start from either a mustache SC or a GSF track. To reduce the CPU time, a mustache SC must either be associated with a GSF track or satisfy SC,T > 10 GeV and ( / SC < 0.5 or SC,T > 100 GeV). The ECAL clusters must not be linked to any track from the primary vertex unless that track is associated with the object's GSF track. ECAL clusters already added by the mustache algorithm are exempted from this requirement. ECAL clusters linked to secondary conversion tracks and bremsstrahlung tangents are then provisionally added to the so called refined supercluster. However, in the final step, they can be withdrawn from the refined SC if this makes the total energy more compatible with the GSF track momentum at the inner layer. ECAL clusters that are within | | < 0.05 of the GSF track outermost position extrapolated to the ECAL or within -9 -| | < 0.015 of a bremsstrahlung tangent are exempted from this removal. Finally, a given ECAL cluster can belong to only one refined SC.

Integration in the global event description
Electrons and photons present a unique challenge in the PF framework because they can be composite objects consisting of several clusters and tracks. This can lead to incorrect results when an object that is not an e/γ object is reconstructed under the e/γ hypothesis. For example, the photons, charged hadrons, and neutral hadrons in a jet can be reconstructed as e/γ objects instead of being reconstructed individually, and can potentially cause a large mismeasurement of the reconstructed jet energy. Therefore, a minimal selection, as reported in ref. [13], is applied to correctly identify hadrons and e/γ objects and to improve the measurement of jets and missing transverse momenta.
Because of computing constraints, it is not currently feasible to rerun the PF algorithm using multiple e/γ identification requirements, and hence a common "loose" identification selection is used for electrons and photons. A loose requirement on the BDT classifier is applied for electrons, with a different BDT used for isolated and nonisolated electrons. Both the BDTs use various shower-shape, detector-based isolation, and tracker-related variables as input. The BDT selection for nonisolated electrons is the one used for the selection of electron candidates, as explained in section 4.3.3. Additionally, selection requirements on / (the ratio between the electron energy and its momentum), / , and on quantities based on the associated generic tracks are applied to reject candidates that are problematic for jet algorithms. Occasionally, an electron can be selected by the PF algorithm, but with its additional tracks released for charged hadron reconstruction in PF. Photon candidates are required to be isolated, and their shower-shape variables must be compatible with genuine photons.

Bremsstrahlung and photon conversion recovery
To collect the energy of photons emitted by bremsstrahlung, tangents to the GSF tracks are extrapolated to the ECAL surface from the track positions. A cluster that is linked to the track is considered as a potential bremsstrahlung photon if the extrapolated tangent position is within the boundaries of the cluster, as defined above, provided that the distance between the cluster and the GSF track extrapolation in is smaller than 0.05. The fraction of the momentum lost by bremsstrahlung, as measured by the tracker is defined as: where trk-in is the momentum at the point of closest approach to the primary vertex, and trk-out is the momentum extrapolated to the surface of the ECAL from the outermost tracker layer. Its distribution is shown in figure 2 for the barrel and the endcaps. Bremsstrahlung photons, as well as prompt photons, have a significant probability to further convert into an e + e − pair in the tracker material. Because of higher tracker material budget in the endcaps, brem has a higher peak at large values, close to 1, compared to the distribution in the barrel. The disagreement observed between data and simulation in the endcap region is attributed to an imperfect modelling of the material in simulation. According to simulation, the fraction of photon conversions occurring before the last tracker layer is as high as 60% in the regions with the largest amount of tracker material in -10 -

JINST 16 P05014
front of the ECAL. A conversion-finder was therefore developed to create links between any two tracks compatible with a photon conversion [5]. To recover converted bremsstrahlung photons, the vector sum of any possible bremsstrahlung pair conversion candidate track momenta is checked for compatibility with the aforementioned electron track tangents. The photon conversion-finding algorithm is validated by reconstructing the µµγ invariant mass from events in which a conversion track pair is matched to the photon, as discussed in section 6.3.

Reconstruction performance
Photons are reconstructed as SCs in the ECAL after applying a very loose selection requirement on / SC < 0.5, for which 100% SC reconstruction efficiency is assumed. Since electrons are additionally required to have a track matching with the SC, the reconstruction efficiency for a SC having a matching track is computed, as described below. Electron reconstruction efficiency is defined as the ratio between the number of reconstructed SCs matched to reconstructed electrons and the number of all reconstructed SCs. The electron reconstruction efficiency is computed with a tag-and-probe method using Z → ee events [34] as a function of the electron and T , and covers all reconstruction effects. This reconstruction efficiency is higher than 95% for T > 20 GeV, and is compatible between data and simulation within 2%.
The tag-and-probe technique is a generic tool to measure efficiency that exploits dileptons from the decays of resonances, such as a Z boson or J/ψ (1S) meson. In this technique, one electron of the resonance decay, the tag, is required to pass a tight identification criterion (whose requirements are listed in detail in section 7.3) and the other electron, the probe, is used to probe the efficiency under study. The estimated efficiencies are almost insensitive to variations in the definition of the -11 -tag. For the results in this paper, tag electrons are required to satisfy T > 30 (35) GeV for the 2016 (2017-2018) data-taking years, respectively. The probe is then required to pass the selection criteria (either reconstruction or identification) whose efficiency is under test. A requirement for having oppositely charged leptons is also applied. When there are two or more probe candidates corresponding to a given tag within the invariant mass range considered, only the probe with the highest T is kept. In data, the events used in the tag-and-probe procedure are required to satisfy HLT paths that do not bias the efficiency under study.
Backgrounds are estimated by fitting. The invariant mass distributions of the (tag, passing probe) and (tag, failing probe) pairs are fitted separately with a signal plus background model around the Z boson mass in the range [60,120] GeV. This range extends sufficiently far from the peak region to enable the background component to be extracted from the fit. The efficiency under study is computed from the ratio of the signal yields extracted from the two fits. This procedure is usually performed in bins of T and of the probe electron, to measure efficiencies as a function of those variables.
Different models can be used in the fit to disentangle the signal and background components. In the absence of any kinematic selection on the tag-and-probe candidates, the background component in the mass spectrum is well described by a falling exponential. However, the kinematic restrictions on the Z candidates in each T and range of the probe candidate distort the mass spectrum in a way that is well described by an error function. Consequently, the background component of the mass spectrum is described by a falling exponential multiplied by an error function as that acts as the resolution function. If a template from simulation is used, the signal part of the distribution is modeled through a sample of simulated electrons from Z boson decays, convolved with a resolution function to account for any remaining differences in resolution between data and simulation. An example fit is shown in figure 3. The tag-and-probe technique is applied to data and simulated events to compare efficiencies, and evaluate data-to-simulation ratios ("scale factors"). In many analyses, these scale factors are applied as corrections to the simulation, or are used to assess systematic uncertainties. The efficiency in simulation is estimated from a Z → ee sample that contains no background, since a spatial match with the generator-level electrons is required. Several sources of systematic uncertainties are considered. The main uncertainty is related to the model used in the fit, and is estimated by comparing alternative distributions for signal and background, in addition to comparing analytic functions with templates from simulation. Only a small dependence is found on the number of bins used in the fits and on the definition of the tag.
The electron reconstruction efficiencies measured in 2017 data and in simulated DY samples are shown in figure 4, together with the scale factors for different T bins as a function of . They are compatible in data and simulation, giving scale factors close to unity in almost the entire range. The region 1.44 < | | < 1.57 corresponds to the transition between the barrel and endcap regions of ECAL and is not considered in a large number of physics analyses. The uncertainties shown in the plots correspond to the quadratic sum of the statistical and systematic contributions, dominated by the latter. The main uncertainty is related to the modeling of the signal.     Other objects, such as hadronic jets, may also produce electron-like signals, leading to such objects being misidentified as electron candidates.
The better the reconstruction algorithm, the lower the misidentification rate per event. The larger the number of multiple interactions in an event, the larger the misidentification rate. Figure 5 shows the number of misidentified electron candidates per event in different T ranges (for DY + jets MC events simulated with the different detector conditions corresponding to the three years of the Run 2 data taking period), as a function of the number of pileup vertices. The significant suppression of the misidentification rate in 2017 and 2018 is due to the new pixel detector. The -13 - slightly better results in 2017 with respect to 2018 are due to the better conditions and calibrations used in the Legacy data set.

Electron charge sign measurement
The measurement of the electron charge sign is affected by potential bremsstrahlung followed by photon conversions. In particular, when the bremsstrahlung photons convert upstream in the detector, the initiated showers lead to complex hit patterns, and the contributions from conversion electrons can be wrongly included in the electron track fit. A direct charge sign estimate is the sign of the GSF track curvature, which can be altered by the presence of conversions, especially for | | > 2, where the misidentification probability can reach 10% for reconstructed electrons from Z boson decays without any further selection. This is improved by combining this measurement with the estimates from two other methods. A second method is based on the associated KF track that is matched to a GSF track when there is at least one shared hit in the innermost region. A third method evaluates the charge sign using the sign of the angle difference between the vector joining the nominal interaction point to the SC position and the vector connecting the nominal interaction point to the innermost hit of the electron GSF track. A detailed description of the three methods can be found in ref. [4]. When two or three out of the three measurements agree on the sign of the charge (majority method), it is assigned as the default electron charge sign. A very high probability of correct charge sign assignment can be obtained by requiring all three measurements to agree (selective method). While the former method is 100% efficient by construction, the latter has some efficiency loss. The fraction of electrons passing the loose identification requirements (as described in section 7.3) with all three charge sign estimations in agreement is shown in figure 6, as a function of T for electrons -14 - where , is the probability of correctly determining the electron charge in the ( , )-th T − bin and is the number of selected electron pairs. By performing a global fit (in all the bins simultaneously) of the N expected SS to the observed number, the probability for each bin can be obtained in both data and simulation. The electrons are required to pass the loose identification requirements, as described in section 7.3.1. Tighter identification requirements, specifically those requiring no "missing hits" for the track, have different efficiencies and correct charge sign identification probabilities. In this procedure no background subtraction is applied. Figure 7 shows the probability of correct charge assignment of the majority (left) and selective (right) methods, as a function of the electron's | |. The charge identification rate using the 2016 data set is compared with the correct charge assignment probability obtained in Z → ee simulated events.

JINST 16 P05014
From the data-to-simulation comparison, the systematic uncertainty in the charge sign assignment probability for electrons is less than 0.1% in the barrel and 0.3% in the endcap regions.

Online electron and photon reconstruction
Electron and photon candidates at L1 are based on ECAL trigger towers defined by arrays of 5×5 crystals in the barrel and by a more complicated pattern in the endcaps, because of the different layout of the crystals [12]. The central trigger tower with the largest transverse energy above a  fixed threshold ( T > 2 GeV) is designated as the seed tower. To recover energy losses because of bremsstrahlung, clusters are built from surrounding towers with T above 1 GeV to form the L1 candidates. The sum of the T of all towers in the cluster is the raw cluster L1 T . To obtain better identification of L1 e/ candidates, requirements are set on: (i) the energy distribution between the central and neighboring towers; (ii) the amount of energy deposited in the HCAL downstream of the central tower, the L1 T of the candidate; and (iii) variables sensitive to the spatial extent of the electromagnetic shower [38]. No tracker information is available at L1, so electrons and photons are indistinguishable at this stage.
The HLT electron and photon candidates are reconstructed from energy deposits in ECAL crystals grouped into clusters around the corresponding L1 candidate (called the L1 seed). For a given L1 seed, the ECAL clustering algorithm is performed by the HLT from the readout channels overlapping a matrix of crystals centered on the L1 candidate. The HLT processing time is kept short by clustering only around the L1 seed. Based on this L1 seed, superclusters are built using offline reconstruction algorithms, and the HLT requirements are applied as follows. For electron candidates, the ECAL SC is associated with a reconstructed track whose direction is compatible with its location. Electron and photon selection at the HLT relies on the identification and isolation criteria, together with minimal thresholds on the SC HLT T (i.e., the energy measured by the HLT using only the ECAL information, without any final calibration applied). The identification criteria are based on the transverse profile of the cluster energy deposited in the ECAL, the amount of energy in the HCAL downstream from the ECAL SC, and (for electrons) the degree of association between the track and the ECAL SC. The isolation criteria make use of the energy deposits that surround the HLT electron candidate in the ECAL, HCAL, and tracker detectors.

Differences between online and offline reconstruction
The HLT must ensure a large acceptance for physics signals, while keeping the CPU time and output rate under control. This is achieved by exploiting the same software used for the offline analysis that ensures high reconstruction efficiency and reduces the trigger rate by applying stringent identification criteria and quality selections. The differences between the HLT and offline reconstruction code are minimal and are mainly driven by: (i) the limited CPU time available at the HLT, fixed at about 260 ms by the number of processing CPUs and the L1 input rate; (ii) the lack of final calibrations, which are not yet computed during the data-taking period; and (iii) more conservative selection criteria to avoid rejecting potentially interesting events.
To keep the processing time short, all trigger paths have a modular structure and are characterized by a sequence of reconstruction and filtering blocks of increasing complexity. Thus faster algorithms are run first, and their products are immediately filtered, allowing the remaining algorithms to be skipped when the event fails any given filter. Another important time-saving optimization is to restrict the detector readout and reconstruction to regions of interest around the L1 candidates. Moreover, HLT SCs, which are more robust against possible background contamination, have a simpler energy correction than the offline reconstruction.
The main difference between the online and offline reconstruction occurs in the tracking algorithms. Every electron candidate reconstructed at the HLT is ECAL-driven; the algorithm starts by finding a supercluster and then looks for a matching track reconstructed in the pixel detector. The association is performed geometrically, matching the SC trajectory to pixel detector hits. Since 2017, the online pixel matching algorithm requires three pixel hits rather than two, as in the offline algorithm, to maximize early background rejection. Two pixel detector hits are accepted only if the trajectory passes through a maximum of three active modules. Once the SC is associated with the pixel detector seeds, the electron track is reconstructed using the same GSF algorithm as employed offline. Since this algorithm is used only when the pixel matching succeeds, the processing time is considerably reduced. Moreover, not all electron paths lead to reconstructed tracks; some of them can achieve significant rate reduction from pixel detector matching alone. For isolated electrons, all the nearby tracks must be reconstructed to build the track isolation variables. This is accomplished at the end of the path by using an iterative tracking algorithm similar to that applied offline, but specifically customized for the HLT and with fewer iterations of the tracking procedure.
Offline tracker-driven electron reconstruction is advantageous only for low energy or nonisolated electrons, neither of which is easy to trigger on. The use of only ECAL-driven electrons at the HLT is thus a reasonable simplification with respect to the offline reconstruction.
Other differences that exist with respect to the offline reconstruction concern calorimetry. At HLT the timing selection requirement applied offline to reject out-of-time hits (e.g., pileup, anomalous signals in ECAL from the interaction of particles in photodetectors, cosmic and beam halo events) is removed, since it does not significantly reduce the rate and risks losing rare signatures, such as the detection of long-lived particles. Moreover, the ECAL online calibration is also different; the response corrections for the crystal transparency loss that are applied at HLT during the datataking period and updated twice per week are not as accurate as the ones used by the offline reconstruction.
-17 -Finally, some online variables are defined differently with respect to offline. The T is computed with respect to the origin of the CMS reference system, instead of the actual position of the collision primary vertex, and it is measured using only calorimeter information, without any track-based corrections or final calibrations. The online particle isolation is defined by exploiting energy clusters built in the ECAL and HCAL and tracks reconstructed in the tracker, instead of using the more complete PF information, which is available offline. Some other variables, such as / , are defined in the same way both offline and online, although with slightly different parameters, that ensure the online selection is always looser than offline.

Electron trigger requirements and performance
The electron triggers correspond to the first selection step of most offline analyses using electrons, which requires the presence of at least one, two or three HLT electron candidates. Because of bandwidth limitations, both L1 seeds and HLT paths may be prescaled, i.e., they may record only a fraction of the events, to reduce the trigger rate. Tables 2 and 3 show the lowest unprescaled L1 and HLT T thresholds and the corresponding L1 seed and the HLT path names of Run 2 [38]. The single-and double-electron trigger performance is reported, using the full Run 2 data sample corresponding to an integrated luminosity of 136 fb −1 . Efficiencies are obtained using a data-driven method based on the tag-and-probe technique (described in detail in section 4.7), which exploits Z → ee events and requires one electron candidate, called the tag, to satisfy tight selection requirements, while leaving the other electron of the pair, called the probe, unbiased to measure the efficiency.
For the results presented in this section, tag electrons are required to pass the criteria described in section 4.7. Moreover, they have to satisfy an unprescaled single-electron trigger, with HLT T > 27 -18 -and 32 GeV for the 2016 and 2017-2018 data-taking periods, respectively. Probe electrons must have | | < 2.5 and ECAL T > 5 GeV, with ECAL T = ECAL sin SC , where ECAL is the best estimate of the electron energy measured by ECAL and SC is the angle with respect to the beam axis of the electron SC. No additional identification criteria are applied to the probes. To measure the trigger efficiency, probes are then required to pass the HLT path under study. The electron triggers analyzed in this paper are the following: • HLT_Ele (27) Photon triggers are not included in this paper, since they are very similar to electron triggers, except for the absence of the requirement on the presence of matching tracks. Moreover, photon triggers are usually designed for specific analyses and are not used as extensively as the electron triggers described above.
The efficiency of the two analyzed electron triggers in different SC regions is shown, with respect to an offline reconstructed electron, in figures 8 and 9, as a function of the electron T . The region 1.44 < | | < 1.57 is not included since it corresponds to the transition between the barrel and endcap regions of the ECAL where the quality of reconstruction, calibration and identification are not as good as in the rest of the ECAL (see figure 4). The DY + jets simulated event samples produced with MadGraph5 [21] are used for comparison. The measurement combines both the L1 and HLT efficiencies. At the HLT level, both objects required by the double-electron path must correspond to an L1 seed, which can require either a single-electron with a higher momentum threshold (L1_SingleEG) or two electrons (L1_DoubleEG) with lower momentum thresholds (as shown in table 2). This requirement also needs to be applied offline when performing the tag-and-probe measurement. Since the tag needs to pass a single-electron HLT path, it must pass an L1_SingleEG seed. As a consequence, it will also satisfy the requirements of the T -leading object of the lowest unprescaled L1_DoubleEG, lowering the L1 requirement on the probe to be only above the subleading threshold of the lowest unprescaled L1_DoubleEG. When the T -leading object (Ele23) of the doubleelectron path is tested, the probe is thus specifically requested to pass the leading threshold of the path's L1_DoubleEG seed. As reported in table 2, the L1 T thresholds of the lowest unprescaled L1_DoubleEG seed increased across the years, leading to a larger efficiency at low T for the double-electron trigger in 2016 than in 2017 and 2018.
The single-electron trigger analyzed in this paper is characterized by a sequence of strict identification and isolation selections, known as "tight working point" (WPTight). This selection was retuned in 2017 to ensure better performance. As a consequence, the single-electron trigger efficiency is higher in 2017-2018 than in 2016.
As previously described, electron candidates at the HLT are built by associating a track reconstructed in the pixel detector with an ECAL SC. In 2017, the CMS pixel detector was upgraded by introducing extra layers in the barrel and forward regions. At the beginning of that year, a commissioning period of the pixel detector led to a slightly reduced efficiency, which mostly affected barrel electrons. Moreover, as a consequence of the new detector, the algorithm used to reconstruct electrons, by matching ECAL superclusters to pixel tracks, was revised. Since the beginning of 2017 data taking, the algorithm requires two hits in the pixel detector when the particle trajectory passes through three or less active modules and three hits otherwise, whereas in 2016 only two hits were demanded in all cases. This change produced a significant rate reductions with minimal efficiency losses. To operate with the new pixel detector, DC-DC converters were installed. After a few months of smooth operation, some converters started to fail once the luminosity of the accelerator was increased, at the beginning of October 2017, leading to a decreasing efficiency toward the end of the year. For these reasons related to the pixel detector, 2017 trigger performance is slightly worse than for the other years, in particular for the double-electron trigger, where the retuning of the tight working point does not have any effect. In figure 10, the 2017 efficiencies of the single-and double-electron HLT paths are reported as a function of the number of reconstructed primary vertices. In 2017 the majority of the high-pileup data was recorded at the end of the year, the same time the pixel DC-DC convertors exhibited efficiency losses. Thus the efficiency loss versus number of vertices in the event is not solely due to the pileup. However, as figure 10 shows, the efficiency loss is significant only for 2.0 < | | < 2.5.
The combined L1 and HLT trigger efficiency for the lowest unprescaled single-electron trigger path is about 80% at the T plateau, with slightly lower values in the endcaps in 2016-2017. Because of the looser selection applied, the double-electron trigger has an efficiency close to unity for both objects. The increase in dead regions in the pixel tracker arising from DC-DC convertor -20 -

JINST 16 P05014
failure is difficult to simulate, and is one of the main causes of disagreement between data and simulation, in particular in 2017, especially at high T . The discrepancy in the turn-on at low T , seen for all years and values, is mainly because of the small differences that exist between the online and offline ECAL response corrections, as described above. 20   Trigger efficiency      Trigger efficiency Trigger efficiency  Figure 10. Efficiency of HLT_Ele32_WPTight_Gsf (left) and HLT_Ele23_Ele12 (right) trigger, with respect to an offline reconstructed electron, as a function of the number of reconstructed primary vertices, obtained for different regions using the 2017 data set. Electron T is required to be >50 GeV. The bottom panel shows the data-to-simulation ratio. The efficiency measurements combine the effects of the L1 and HLT triggers. The vertical bars on the markers represent combined statistical and systematic uncertainties. In 2017 the majority of the high-pileup data was recorded at the end of the year, the same time the pixel DC-DC convertors exhibited efficiency losses. Thus the efficiency loss versus number of vertices in the event is not solely due to the pileup. However the efficiency loss is significant only for 2.0 < | | < 2.5.

JINST 16 P05014
be lost through lateral and longitudinal shower leakage, or in intermodule gaps or dead crystals; the shower energy can also be smaller than the initial electron energy because of the energy lost in the tracker.
These losses result in systematic variations of the energy measured in the ECAL. Without any corrections, this would lead to a degradation of the energy resolution for reconstructed electrons and photons. To improve the resolution, a multivariate technique is used to correct the energy estimation for these effects, as discussed below. The regression technique described in section 6.1 uses simulation events only, whereas the energy scale and spreading corrections detailed in section 6.2 are based on the comparison between data and simulation.

Energy corrections with multivariate regressions
A set of regression fits based on BDTs are applied to correct the energy of e/ [39]. The minimum T for electrons (photons) considered for the BDT training is 1 (5) GeV at the simulation level. Each of these energy regressions is built as follows. The regression target is the ratio between the true energy of an e/ and its reconstructed energy, thus the regression prediction for the target is the correction factor to be applied to the measured energy to obtain the best estimate of the true energy. The regression input variables, represented by the vector ì, includes the object and event parameters most strongly correlated with the target. The regression is implemented as a gradient-boosted decision tree, and a log-likelihood function is employed [39]: where ( | ì) is the estimated probability for an object to have the observed value , given the input variables ì, and the sum runs over all objects in a simulated sample in which the true values of the object energies are known. The probability density function used in this regression algorithm is a double-sided Crystal Ball (DSCB) function [37] that has a Gaussian core with power law tails on both sides. The definition of the DSCB function is as follows: 2) where is the normalization constant, ( ) = ( − )/ , the variables and are the parameters of the Gaussian core, and the R ( L ) and R ( L ) parameters control the right (left) tails of the function. Through the training phase, the regression algorithm performs an estimate of the parameters of the double Crystal Ball probability density as a function of the input vector of the object and event characteristics ì: Subsequently, for an e/ candidate, the most probable value is the estimate of the correction to the object's energy, and the width of the Gaussian core is the estimate of the per-object energy -23 -resolution. Both and are predicted by the regression, as functions of the object and event parameter vector ì. The electron energy is corrected via the sequential application of three regressions: the first regression (step 1) provides the correction to the SC energy, the second regression (step 2) provides an estimate of the SC energy resolution, taking into account the additional spread in data due to real detector conditions, and the third regression (step 3) yields the final energy value, correcting the combined energy estimate from the SC and the electron track information. The photon energy is corrected using the same method, except that step 3 is omitted.
The electron and photon regressions are trained on samples of simulated events with two electrons or photons in each event, generated with a flat transverse momentum spectrum, where the true value of the e/ energy is known and the geometric condition Δ < 0.1 is used to find a match of the reconstructed e/ to the true ones. The ECAL crystals exhibit slight variations in the light output for a given electromagnetic shower. This effect is corrected by the intercalibration of the crystals [19], and a corresponding modeling of this variation is applied in the simulation. In addition the knowledge of the crystal intercalibrations is affected by random deviations [4], which impact the energy resolution. This effect is usually simulated by applying a random spreading of the crystal intercalibrations within the expected inaccuracy. To avoid the random spreading of the simulation, the regression fit corrects the data using two MC samples: a sample without the intercalibration spreading (called ideal IC samples) is used to train the energy regression, and a sample with the intercalibration spreading (called real IC samples) is used for the energy resolution estimation and for the SC and track combination.
The workflow for the electron and photon energy regressions is summarized in table 4. Each subsequent step depends on the output of the previous step. The step 1 regression primarily corrects for the energy that is lost in the tracker material or in modular gaps in the ECAL. The regression inputs include the energy and position of the SC, and the variable 9 , which is defined as the energy sum of the 3×3 crystal array centered around the most energetic crystal in the SC divided by the energy of the SC. Other quantities, including lateral shower shapes in and , number of saturated crystals, and other SC shape parameters, as well as an estimate of the pileup transverse energy density in the calorimeter are also included.
Step 2 is performed to obtain an estimate of the per-object resolution. It uses the same inputs as in step 1, but the SC energy is scaled by the correction factor obtained from the step 1 regression, and the target of the step 2 regression is the ratio of the true energy of the particle to the measured energy corrected by step 1. Since imperfect intercalibration affects the spread of the energy response between crystals and not the mean value of the average response, in step 2 the mean ( ì) of the -24 -

JINST 16 P05014
DSCB probability density function is fixed to that obtained from step 1. The primary result of the step 2 regression is the estimated value of the energy resolution ( ì).
For electrons, since the energy measurement is performed independently in the ECAL and the tracker, an additional step combining the ECAL energy and momentum estimate from the tracker is performed. A weighted combination of the two independent measurements can be formed as: where ECAL and are the ECAL measurements of the energy and the energy resolution of the SC of the electron corrected with the step 1 and 2 regressions, respectively, and tracker with are the momentum magnitude and momentum resolution measured by the electron tracking algorithm (as described in section 4.3). This improves the predicted electron energy at low T , especially where the momentum measurement from the tracker has a better resolution than the corresponding ECAL measurement. The average relative momentum resolution of the tracker ( ) and energy resolution of the ECAL ( ) are shown in figure 11. The momentum resolution of the tracker is better than the ECAL energy resolution for transverse momenta below 10-15 GeV and deteriorates at higher energies. The -combination in CMS is only performed for electrons with energies less than 200 GeV. For higher-energy electrons only the SC energy is used, corrected by the above described regression steps. The step 3 regression uses as a target the ratio of the true electron energy and reco combined computed as the -combination discussed above. The inputs for the regression include all quantities that enter the reco combined expression, plus several additional tracker quantities including the fractional amount of energy lost by the electron in the tracker, whether the electron was reconstructed as ECAL-driven or tracker-driven (as discussed in section 4.3), and a few other tracker-related parameters. In figures 11-14 the outcome of the step 3 regression is referred to as the -combination.
These regressions lead to significantly improved measurements of electron and photon energies and energy resolutions as seen in figure 12. The primary improvement occurs in the regressions applied to the energy of the SC (steps 1 and 2). Correcting the -combination, which already uses the improved SC energy, has a smaller impact. The effects of the regression corrections for the various steps of the correction procedure are illustrated in figure 12 for low-T electrons. The regressions are robust, and the performance is stable for electrons and photons in a wide energy range in all regions of ECAL, and as a function of pileup, as shown in figures 13 and 14.

Energy scale and spreading corrections
After applying the corrections described in section 6.1, small differences remain between data and simulation in both the electron and photon energy scales and resolutions. In particular, the resolution in simulation is better than that in data.
An additional spreading needs to be applied to the photon and electron energy resolutions in simulation to match that observed in data. The electron and photon energy scales are corrected by varying the scale in the data to match that observed in simulated events. The magnitude of the final correction is up to 1.5% with a total uncertainty estimated to be smaller than 0.1 (0.3)% in the barrel (endcap).    Two dedicated methods, the "fit method" and the "spreading method" [4], were developed in Run 1 to estimate these corrections from Z → ee events. In the fit method, an analytic fit is performed to the invariant mass distribution of the Z boson ( ee ), with a convolving of a BW and a OSCB function. The invariant mass distributions obtained from data and from simulated events are fitted separately and the results are compared to extract a scale offset. The BW width is fixed to that of the  upstream of the ECAL, are free parameters of the fit. The spreading method, on the other hand, utilizes the simulated Z boson invariant mass distribution as a probability density function in a maximum likelihood fit to the data. The simulation already accounts for all known detector effects, reconstruction inefficiencies, and the Z boson kinematic properties. The residual discrepancy between data and simulation is described by an energy spreading function, which is applied to the simulation. A Gaussian spreading, which ranges from 0.1 to 1.5%, is applied to the simulated energy response; it is adequate to describe the data in all the examined categories of events. Compared -27 -with the fit method, the spreading method can accommodate a larger number of electron categories in which these corrections are derived.
A multistep procedure is implemented, based on the fit and spreading methods, to fine-tune the electron and photon energy scales. To derive the corrections to the photon energy scale, electrons from Z boson decays are used, reconstructed using information exclusively from the ECAL.
In the first step, any residual long-term drifts in the energy scale in data are corrected by using the fit method, in approximately 18-hour intervals (corresponding approximately to one LHC fill). Further subcategories are defined based on various regions, owing to the different levels of radiation damage and of the amount of material budget upstream of the ECAL. There are two regions in the barrel, | | < 1.00 and 1.00 < | | < 1.44. In the endcap, the two categories are defined by 1.57 < | | < 2.00 and 2.00 < | | < 2.50. After applying these time-dependent residual scale corrections, the energy scale in data is stable with time.
In the second step, corrections to both the energy resolution in the simulation and the scale for the data are derived simultaneously in bins of | | and 9 for electrons, using the spreading method. The energy scale corrections are derived in 50 electron categories: 5 in | | and 10 in 9 . This is a significant improvement in granularity compared with Run 1 [4], where only 8 electron categories were used (4 in | | and 2 in 9 ), thus leading to an improvement in the precision of the derived scale corrections. The 9 value of each electron or photon SC is used to select electrons that interact or photons that undergo a conversion in the material upstream of the ECAL. The energy deposited by photons that convert before reaching the ECAL tends to have a wider transverse profile and thus lower 9 values than those for unconverted photons. The same is true for electrons that radiate upstream of the ECAL.
The energy scale corrections obtained from this step in fine bins of 9 are shown in figure 15 for the 2017 data-taking period. The uncertainties shown are statistical only.  The ECAL electronics operate with three gains: 1, 6, and 12, depending on the energy recorded in a single readout channel. Most events are reconstructed with gain 12, whereas events with the highest energies are reconstructed with gains 6 or 1. The gain switch from 12 to 6 (6 to 1) typically happens for electron/photon energies above 150 (300) GeV in the barrel, and higher values in the endcaps. A residual scale offset of nearly 1% is measured for gain 6 both in the EB and EE and of 2 (3)% for gain 1 in the EB (EE). Thus an additional gain-dependent residual correction is derived and applied.
The systematic uncertainties in the electron energy scale and resolution corrections are derived using Z → ee events by varying the distribution of 9 , the electron selections used, and the T thresholds on the electron pairs used in the derivation of the corrections. The contributions of these individual sources are added in quadrature to obtain the total uncertainty. This uncertainty in the energy scale is 0.05-0.1 (0.1-0.3)% for electrons in the EB (EE), where the range corresponds to the variation in the 9 bins.
The performance of energy corrections in data, including the ones described in section 6.1, is illustrated by the reconstructed Z → ee mass distribution before and after corrections, as shown in figure 16. The regression clearly improves the mass resolution for electrons from Z boson decays, both in the barrel and endcaps, and the absolute energy scale correction shifts the dielectron mass distribution peak closer to the world-average Z boson mass value. The data-to-simulation agreement, after the application of residual scales to data and spreadings to simulated events, is shown in figure 17 for two representative categories.
The ultimate energy resolution after all the corrections (regression and scale corrections) ranges from 2 to 5%, depending on electron pseudorapidity and energy loss through bremsstrahlung in the detector material.

Performance and validation with data
Energy scale and spreading corrections, derived with electrons from Z boson decays with a mean T of around 45 GeV, are applied also to electrons and photons over a wide range of T up to several hundreds of GeV. Therefore, it is important to validate the performance of the residual energy corrections on a sample of unbiased photons and on high-energy e/ .
To validate the unbiased photons, a sample of Z → events selected from data with 99% photon purity is used. Events in both data and simulation are required to satisfy standard dimuon trigger requirements. An event is kept if there are at least two muons passing the tight muon identification requirements [40], with T > 30 and 10 GeV and | | < 2.4. The two muons must have opposite charges and an invariant mass ( ) greater than 35 GeV. Once the dimuon system is identified, a photon in the event is required to have | | < 2.5, be reconstructed outside the barrel-endcap transition region, and have T > 20 GeV. The system is then selected by requiring that the photon is within Δ = 0.8 of at least one of the muons. After applying these criteria, roughly 140 × 10 3 (230 × 10 3 ) of events have been selected with a photon in the EB for 2016-2017 (2018) and roughly 40 × 10 3 (80 × 10 3 ) with a photon in the EE for 2016-2017 (2018) data sets. Figure 18 shows the invariant mass distribution of the Z → system ( ) obtained after applying the scale and spreading corrections derived with electrons from Z boson decays to the photons, shown separately for barrel and endcap photons in 2017 data and simulation, and in 2018 data and simulation. The photon energy scale is extracted for data and simulation from the mean of the distribution of a per-event estimator [19] defined as = ( 2 − 2 Z )/( 2 Z − 2 ), where Z denotes the Particle Data Group world-average Z boson mass [36]. The energy scale difference between data, corrected with the energy scale corrections derived with electrons from -30 -Z boson decays, and simulation, both from Z → events, is smaller than 0.1% for photons both in the barrel or in the endcaps. This correction is within the quadratic sum of the statistical and systematic uncertainties associated with this scale extraction process, which include scale and spreading systematic uncertainty, as well as systematic uncertainties due to corrections applied to muon momenta [40]. The performance of the energy corrections on high-energy e/ is validated by using Z → ee data and MC samples, with scale and spreading corrections applied. The residual corrections for T between 120 and 300 GeV are better than 0.8 (1.1)% in the barrel (endcaps). These values are used to derive the systematic uncertainties in the energy correction extrapolation above 300 GeV, where the statistics are very low. In this T range, the systematic uncertainty is conservatively assumed to be 2 (3) times the systematic uncertainty in EB (EE) of the T range 120-300 GeV.

Impact of residual corrections in H → channel
The mass of the Higgs boson in the diphoton channel has been recently measured exploiting pp collision data collected at a center-of-mass energy of 13 A key requirement for this measurement was to measure and correct for nonlinear discrepancies between data and simulation in the energy scale, as a function of T , using electrons from Z boson decays. Additional energy scale corrections were derived in bins of | | and T to account for any nonlinear response of the ECAL with energy for the purpose of this high-precision measurement. The corrections obtained from this step are shown in figure 19 for electrons, as functions of T

Electron and photon selection
Many physics processes under study at the LHC are characterized by the presence of electrons or photons in the final state. The performance of the identification algorithms for electrons and photons is therefore crucial for the physics reach of the CMS experiment. Two different techniques are used in CMS for the identification of electrons and photons. One is based on sequential requirements (cut-based), and the other is based on a multivariate discriminant. Although the latter is more suited for precision measurements and physics analyses with well-established final states, the former is largely used for model independent searches of nonconventional signatures. We describe below in detail the main strategies of electron and photon identification and the performance through the full Run 2.

Electron and photon identification variables
Different strategies are used to identify prompt (produced at the primary vertex) and isolated electrons and photons, and separate them from background sources. For prompt electrons, background sources can originate from photon conversions, hadrons misidentified as electrons, and secondary electrons from semileptonic decays of b or c quarks. The most important background to prompt photons arises from jets fragmenting mainly into light neutral mesons 0 or , which subsequently -32 -decay promptly to two photons. For the energy range of interest, the 0 or are significantly boosted, such that the two photons from the decay are nearly collinear and are difficult to distinguish from a single-photon incident on the calorimeter. Different working points are defined to identify either electrons or photons, corresponding to identification efficiencies of approximately 70, 80, and 90%, respectively. In all cases data and simulation efficiencies are compatible within 1-5% over the full and T ranges for electrons and photons.

Isolation criteria
One of the most efficient ways to reject electron and photon backgrounds is the use of isolation energy sums, a generic class of discriminating variables that are constructed from the sum of the reconstructed energy in a cone around electrons or photons in different subdetectors. For this purpose, it is convenient to define cones in terms of an -metric; the distance with respect to the reconstructed electron or photon direction is defined by Δ . To ensure that the energy from the electron or photon itself is not included in this sum, it is necessary to define a veto region inside the isolation cone, which is excluded from the isolation sum. Electron and photon isolation exploits the information provided by the PF event reconstruction [13]. The isolation variables are obtained by summing the transverse momenta of charged hadrons ( ch ), photons ( ), and neutral hadrons ( n ), inside an isolation cone of Δ = 0.3 with respect to the electron or photon direction. The larger the energy of the incoming electrons or photons, the larger the amount of energy spread around its direction in the various subdetectors. For this reason, the thresholds applied on the isolation quantities are frequently parametrized as a function of the particle T , as indicated in tables 5 and 6.
The isolation variables are corrected to mitigate the contribution from pileup. This contribution in the isolation region is estimated as eff , where is the median of the transverse energy density per unit area in the event and eff is the area of the isolation region weighted by a factor that accounts for the dependence of the pileup transverse energy density on the object [4]. The quantity eff is subtracted from the isolation quantities.
The distributions of for photons after the corrections are shown in figure 20 for photons in the EB and EE.

Shower shape criteria
Another method to reject jets with high electromagnetic content exploits the shape of the electromagnetic shower in the ECAL. Even if the two photons from neutral hadron decays inside a jet cannot be fully resolved, a wider shower profile is expected, on average, compared with a single incident electron or photon. This is particularly true along the axis of the cluster, since the presence of the material combined with the effect of the magnetic field reduce the discriminating power resulting from the profile of the shower. This can elongate the electromagnetic cluster in the direction for both converted photons as well as pairs of photons from neutral hadron decays where at least one of the photons has converted. Several shower-shape variables are constructed to parameterize the differences between the geometrical shape of energy deposits from prompt photons or electrons compared with those caused by hadrons from jets. The following are two of the most relevant variables used for photon and electron identification. • Hadronic over electromagnetic energy ratio ( / ): the / ratio is defined as the ratio between the energy deposited in the HCAL in a cone of radius Δ = 0.15 around the SC direction and the energy of the photon or electron candidate. There are three sources that significantly contribute to the measured hadronic energy (H) of a genuine electromagnetic object: HCAL noise, pileup, and leakage of electrons or photons through the inter-module gaps. For low-energy electrons and photons, the first two sources are the primary contributors, whereas for high-energy electrons, the last contribution dominates. Therefore, to cover both low-and high-energy regions, the / selection requirement is of the form < + + , where and represent the noise and pileup terms, respectively, and is a scaling term for high-energy electrons and photons. An example of the / distribution in data and simulation is shown in figure 21 for electrons in the barrel and endcap regions. The datato-simulation ratio in figure 21 is mostly consistent with one except for / > 0. 16 where background from events with nonprompt electrons starts to contribute.
• : the second moment of the log-weighted distribution of crystal energies in , calculated in the 5×5 matrix around the most energetic crystal in the SC and rescaled to units of crystal size. The mathematical expression is given below: Here, is the pseudorapidity of the th crystal, 5×5 denotes the pseudorapidity mean position, and is a weight factor which is defined as: reject ECAL noise, by ensuring that crystals with energy deposits of at least 0.9% of 5×5 , the energy deposited in a 5×5 crystal matrix around the most energetic crystal, will contribute to the definition of . Because of the presence of upstream material and the magnetic field, the shower from an electron or a photon spreads into more than one crystal. The size of the crystal in in the EB is 0.0175 and in the EE it varies from 0.0175 to 0.05. Following eq. 7.1, the variable essentially depends on the distance between two crystals in . Thus, the spread of in EE is twice larger than in the EB. The distribution of is expected to be narrow for single-photon or electron showers, and broad for two-photon showers that arise from neutral meson decays. An example of the distribution in data and MC is shown in figure 22 for photons in the barrel and in the endcap regions.
Another important variable is 9 . Showers of photons that convert before reaching the calorimeter have wider transverse profiles and lower values of 9 than those of unconverted photons. The energy weighted -width and -width of the SC provide further information of the lateral spread of the shower. In the endcaps, where CMS is equipped with a preshower detector, the variable RR = √︁ 2 + 2 is also used, where and measure the lateral spread in the two orthogonal directions of the sensor planes of the preshower detector.

Additional electron identification variables
Additional tracker-related variables are used for the identification of electrons. One such discriminating variable is |1/ − 1/ |, where is the SC energy and is the track momentum at the point of closest approach to the vertex. Another important variable for the electron identification is discriminating variable that uses the SC energy-weighted position in instead of the seed cluster . An example of the Δ in distribution in data and simulation is shown in figure 23 for electrons in the barrel and endcap regions.
An important source of background to prompt electrons arises from secondary electrons produced in conversions of photons in the tracker material. To reject this background, CMS algorithms exploit the pattern of hits associated with the electron track. When photon conversions take place inside the tracker volume, the first hit of the electron tracks from the converted photons is not likely to be located in the innermost tracker layer, and missing hits are therefore expected in the first tracker layers. For prompt electrons, whose trajectories start from the beamline, no missing hits are expected in the innermost layers. Distributions of some of the identification variables for electrons are shown in figures 21 and 23, for electrons from Z boson decays in data and simulation. The data-to-simulation ratio is close to unity, except at the very high end tail, where background from events with nonprompt electrons start to contribute.

Photon identification
A detailed description of photon identification strategies is given below.

Cut-based photon identification
Requirements are made on , / , and the isolation sums after correcting for pileup as detailed in section 7.1.1. A summary of the standard identification requirements for photons in the barrel and the endcaps is given in table 5 for the tight working point. The selection requirements were tuned using a MC sample with 2017 data-taking conditions, but these identification criteria are suitable for use in all three years of Run 2. The "loose" working point has an average signal efficiency of about 90%, and is generally used when backgrounds are low. The "medium" and "tight" working points have an average efficiency of about 80% and 70%, respectively, and are used in situations where the background is expected to be larger.

Electron rejection
Along with the cut-based photon identification criteria, a prescription is required to reject electrons in the photon identification scheme. The most commonly used method is the conversion-safe electron veto [5]. This veto requires the absence of charged particle tracks, with a hit in the innermost layer of the pixel detector not matched to a reconstructed conversion vertex, pointing to the photon cluster in the ECAL. A more efficient rejection of electrons can be achieved by rejecting any photon for which a pixel detector seed consisting of at least two hits in the pixel detector points to the ECAL within some window defined around the photon SC position. The conversion-safe -37 -electron veto is appropriate in the cases where electrons do not constitute a major background, whereas the pixel detector seed veto is used when electrons misidentified as photons are expected to be an important background.

Photon identification using multivariate techniques
A more sophisticated photon identification strategy is based on a multivariate technique, employing a BDT implemented in the TMVA framework [31]. Here, a single discriminant variable is built based on multiple input variables, and provides excellent separation between signal (prompt photons) and background from misidentified jets. The signal is defined as reconstructed photons from a + jets simulated sample that are matched at generator level with prompt photons within a cone of size Δ = 0.1, whereas the background is defined by reconstructed photons in the same sample that do not match with a generated photon within a cone of size Δ = 0.1. Photon candidates with T > 15 GeV, | | < 2.5, and satisfying very loose preselection requirements are used for the training of the BDT. The preselection requirements consist of very loose cuts on / , , 9 , PF photon isolation and track isolation.
The variables used as input to the BDT include shower-shape and isolation variables, already presented above. Three more quantities are used that improve the discrimination between signal and background by including the dependencies of the shower-shapes and isolation variables on the event pileup, and T of the candidate photon: the median energy per unit area, ; the ; and the uncorrected energy of the SC corresponding to the photon candidate.
A comparison of the performance between cut-based identification and BDT identification for photons is shown in figure 24. The background efficiency as a function of the signal efficiency is reported for the multivariate identification (curves) and for the cut-based selection (discrete points). -38 -

Electron identification
A detailed description of electron identification strategies is given below.

Cut-based electron identification
The sequential electron identification selection includes the requirements for seven identification variables, with thresholds as listed in table 6 for the representative tight working point. The selection requirements were tuned using a MC sample with 2017 data-taking conditions, and this selection is suitable for use in all three years of Run 2. The combined PF isolation is used, combining information from ch , and n . It is defined as: combined = ch + max(0, n + − PU ), where PU is the correction related to the event pileup. The isolation-related variables are very sensitive to the extra energy from pileup interactions, which affects the isolation efficiency when there are many interactions per bunch crossing. The contribution from pileup in the isolation cone, which must be subtracted, is computed assuming PU = eff , where and eff are defined before. The variable combined is divided by the electron T , and is called the relative combined PF isolation. For the cut-based electron identification, four working points are generally used in CMS. The "veto" working point, which corresponds to an average signal efficiency of about 95%, is used in analyses to reject events with more reconstructed electrons than expected from the signal topology. The "loose" working point corresponds to a signal efficiency of around 90%, and is used in analyses where backgrounds to electrons are low. The "medium" working point can be used for generic measurements involving W or Z bosons, and corresponds to an average signal efficiency of around 80%. The "tight" working point is around 70% efficient for genuine electrons, and is used when backgrounds are larger. Requirements on the minimum number of missing hits, together with requirements on the pixel conversion veto described in section 7.2.2, are also applied.

Electron identification using multivariate techniques
To further improve the performance of the electron identification, especially at T less than 40 GeV, several variables are combined using a BDT. The set of observables is extended relative to the -39 -simpler sequential selection: the track-cluster matching observables are computed both at the ECAL surface and at the vertex. More cluster-shape and track-quality variables are also used. The fractional difference between the track momentum at the innermost tracker layer and at the outermost tracker layer, brem , is also included. Similar sets of variables are used for electrons in the barrel and in the endcaps. Electron candidates in DY + jets simulated events with T greater than 5 GeV and | | < 2.5 are used to train several BDTs in bins of T and with the XGBoost algorithm [42]. The splits in pseudorapidity are at the barrel-endcap transition and at | | = 0.8, because the tracker material budget steeply increases beyond this point. The split in T is at 10 GeV, allowing for a dedicated training in a region where background composition and amount of background is different. Signal electrons are defined as reconstructed electrons that match generated prompt electrons within a cone of size Δ = 0.1. Background electrons are defined as all reconstructed electrons that either match generated nonprompt electrons (usually electrons from hard jets) or that don't match any generated electron. Reconstructed electrons that match generated electrons from leptonic decays are not considered either for the signal or for the background. For a maximum flexibility at analysis level, the electron identification BDTs are trained with and without including the isolation variables. The performance of the BDT-based identification is reported in figure 25, and compared with a sequential selection in which the BDT and isolation selection requirements are applied one after the other. The BDT trained with isolation variables shows a clear advantage over the sequential approach, especially for high selection efficiencies. Whereas the BDT is optimized on background from DY + jets simulated events, the cut-based identification is optimized on background from tt events, which contains a much higher fraction of nonprompt electrons. This optimizes the performance of the identification algorithms for the analyses in which they are applied, but prevents a like-for-like comparison between the two algorithms.
Although the BDT-based identifications have better background rejection for a given signal efficiency, compared with the cut-based identifications, a significant fraction of physics analyses still prefer to use cut-based identifications because it is easy to flip or undo a specific cut to perform a sidebands study. This is particularly true for searches focussing on high-T ranges, where background is so low that an improved electron identification does not bring any sizeable improvement, and these analyses profit from the simplicity and flexibility of the cut-based identifications.

High-energy electron identification
The CMS experiment employs a dedicated cut-based identification method for the selection of high-T electrons, known as high-energy electron pairs (HEEP). Variables similar to those used for the cut-based general electron identification are used to select high-T electrons, starting at 35 GeV and extending up to about 2 TeV or more. This selection requires that the lateral spread of energy deposits in the ECAL is consistent with that of a single electron and that the track is matched to the ECAL deposits and is consistent with a particle originating from the nominal interaction point. The associated energy in the HCAL around the electron direction must be less than 5% of the reconstructed energy of the electron, once the noise and pileup contributions are included.
The main difference between the high-T electron identification and the cut-based electron identification is the use of subdetector-based isolation instead of PF isolation. Although the two algorithms are expected to provide similar performance, the detector-based isolation behavior is better suited for high-T electrons.

CMS Simulation
BDT w/o iso BDT and iso BDT w/ iso ECAL Barrel ECAL Endcap Figure 25. Performance of the electron BDT-based identification algorithm with (red) and without (green) the isolation variables, compared to an optimized sequential selection using the BDT without the isolations followed by a selection requirement on the combined isolation (blue). Electrons are selected for the BDT training with an T of at least 20 GeV.
Very high-energy electrons can lead to saturation of the ECAL electronics. In the presence of a saturated crystal, the shower-shape variables become biased and the requirements on lateral shower-shape variables, as for example the ratio of the energy collected in × arrays of crystals 2×5 / 5×5 and are disabled if a saturated crystal occurs within a 5×5 crystal matrix around the central crystal. The selection requires that the electron be isolated in a cone of radius Δ = 0.3 in both the calorimeters ( ECAL and HCAL ) and tracker ( tracker ). The HCAL subdetector had two readout depths available in the endcap regions in Run 2. Only the first longitudinal depth is used for the HCAL isolation, because the second one suffered from higher detector noise.
Only well-measured tracks that are consistent with originating from the same vertex as the electron are included in the isolation sum. Moreover, in the barrel, the 2×5 / 5×5 and 1×5 / 5×5 variables are used, since they are very effective at high-T .
As mentioned in section 4.3, as T of the electrons increase, they are less likely to be seeded by a tracker-driven approach. Such electrons are rejected in the high-T electron identification algorithm, which is mostly meant for high-energy electrons. Requirements are applied on the minimum number of hits the electron leaves in the inner tracker and on the impact parameter relative to the center of the luminous region in the transverse plane ( xy ). This selection, which is about 90% efficient for electrons with T > 50 GeV, is used in many searches for exotic particles published by CMS in Run 2 [43]. A summary of the requirements applied in the HEEP identification algorithm is shown in table 7. These selection criteria are valid for the entirety of Run 2.

Selection efficiency and scale factors
The electron and photon identification efficiencies, as well as the electron reconstruction efficiency, are measured in data using a tag-and-probe technique that utilizes Z → ee events [4] described in detail in section 4.7. The identification efficiency is measured for transverse energies above 10 GeV -41 - for electrons and above 20 GeV for photons. For photon identification efficiency no requirement is applied on the track and charge of the probe, pretending in this way to identify an electron from Z boson decay as a photon. The performance achieved for the two reference electron selections is shown in figure 26. The electron identification efficiency in data (upper panels) and data-to-simulation efficiency ratios (lower panels) for the cut-based identification and the veto working point (left), and the BDT-based identification loosest working point (right) are shown as functions of the electron T . The efficiency is shown in four ranges. The vertical bars on the data-to-simulation ratios represent the combined statistical and systematic uncertainties. For the three years of data-taking, efficiencies and scale factors are measured with total uncertainties of the same order of magnitude. For 2017 data, where the latest calibrations were used to reconstruct the data, the measured data-to-simulation efficiency ratios are closer to unity by 3-5% over the entire energy range compared to 2016 and 2018 data.
The photon identification efficiency in data and data-to-simulation efficiency ratios are shown in figure 27, for the loose cut-based (left) and loosest BDT-based (right) identification working points, as functions of the electron T . The efficiency is shown in four ranges. The vertical bars on the data-to-simulation ratio represent the combined statistical and systematic uncertainties.
The electron and photon identification efficiency in data is very well described by the MC simulations, and it is reflected in the fact that the ratios are within 5% from unity for all the cutbased identification working points and the BDT-based ones. There are some effects that are very difficult to simulate. For example, the variation of ECAL noise with time, which affects the variable in low-and medium-T electrons and photons in the endcap. Such effects are included in the correction factors.      Figure 27. Photon identification efficiency in data (upper panels) and data-to-simulation efficiency ratios (lower panels) for the loose cut-based (left) and loosest BDT-based (right) identification working points, as functions of the photon T . The vertical bars on the markers represent combined statistical and systematic uncertainties.

Performance of recalibrated data sets
-43 -

JINST 16 P05014
subdetector components and the related physics objects. The simulation was also improved with a more accurate description of the data in terms of dynamic inefficiencies, radiation damage, and description of the detector noise. Electron and photon reconstruction and identification performance strongly depend upon improvements on measurements with the ECAL subdetector. An updated method to monitor and correct the ECAL crystal transparency loss due to radiation damage has been introduced. In parallel a more granular calibration of the crystals has been performed allowing to calibrate precisely also the highest pseudorapidity region of the calorimeter. All these actions have led to better resolution and better agreement between data and simulation. Figure 28 shows the improvement in resolution brought by the Legacy calibration of the ECAL for low showering electrons ( 9 > 0.94) as a function of pseudorapidity. This resolution is estimated after all the corrections described in section 6 are applied, also including the scale and spreading corrections of section 6.2. The relative resolution improves as a function of up to more than 50% for | | > 2.0. The improved detector calibration and simulation has led to an improved agreement between data and simulation. Figure 29 shows the improvement in the PF-based relative neutral hadron isolation in the barrel. These are obtained using Z → ee electrons, as described in section 7.1.3. Figure 30 shows the improvement in the electron reconstruction data-to-simulation efficiency correction factors. The magnitude of correction factors in the bottom panel is below 2% for the Legacy calibration compared with 3% for the EOY calibration. The number of misreconstructed electron candidates per event is reported in figure 31 and shows a slight decrease in the Legacy data set, due to the better conditions and calibrations.     the energy resolution throughout the entire Run 2 period ranges from 1 to 3.4%, depending on the region considered. In general, the agreement between data and simulation depends on the noise modeling in simulation and energy calibration for that object. Better calibration of the 2017 data, together with a more appropriate simulation of the noise levels in MC, have led to a better description of isolation variables.
The identification efficiencies for 2016, 2017, and 2018 are shown in figure 34 for cut-based loose electron identification requirements. The efficiencies are stable within 5% for the full range of T of the electrons across the full ECAL. The correction factors are also stable within 3% over the full three years.

Timing performance
In addition to the energy measurement, the ECAL provides a time of arrival for electromagnetic energy deposits that can separate prompt electrons and photons from backgrounds with a broader time of arrival distribution. The fast decay time of the PbWO 4 ECAL crystals, comparable to the LHC bunch crossing interval (80% of the light is emitted in 25 ns), together with the use of electronic pulse shaping with an high sampling rate provides for an excellent timing resolution [44]. The better the precision and synchronization of the timing measurement, the larger the rejection of the background. Background sources with a broad time distribution include cosmic rays, beam halo muons, electronic noise, spikes (hadrons that directly ionize the EB photodetectors), and out- The ECAL timing performance has been measured prior to data-taking using electrons from test beams, cosmic muons, and beam splash events [44]. The resolution for large energy deposits ( > 50 GeV) was estimated to be better than 30 ps and the linearity of the time response was also verified. During collisions in the LHC, there are many additional effects that could worsen the performance, such as residual timing jitter in the electronics or the clock distribution to the individual readout units, run-by-run variations, intercalibration effects, energy-dependent systematic uncertainties, and crystal damage due to radiation. A detailed description of the method to measure the ECAL timing with the crystals is given in the next section.

Time resolution measurement using Z → ee events
The method used to extract the electron and photon time resolutions with the ECAL detector is based on comparisons of the time of arrival of the two electrons arising from Z decays. The time of arrival of each electron is the measured time of the most energetic hit of the energy deposit in the ECAL. This time is corrected for the electron time-of-flight, which is determined from the primary vertex position, obtained from the electron track. This correction is needed because the timestamp -48 -recorded by the ECAL crystal assumes a time-of-flight from the origin of the detector, such that the distribution for the most energetic hit time is centered around zero for all crystals.
The two electron clusters are required to be in the EB and to pass loose identification criteria on the cluster shape. Their resulting invariant mass has to be consistent with the Z boson mass (60 < ee < 120 GeV). The energy of each of the two hits must fall within the range 10 < < 120 GeV. The lower threshold is motivated by the minimal energy constraint applied to reconstruct good quality electrons, whereas the upper threshold is applied to include only ECAL signals below the lowest gain switch threshold described in section 6.2. The resulting resolution for the full Run 2 inclusive data set is shown in figure 35 as a function of the effective energy of the dielectron system, which depends on the individual energies of the two electrons measured in the two seed crystals as For 2017 data, the Legacy calibration is used. The resolution is extracted as the parameter of a Gaussian fit to the core of the distribution of the time difference between the two electrons. The trend of the ECAL timing resolution as a function of eff is modeled with a polynomial function as ( 1 − 2 ) = / eff ⊕ √ 2 where represents the noise term and is the constant term and dominates at energies above 30-40 GeV. The noise term is very similar to that obtained prior to collisions [44] and the constant term is about 200 ps.

Electron and photon reconstruction performance in PbPb collisions
The quark-gluon plasma (QGP), a deconfined state of matter that is predicted [48] by quantum chromodynamics to exist at high temperatures and energy densities, is expected to be produced by colliding heavy nuclei at the LHC. Parton scattering reactions with large momentum transfer, which occur very early compared to QGP formation, provide tomographic probes of the plasma [49]. The outgoing partons interact strongly with the QGP and lose energy. This phenomenon has been observed at BNL RHIC [50-53] and at CERN LHC [54-58] using measurements of hadrons with -49 -high T and of jets, both created by the fragmentation of the high-momentum partons. Electroweak bosons such as photons and Z bosons that decay into leptons do not interact strongly with the QGP. The electroweak boson T reflects, on average, the initial energy of the associated parton that fragments into a jet, before any energy loss has occurred. Hence, the measurements of jets produced in the same hard scattering as a photon or a Z boson have, in contrast to dĳet measurements, a controlled configuration of the initial hard scattering.
The degree of overlap of the two colliding heavy nuclei (e.g., Pb) is defined using signals in the HF calorimeters, and is known as "centrality". Centrality is determined by the fraction of the total energy deposited in the HF, with 0% centrality corresponding to the most central collisions.
The typical particle multiplicity in central PbPb collisions is O (10 4 ), giving rise to a dense underlying event (UE). For this reason, the reconstruction, identification, and energy correction algorithms must be revised and optimized to perform in the extreme conditions of central PbPb collisions. The PbPb collisions were recorded in 2018 at a nucleon-nucleon center-of-mass energy of √ NN = 5.02 TeV, corresponding to an integrated luminosity of 1.7 nb −1 .

Electron and photon reconstruction
Several changes have been made to the photon and electron reconstruction with respect to the algorithms used in pp collisions. The out-of-time pileup in PbPb collisions is negligible, hence out-of-time hits and photons were excluded from the reconstruction. The PF ECAL clustering algorithm described in section 4.2 uses a dynamic window in the direction that is dependent on the seed T to recover the shower spreads in , which are considerable at low T . When applied to PbPb events, which have a denser environment of underlying events, this algorithm gives worse energy resolution and higher misidentification rates with respect to pp collisions. To improve the performance, an upper bound of 0.2 was imposed on the extent of the SC in the direction. Following these changes, the misidentification rate decreased from 2.7% to 0.5% for photons with 40 < T < 60 GeV, and the energy resolution, estimated from the effective width of the distribution of the ratio of the uncorrected SC energy ( SC,uncorr. ) to the true energy ( gen ), decreased: as shown in figure 36, the modified fixed-width PF algorithm resulted in an energy resolution between 8% and 3% at T = 20 and 100 GeV, respectively. In the simulation, the effect of the PbPb UE is modeled by embedding the output created with CMS 8 CP5 set of parameters [24] in events generated using [59], which is tuned to reproduce the charged-particle multiplicity and T spectrum in PbPb data. This embedding is applied with an event-by-event weight factor, based on the average number of nucleon-nucleon collisions calculated with a Glauber model [60] for each PbPb centrality interval.
In PbPb collisions, the large particle multiplicities involved often result in excessively long reconstruction times. As a result, the following modifications were made to the PbPb reconstruction algorithm to keep the reconstruction timing at a reasonable level: (i) the tracker-driven electron seeds were removed, (ii) the tracking regions were changed to be centered in a narrow region around the primary vertex, and (iii) the SC energy was required to be at least 15 GeV. These changes resulted in an improvement of the overall reconstruction timing by a factor of more than 5. The reconstruction performance for electrons with T > 20 GeV, the kinematic region of interest to the majority of analyses, was not affected.

Electron identification and selection efficiency
As described in section 7.3, several strategies are used in CMS to identify prompt isolated electrons and to separate them from background sources, such as photon conversions, jets misidentified as electrons, or electrons from semileptonic decays of c or b quarks. A cut-based technique was chosen for the electron identification in PbPb collisions, using shower-shape and track-related variables to separate the signal from the background. The selection requirements are optimized using the TMVA framework, with the working point target efficiencies remaining the same as in pp collisions. The input variables are , / ratio computed from a single tower, 1/ − 1/ , |Δ seed in | between the ECAL seed crystal and the associated track, and |Δ in | between the ECAL SC and the associated track. An optimization is performed in two centrality bins (0-30% (central) and 30-100% (peripheral)), since most of the included variables are centrality dependent. Variables that do not depend on centrality, i.e., the number of expected missing inner hits and three-dimensional impact parameter, were optimized in a second step.
The efficiency of electron reconstruction and identification selection requirements is estimated in data and simulation using the tag-and-probe method, as described in section 4.7. Events are required to pass standard calorimeter noise filters, to trigger the single-electron HLT with an T threshold of 20 GeV, and to have a primary vertex position | | < 15 cm. The event must contain at least two reconstructed electrons. Each electron has to be within the acceptance region (20 < T < 200 GeV, | | < 2.1), and should not be in the barrel-endcap transition region or in the problematic HCAL region for 2018 PbPb data, because in 2018, a 40 • section of one end of the hadronic endcap calorimeter lost power during the data-taking period. All PbPb data is affected by this power loss. Tag-and-probe electrons are defined as described in section 4.7. The tag-and-probe pairs are required to be oppositely charged and to have an invariant mass in the range 60-120 GeV.
-51 -For the loose identification working point the data-to-simulation correction factor is smaller than 3%, both in the barrel and the endcaps.
Several sources of systematic uncertainty are considered. The main uncertainty is related to the model used in the mass fit, and is estimated by comparing alternative distributions for signal and background. The second most important uncertainty is related to the tag requirement, varied from the tight to the medium working points. The total systematic uncertainty in the loose identification working point data-to-simulation correction factor is 2.0-4.5 (2.0-7.5)% in the barrel (endcaps).

Electron energy corrections
In heavy ion collisions, the UE activity can vary greatly between the most central and peripheral collisions. The additional energy deposited by particles from the UE in the ECAL can be clustered together with the energy deposited from genuine electrons, and thus affect the energy scale of reconstructed electrons in a centrality-dependent manner. The electron energy scale is studied using control samples in data (as described in section 6.1) based on the invariant mass of the Z boson, which is known precisely and within 5% of the MC scale. The electron energy scale and resolution extracted from this study are used to correct the energy scale, and to smear the electron energy resolution in simulated samples, to match those observed in data.
The invariant mass of dielectron pairs from Z → ee decays is constructed from the ECAL energy component in three categories corresponding to the detector regions in which the two electrons are reconstructed, namely the EB and EE regions. The events are further subdivided into three centrality regions: 0-10, 10-30, and 30-100%. The electrons are required to have a minimum T of 20 GeV, pass the loose identification selection, and to be located outside the ECAL transition region or the problematic HCAL region. The invariant mass distributions are fitted with a DSCB distribution, from which the mean values are extracted. This is performed separately for data and simulation, and the ratio of the extracted mean values to the world-average Z boson mass [36] is used as a correction factor applied to the mean energy scale. The energy resolutions are extracted after first applying the scale factors derived to shift the invariant mass distributions back to the nominal Z boson mass. The energy scale and resolution spreading correction factors are applied to the ECAL energy component of the reconstructed electrons, with the final electron momentum obtained by redoing the ECAL-tracker recombination. The first two sources of systematic uncertainty are evaluated by constructing the invariant mass distributions after varying the electron selection criteria. The variations considered are to tighten the selection criteria from the loose to the medium working point, and increase the electron T threshold from 20 to 25 GeV. The difference between the mean values of the nominal and the varied distributions is used as an estimate of the systematic uncertainty. The residual discrepancy between the corrected and the nominal Z boson mass is also assumed as a systematic uncertainty, and is smaller than 1 (3)% in the barrel (endcap) region.
A comparison of the Z → ee invariant mass peak between data and simulated Drell-Yan events generated with MadGraph5 at NLO [21] is shown in figure 37. The electron energy in simulation has been corrected using scale corrections and resolution spreading, and electron reconstruction and identification efficiency corrections have also been applied.

Summary
The performance of electron and photon reconstruction and identification in CMS during LHC Run 2 was measured using data collected in proton-proton collisions at √ = 13 TeV in 2016-2018 corresponding to a total integrated luminosity of 136 fb −1 .
A clustering algorithm developed to cope with the increasing pileup conditions is described, together with the use of the new pixel detector with one more layer and a reduced material budget. These are major changes in electron and photon performance with respect to Run 1.
Multivariate algorithms are used to correct the electron and photon energy measured in the electromagnetic calorimeter (ECAL), as well as to estimate the electron momentum by combining independent measurements in the ECAL and in the tracker. The overall energy scale and resolution are both calibrated using electrons from Z → ee decays. The uncertainty in the electron and photon energy scale is within 0.1% in the barrel, and 0.3% in the endcaps in the transverse energy ( T ) range from 10 to 50 GeV. The stability of this calibration is estimated to be within 2-3% for higher energies. The measured energy resolution for electrons produced in Z boson decays ranges from 2 to 5%, depending on electron pseudorapidity and energy loss through bremsstrahlung in the detector material. The energy scale and resolution corrections have been checked for photons using Z → events and are adequate within the assigned systematic uncertainties. The performance of electron and photon reconstruction and identification algorithms in data is studied with a tag-and-probe method using Z → ee events. Good agreement is observed between -53 -data and simulation for most of the variables relevant to both reconstruction and identification. The reconstruction efficiency in data is better than 95% in the T range from 10 to 500 GeV. The data-to-simulation efficiency ratios, both for electron reconstruction and for the various electron and photon selections, are compatible with unity within 2% over the full T range, down to an T as low as 10 (20) GeV for electrons (photons) when using the 2017 data reconstructed with the dedicated Legacy calibration. Identification efficiencies target three working points with selection efficiencies of 70, 80, and 90%, respectively.
The energy resolution and energy scale measurements, together with the relevant identification efficiencies, remain stable throughout the full Run 2 data-taking period (2016-2018). For the 2017 data-taking period, the dedicated Legacy calibration brings an improvement of up to 50% in terms of relative energy resolution in the ECAL, as well as an improved agreement between data and simulation, leading to smaller reconstruction and identification efficiency correction over the entire T and ranges. As a result of these calibrations the electron and photon reconstruction and identification performance at Run 2 are similar to that of Run 1, despite the increased pileup and radiation damage. The evident success of the dedicated Legacy calibration of 2017 data motivates a plan to pursue the same techniques for the 2016 and 2018 data.
The ECAL timing resolution is crucial at CMS to suppress noncollision backgrounds, as well as to perform dedicated searches for delayed photons or jets predicted in several models of physics beyond the standard model. A global timing resolution of 200 ps is measured for electrons from Z decays with the full Run 2 collision data.
Excellent performance in electron and photon reconstruction and identification has also been achieved in the case of lead-lead collisions at √ NN = 5.02 TeV in 2018 corresponding to a total integrated luminosity of 1.7 nb −1 . Reconstruction, identification, and energy correction algorithms have been revised and optimized to perform in the extreme conditions of high underlying event activity in central lead-lead collisions. For electrons and photons reconstructed in lead-lead collisions, the uncertainty on the energy scale is estimated to be better than 1 (3)% in the barrel (endcap) region.  -57 -