Electron and photon performance measurements with the ATLAS detector using the 2015-2017 LHC proton-proton collision data

This paper describes the reconstruction of electrons and photons with the ATLAS detector, as employed for measurements and searches exploiting the complete LHC Run 2 dataset. An improved energy clustering algorithm is introduced, and its implications for the measurement and identification of electrons and photons are discussed in detail. Corrections and calibrations that affect performance, including energy calibration, identification and isolation efficiencies, and the measurement of the charge of reconstructed electron candidates are determined using up to 81 fb$^{-1}$ of proton-proton collision data collected at $\sqrt{s}=$ 13 TeV between 2015 and 2017.


Introduction
With an integrated luminosity of about 147 fb −1 , the proton-proton (pp) collision dataset collected by the ATLAS detector between 2015 and 2018 at a centre-of-mass energy of √ s = 13 TeV will allow significant advances in the exploration of the electroweak scale. Optimal performance in the measurement of electrons and photons plays a fundamental role in searches for new particles, in the measurement of Standard Model cross-sections, and in the precise measurement of the properties of fundamental particles such as the Higgs and W bosons and the top quark.
The ATLAS collaboration published three papers describing the performance of the reconstruction, identification and energy measurement of electrons and photons with 36 fb −1 of pp collision data collected in 2015 and 2016 [1][2][3]. New algorithms for electron and photon reconstruction were introduced in 2017. The present paper describes the performance of these algorithms, and extends the analysis to the dataset collected between 2015 and 2017, which corresponds to an integrated luminosity of about 81 fb −1 . The discussion is limited to electrons and photons reconstructed in the central calorimeters, covering the pseudorapidity range |η| < 2. 5.
The transition from the reconstruction of electrons and photons based on fixed-size clusters of calorimeter cells towards a dynamical, topological cell clustering algorithm [4] represents the most important modification. The algorithms used for the identification of the candidates and the estimation of their energy have been updated accordingly. The performance of these changes is discussed in detail. In addition, methods allowing an improved rejection of misreconstructed or non-isolated candidates are presented, and are of particular importance for measurements of processes with low cross-sections or high backgrounds, such as the associated production of a Higgs boson with a top-quark pair, or vector-boson scattering at high energy.
After a summary of the experimental apparatus and the samples used for this analysis in sections 2 and 3, section 4 describes the new reconstruction of clusters of energy deposits in the electromagnetic (EM) calorimeter, the estimation of their energy, and the use of information from the inner tracking detector to distinguish between electrons and photons. Section 5 summarizes the energy calibration corrections and the associated systematic uncertainties. Sections 6 and 7 present the re-optimized electron and photon identification algorithms. Section 8 discusses the discrimination between prompt electrons and photons and backgrounds from hadron decays. Finally, studies dedicated to the electron and positron charge identification are reported in section 9.

Topo-cluster reconstruction
The topo-cluster reconstruction algorithm [4,26] begins by forming proto-clusters in the EM and hadronic calorimeters using a set of noise thresholds in which the cell initiating the cluster is required to have significance ς EM cell ≥ 4, where , E EM cell is the cell energy at the EM scale4 and σ EM noise,cell is the expected cell noise. The expected cell noise includes the known electronic noise and an estimate of the pile-up noise corresponding to the average instantaneous luminosity expected for Run 2. In this initial stage, cells from the presampler and the first LAr EM calorimeter layer are excluded from initiating proto-clusters, to suppress the formation of noise clusters. The proto-clusters then collect neighbouring cells with significance ς EM cell ≥ 2. Each neighbour cell passing the threshold of ς EM cell ≥ 2 becomes a seed cell in the next iteration, collecting each of its neighbours in the proto-cluster. If two proto-clusters contain the same cell with ς EM cell ≥ 2 above the noise threshold, these proto-clusters are merged.
4The EM scale is the basic signal scale accounting correctly for the energy deposited in the calorimeter by electromagnetic showers.
-6 - A crown of nearest-neighbour cells is added to the cluster independently on their energy. In the presence of negative-energy cells induced by the calorimeter noise, the algorithm uses ς EM cell instead of ς EM cell to avoid biasing the cluster energy upwards, which would happen if only positive-energy cells were used. This set of thresholds is commonly known as '4-2-0' topo-cluster reconstruction. Proto-clusters with two or more local maxima are split into separate clusters; a cell is considered a local maximum when it has E EM cell > 500 MeV, at least four neighbours, and when none of the neighbours has a larger signal.
Electron and photon reconstruction starts from the topo-clusters but only uses the energy from cells in the EM calorimeter, except in the transition region of 1.37 < |η| < 1.63, where the energy measured in the presampler and the scintillator between the calorimeter cryostats is also added. This is referred to as the EM energy of the cluster, and the EM fraction ( f EM ) is the ratio of the EM energy to the total cluster energy. Only clusters with EM energy greater than 400 MeV are considered. The distribution of f EM is shown in figure 2a and the electron reconstruction efficiency for various cuts on f EM is shown in figure 2b, for electron clusters which have been simulated with µ = 0, and for pile-up clusters. A preselection requirement of f EM > 0.5 was chosen for the initial topo-clusters, as it rejects ∼ 60% of pile-up clusters without affecting the efficiency for selecting true electron topo-clusters. 5 These clusters are referred to as EM topo-clusters in the rest of this 5In the transition region, some topo-clusters are also selected as EM clusters, even if they fail the requirement on f EM , when they satisfy E T > 1 GeV, in order to increase the reconstruction efficiency in that region.

Track reconstruction, track-cluster matching, and photon conversion reconstruction
Track reconstruction for electrons is unchanged with respect to refs. [1,2]. A summary of the changes applied for photons is given below. Standard track-pattern reconstruction [27] is first performed everywhere in the inner detector. However, fixed-size clusters in the calorimeter that have a longitudinal and lateral shower profile compatible with that of an EM shower are used to create regions-of-interest (ROIs). If the standard pattern recognition fails for a silicon track seed (a set of silicon detector hits used to start a track) within an ROI, a modified pattern recognition algorithm based on a Kalman filter formalism [28] is used, allowing for up to 30% energy loss at each material intersection. Track candidates are then fitted with the global χ 2 fitter [29], allowing for additional energy loss when the standard track fit fails. Additionally, tracks with silicon hits loosely matched6 to fixed-size clusters are re-fitted using a Gaussian sum filter (GSF) algorithm [30], a non-linear generalization of the Kalman filter, for improved track parameter estimation.
The loosely matched, re-fitted tracks are then matched to the EM topo-clusters described above, extrapolating the track from the perigee to the second layer of the calorimeter, and using either the measured track momentum or rescaling the magnitude of the momentum to match the cluster energy. The momentum rescaling is performed to improve track-cluster matching for electron candidates with significant energy loss due to bremsstrahlung radiation in the tracker. A track is considered matched if, with either momentum magnitude, |∆η| < 0.05 and −0.10 < q · (φ track − φ clus ) < 0.05, where q refers to the reconstructed charge of the track. The requirement on q · (φ track − φ clus ) is asymmetric because tracks sometimes miss some energy from radiated photons that clusters measure.
6The match must be within |∆η| < 0.05 and −0.20 < q · (φ track − φ clus ) < 0.05 when using the track energy to extrapolate from the last inner detector hit, or |∆η| < 0.05 and −0.10 < q · (φ track − φ clus ) < 0.05 when using the cluster energy to extrapolate from the track perigee; q refers to the reconstructed charge of the track.
-8 -If multiple tracks are matched to a cluster, they are ranked as follows. Tracks with hits in the pixel detector are preferred, then tracks with hits in the SCT but not in the pixel detector. Within each category, tracks with a better ∆R match to the cluster in the second layer of the calorimeter are preferred, unless the differences are small (less than 0.01). The extrapolation of the track through the calorimeter is done first with the track momentum rescaled to the cluster energy and successively without rescaling. If both the first and the second extrapolation result in small ∆R differences, the track with more pixel hits is preferred, giving an extra weight to a hit in the innermost layer. The highest-ranked track is used to define the reconstructed electron properties.
The photon conversion reconstruction is largely unchanged from the method described in ref. [1]. Tracks loosely matched to fixed-size clusters serve as input to the reconstruction of the conversion vertex. Both tracks with silicon hits (denoted Si tracks) and tracks reconstructed only in the TRT (denoted TRT tracks) are used for the conversion reconstruction. Two-track conversion vertices are reconstructed from two opposite-charge tracks forming a vertex consistent with that of a massless particle, while single-track vertices are essentially tracks without hits in the innermost sensitive layers. To increase the converted-photon purity, the tracks used to build conversion vertices must have a high probability to be electron tracks as determined by the TRT [31]. The requirement is loose for Si tracks but tight for TRT tracks used to build double-track conversions, and even tighter for tracks used to build single-track conversions.
Changes were made with respect to the reconstruction software described in ref. [1], both to improve the reconstruction efficiency of double-track Si conversions (conversions reconstructed with two Si tracks), and to reduce the fraction of unconverted photons mistakenly reconstructed as single-or double-track TRT conversions (conversions reconstructed with one or two TRT tracks). The efficiency for double-track Si conversions was improved by modifying the tracking ambiguity processor, which determines which track seeds are retained to reconstruct tracks. For double-track conversion topologies, the two tracks are expected to be close to each other, parallel, and potentially to have shared hits, so that frequently only one track is reconstructed. The optimization in the ambiguity processor results in the recovery of the second track that was previously discarded. Overall, these modifications result in a 2-4% improvement in efficiency for double-track Si conversions, with larger improvements of up to 9% for photons with conversion radii larger than 200 mm. In addition to reconstructing the second track of what would otherwise have been single-track Si conversions, the overall conversion reconstruction efficiency is improved by about 1% by reducing the fraction of low-radius converted photons that are only reconstructed as electrons.
To reduce the fraction of unconverted photons reconstructed as double-or single-track TRT conversions, requirements on the TRT tracks were tightened. The tracks are required to have at least 30% precision hits, where a precision hit is defined as a hit with a track-to-wire distance within 2.5 times its uncertainty [32]. In addition, the requirement on the probability of a track to correspond to an electron, as determined by the TRT, was tightened to 0.75 for tracks used in double-track TRT conversions and to 0.85 for tracks used in single-track TRT conversions, compared with the previous requirement of 0.7 for tracks used in both conversion types. The fraction of unconverted photons erroneously reconstructed as converted photons is below 5% for events with µ < 60, improving by a factor of two compared to the previous algorithm.
- 9 -The conversion vertices are then matched to the EM topo-clusters. 7 If there are multiple  conversion vertices matched to a cluster, double-track conversions with two silicon tracks are  preferred over other double-track conversions, followed by single-track conversions. Within each category, the vertex with the smallest conversion radius is preferred.

Supercluster reconstruction
The reconstruction of electron and photon superclusters proceeds independently, each in two stages: in the first stage, EM topo-clusters are tested for use as seed cluster candidates, which form the basis of superclusters; in the second stage, EM topo-clusters near the seed candidates are identified as satellite cluster candidates, which may emerge from bremsstrahlung radiation or topo-cluster splitting. Satellite clusters are added to the seed candidates to form the final superclusters if they satisfy the necessary selection criteria.
The steps to build superclusters proceed as follows. The initial list of EM topo-clusters is sorted according to descending E T , calculated using the EM energy. 8 The clusters are tested one by one in the sort order for use as seed clusters. For a cluster to become an electron supercluster seed, it is required to have a minimum E T of 1 GeV and must be matched to a track with at least four hits in the silicon tracking detectors. For photon reconstruction, a cluster must have E T greater than 1.5 GeV to qualify as a supercluster seed, with no requirement made on any track or conversion vertex matching. A cluster cannot be used as a seed cluster if it has already been added as a satellite cluster to another seed cluster.
If a cluster meets the seed cluster requirements, the algorithm attempts to find satellite clusters, using the process summarized in figure 3. For both electrons and photons, a cluster is considered a satellite if it falls within a window of ∆η × ∆φ = 0.075 × 0.125 around the seed cluster barycentre, as these cases tend to represent secondary EM showers originating from the same initial electron or photon. For electrons, a cluster is also considered a satellite if it is within a window of ∆η × ∆φ = 0.125 × 0.300 around the seed cluster barycentre, and its 'best-matched' track is also the best-matched track for the seed cluster. For photons with conversion vertices made up only of tracks containing silicon hits, a cluster is added as a satellite if its best-matched (electron) track belongs to the conversion vertex matched to the seed cluster. These steps rely on tracking information to discriminate distant radiative photons or conversion electrons from pile-up noise or other unrelated clusters.
The seed clusters with their associated satellite clusters are called superclusters. The final step in the supercluster-building algorithm is to assign calorimeter cells to a given supercluster. Only cells from the presampler and the first three LAr calorimeter layers are considered, except in the transition region of 1.4 < |η| < 1.6, where the energy measured in the scintillator between the calorimeter cryostats is also added. To limit the superclusters' sensitivity to pile-up noise, the size of each constituent topo-cluster is restricted to a maximal width of 0.075 or 0.125 in the η direction 7If the conversion vertex has tracks with silicon hits, a conversion vertex is considered matched if, after extrapolation, the tracks match the cluster to within |∆η| < 0.05 and |∆φ| < 0.05. If the conversion vertex is made of only TRT tracks, then if the first track is in the TRT barrel, a match requires |∆η| < 0.35 and |∆φ| < 0.02, and if the first track is in the TRT endcap, a match requires |∆η| < 0.2 and |∆φ| < 0.02.
8An exception to the E T ordering is made for clusters in the transition region that fail the standard selection but pass a looser selection; these are added at the end.
-10 - in the barrel or endcap region, respectively. Because the magnetic field in the ID is parallel to the beam-line, interactions between the electron or photon and detector material generally cause the EM shower to spread in the φ direction, so the restriction in η still generally allows the electron or photon energy to be captured. No restriction is applied in the φ-direction.

Creation of electrons and photons for analysis
After the electron and photon superclusters are built, an initial energy calibration and position correction is applied to them, and tracks are matched to electron superclusters and conversion vertices to photon superclusters. The matching is performed the same way that the matching to EM topo-clusters was performed, but using the superclusters instead. Creating the analysis-level electrons and photons follows. Because electron and photon superclusters are built independently, a given seed cluster can produce both an electron and a photon. In such cases, the procedure presented in figure 4 is applied. The purpose is that if a particular object can be easily identified only as a photon (a cluster with no good track attached) or only as an electron (a cluster with a good track attached and no good photon conversion vertex), then only a photon or an electron object is created for analysis; otherwise, both an electron and a photon object are created. Furthermore, these cases are marked explicitly as ambiguous, allowing the final classification of these objects to be determined based upon the specific requirements of each analysis.
Because the energy calibration depends on matched tracks and conversion vertices, and the initial supercluster calibration is performed before the final track and conversion matching, the energies of the electrons and photons are recalibrated, following the procedure described in ref. [3].
Subsequently, shower shape and other discriminating variables [1,2] are calculated for electron and photon identification. A list is given in  Figure 4. Flowchart showing the logic of the ambiguity resolution for particles initially reconstructed both as electrons and photons. An 'innermost hit' is a hit in the functioning pixel nearest to the beam-line along the track trajectory, E/p is the ratio of the supercluster energy to the measured momentum of the matched track, R conv is the radial position of the conversion vertex, and R firstHit is the smallest radial position of a hit in the track or tracks that make a conversion vertex. energetic cell, so they are independent of the clustering used, provided the same most energetic cell is included in the clusters. More information about the variables and the identification methods are given in sections 6 and 7 for electrons and photons, respectively. Figure 5 shows the reconstruction efficiencies for electrons. The reconstruction efficiency at high p T approaches the tracking efficiency, as expected. One interesting feature, however, is the difference between the efficiency to reconstruct the cluster and track (green triangles) and the efficiency to reconstruct an electron (purple inverted triangles) at lower p T . The reason for this is that tracks with silicon hits are considered for matching to superclusters only if they have had a GSF re-fit performed. The fixed-size clusters used for choosing the tracks on which the GSF re-fit is performed introduce an E T threshold, which is the source of this inefficiency. To alleviate this feature, the EM topo-clusters as defined in section 4.1 could be used to seed the GSF fit. The top plot in figure 6 shows the reconstruction efficiency for converted photons as a function of the true E T of the simulated photon for the previous version of the reconstruction software, described in ref. [1], and the current version, described in section 4.2, along with the contributions of the different conversion types. For a photon to be classified as a true converted photon, the true radius of the conversion must be smaller than 800 mm. Only simulated photons with transverse energy greater than 20 GeV are considered. The simulated photons are distributed uniformly in |η|, with most of the photons having a transverse momentum smaller than 200 GeV. The bottom left plot of figure 6 shows the reconstruction efficiency for converted photons along with the contributions of the different conversion types as a function of µ . The improvement (see section 4.2) in the reconstruction efficiency for double-track Si conversions and the corresponding reduction of single-track Si conversions is clearly visible in those two plots. A slight reduction in double-and single-track TRT conversion efficiency is also visible, with the purpose of significantly reducing the probability for true unconverted photons to be reconstructed as TRT conversions, as can be seen in -12 - Table 1. Discriminating variables used for electron and photon identification. The usage column indicates if the variables are used for the identification of electrons, photons, or both. For variables calculated in the first EM layer, if the cluster has more than one cell in the φ direction at a given η, the two cells closest in φ to the cluster barycentre are merged and the definitions below are given in terms of this merged cell. The sign of d 0 is conventionally chosen such that the coordinates of the perigee in the transverse plane are (x 0 , y 0 ) = (−d 0 sin φ, d 0 cos φ), where φ is the azimuthal angle of the track momentum at the perigee.

Category
Description Name Usage Hadronic leakage Ratio of E T in the first layer of the hadronic calorimeter to E T of the EM cluster (used over the ranges |η| < 0.8 and |η| > 1.37) Ratio of E T in the hadronic calorimeter to E T of the EM cluster (used over the range 0.8 < |η| < 1.37) R had e/γ EM third layer Ratio of the energy in the third layer to the total energy in the EM calorimeter   An important reason for using superclusters is the improved energy resolution that superclusters provide by collecting more of the deposited energy. The peaks of the energy response, E calib /E true , where E true is the true energy of the simulated particle prior to any detector simulation, and E calib is the calibrated reconstructed energy, do not deviate from one by more than 0.5% for the different particles. To quantify the width (resolution) of the energy response, the effective interquartile range is used, defined as where Q 1 and Q 3 are the first and third quartiles of the distribution of E calib /E true , and the normalization factor is chosen such that the IQE of a Gaussian distribution would equal its standard deviation.
Comparisons of the resolutions of the calibrated energy response of simulated single electrons, converted photons, and unconverted photons, built using fixed-size clusters and superclusters, are given in figure 7. In particular, figure 7 shows the IQE of the two approaches in different regions of |η true | and E true T . The reconstructed electrons and photons in these distributions are required to correspond to true primary electrons and photons and to satisfy loose identification requirements. After calibration, the supercluster algorithm shows a significant improvement in resolution compared with the sliding-window algorithm for electrons. In absence of pile-up, an improvement in resolution of up to 20-30% is found in some bins in the endcap region of the detector, as well as in the central region for low-E T electrons. Similarly, a large improvement in the resolution is seen for converted photons, over 30% in a few bins. For unconverted photons, the overall change in performance is small, due to the generally narrower shower width. However, some improvement is observed for high E T bins in the endcap region. In presence of pile-up, the improvement in resolution still reaches 15 to 20%, depending on η and E T .  An important consideration is the performance of the supercluster reconstruction at different pile-up levels. Figure 8 shows the calibrated energy response resolution at different µ levels for electrons, converted photons, and unconverted photons, in two |η| regions. The topo-cluster noise thresholds for the 'high-µ' data sample were tuned for µ ∼ 40. For electrons and converted photons, the IQE of the supercluster reconstruction generally remains better, although the supercluster-based response is more sensitive to pile-up, as seen by its larger slope as a function of µ . Part of the reason is that the topo-cluster noise thresholds remain fixed even though µ changes. For unconverted photons, however, the supercluster reconstruction shows worse IQE for µ > 15. This degradation could be mitigated in particular by limiting the growth of the size of the clusters.

Electron and photon energy calibration
The energy calibration of electrons and photons closely follows the procedure used in ref. [3], updated for the new energy reconstruction described in section 4. The energy resolution of the   Figure 7. Calibrated energy response resolution, expressed in terms of IQE, for electrons (top), converted photons (middle), and unconverted photons (bottom) simulated with µ = 0. Two representative pseudorapidity ranges are shown. The response resolution for fixed-size clusters based on the sliding window method is shown in dashed red, while the supercluster-based response resolution is shown in full blue. For all plots, the bottom panel shows the ratios between the IQE obtained using the supercluster reconstruction and using the sliding window method. -17 -electron or photon is optimized using a multivariate regression algorithm based on the properties of the shower development in the EM calorimeter. The adjustment of the absolute energy scale using Z → ee decays is updated, together with systematic uncertainties related to pile-up and material effects. The universality of the energy scale is verified using radiative Z-boson decays.

Energy scale and resolution measurements with Z → ee decays
The difference in energy scale between data and simulation is defined as α i , where i corresponds to different regions in η. Similarly, the mismodelling of the energy resolution is parameterised as an η-dependent additional constant term, c i . The corresponding energy scale correction is applied to the data, and the resolution correction is applied to the simulation as follows: where the symbol ⊕ denotes a sum in quadrature. For samples of Z → ee decays, with electrons reconstructed in η regions i and j, the effect of the energy scale correction on the dielectron invariant mass is given in first order by m data,corr The values of α i j and c i j are determined by optimizing the agreement between the invariant mass distributions in data and simulation, separately for each (i, j) category. The α i and c i parameters are then extracted from a simultaneous fit of all categories.
Two methods are used for this comparison and the difference is taken as a systematic uncertainty. In the first method, the best estimates of α i j and c i j are found by minimizing the χ 2 of the difference between data and simulation templates. The templates are created by shifting the mass scale in simulation by α i j and by applying an extra resolution contribution of c i j . In the second method, used as a cross-check, a sum of three Gaussian functions is fitted to the data and simulated invariant mass distributions in each (i, j) region; the α i and c i are extracted from the differences, between data and simulation, of the means and widths of the fitted distributions.
Figures 9a and 9b show the results of α i and c i derived in 68 and 24 η intervals, respectively, separately for 2015, 2016 and 2017. The difference in α i for the different years is mainly due to two effects: variations of the LAr temperature, and the increase of the instantaneous luminosity. The former effect induces a variation in the charge/energy collection, affecting the energy response by about −2%/K [33]. The latter implies an increased amount of deposited energy in the liquid-argon gap that creates a current in the high-voltage lines, reducing the high voltage effectively applied to the gap and introducing a variation of the response of up to 0.1% in the endcap region. A prediction of the different effects that can impact the results is presented in ref. [3]. Given the small size of the observed dependence, well within 0.3%, dedicated energy scale corrections for each data taking year provide an adequate stability of the energy measurement.
For the constant term corrections c i , a dependence on the pile-up level is observed through the different values obtained for 2015 to 2017 data; this is addressed in section 5.2. A weighted average of the c i values for the different years is applied in the analyses of the complete dataset. The additional constant term of the energy resolution is typically less than 1% in most of the barrel and between 1% and 2% in the endcap.   Figure 10a shows the invariant mass distribution for Z → ee candidates for data and simulation after the energy scale correction has been applied to the data and the resolution correction to the simulation. No background contamination is taken into account in this comparison, but it is expected to be at the level of 1% over the full shown mass range. The uncertainty band corresponds to the propagation of the uncertainties in the α i and c i factors, as discussed in ref. [3]. Within these uncertainties, the data and simulation are in fair agreement. Figure 10b shows the stability of the reconstructed peak position of the dielectron mass distribution as a function of the average number of interactions per bunch crossing for the data collected in 2015, 2016 and 2017. The variation of the energy scale with µ is well below the 0.1% level in the data. The small increase of energy with µ observed in data is consistent with the MC expectation and is related to the new dynamical clustering used for the energy measurement, as introduced in section 4.

Systematic uncertainties
Several systematic uncertainties impact the measurement of the energy of electrons or photons in a way that depends on their transverse energy and pseudorapidity. These uncertainties were evaluated in ref. [3]. The amount of passive material located between the interaction point and the EM calorimeter is measured using the ratio of the energies deposited by electrons from Z-boson decays in the first and second layer of the EM calorimeter (E 1/2 ). The sensitivity of the calibrated energy to the detector material was re-evaluated to reflect the changes in the reconstruction described above. The systematic uncertainty due to the material description of the innermost pixel detector layer and the services of the pixel detector were updated with regards to ref. [3] using a more accurate description of these systems in the simulation [34]. The dependence of the constant term on the amount of pile-up, observed in figure 9b, is explained by the larger pile-up noise predicted by the simulation, compared with that observed in the data. Figure 11 shows an example of the evolution of the second central moment of the cell energy deposit in data and simulation as a function of µ for the second layer and 1.0 < |η| < 1.1 assuming φ symmetry. The contribution of the pile-up noise varies linearly with √ µ, while the electronic noise remains constant. An average difference of 10% between the pile-up noise in data and simulation is observed. This mismodelling is absorbed in the c i parameters for electrons of E T ∼ 40 GeV, the average E T value for electrons from Z → ee decays used to derive the energy corrections. The two methods used for the extraction of the energy resolution corrections, described in section 5.1, are compared and the full difference is taken as an uncertainty in the energy resolution. This uncertainty amounts to up to 0.2% in the barrel and is due to the different sensitivities of the two methods to the pile-up. The impact of a 10% difference in pile-up noise at a different energy is propagated to the energy resolution uncertainty relying on the predicted dependence of the pile-up noise effect as a function of the energy. For electrons and photons in the transverse energy range 30-60 GeV, the uncertainty in the energy resolution is of the order of 5% to 10%. In order to mimic the pile-up noise estimation in the simulation, the pile-up rescaling factor, described in section 3, is changed from 1.03 to 1.2 for the 48b filling scheme and to 1.3 for the 8b4e filling scheme. A systematic uncertainty in the energy scale is derived comparing the results obtained with the two pile-up reweighting factors; it is of the order of 2 × 10 −4 in the barrel and of 5 × 10 −4 in the endcap. The total systematic uncertainty in the energy scale amounts to 4 × 10 −4 in the barrel and 2 × 10 −3 in the endcap.

Validation of the photon energy scale with Z → γ decays
The energy scale corrections extracted from Z → ee decays, as described in section 5. is performed using radiative decays of the Z boson, probing mainly the low-energy region. Residual energy scale factors for photons, ∆α, are derived by comparing the mass distribution of the γ system in data and simulation after applying the Z-based energy scale corrections. The mass distribution of the γ system in the simulation is modified by applying ∆α to the photon energy and the value of ∆α that minimizes the χ 2 comparison between the data and the simulation is extracted. If the energy calibration is correct, ∆α should be consistent with zero within the uncertainties described in section 5.2. An alternative method based on a binned extended maximum-likelihood fit with an analytic function to describe the mass distribution is used, and gives consistent results. The electron and muon channels are analysed separately. In the electron channel, the electron energy scale uncertainty is accounted for in the determination of the residual photon energy scale. The electron and muon results are found to agree, and are combined. Figure 12 shows the measured ∆α as a function of E T and |η|, separately for converted and unconverted photons. The dominant sources of uncertainty in the extrapolation to photons of the energy corrections derived in Z → ee decays are related to the amount of passive material in front of the EM calorimeter, and to the intercalibration of the calorimeter layers. The value of ∆α is consistent with zero within about two standard deviations at most.

Energy scale and resolution corrections in low-pile-up data
Special data with low pile-up were collected in 2017 at 13 TeV, as described in section 3. Energy scale factors are derived for this sample using the baseline method, described in section 5.1. The measurement is done in 24 η regions given the small size of the sample.
An alternative approach, used for validation, consists of measuring the energy scale factors using high-pile-up data and extrapolating the results to the low-pile-up conditions. Two main effects are considered in the extrapolation, namely the explicit dependence of the energy corrections on µ , and differences between the clustering thresholds used for the two samples; other effects are sub-leading and are treated as systematic uncertainties.
To evaluate the first effect, the high-pile-up energy scale corrections are measured in five intervals of µ in the range 20 < µ < 60, in each of the 24 η regions considered for the low-pileup sample. The results are parameterized using a linear function, which is extrapolated to µ = 2. Over this range, the energy correction is found to vary by about 0.01% in the barrel, and by about 0.1% in the endcap. The statistical uncertainty in the extrapolation is about 0.05% in each η region. The procedure is illustrated in figure 13, for representative η regions in the barrel and in the endcap.
Secondly, as described in section 4, the low-pile-up data were reconstructed with topo-cluster noise thresholds corresponding to µ = 0, while the standard runs used thresholds corresponding to µ = 40. This results in an increased cluster size and enhanced energy response for the low-pile-up samples. The difference between the enhancements in data and simulation is measured using Zboson decays, and a correction applied. The correction amounts to about 2 × 10 −3 in the barrel and 4 × 10 −3 in the endcap, with a typical uncertainty of 3 × 10 −4 . Figure 14a shows the comparison between the energy scale factors derived from low-pile-up data and extrapolated from high-pile-up data after correcting for the noise threshold effect. The observed difference is of the order of 0.1% in the barrel region and increases to 0.5% in the endcap region. Different systematic uncertainties were considered for the extrapolation approach. In addition to the systematic uncertainties in high-pile-up data discussed in section 5.2, systematic uncertainties related to the functional form chosen for the extrapolation or the number of µ intervals considered were evaluated and are of the order of a few 10 −4 . The changes of the LAr temperature, in the absence of collisions, between the low-pile-up and high-pile-up data-taking periods, was found to induce a variation of the energy scale by 0.006%. A systematic uncertainty in the energy scale is also added for the non-linear variation of the LAr temperature with µ and amounts to a few times 10 −4 in the barrel and 10 −3 in the endcap. The total uncertainty in the extrapolated energy scale factors is about 0.05% in the barrel, and on average 0.15% in the endcap, as shown in figure 14b.

Electron identification
Further quality criteria, called 'identification selections' below, are used to improve the purity of selected electron and photon objects. The identification of prompt electrons relies on a likelihood discriminant constructed from quantities measured in the inner detector, the calorimeter and the combined inner detector and calorimeter. A detailed description is given in ref.
[2]. Recent changes implemented as a result of the migration to the supercluster reconstruction algorithm and adjustments made in parallel are discussed in the following. The identification criteria apply to all reconstructed electron candidates (see section 4).

Variables in the electron identification
The quantities used in the electron identification are chosen according to their ability to discriminate prompt isolated electrons from energy deposits from hadronic jets, from converted photons and from genuine electrons produced in the decays of heavy-flavour hadrons. The variables can be grouped into properties of the primary electron track, the lateral and longitudinal development of  Figure 14. (a) Energy scale corrections derived from Z → ee candidate events as a function of η for the low-pile-up data, high-pile-up data and the extrapolated high-pile-up data after correction for the topo-cluster noise threshold difference. The shaded areas correspond to the statistical uncertainties. The bottom panel shows the differences between the energy scale corrections measured in the 2017 high-µ dataset without any correction or extrapolated to µ = 2 and the measurements using 2017 low-µ data only. (b) Uncertainties in the energy scale corrections as a function of η for the low-pile-up data. the electromagnetic shower in the EM calorimeter, and the spatial compatibility of the primary electron track with the reconstructed cluster. They are described in table 1 and summarized here.
The primary electron track is required to fulfil a set of quality requirements, namely hits in the two inner tracking layers closest to the beam line, as well as a number of hits in the silicon-strip detectors. The transverse impact parameter of the track and its significance are used to construct the likelihood discriminant. Furthermore, ∆p/p and particle identification in the TRT are used.
The lateral development of the electromagnetic shower is characterized with variables calculated separately in the first and second layer of the electromagnetic calorimeter. To reject clusters from multiple incident particles, w s tot is used (see table 1). The lateral shower development is measured with R φ and R η . All lateral shower shape variables are calculated by summing energy deposits in calorimeter cells relative to the cluster's most energetic cell, and no significant difference between fixed-size EM clusters and superclusters is expected in these variables, as shown in figure 15a for R φ .
For the longitudinal shower shape variables, the numbers of cells contributing to the energy measurement in each layer are chosen dynamically in the supercluster approach, compared with fixed numbers of cells in fixed-size clusters. The supercluster approach inherently suppresses noise in the calorimeter cells, resulting in lower values and narrower distributions. The electron identification uses f 1 and f 3 (see table 1). The distribution of f 3 is compared for fixed-size clusters and superclusters in figure 15b. The significant differences between data and simulation are caused by a known mismodelling of calorimeter shower shapes in the G 4 detector simulation. These are accounted for in the optimisation of the electron identification (see section 6.3) and corrected with -24 -data-to-simulation efficiency ratios in analyses. Further discrimination against hadronic showers is achieved with R had .
The reconstructed track and the EM cluster are matched using ∆η 1 and ∆φ res .

Likelihood discriminant
A discriminant is formed from the likelihoods for a reconstructed electron to originate from signal, L S , or background, L B . They are calculated from probability density functions (pdfs), P, which are created by smoothing histograms of the n (typically 13) discriminating variables with an adaptive kernel density estimator (KDE [35]) as implemented in TMVA [36], separately for signal and background and in 9 bins in |η| and 7 bins of E T : For signal and background the pdfs take the values P S,i (x i ) and P B,i (x i ), respectively, for the quantity i at value x i . The likelihood discriminant d L is defined as the natural logarithm of the ratio of L S and L B .
The pdfs for signal were derived from Z → ee (for E T > 15 GeV) and J/ψ → ee events (for E T < 15 GeV) prior to the 2017 data-taking period in 36.9 fb −1 of data recorded in the years 2015 and 2016. A reconstructed electron is selected in these events using a tag-and-probe method [37]. One of the electrons must satisfy a strict requirement on the likelihood discriminant of the previous electron identification [2] and the other electron serves as a probe. To reduce the background contamination in the selected data, probe electrons are required to satisfy a very loose requirement on the likelihood discriminant. This requirement rejects approximately 95% of the background -25 -with a signal efficiency of 97%, causing only a mild distortion of the likelihood pdfs. Events with at least one reconstructed electron are selected to derive the pdfs for background. This sample primarily contains dijet events; contributions from genuine electrons, mainly from W → eν and Z → ee decays, are suppressed to a negligible level using dedicated selection criteria. Deriving the likelihood pdfs in data is an improvement compared to the previous likelihood-based identification, which used simulation. Compared to the mismodelling in simulation, the selection applied in data and differences in the run conditions between the years 2015, 2016, 2017 and 2018 cause only mild differences in the pdfs.
The electron likelihood identification imposes a selection on the likelihood discriminant and some additional requirements. The variable f 3 exhibits a dependence on the electron E T and η that cannot adequately be captured by the seven and nine bins, respectively, in which the pdfs are determined. It is therefore only used for electrons with |η| < 2.37 and E T < 80 GeV. Electrons are also rejected if a two-track silicon conversion vertex was reconstructed with a momentum closer to the cluster energy than that of the primary electron track. To pass the Tight operating point, electrons must moreover satisfy E/p < 10 and their primary track must satisfy p T > 2 GeV. These additional criteria aim to reject background from converted photons. For very high E T the energy dependence of the shower shape variables can cause a degradation of efficiencies for very strict requirements on the likelihood discriminant. To avoid efficiency losses in the Tight identification, the cuts on d L are chosen to be identical to the Medium identification for E T > 150 GeV, and the operating points differ only in the additional requirements and an η-dependent requirement on the shower width in the first calorimeter layer, applied to Tight electrons. -26 -

Efficiency of the electron identification
The operating points Loose, Medium and Tight are each optimized in 9 bins in |η| and 12 bins in E T such that reconstructed electrons meet the requirements on the likelihood discriminant with some predefined efficiency. The values of these requirements are determined in simulated events. For that purpose, the electromagnetic shower quantities and the combined track-cluster variables are shifted and adjusted in width such that the resulting distribution of the likelihood discriminant of the simulated electrons closely matches that in data. The discriminant threshold is adjusted linearly as a function of pile-up level to yield a stable rejection of background electrons. The number of reconstructed vertices n vtx serves as a measure for pile-up. Due to the deterioration of the discriminating power with pile-up, the approximately constant background rejection is accompanied by a reduction of signal efficiency as a function of the average number of interactions per bunch crossing, as shown in figure 16 for a pure sample of electrons from Z-boson decays.
The target efficiencies are the same as in the previous identification [2], as these have proven to suit a wide range of analyses and topologies. For typical electroweak processes they are, on average, 93%, 88% and 80% for the Loose, Medium, and Tight operating points and gradually increase from low to high E T . The reduced efficiency of the Medium and Tight operating points is accompanied by an improved rejection of background processes by factors of approximately 2.0 and 3.5, respectively, in the range 20 GeV < E T < 50 GeV. The background efficiency was evaluated in QCD two-to-two processes simulated as described in section 3.1. Figure 17 shows the resulting efficiencies in data. With increasing E T , the identification efficiency varies from 58% at E T = 4.5 GeV to 88% at E T = 100 GeV for the Tight operating point, and from 86% at E T = 20 GeV to 95% at E T = 100 GeV for the Loose operating point. In 2015, a different gas mixture was used in the TRT causing higher efficiencies. Similar efficiencies are obtained for the data recorded in the years 2016 and 2017 and residual differences are caused by their dependence on pileup. The discontinuity in the efficiency curve at E T = 15 GeV is caused by a known mismodelling of the variables used in the likelihood discriminant at low E T : performing the optimization of the discriminant cuts using simulated events leads to a higher efficiency in data in this region, resulting in the rise at low E T observed in the lower panels of figure 17.
The uncertainties in the efficiency are ±7% at E T = 4.5 GeV and decrease with transverse energy, reaching better than ±1% for 30 GeV < E T < 250 GeV. The systematic uncertainties in the measurements are dominated by background subtraction uncertainties at low E T , and are derived as decribed in ref. [2]. For larger values of E T , additional systematic uncertainties of ±0.5%, ±1.0%, ±1.5% assigned due to variations in the electron efficiency with E T for Loose, Medium and Tight identification, respectively, limit the precision.

Photon identification 7.1 Optimization of the photon identification
The photon identification criteria are designed to efficiently select prompt, isolated photons and reject backgrounds from hadronic jets. The photon identification is constructed from one-dimensional selection criteria, or a cut-based selection, using the shower shape variables described in table 1.  Figure 17. The electron identification efficiency in Z → ee events in data as a function of E T (left) and as a function of η (right) for the Loose, Medium and Tight operating points. The efficiencies are obtained by applying data-to-simulation efficiency ratios measured in J/ψ → ee and Z → ee events to Z → ee simulation. The inner uncertainties are statistical and the total uncertainties are the statistical and systematic uncertainties in the data-to-simulation efficiency ratio added in quadrature. For both plots, the bottom panel shows the data-to-simulation ratios.
The variables using the EM first layer play a particularly important role in rejecting π 0 decays into two highly collimated photons.
The primary identification selection is labelled as Tight, with less restrictive selections called Medium and Loose, which are used for trigger algorithms. The Loose identification criteria have remained unchanged since the beginning of Run 2, and Loose was the main selection used in the triggering of photon and diphoton events in 2015 and 2016. It uses the R had , R had 1 , R η , and w η 2 shower shape variables. The Medium selection, which adds a loose cut on E ratio , became the main trigger selection in the beginning of 2017, in order to maintain an acceptable trigger rate. Because the reconstruction of photons in the ATLAS trigger system does not differentiate between converted and unconverted photons, the Loose and Medium identification criteria are the same for converted and unconverted photons. The Tight identification criteria described in this paper are designed to select a subset of the photon candidates passing the Medium criteria. Because the shower shapes vary due to the geometry of the calorimeter, the cut-based selection of Loose, Medium and Tight are optimized separately in bins of |η|. The Tight identification presented here is also optimized in separate bins of E T , and compared with an earlier version of the Tight identification that makes an E T -independent selection.
The Tight identification is optimized using TMVA, and performed separately for converted and unconverted photons. The shower shapes of converted photons differ from unconverted photons due to the opening angle of the e + e − conversion pair, which is amplified by the magnetic field, and from the additional interaction of the conversion pair with the material upstream of the calorimeters.
The Tight identification is optimized using a series of MC samples that provide prompt photons and representative backgrounds at different transverse momenta. For photons with 10 < E T < -28 -

JINST 14 P12006
25 GeV, the Z → γ MC sample with the selection described in section 3.1 is used as a signal. The corresponding background sample is obtained from data consisting of Z+jets events collected using a similar event selection, but with relaxed requirements on the dilepton and dilepton+photon invariant masses m and m γ . Above E T = 25 GeV, the inclusive-photon production MC sample described in section 3.2 is compared with a dijet background MC sample that is enriched in high-E T energy deposits using a generator-level filter. No isolation selection is applied to the training samples, and the shower shape variables are corrected to match the shower shapes observed in data using the correction procedure described in ref. [1]. Figures 18 and 19 show the result of the Tight identification optimization in terms of the efficiencies as a function of E T for the signal and background MC training samples. The optimized selection, labelled E T -dependent, is compared with a reference selection that uses criteria that do not change with E T (E T -independent). The new, E T -dependent Tight identification allows the efficiencies of low-and high-E T photon regions to be tuned separately. The Tight identification is tuned to give a ∼20% higher efficiency at low E T , and an improved background rejection at high E T . The µ dependence of the photon identification is depicted in figure 20 for photons from Z → γ decays.  Figure 18. Efficiencies of the Tight photon identification for unconverted (left) and converted (right) signal photons, plotted as a function of photon E T . The signal events are taken from the sample of Z→ γ photons with E T < 25 GeV, and from inclusive-photon production above 25 GeV. In each case, the E Tindependent and E T -dependent selections are compared. The Loose isolation (see section 8.2) is applied as a preselection. For both plots, the bottom panel shows the ratios between the E T -dependent and the E T -independent identification efficiencies.

Efficiency of the photon identification
To assess the performance of the (E T -dependent) Tight photon identification on data, three photon efficiency measurements are performed using distinct data samples. The first uses an inclusivephoton production data selection, the second uses photons radiated from leptons in Z → γ decays, and the third uses electrons from Z → ee decays, with a method that transforms the electron shower   Figure 19. Efficiencies of the Tight photon identification for unconverted (left) and converted (right) background photons from jets, plotted as a function of photon E T . The background is taken from Z→ +jets production below 25 GeV, and filtered dijet production above 25 GeV. In each case, the E T -independent and E T -dependent selections are compared. The Loose isolation (see section 8.2) is applied as a preselection. For both plots, the bottom panel shows the ratios between the E T -dependent and the E T -independent identification efficiencies. 10   shapes to resemble the photon shower shapes. These efficiency measurements are described in detail in ref. [1], and summarized below. All three procedures measure photons that are isolated, using the Loose working-point definition (see section 8.2).
The three measurements use a common method to characterize the imperfect modelling of shower shapes in simulated samples, in order to estimate its impact on the efficiency measurement -30 -

JINST 14 P12006
in data. Nominally, the MC shower shapes are compared with data in control regions enriched in real photons and corrected by applying a simple shift to the distributions, whose magnitude is determined by a χ 2 minimization procedure. However, some data-MC differences cannot be corrected by this procedure, such as the widths of the distributions. In order to estimate any residual data-MC differences, the χ 2 minimisation is repeated considering only the tail of the distribution, defined as the region containing 30% of the distribution on the side closer to the identification cut value. The shift value obtained when comparing the data and simulation tails is used to define a systematic uncertainty in the modelling of the shower shapes, and is derived for all variables for which a mismodelling is observed. Four variations are defined using sets of correlated variables; the variables within each set are shifted together: {R had }, {R φ }, {R η ,w η 2 }, and {w s 3 , f side ,w s tot }. The result is equivalent to four sets of MC simulated samples, which can be used to assign systematic uncertainties for mismodelling effects that impact the data measurement, and which are considered to be uncorrelated variations.
The method using Z → γ decays selects data as described in section 3.1. Additional requirements on the invariant mass of the three-body system, 80 < m γ < 100 GeV, and on the lepton-pair invariant mass, 40 < m < 83 GeV, select radiative Z-boson decays while rejecting backgrounds from Z + γ and Z+jets production. The efficiency and purity of the samples with and without the Tight identification requirement are determined from fits of signal and background templates, extracted from simulated Z → γ and Z+jets events, to the observed three-body invariant-mass distribution.
The systematic uncertainties in the photon efficiency measurement using Z → γ decays include a closure test using simulated signal and background samples to assess the validity of the measurement. To assess the impact of simulation mismodelling, the measurement is repeated comparing the P -P 8 and S Z → samples and the difference is taken as a systematic uncertainty. The shower shape correction uncertainties are considered by repeating the measurement with each of the four sets of modified simulation samples, and the observed differences are added in quadrature. Finally, as a test of the background description, the fit range of the m γ distribution is varied from its nominal value of [65, 105] GeV using two variations, [45, 95] GeV and [80, 120] GeV, and the efficiency differences are assigned as a systematic uncertainty.
The method to extract the photon efficiency using inclusive-photon production relies on data collected with prescaled photon triggers that feature a Loose identification requirement, as described in section 3.1. This data sample contains a mixture of real photons and backgrounds from jet production, and a matrix method is used to extract the photon efficiency. The matrix method constructs four regions by categorizing Loose photon candidates according to whether they pass or fail the Tight identification, and whether they pass or fail track-based isolation cuts. The four regions contain eight unknowns (i.e. the numbers of signal and background events in each region); if the isolation efficiencies for signal and background from each region are known, the efficiency for Loose photons to pass the Tight identification can be extracted. The isolation efficiencies for loosely and tightly identified signal photons are determined from the Monte Carlo samples, and the isolation efficiencies for backgrounds are obtained in a jet-enriched control region constructed by inverting identification criteria. Finally, the efficiency for reconstructed photon candidates to pass the Loose identification is determined from simulation, as this contribution is not measured in data by this method. The magnitude of the correction is typically less than 5%, and smaller at high E T .
-31 -Systematic uncertainties assigned to the matrix method include a closure uncertainty that quantifies the agreement between the background isolation efficiencies derived in the data control region and in the regions to which they are applied. This effect is estimated using simulation, and is the largest source of uncertainty in the measurement. The robustness of the method is tested by varying the track-based isolation requirement, and assigning any difference in measured efficiency as a systematic uncertainty. The impact of uncertainties in the shower shape corrections is estimated using simulation; the effects of the four shower shape variations described above are added in quadrature. Finally, an uncertainty is assigned for a potential mismodelling in the MCbased correction to extrapolate from Loose to reconstructed photons. This uncertainty is based on the Loose identification efficiency measured with radiative photons in Z → γ events.
Photon efficiencies can be estimated in a data sample of electrons from Z → ee decays whose shower shape variables have been modified to resemble photon shower shapes, a technique referred to as the electron extrapolation method. This efficiency measurement, described in ref. [1], uses the Z → ee sample defined in section 3.1, with the photon Loose isolation requirement applied to the electron candidates. Electron shower shape variables are modified using a Smirnov transform [38] derived from simulated Z → ee and inclusive-photon production samples. The candidate electrons in data contain a small background from W+jets and multijet production; this background is subtracted by fitting simulated signal samples and background templates derived from data control regions to the m ee data distributions. The electron candidates are counted for events in the range 70 < m ee < 110 GeV, and the efficiencies are measured using the tag-and-probe method as described in section 6.
The systematic uncertainties in the electron extrapolation method are as follows. First, a closure test is performed to determine whether the transformed electrons can reproduce the expected photon efficiency, using the simulation and in the absence of background. The difference in relative efficiency, which can be as high as 3%, is applied as a correction to the measured data efficiency, and the magnitude of the correction is assigned as the systematic uncertainty. Systematic effects that affect the Smirnov transformations include the fraction of fragmentation photons in the simulated inclusive-photon sample, which is varied by ±50%, and the predicted fraction of true converted photons, which is varied by ±10%, to assess the impact of the imperfect simulation on the efficiency measurement. The uncertainty in the modelling of identification variables in simulation is assessed by defining Smirnov transformations for each of the four sets of variations of the shower shape modelling, recalculating the efficiency for each case; the total modelling uncertainty is taken as the sum in quadrature of the individual variations. The uncertainty due to the limited size of the MC samples used to derive the Smirnov transformations is assessed using the bootstrap method. Finally, the uncertainty associated with the subtraction of the W+jets and multijet backgrounds in the signal region is tested by reducing the level of background through a restriction of the selected invariant-mass range to 80 < m < 100 GeV, and repeating the measurement procedure. The resulting difference in the measured efficiency is taken as the systematic uncertainty.
The three efficiency measurements are compared with MC simulation in order to obtain scale factors, in bins of E T and |η|, that are used to correct the MC simulations so that the simulations closely resemble data. Before determining these scale factors, the shower shapes in these MC simulations were corrected to match data using the procedure described in ref. [1].
-32 -   Figure 21. The photon identification efficiency, and the ratio of data to MC efficiencies, for unconverted photons with a Loose isolation requirement applied as preselection, as a function of E T in four different |η| regions. The combined scale factor, obtained using a weighted average of scale factors from the individual measurements, is also presented; the band represents the total uncertainty.
The scale factors from each of the three efficiency measurements are combined using a weighted average. The statistical and systematic uncertainties are assumed to be uncorrelated between the methods. The total uncertainty of the combined scale factors ranges between 7% at low E T and 0.5% at high E T for unconverted photons, and between 12% (low E T ) and less than 1% (high E T ) for converted photons. For E T > 1.5 TeV, where no measurement is performed, the scale factor measured in the E T bin [0.25,1.5] TeV is used, with the same uncertainty.   Figure 22. The photon identification efficiency, and the ratio of data to MC efficiencies, for converted photons with a Loose isolation requirement applied as preselection, as a function of E T in four different |η| regions. The combined scale factor, obtained using a weighted average of scale factors from the individual measurements, is also presented; the band represents the total uncertainty.

Electron and photon isolation
The activity near leptons and photons can be quantified from the tracks of nearby charged particles, or from energy deposits in the calorimeters, leading to two classes of isolation variables. The raw calorimeter isolation [2] (E isol T,raw ) is built by summing the transverse energy of positiveenergy topological clusters whose barycentre falls within a cone centred around the electron or photon cluster barycentre. The topological cluster energy scale is the EM scale. The raw calorimeter isolation includes the EM particle energy (E T,core ), which is subtracted by removing the energy of the EM calorimeter cells contained in a ∆η × ∆φ = 5 × 7 (in EM-middle-layer units) rectangular cluster around the barycentre of the EM particle cluster. The advantage of this simple method is -34 -a stable subtraction for real or fake/non-prompt objects for any transverse momentum and pileup. The disadvantage is that it does not subtract all the EM particle energy and an additional leakage correction is needed. This leakage is parameterized as a function of E T and |η| using MC samples of single electrons or photons without pile-up. Additionally, a correction for the pile-up and underlying-event contribution to the isolation cone is also estimated [39].
Finally, the fully corrected calorimeter isolation variable is computed as: where XX refers to the size of the employed cone, ∆R = XX/100. A cone size ∆R = 0.2 is used for the electron working points whereas cone sizes ∆R = 0.2 and 0.4 are used for photon working points.
The track isolation variable (p coneXX T ) is computed by summing the transverse momentum of selected tracks within a cone centred around the electron track or the photon cluster direction. Tracks matched to the electron or converted photon are excluded. Since for electrons produced in the decay of high-momentum heavy particles, other decay products can be very close to the electron direction, the track isolation for electrons is defined with a variable cone size (p varconeXX T ) -the cone size shrinks for larger transverse momentum of the electron: where ∆R max is the maximum cone size (typically 0.2). The tracks considered are required to have p T > 1 GeV and |η| < 2.5, at least seven silicon (Pixel + SCT) hits, at most one shared hit (defined as n sh Pixel + n sh SCT /2, where n sh Pixel and n sh SCT are the numbers of hits assigned to several tracks in the Pixel and SCT detectors), at most two silicon holes (i.e. missing hits in the pixel and SCT detectors) and at most one pixel hole. In addition, for electron isolation, the tracks are required to have a loose vertex association, i.e. the track was used in the primary vertex fit, or it was not used in any vertex fit but satisfies |∆z 0 | sin θ < 3 mm, where |∆z 0 | is the longitudinal impact parameter relative to the chosen primary vertex; for photon isolation, all selected tracks satisfying |∆z 0 | sin θ < 3 mm are used.
In this section, the isolation efficiency measurements are illustrated with the data recorded in 2017; nevertheless, the measurements are performed for the full high-µ dataset described in section 3.1.

Electron isolation criteria and efficiency measurements
The implementation of isolation criteria is specific to the physics analysis needs, as it results from a compromise between a highly-efficient identification of prompt electrons, isolated or produced in a busy environment, and a good rejection of electrons from heavy-flavour decays or light hadrons misidentified as electrons. The different electron-isolation working points used in ATLAS are presented in table 2.
The working points can be defined in two different ways, targeting a fixed value of efficiency or with fixed cuts on the isolation variables. The Gradient working point is designed to give an efficiency of 90% at p T = 25 GeV and 99% at p T = 60 GeV, uniform in η. The requirements on -35 -

JINST 14 P12006
E cone20 T and p varcone20 T (cut maps) for this working point are derived from J/ψ → ee (E T < 15 GeV) and Z → ee (E T > 15 GeV) MC simulations and Tight identification requirements. The three other working points, HighPtCaloOnly, Loose and Tight, have a fixed requirement on the calorimeter and/or the track isolation variables. Figure 23 shows the electron isolation efficiency measured in data recorded in 2017 and the corresponding data-to-MC simulation ratios as a function of the electron E T and η, and of the number of interactions per bunch crossing for the isolation working points summarized in table 2. The pile-up correction to the calorimeter isolation is applied, and reduces the dependence of the isolation efficiency by about a factor of five. These results are obtained using a sample enriched in Z → ee events, where the electrons satisfy the Medium identification. The method used to compute the electron isolation efficiency and the associated uncertainties are described in ref.
[2]. For Gradient, a jump in the efficiency is observed at the transition point of 15 GeV because the value of the isolation efficiency is process dependent: the cut maps are optimized with J/ψ → ee events below 15 GeV, while the measurement is performed with Z → ee events in the full range. The Tight operating point gives the highest background rejection below 60 GeV and the most significant difference in shape in η. As the name suggests, HighPtCaloOnly gives the highest rejection in the high-E T region (E T > 100 GeV). The Gradient and Tight operating points give the highest pile-up dependency, the isolation efficiency decreasing from ∼95% at low µ to ∼85% when µ is around 70-80.
The overall differences between data and MC simulation are less than approximately 1-5% depending on the working point, with the largest difference observed for Tight isolation. For electrons with E T higher than 500 GeV no measurement can be performed because of the limited number of data events, and the results from the E T bin [300,500] GeV are used with an additional systematic uncertainty varying betwen 0.1% and 1.7%, depending on the isolation working point. The overall scale factor uncertainties range from about 5% for electrons with E T below 7 GeV, to less than 0.5% towards high E T .

Photon isolation criteria and efficiency measurements
Three photon isolation operating points are defined using requirements on the calorimeter and track isolation variables, as summarized in table 3. For the calorimeter-based photon isolation variables a discrepancy between the peak positions of their distributions in data and simulation has been observed since Run 1 [40], pointing to a mismodelling in simulation of the lateral profile development of the electromagnetic showers. As a result, the photon isolation efficiencies in data and simulations disagree, leading to scale factors significantly different from 1.
These discrepancies are mitigated by applying data-driven shifts to the calorimeter isolation variables for photons in simulation. The shifts are obtained by performing fits to the calorimeter isolation variable distribution, using Crystal Ball pdfs [41], in regions dominated by real photons, in data and simulation. The fits are performed in bins of photon η, E T and conversion status, separately for E cone20 T and E cone40 T isolation variables. The difference in the fitted peak values between data and simulation defines the shift value, which is added to the photon calorimeter isolation values in simulation. Figure 24 illustrates the data-driven shifts obtained with 2017 data and the P 8 simulation for the E cone20 T and E cone40 T isolation variables in two η regions. Figure 25 shows the -36 - Table 2. Definition of the electron isolation working points and isolation efficiency . In the Gradient working point definition, the unit of p T is GeV. All working points use a cone size of ∆R = 0.2 for calorimeter isolation and ∆R max = 0.2 for track isolation.

Working point
Calorimeter isolation Track isolation   distribution of the E cone40 T isolation variable in 2017 data and simulation, using Z → γ events after the data-driven shifts are applied.
The photon isolation efficiency is studied in two main signatures: radiative Z decays (valid for 10 < E T < 100 GeV) and inclusive photons (used in the 25 GeV < E T < ∼1.5 TeV range).

Measurement of photon isolation efficiency with radiative Z decays
As detailed in section 7, final-state radiation in Z-boson decays provides a clean environment to probe photons in the low-E T range. Using the same method as for the photon identification, photon isolation efficiencies are measured for the operating points presented in table 3. The evolution of the isolation efficiency measured in 2017 data as a function of η and E T is illustrated in figure 26, together with the data-to-simulation efficiency ratio. The overall differences between data and simulation are less than approximately 5%. The decrease of efficiency with increasing pile-up activity is shown in figure 27. A loss of efficiency of ∼10% is measured when increasing µ from 15 to 60. This loss is well described by the simulation.

Photon calorimeter isolation efficiency measurement with inclusive-photon events
Photon isolation studies with inclusive-photon events are performed using two different methods for the calorimeter-based and track-based isolations. This is because the distribution of the track isolation variable shows a large peak at p cone20 T = 0 followed by a 1 GeV gap, due to the selection of the tracks entering the p cone20 T computation, and by a small tail, and cannot be fitted with an analytic function. In consequence, the efficiency measurement is done separately for the track isolation and calorimeter isolation criteria applied to define the working points presented in table 3. When the measurement is performed for the track-based (calorimeter-based) isolation, the requirements on the calorimeter-based (track-based) isolation are applied at preselection level to reduce the background from jets.
The photon calorimeter isolation (calo-only) efficiency with inclusive-photon events is obtained by fitting the distribution of the calorimeter isolation, E cone40 T or E cone20 T , minus the relevant E T fraction (0.022 × E T for Tight and 0.065 × E T for Loose), hereafter simply called the isolation distribution. The measurement is performed in bins of photon η, E T , conversion status and datataking period. The P 8 inclusive-photon sample described in section 3.2 is used for the true photon template.
A set of alternate selections is used to determine the isolation distributions for the backgound and their uncertainty. These criteria, denoted LoosePrimeN, select photon candidates that pass the Loose identification but fail at least one out of N shower shape cuts used in the Tight identification.9 The nominal background template is obtained from photon candidates passing the LoosePrime4   identification. As in the measurements of the data-driven shifts, the photon isolation efficiency is obtained by performing a set of fits in regions defined in simulation and data. Although background enriched, the sample passing LoosePrime4 also contains true photons that fail the Tight identification requirement; these are defined as 'leakage' photons and subtracted. The sequence of fits proceeds as follows: 1. A model for the isolation distribution for signal photons is defined from a fit, using a Crystal Ball function, to the isolation distribution obtained for tightly identified photons in simulation.
2. The corresponding model for leakage photons is defined from a fit to the isolation distribution obtained for LoosePrime4 photons in simulation.  Finally, the calo-only isolation efficiency in data is obtained by integrating the backgroundsubtracted isolation distribution for tightly identified photons in data, up to the working point cut-off of 0 GeV (Loose) or 2.45 GeV (Tight and TightCaloOnly). Three sources of systematic uncertainty are considered: discrepancies between the fitted isolation distribution and that observed for photons in data; differences between results obtained using LoosePrime3 and LoosePrime5 -41 -instead of LoosePrime4 for the determination of the background templates; and uncertainties in the estimation of the number of leakage photons in the LoosePrime4 sample. A binomial statistical error in the scale factors is also calculated and added in quadrature to the systematic components.   The calo-only isolation efficiencies measured with inclusive-photon events in 2017 data are shown in figure 28. The overall differences between data and simulation increase from a few percent in the low E T region up to 15% at high E T (> 200 GeV) for the TightCaloOnly working point, and only up to 5% for Loose and Tight.

Photon track-based isolation efficiency measurement with inclusive-photon events
As in the measurement of the calo-only photon isolation efficiency, the main source of background comes from jets misidentified as photons. This background is estimated with a template fit to the track isolation distribution, in a region enriched in background photons satisfying LoosePrime4 but failing the Tight identification criterion. The track-only photon isolation efficiency is measured in a signal region enriched in tightly identified photons, after the background is subtracted. To assign the systematic uncertainties, the fit range is varied as well as the definition of the background template, where the photons are required to pass LoosePrime2, LoosePrime3 or LoosePrime5 instead of the LoosePrime4 criterion. Efficiencies for each configuration are computed, and with them the corresponding scale factors. Once the different scale factors are calculated, a bin-by-bin scan is performed, keeping the largest deviation from the nominal value among the considered variations. The total uncertainty is obtained by adding the systematic and statistical components in quadrature.
The track-only isolation efficiencies measured with inclusive-photon events in 2017 data are shown in figure 29. The ratio of the data to MC simulation is close to unity.

Combination of photon isolation scale factors
The photon isolation scale factors are measured for the three isolation working points detailed in table 3 using radiative Z decays and inclusive-photon events. The different results are combined to obtain one set of scale factors per working point, data-taking year, and photon conversion status. The combination is performed in two steps. First, the track-only and calo-only scale factors determined with inclusive-photon events are multiplied together to obtain a single set per configuration. These inclusive-photon scale factors are further combined with those determined with radiative Z decay events using a simple weighted average. The uncertainties in the track-only and calo-only results obtained with inclusive-photon events are treated as fully correlated in the combination, while the uncertainties in the radiative-Z and inclusive-photon measurement results are treated as uncorrelated. The combination is performed for 25 < E T < 100 GeV; below 25 GeV, only results from radiative Z decays are available, while above 100 GeV the results are obtained with inclusive-photon events only. If, in a given (|η|, E T ) bin, the total uncertainty in the combined scale factor does not cover the difference between the values obtained from the two samples, it is scaled such that χ 2 = 1. Above 1.5 TeV, the results obtained in the last bin used for the measurement are considered, with no change in the systematic uncertainty.
For E T < 25 GeV, the measurements achieve a typical uncertainty of about 2%, and at worst 5-10% for E T < 15 GeV. For E T > 100 GeV, uncertainties around 1-2% are obtained. For 25 < E T < 100 GeV, the combination of the two channels reduces the scale factor uncertainties to about 1% on average.

Electron charge misidentification
The reconstruction of the electric charge of an electron relies solely on the measurement of the curvature of its associated track in the inner detector. Interactions of an electron with the detector material can create secondary particles: photons and electron-positron pairs. The production of these secondary particles can lead to distortions of the primary electron track, e.g. hits from the  Figure 29. Efficiency of the different track-only isolation working points for photons from inclusive-photon events, as a function of photon E T in two η bins (|η| < 0.6 top, and |η| > 1.81 bottom). The results are shown for converted (left) and unconverted (right) photons. The lower panel shows the ratio of the efficiencies measured in data and in simulation. The total uncertainties are shown, including the statistical and systematic components. secondary particle being included in the fit of the primary electron track, and the presence of additional tracks of secondary particles in the vicinity of the primary electron track. Incorrect charge reconstruction can thus be caused either by an incorrect determination of the track curvature, or by the choice of an incorrect track.
For electrons at high transverse momentum, the first effect becomes dominant and leads to an almost linear increase with energy in the probability to determine the sign of the curvature incorrectly. Final-state radiation emitted collinearly off the electron can also cause charge misidentification if the radiated photon subsequently converts to an electron-positron pair in the detector material. Here, the correct or incorrect charge is assigned with equal probability. The electric -44 -charge is heavily used as a selection criterion in measurements with the ATLAS experiment, and hence understanding the effects of charge misidentification is important. Some specific signatures also require the suppression of electron charge misidentification in order to reduce background.

Suppression of electron charge misidentification
The suppression of electron charge misidentification is based on the output discriminant of a boosted decision tree (BDT). A previous version, optimized for data recorded in 2015 and 2016, rejected 90% of electrons with incorrectly reconstructed charge, removing only 3% of electrons with correctly reconstructed charge [2]. The optimization was based on simulated electrons and showed a higher rejection than observed in data. In the following, a re-optimization of the BDT is described. Data from Z → ee decays are used to reduce efficiency losses due to mismodelling of the input variables in the BDT training. Furthermore, additional input variables have been studied.
To select a relatively clean sample of electrons with correctly and incorrectly reconstructed charge, one of the electrons is restricted to |η| < 0.6, required to satisfy Tight identification and to pass the 97% operating point of the previous BDT discriminant. These requirements minimize charge misidentification for this electron. Any additional reconstructed electron in the event is used to train the BDT, as a signal electron if it has an electric charge different from the first electron, and as a background electron if the electric charge is the same. To reduce background from converted photons from initial-or final-state radiation, the invariant mass of any pairs of electrons must lie within 5 GeV of 90 GeV in opposite-charge events and within 5 GeV of 88 GeV in same-charge events. The lower value used in same-charge events accounts for the fact that electrons with the incorrect charge have a higher probability for energy loss as discussed in section 9.2 and illustrated in figure 30a.
Input quantities to the BDT are the electron E T and η, and a set of additional variables. In decreasing order of separation power, these are: the transverse impact parameter multiplied by the electron electric charge q × d 0 , the average charge of all tracks matched to the electron weighted by their number of hits in the SCT detectorq SCT , E/p and ∆φ res . Withq SCT the BDT includes for the first time the reconstructed properties of additional tracks in the vicinity of the electron, which improves rejection in cases where the incorrect track is chosen as the primary electron track.
The efficiency of the requirement on the BDT is 98% in Z → ee events for electrons satisfying Medium or Tight identification with the Tight isolation requirement, and that have the correct electric charge. Approximately 90% of electrons with the same identification and isolation requirements but incorrect electric charge are removed. This re-optimization of the BDT variables has improved the efficiency of the selection criterion, leaving the rejection of electrons with misidentified charge unchanged.

Measurement of the probability for charge misidentification
The probability for electron charge misidentification is measured in seven bins in η and six E T bins in the range 20 GeV < E T < 95 GeV in Z → ee events. The events were collected with the dielectron triggers discussed in section 3.1 with transverse momentum thresholds of 17 GeV or less and Loose trigger identification, allowing the measurement to be extended to lower values of E T and looser identification criteria than previous measurements. Both electrons in the event are selected with the same identification and isolation criteria and, respectively, fall into bins i and j in η, E T , yielding N i j Z → ee events. Their invariant mass must lie within 10 GeV of the nominal Z-boson mass. The probabilities of the electron charge misidentification in bins i and j, i and j , maximize the Poisson probability P λ i j |n sc i j , where: and n sc i j is the number of same-charge Z → ee events. The number of background events in the sample where both electrons have the same electric charge, B sc i j , consists of misidentified electrons from multijet production and electrons from converted photons from the aforementioned finalstate radiation. The two components are estimated in a sideband subtraction and from simulation, respectively. The selected data and the estimated background is shown in figure 30a for an example bin. Sources of systematic uncertainties in the measurement are the estimation of the background from multijet production and final-state radiation, and the restriction of the dielectron invariant mass. Possible biases in the experimental method used to perform the measurement are evaluated by comparing the charge misidentification probability obtained in the likelihood maximization in simulation with those obtained using generator-level information.
The kinematic range of E T > 95 GeV is particularly relevant for searches for physics beyond the Standard Model with same-charge signatures. For a measurement with high granularity, the double differential charge misidentification probabilities are factorized into an η-and an E T -dependent part. This approach allows measurements in 5 bins in E T and 14 bins in η with reasonable statistical precision from a sample of approximately 9000 electrons with the incorrect charge assignment (for -46 -Tight identification). The systematic uncertainty in the parameterization is assessed by comparing, double differentially, the ratio of same-charge events and opposite-charge events, weighted with the charge misreconstruction probability, in data and simulation. The systematic uncertainty is derived by incrementing the uncertainty in steps of 1% until the χ 2 value falls below 1, separately in each bin in E T .
The interactions with material in the inner detector causing electron charge misidentification can also lead to significant energy loss and leakage of energy outside the EM cluster, introducing a correlation between the two effects. In figure 30b, the charge misidentification probability is shown as a function of the energy response, (p reco T − p true T )/p true T , in several bins of η. It increases with the difference between true and reconstructed electron energy. The same effect causes the differences in reconstructed invariant mass between opposite-charge and same-charge events shown in figure 30a. The correlation with the energy response complicates the measurement of charge misidentification probabilities in data. The probability measurement is blind as to which of the two electrons has the incorrect charge assignment. Hence, the probabilities determined from the likelihood maximization are used to form data-to-simulation probability ratios. No significant dependence of the data-to-simulation ratios on the dilepton invariant mass has been observed. The charge misidentification probabilities in data are obtained by multiplying the data-to-simulation probability ratios by the charge misidentification probabilities computed in the simulation, where the electron with the incorrect charge assignment is unambiguous. The probabilities in data are shown in figure 31 for several combinations of identification and isolation operating points. For Medium identification with Tight isolation, the electron charge misidentification probability in Z → ee events is smallest in the central region of the detector at 0.05%, and increases to 2.7% at high |η|. As a function of E T it increases approximately linearly from 0.28% at E T = 20 GeV to 1.7% at E T = 120 GeV. With Tight instead of Medium identification, a reduction of charge misidentification by 25%-50%, depending on E T and η, is seen. The BDT presented in section 9.1 further reduces the misidentification probability by factor of about five, on average over the detector acceptance, and by up to a factor 10 at high pseudorapidity.

Conclusions
The reconstruction of electrons and photons based on a dynamical, topological cell clustering algorithm has been described, and the corresponding updates to the methods used for the identification of the candidates and the estimation of their energy have been discussed. The rejection of non-isolated particles and of mismeasured electron candidates have been re-optimized accordingly.
The dynamical cell clustering algorithm provides an electron and photon reconstruction efficiency similar to that of the sliding-window reconstruction. A relative improvement of about 15% is obtained in the reconstruction efficiency for two-track photon conversions. The misclassification of unconverted photons as single-track TRT conversions is reduced by a factor of two, while the single-track conversion reconstruction efficiency only decreases by 5 to 10%. The present algorithm also provides a better energy measurement, with a relative improvement in resolution by about 15% in the barrel, and about 20-25% in the endcap, for electrons and converted photons. The resolution for unconverted photons is unchanged.
-47 - Energy scale and resolution corrections have been measured using electrons from Z → ee decays. A significant dependence of the corrections on the amount of pile-up has been observed, reflecting a mismodelling of the calorimeter activity in minimum-bias events. The uncertainty in the energy scale corrections ranges from 4 × 10 −4 in the barrel to 2 × 10 −3 in the endcap. The uncertainty in the constant-term resolution corrections is typically 1-2 × 10 −3 . The electron-based energy calibration has been verified for photons, using radiative Z-boson decays, to a precision of 0.5% at worst.
The identification of electrons and photons has been revisited to match the improved cell clustering procedure. For electrons, identification efficiencies vary from 93% for the Loose identification criterion, to 80% for the Tight criterion, for electrons from Z-boson decays. The simulation models these efficiencies to a precision of 2% for Loose electrons and 5% for Tight electrons, respectively. The efficiency correction factors are measured with a typical precision of 0.2%. In the case of photons, the identification efficiency reaches 92% for unconverted photons, and 98% for converted photons, for E T ∼ 70 GeV and above. The precision of the efficiency correction factors ranges from 7% at low E T to 0.5% at high E T for unconverted photons, and from 12% to 1% for converted photons.
Several electron and photon isolation selection criteria have been defined, targeting a range of processes with varying event activity. The efficiencies of the isolation selections vary from about 99% for the loosest, to about 90% for the tightest criterion, depending on the physics process. Tight isolation selections exhibit a steeply rising efficiency as a function of E T ; for all isolation criteria, the selection efficiency varies by about 10% as a function of µ , for the range of µ spanned by the present dataset. Differences in efficiency between data and simulation range from 1% to 5%, depending on |η| and E T .
-48 -A dedicated algorithm has been implemented to reject electrons with badly measured track parameters, with the main objective of reducing the fraction of electron candidates with wrongly measured charge. This fraction, rising from less than 0.1% in the barrel to about 3% at high |η| for all candidates, is reduced by a factor of three to five as a function of E T , and by up to a factor of ten at high |η|. The simulation is found to model the data within 20% for the residual fraction of wrong-charge electron candidates, and the corresponding correction factors are measured with about 50% precision.
The present results define the baseline performance of the ATLAS detector for searches and measurements using electrons and photons from LHC proton-proton collision data collected at √ s = 13 TeV.