Reconstruction of signal amplitudes in the CMS electromagnetic calorimeter in the presence of overlapping proton-proton interactions

A template fitting technique for reconstructing the amplitude of signals produced by the lead tungstate crystals of the CMS electromagnetic calorimeter is described. This novel approach is designed to suppress the increased out-of-time pileup contribution to the signal following the reduction of the accelerator bunch spacing from 50 to 25 ns at the start of Run 2 of the LHC. Execution of the algorithm is sufficiently fast for it to be employed in the CMS high-level trigger. It is also used in the offline event reconstruction. Results obtained from simulations and from collision data demonstrate a substantial improvement in the energy resolution of the calorimeter over a range of energies extending from a few GeV to several tens of GeV.


Introduction
The CMS experiment at the LHC is a general-purpose detector, based on a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. It is equipped with several subdetectors, including a lead tungstate (PbWO 4 ) crystal electromagnetic calorimeter (ECAL), which is the focus of this report. A more detailed description of the CMS detector is given in Ref. [1].
The ECAL is located within the solenoid. It consists of 61 200 PbWO 4 crystals mounted in the barrel section (EB), covering the range of pseudorapidity |η| < 1.48, closed by 7324 crystals in each of the two endcaps (EE), covering the range 1.48 < |η| < 3.0. The EB uses 23 cm long crystals with front-face cross sections of approximately 2.2×2.2 cm 2 , while the EE contains 22 cm long crystals with a front-face cross section of 2.86×2.86 cm 2 . The scintillation light is detected by avalanche photodiodes (APDs) in the EB and by vacuum phototriodes (VPTs) in the EE. The PbWO 4 crystals have a Molière radius of 2.19 cm, approximately matching the transverse dimensions of the crystals. A preshower detector consisting of two planes of silicon sensors interleaved with lead for a total of 3 radiation lengths is located in front of EE [2].
The LHC operating conditions during Run 2 data taking (2015-18) were more challenging than those of Run 1 (2010-13) in several respects. The center-of-mass energy of the collisions was raised from 8 to 13 TeV, the bunch spacing (the time interval between neighboring bunches), was halved from 50 ns to the design value of 25 ns, and the instantaneous luminosity reached 2.05×10 34 cm −2 s −1 compared to 0.75×10 34 cm −2 s −1 achieved in 2012.
The mean number of interactions in a single bunch crossing (BX), termed pileup (PU), in Run 2 was 38, with the tail of the distribution extending up to 80. For comparison, during Run 1 in 2012, the mean value was 21 interactions per BX, with an extreme value of 40. After shaping by the electronics, the ECAL signals extend over several hundred nanoseconds. Consequently, the decrease in the LHC bunch spacing from 50 to 25 ns results in an increased number of overlapping signals from neighboring BXs, referred to as out-of-time (OOT) pileup. These spurious signals effectively add to the electronic noise and degrade the energy resolution of the calorimeter. To reduce these effects, an innovative ECAL amplitude reconstruction procedure was introduced, named "multifit". The new algorithm replaces the one used during Run 1 ("weights" method) [3], which was based on a digital-filtering technique. The original algorithm performed well under the conditions of Run 1, but was not suitable for Run 2 because of the increased OOT pileup.
sample. The first three samples are read out before the signal pulse rises significantly from the pedestal baseline (presamples). The 50 ns rise time of the signal pulse after amplification results from the convolution of the 10 ns decay time of the crystal scintillation emission and the 40 ns shaping time of the MGPA [1-3].

Amplitude reconstruction of ECAL signals
During LHC Run 1, a weighting algorithm [3] was used to estimate the ECAL signal amplitudes, both online in the high-level trigger (HLT) [4] and in the offline reconstruction. With that algorithm the amplitude is estimated as a linear combination of 10 samples, s i : where the weights w i are calculated by minimizing the variance ofÂ. This algorithm was developed to provide an optimal reduction of the electronics noise and a dynamic subtraction of the pedestal, which is estimated on an event-by-event basis by the average of the presamples.
The LHC Run 2 conditions placed stringent requirements on the ECAL pulse reconstruction algorithm. Several methods were investigated to mitigate the effect of the increased OOT pileup, to achieve optimal noise performance. The methods that were studied included: using a single sample at the signal pulse maximum, a deconvolution method converting the discrete time signal into the frequency domain [5], and a template fit.
The new method uses a template fit with N BX parameters, comprising one in-time (IT) and up to nine OOT amplitudes, up to five occurring before, and up to four after the IT one. The fit minimizes the χ 2 defined as: where the vector S comprises the 10 readout samples, s i , after having subtracted the pedestal value, p j are the pulse templates for each BX, and A j , which are obtained by the fit, are the signal pulse amplitudes in ten consecutive BXs, with A 5 corresponding to the IT BX. The pulse templates p j for each BX have the same shape, but are shifted in time by j multiples of 25 ns. The pulse templates are described by binned distributions with 15 bins of width 25 ns. An extension of five additional time samples after the 10 th sample (the last digitized one) is used to obtain an accurate description of the contribution to the signal from early OOT pulses with tails that overlap the IT pulse.
The total covariance matrix C used in the χ 2 minimization of Eq. (2) includes the correlation of the noise and the signal between the different time samples. It is defined as the quadratic sum of two contributions: where C noise is the covariance matrix associated with the electronics noise and C j pulse is the one associated with the pulse shape template. Each channel of the ECAL, i.e., a crystal with its individual readout, is assigned its own covariance matrix. Quadratic summation of the two components is justified since the variance for the pulse templates is uncorrelated with the electronic noise. In fact, the uncertainty in the shape of the signal pulses for a given channel is dominated by event-by-event fluctuations of the beam spot position along the z-axis, of order several cms [6], which affect the arrival time of particles at the front face of ECAL.
The C pulse matrix is calculated as: where thes i (n) are the sample values s i (n) scaled, for each event n, such thats 5 (n) = 1. The value of P equals the average of the three presamples over many events. Both the templates and their covariance matrices are estimated from collision data and may vary with time, for reasons described in Section 3.1. The electronics noise dominates the uncertainty for low-energy pulses, whereas the uncertainty in the template shape dominates for higher energies. The determination of C noise , which is calculated analogously as C pulse , but with dedicated data, is described in Section 3.2.
The minimization of the χ 2 in Eq.
(2) has to be robust and fast to use both in the offline CMS reconstruction and at the HLT. In particular, the latter has tight computation time constraints, especially in the EB, where the number of channels that are read out (typically 1000 and as high as 4000) for every triggered BX, poses a severe limitation on the time allowed for each minimization. Therefore, the possibility of using minimization algorithms like MINUIT [7] to perform the 10×10 matrix inversion is excluded. Instead, the technique of nonnegative least squares [8] is used, with the constraint that the fitted amplitudes A j are all positive. The χ 2 minimization is performed iteratively. First, all the amplitudes are set to zero, and one nonzero amplitude at a time is added. The evaluation of the inverse matrix C −1 , which is the computationally intensive operation, is iterated until the χ 2 value in Eq. (2) converges (∆χ 2 < 10 −3 ) [9]. Usually the convergence is reached with fewer than 10 nonzero fitted amplitudes, so the system is, in general, over-constrained.
In addition to data samples from calibration sources and collision data, two kinds of simulation samples are used. One is the full detector simulation used for physics analyses, implemented with GEANT4 [10]. These events are used to study the performance of the algorithm when the showering of an electromagnetic particle spreads across more than a single crystal, which is typical of most energy deposits in the ECAL. The second is a standalone simulation, where the single-crystal amplitudes are generated from the expected signal template. When a fast simulation of a single channel is sufficient, a standalone code generates pseudo-experiments using a parametric representation of the pulse shape and the measured covariance matrix, and it overlaps these signals to energy deposits from the typical PU of Run 2. Examples of fitted pulses in single crystals of the EB and EE are shown in Fig. 1 (right) and (left), respectively. They are obtained from a full detector simulation of photons with transverse momentum p T = 10 GeV.
Since the only unknown quantities are the fitted amplitudes, the minimization corresponds to the solution of a system of linear equations with respect to a maximum of 10 nonnegative A j values. The implementation uses a C++ template linear algebra library, EIGEN [11], which is versatile and highly optimized. The time required to compute the amplitude of all the channels in one event is approximately 10 ms for typical Run 2 events where the bunch spacing was 25 ns and there is an average of 40 PU interactions per BX. The timing has been measured on an Intel Xeon E5-2650v2 processor, which was used expressly for the benchmark tests of the CMS HLT farm at the beginning of Run 2 in 2015 [12]. The CPU time needed is about 100 times less than that which was used to perform the equivalent minimization using MINUIT, and for all events is much less than the maximum time of 100 ms/event allowed for the HLT. The algorithm implementation has also been adapted for execution on GPUs for the new processor farm, which will be used for LHC Run 3, which is planned to begin in 2022.

Pulse shape templates
The templates for the p j term in Eq. (2) are histograms with 15 bins, and represent the expected time distribution of a signal from an energy deposit in the ECAL crystals. The first 10 bins correspond to the samples that are read out in a triggered event. Bins 10-14 describe the tail of the signal pulse shape.
The pulse template differs from crystal to crystal, both because of intrinsic pulse shape differences and, more importantly, because of differences in the relative time position of the pulse maximum, T max , between channels. The pulse templates have also been found to vary with time, during Run 2, as a result of crystal irradiation. Both of these effects must be corrected for, and the time variation requires regular updates to the pulse shape templates during data taking.
The pulse templates are constructed in situ from collision data acquired with a zero-bias trigger, i.e., a beam activity trigger [4], and events recorded with a dedicated high-rate calibration data stream [13]. In the calibration data stream, the ten digitized samples from all single-crystal energy deposits above a predefined noise threshold are recorded, while the rest of the event is dropped to limit the trigger bandwidth. The energy deposits in these events receive contributions from both IT and OOT interactions. In a fraction of the LHC fills, the circulating beams are configured so that a few of the bunch collisions are isolated, i.e., occur between bunches that are not surrounded by other bunches. In these collisions, the nominal single-bunch intensity is achieved without OOT pileup, so a special trigger requirement to record them was developed. This allows a clean measurement of the templates of IT pulses only. An amplitude-weighted average pulse template is obtained, and only hits with amplitudes larger than approximately five times the root-mean-square spread of the noise are used.
During 2017, the pulse templates were recalibrated about 30 times. The LHC implemented collisions with isolated bunches only when the LHC was not completely filled with bunches, i.e., during the intensity ramp up, typically at the beginning of the yearly data taking and after each technical stop. For all other updates, normal bunch collisions were used. For these, a minimum amplitude threshold was imposed at the level of 1 GeV, or 5σ noise when this was greater, and the amplitude-weighted average of the templates suppressed the relative contribution of OOT PU pulses. It was verified that the pulse templates derived from isolated bunches are consistent with those obtained from nonisolated bunches. Anomalous signals in the APDs, which have a distorted pulse shape, are rejected on the basis of the single-crystal timing and the spatial distribution of the energy deposit among neighboring crystals [13,14].
The average pulse shape measured in the digitized time window of 10 BXs is extended by five additional time samples to model the falling tail of the pulse, which is used to fit for the contribution of early OOT pileup. This is achieved by fitting the average template with a function of the form [15]: where A represents the hit amplitude, ∆t = t − T max the time position relative to the peak, and α, β are two shape parameters. Examples of two average pulse shapes, obtained using this method, are shown in Fig. 2. The extrapolation of the pulse templates outside of the readout window was checked by injecting laser light into the crystals, with a shifted readout phase. The tail of the pulse, measured in this way, agrees with the extrapolated templates.  The covariance matrix associated with the pulse template, C pulse , is computed using Eq. (4), with the same sample of digitized templates used to determine the average pulse template and with the same normalization and weighting strategy. The correlation matrix of the pulse template, ρ pulse , shown in Fig. 3, is defined as pulse is the variance of the pulse shape for the i, k bin of the template. The values of σ i pulse are in the range 5×10 −4 − 1×10 −3 , the largest values relative to samples in the tail of the pulse template. The elements of the covariance matrix outside the digitization window, C pulse i,k with i > 9 or k > 9, are estimated from simulations of single-photon events with the interaction time shifted by an integer number of BXs. It was checked that this simulation reproduces well the covariance matrix for the samples inside the readout window. The elements with i = 5 or k = 5 have zero variance by definition, since S 5 = 1 for all the hits. The elements ρ i,k pulse with i < 3 or k < 3 are the presamples, where no signal is expected, and are set to zero. Those with i > 9 or k > 9 are estimated from simulations with a shifted BX. The others are measured in collision data, as described in the text. All the values in the figure represent 100ρ i,k pulse for readability.
The C pulse matrix shows a strong correlation between the time samples within either the rising edge or the falling tail of the pulse. An anti-correlation is also observed between the time samples of the rising edge and of the falling tail that is mostly due to the spread in the particle arrival time at the ECAL surface, which reflects the spatial and temporal distribution of the LHC beam spot in CMS [16]. For the measured samples, the correlations between S 9 and S 8 , S 7 , S 6 , are all close to 1, with values in the range (0.93-0.97). For the extrapolated samples, the correlations change from bin to bin: between S 14 and S 13 , S 12 , S 11 they are 0.70, 0.57, 0.46, respectively.

Pedestals and electronic noise
The pedestal mean is used in the multifit method to compute the pedestal-subtracted template amplitudes A j in Eq. (2). A bias in its measurement would reflect almost linearly in a bias of the fitted amplitude, as discussed in Section 4.
The covariance matrix associated with the electronic noise enters the total covariance matrix of Eq. (3). It is constructed as C noise = σ 2 noise ρ noise , where σ noise is the measured single-sample noise of the channels, and ρ noise is the noise correlation matrix. The C noise is calculated with Eq. (4), where i, k are the sample indices,S i andS j are the normalized measured sample values, and P is the expected value in the absence of any signal. The average is calculated over many events. The noise correlation matrix is defined as: The average pedestal value and the electronic noise are measured separately for the three MGPA gains. For the largest gain value, data from empty LHC bunches [17,18] are used, collected by injecting laser light into the ECAL crystals during sequences in the LHC orbit where there are no collisions. This gain value is used for the vast majority of the reconstructed pulses (up to 150 GeV), and is very sensitive to the electronics noise. One measurement per channel is acquired approximately every 40 minutes. For the other two MGPA gains, the pedestal mean and its fluctuations are measured from dedicated runs without LHC beams present.
The time evolution of the pedestal mean in the EB during Run 2 is shown for the highest MGPA gain in Fig. 4 (left). A long-term, monotonic drift upwards is visible. Short term (interfill) luminosity related effects are also visible. The short-term variations are smaller when the LHC luminosity is lower. The long-term drift depends on the integrated luminosity, while the shortterm effects depend on the instantaneous luminosity. Noise (ADC counts) The evolution of the electronic noise in the barrel is shown in Fig. 4 (right). It shows a monotonic increase with time, related to the increase of the APD dark current due to irradiation; no short-term luminosity-related effects are visible. For the barrel, where 1 ADC count ∼ = 40 MeV, this translates to an energy-equivalent noise of about 65 MeV at the beginning of 2017 and 80 MeV at the end of the proton-proton running in the same year. A small decrease in the noise induced by the APD dark current is visible after long periods without irradiation, i.e., after the year-end LHC stops. For the endcaps, the single-channel noise related to the VPT signal does not evolve with time, and is approximately 2 ADC counts. Nevertheless, the energy-equivalent noise increases with time and with absolute pseudorapidity |η| of the crystal because of the strong dependence of the crystal transparency loss on |η| and time. Consequently, the average noise at the end of 2017 in the endcaps translates to roughly 150 MeV up to |η| ≈ 2, whereas it increases to as much as 500 MeV at the limit of the tracker acceptance (|η| ≈ 2.5). Thus, the relative contribution of C noise in the total covariance matrix strongly depends on |η|. For hits with amplitude larger than ≈20 ADC counts, equivalent to an energy ≈1 GeV before applying the light transparency corrections, C pulse dominates the covariance matrix for the whole ECAL.
The covariance matrix for the noise used in the multifit is obtained by multiplying the time independent correlation matrix in Eq. (6) by the time dependent squared single sample noise, σ 2 noise . The time evolution is automatically accounted for by updating the values in the conditions database [19], with the measurements obtained in situ.
Correlations between samples exist because of (1) the presence of low-frequency (less than 4 MHz) noise that has been observed during CMS operation [15], and (2) the effect of the feedback resistor in the MGPA circuit [20]. The correlation matrix of the electronic noise was measured with dedicated pedestal runs; it is very similar for all channels within either the EB or the EE, and stable with time. Consequently, it has been averaged over all the channels within a single subsystem. The matrix for the highest gain of the MGPA is shown in component of the noise is such that the correlation depends almost solely on the time distance between the two samples, following an exponential relationship. For ∆t > 100 ns, it flattens to a plateau corresponding to the low frequency noise.

Sensitivity of the amplitude reconstruction to pulse timing and pedestal drifts
The multifit amplitude reconstruction utilizes as inputs pedestal baseline values and signal pulse templates that are determined from dedicated periodic measurements. Thus, it is sensitive to their possible changes with time. Figure 6 shows the absolute amplitude bias for pulses corresponding to a 50 GeV energy deposit in one crystal in the barrel, as a function of the pedestal baseline shift. The dependence for deposits in the endcaps is the same. A shift of ±1 ADC count produces an amplitude bias up to 0.3 ADC counts in a single crystal, corresponding, in the barrel, to an energy-equivalent shift of about 300 MeV in a 5×5 crystal matrix. Since the drift of the pedestal baseline with time can be as much as 2 ADC counts in one year of data taking, as shown in Fig. 4 (left), and is coherent in all crystals, the induced bias is significant, in the range ≈(0.5-1)%, even in the typical energy range of decay products of the W, Z, and Higgs bosons. Therefore, it is important to monitor and periodically correct the pedestals in the reconstruction inputs. The IT amplitude resulting from the χ 2 minimization of Eq. (2) is also more sensitive to a shift in the position of the maximum, T max of the signal pulse, compared to that obtained from the weights method [3]. This timing shift can be caused by variations of the pulse shapes over time, both independently from crystal to crystal and coherently, as discussed in Section 3.1. A difference in the pulse maximum position between the measured signal pulse and the binned template will be absorbed into the χ 2 as nonzero OOT amplitudes, A j , with j = 5.
To estimate the sensitivity of the reconstructed amplitude to changes in the template timing ∆T max , the amplitude of a given pulse is reconstructed several times, with increasing values of ∆T max . The observed changes in the ratio of the reconstructed amplitude to the true amplitude, A /A true , as a function of ∆T max , for single-crystal pulses of 50 GeV in the EB and EE, are shown in Fig. 7 (left) and (right), respectively. The difference in shape for positive and negative time shifts is related to the asymmetry of the pulse shape with respect to the maximum: spurious OOT amplitudes can be fitted more accurately using the time samples preceding the rising edge, where pedestal-only samples are expected, compared to using those on the falling tail. For positive ∆T max , the net change is positive because the effect of an increase in the IT contribution is larger than the decrease in the signal amplitude caused by the misalignment of the template. The change in reconstructed amplitude at a given ∆T max is similar for the barrel and the endcaps. Small differences arise mostly from the slightly different rise time of the barrel and endcap pulses and the difference in energy distributions from PU interactions in a single crystal in the two regions.
The effects of small channel-dependent differences between actual pulse shapes and the as-    [21]. For each point, the average of the hits reconstructed in all barrel and endcaps channels is used. The sharp changes in T max correspond to restarts of data taking following LHC technical stops, as discussed in the text. At the beginning of the yearly data taking, the timing is calibrated so that the average T max = 0.
sumed templates are absorbed by the crystal-to-crystal energy intercalibrations. However, any changes with time in the relative position of the template will affect the reconstructed amplitudes, worsening the energy resolution. This implies the need to monitor T max and periodically correct the templates for any observed drifts. The average correlated drift of T max was constantly monitored throughout Run 2, measured with the algorithm of Ref. [21]. Its evolution during 2017 is shown in Fig. 8. The coherent variation can be up to 1 ns. The repeated sharp changes in T max occur when data taking is resumed after a technical stop of the LHC. They are caused by a partial recovery in crystal transparency while the beam is off, followed by a rapid return to the previous value when irradiation resumes. A similar trend was measured in the other years of data-taking during Run 2.
The measured time variation is crystal dependent, since the integrated radiation dose depends on the crystal position, and since there are small differences in the effect between crystals at the same η. For this reason the pulse templates are measured in situ multiple times during periods with collision data, and a specific pulse template is used for each channel. The measurement described in Section 3.1 is repeated after every LHC technical stop, when a change of the templates is expected because of partial recovery of the crystal transparency, or when the |∆T max | was larger than 250 ps.

Performance with simulations and collision data
In this section, the performance of the ECAL local reconstruction with the multifit algorithm is compared with the weights method [3]. Simulated events with a PU typical of Run 2 (a Poisson distribution with a mean of 40) and collision data collected in 2016-2018 are used. The data comparisons are performed for low-energy photons from π 0 → γγ decays, and for high-energy electrons from Z → e + e − decays.

Suppression of out-of-time pileup signals
The motivation for implementing the multifit reconstruction is to suppress the OOT pileup energy contribution, while reconstructing IT amplitudes as accurately as possible. To show how well the multifit reconstruction performs, the resolution of the estimated IT energy is compared for single crystals, as a function of the average number of PU interactions. This study was performed using simple pseudo-experiments, where the pulse shape is generated according to the measured template for a barrel crystal at |η| ≈ 0. The appropriate electronics noise, equal to the average value measured in Run 2, together with its covariance matrix, is included. The effect of the PU is simulated assuming that the number of additional interactions has a Poisson distribution about the mean expected value and that these interactions have an energy distribution corresponding to that expected for minimum bias events at the particular value of η of the crystal. The pseudo-experiments are performed for two fixed single-crystal energies: 2 and 50 GeV.
The amplitude resolution is estimated as the effective standard deviation σ eff , calculated as half of the smallest symmetrical interval around the peak position containing 68.3% of the events. The PU energy from IT interactions constitutes an irreducible background for both energy reconstruction methods. It is expected that event-by-event fluctuations of this component degrade the energy resolution in both cases as the PU increases. On the other hand, the fluctuations in the energy from all the OOT interactions are suppressed significantly by the multifit algorithm, in contrast to the situation for the weights reconstruction, where they contribute further to the energy resolution deterioration at large average PU. This is shown in Fig. 9, for the two energies considered in this study. The reconstructed energy is compared with either the true generated energy (corrected for both IT and OOT PU) or the sum of the energy from the IT pileup and the true energy (corrected only for the effect of OOT PU). In the latter case, the amplitude resolution for the multifit reconstruction does not depend on the number of interactions, showing that this algorithm effectively suppresses the contributions of the OOT PU. The offset in resolution in the case of no PU between the two methods, in this ideal case, is due to the improved suppression of the electronic noise resulting from the use of a fixed pedestal rather than the event-by-event estimate used in the weights method. In the data, additional sources of miscalibration may further worsen the energy resolution. Such effects are considered in the full detector simulation used for physics analyses, described below, but are not included in this standalone simulation. Simulations performed for an upgraded EB, planned for the high-luminosity phase of the LHC [22], have shown that the multifit algorithm can subtract OOT PU for energies down to the level of the electronic noise, for σ noise > 10 MeV, for PU values up to 200 with 25 ns bunch spacing. This future reconstruction method will benefit from a more frequent sampling of the pulse shape, at 160 MHz, and from a narrower signal pulse to be achieved with the upgraded front-end electronics [23].

Energy reconstruction with simulated data
The ability of the multifit algorithm to estimate the OOT amplitudes and, consequently, to estimate the IT amplitude is demonstrated in Fig. 10 (left). Simulated events are generated with an average of 40 PU interactions, with an energy spectrum per EB crystal as shown in Fig. 10 (right). The reconstructed energy assigned by the multifit algorithm to each BX from −5 to +4 is compared with the generated value. The IT contribution corresponds to BX = 0. Amplitudes are included with energy larger than 50 MeV, a value corresponding approximatively to one standard deviation of the electronic noise [24]. The mode of the distribution of the ratio between the reconstructed and true energies of OOT PU pulses and true energies, A PU BX /A true BX , with BX in the range [−5,. . . , +4], is equal to unity within ±2% for all the BXs. The OOT interactions simulated in these events cover a range from 12 BXs before to 3 BXs after the IT interaction, as is done in the full simulation used in CMS. The distribution of the measured to true energy becomes asymmetric at the boundaries of the pulse readout window (BX = −5, −4, and −3), because the contributions of earlier interactions cannot be resolved with the information provided by the 10 digitized samples. However, this does not introduce a bias in the IT amplitude since the energy contribution from very early BXs below the maximum of the IT pulse is negligible. The remaining offset of ≈0.2% in the median of A PU BX /A true BX for BXs close to zero is due to the requirement that all the A j values are nonnegative, i.e., any spuriously fitted OOT pulse can only subtract part of the in-time amplitude. This offset is absorbed in the absolute energy scale calibration and does not affect the energy resolution. The energy from an electromagnetic shower for a high-momentum electron or photon is deposited in several adjacent ECAL crystals. A clustering algorithm is required to sum together the deposits of adjacent channels that are associated with a single electromagnetic shower. Corrections are applied to rectify the cluster partial containment effects. In the present work, we use a simple clustering algorithm that sums the energy in a 5×5 crystal matrix centered on the crystal with the maximum energy deposit. This approach is adequate for comparing the performance of the two reconstruction algorithms, especially in regions with low tracker material (e.g., |η| < 0.8), where the fraction of energy lost by electrons by bremsstrahlung (and subsequent photon conversions) is small. Here, more than 95% of the energy is contained in a 5×5 matrix. To reduce the fraction of events with partial cluster containment caused by early bremsstrahlung and photon conversion, a selection is applied to the electrons and photons. In the simulation, events with photon conversions are rejected using Monte Carlo information, whereas in data a variable that uses only information from the tracker is adopted, as described later.
The relative performance of the two reconstruction algorithms is evaluated on a simulated sample of single-photon events generated by GEANT4 with a uniform distribution in η and a flat transverse momentum p T spectrum extending from 1 to 100 GeV. The photons not undergoing a conversion before the ECAL surface are selected by excluding those that match geometrically electron-positron pair tracks from conversions in the simulation. For the retained photons, the energy is mostly contained in a 5×5 matrix of crystals, and no additional corrections are applied.
The ratio between the reconstructed energy in the 5×5 crystal matrix and the generated photon energy, E 5×5 /E true , for nonconverted photons with a uniform distribution in the range 1 < p true T < 100 GeV is histogramed. For both reconstruction algorithms, the distributions show a non-Gaussian tail towards lower values, caused by the energy leakage out of the 5×5 crystal matrix, which is not corrected for. To account for this, σ eff , as defined in Section 5.1, is used to quantify the energy resolution. The average energy scale of the reconstructed clusters is shifted downwards for the multifit method, whereas it is approximately unity for the weights reconstruction. As stated earlier, this is because the amplitudes for the OOT pulses (A j with j = 5) are constrained to be positive. In the reconstruction of photons used by CMS such a shift is corrected for, a posteriori, by a dedicated multivariate regression, which simultaneously corrects the residual dependence of the energy scale on the cluster containment and IT pileup. This correction is applied in the HLT and, with a more refined algorithm, in the offline event reconstruction. This type of cluster containment correction was developed in Run 1 [24,25] and has been used subsequently. In this approach, the shift is corrected by scaling the resolution estimator, σ eff , by the peak position of the distribution, m, estimated by a fit with a Gaussian function to the bulk of the distribution, and expressed in percent. The variation of σ eff as a function of the true p T of the photon, is shown in Fig. 11. The improvement in the precision of the energy measurement is significant for the full range of p T considered. Expressed as a quadratic contribution to the total, it varies from 10 (15)% in the barrel (endcaps) for photons with p T < 5 GeV, to 0.5 (1.0)% at p T = 100 GeV. The improvement is larger at low p T , since the relative contribution of the energy deposits from PU interactions, which have the characteristic momentum spectrum shown in Fig. 10 (right), is relatively larger. This is particularly relevant for suppressing the PU contribution to low-p T particles that enter the reconstruction of jets and missing transverse momentum with the particle-flow algorithm used in CMS [26], thus preserving the resolution achieved during Run 1 [27][28][29]. The improvement grows with |η| both within the EB and within the EE, because of the increasing probability of overlapping pulses from PU. The improvement is larger in the barrel, even though the PU contribution is smaller than in the endcaps, because the lower electronic noise allows a more stringent constraint of the amplitudes in the multifit. For electrons and photons, the improvement extends above p T ≈ 50 GeV, because of the higher number of digitized samples of the pulse shape used, and the suppression of the residual OOT PU contribution. The energy resolution becomes constant at very high energies, above a few hundred GeV, where it is dominated by sources other than the relatively tiny contribution of OOT pileup energy.

Energy reconstruction with Run 2 data
The improvement in the energy resolution for low-energy clusters is quantified in data using π 0 mesons decaying into two photons. The p T spectrum of the photons, selected by a dedicated calibration trigger [13], falls very fast and most of the photons have a p T in the range of 1-2 GeV. The photon energy in this case is reconstructed summing the energy of the crystals in a 3×3 matrix. Figure 12 shows the diphoton invariant masses when both clusters are in the EB (left) and when both are in EE (right). The invariant mass distributions obtained with the weights and the multifit methods are compared, using a subset of the π 0 calibration data collected during 2018. The position of the peak is affected by OOT PU in different ways: for the multifit, a downward shift is caused by the A j being nonnegative for each BX, as described earlier. For the weights method, two effects compete: on the one hand PU adds to the IT pulse, increasing the energy scale, on the other early OOT interactions increase the mean pedestal value, thus decreasing the measured energy. In the latter case, the mean shift depends on the PU occupancy per crystal, which is |η|-dependent. Since the π 0 → γγ process is only used to calibrate the relative response of a crystal with respect to others, the absolute energy scale is not important here. The energy scale is determined separately by comparing the position of the Z → e + e − mass peak in data and simulation. At the end of 2017, the LHC operated for a period of about 1 month with a filling scheme with trains of 8 bunches alternated with 4 empty BXs. The resilience of the multifit method to OOT pileup had a particularly positive effect in this period, since the bunch-to-bunch variations in OOT PU are larger than with the standard LHC filling schemes used in Run 2. All the bunches of a given train provide approximately the same luminosity, about 5.5×10 27 cm −2 s −1 , so the average number of PU interactions is the typical one of Run 2 (about 37, with peaks up to 80). Data from this period is used to assess the sensitivity of the algorithms to OOT interactions by estimating the invariant mass peak position of the π 0 mesons as a function of BX within each LHC bunch train. The measured invariant mass, normalized to that measured in the first BX of the train, is shown in Fig. 13 (left). The peak position, estimated with the weights algorithm, increases for BXs towards the middle of the bunch train, where the contribution from OOT collisions is larger, and then decreases again towards the end of the train. In contrast, for the multifit reconstruction, the peak position remains stable within ±0.4% with respect to the average over the bunch train. The overall resolution in the diphoton invariant mass improves significantly using the multifit algorithm, and, within the precision of the measurement, is insensitive to the variations of OOT PU for different BX within the train. This is shown in Fig. 13 (right). The performance of the two algorithms for high-energy electromagnetic deposits is estimated using electrons from Z → e + e − decays. Electrons with p T > 25 GeV are identified with tight electron identification criteria, using a discriminant based on a multivariate approach [25]. To decouple the effects of cluster containment corrections from the single-crystal resolution, 5×5 crystal matrices are used to form clusters. The sample is enriched in low-bremsstrahlung electrons by selecting with an observable using only tracker information, f brem , which represents the fraction of momentum, estimated from the track, lost before reaching the ECAL. It is defined as f brem = (p in − p out )/p in , where p in and p out are the momenta of the track extrapolated to the point of closest approach to the beam spot and estimated from the track at the last sensitive layer of the tracker, respectively. The variable f brem is required to be smaller than 20%. In the range 0.8 < |η| < 2.0 [25], the resolution is dominated by the incomplete containment of the 5×5 crystal matrix caused by the larger amount of tracker material in this region. Therefore, detailed performance comparisons are restricted to events with electromagnetic showers occurring in the central region of the EB. Figure 14 shows the invariant mass of 5×5 cluster pairs, for a portion of the 2016 data, selecting pairs of electrons, e 1 and e 2 , that lie within a representative central region of the barrel (0.200 < max(|η 1 |, |η 2 |) < 0.435). The shift in the absolute energy scale for the simplified 5×5 clustering, caused by the multifit A j being nonnegative for each BX, is not corrected for. The improvement is still significant for the p T range characteristic of Z → e + e − decays, matching the expectation from the simulation, shown in Fig. 11, namely an improvement in resolution of ≈1% in quadrature, after unfolding the natural width of the Z boson, for electrons and photons with 30 < p T < 100 GeV.  Figure 14: Example of the Z → e + e − invariant mass distribution in a central region of the barrel (0.200 < max(|η 1 |, |η 2 |) < 0.435) with the single-crystal amplitude estimated using either the weights or the multifit method. A portion of collision data with typical Run 2 conditions, recorded during October 2016, is used. Error bars represent the statistical uncertainty. The energy is summed over a 5×5 crystal matrix. The reported values of σ eff include the natural width of the Z boson, and are expressed as a percentage of the position of the peak, m, of the corresponding invariant mass distribution.
A full comparison of the performance of the multifit algorithm in Run 2 with that of the weights algorithm in Run 1 would require a reanalysis of the Run 1 data, applying the more sophisticated clustering techniques used in Run 2. Nevertheless, it is instructive to make a straightforward comparison. For Run 1, where the crystal energy was reconstructed with the default weights method, the electron energy was estimated with the simple 5×5 crystal cluster, and using the optimal calibrations of the 2012 data set ( √ s = 8 TeV and 50 ns LHC bunch spacing) [25]. The effective resolution of the dielectron invariant mass distribution, normalized to its peak, is σ eff /m = 4.59%. This is consistent with the value of 4.59% obtained in Run 2 with the multifit algorithm, shown in Fig. 14. This indicates that the multifit method can maintain the ECAL performance obtained during Run 1, in the p T range relevant for most of the data analyses performed with CMS, despite the substantially larger PU present in Run 2.
The contribution to the average offset of the jet energy scale, from the reconstructed electromagnetic component of each additional PU interaction, was estimated in a simulated sample of pure noise in the CMS detector by considering the energy contained in cones randomly chosen within the detector acceptance. This shows that the contribution to the offset from ECAL signals is reduced to a value of less than 10%, similar to that obtained in Run 1. Further details are given in in Ref. [28].

Reconstruction of cluster shape variables
The relative contribution of the PU energy within a cluster for electrons from Z boson decays is less than for clusters from π 0 meson decays, and the sample of events is smaller. For these reasons, it is difficult to estimate the variation of the energy scale within one LHC fill arising from this contribution. The effect on the cluster shapes is still significant, since they are computed using all the hits in a cluster, including the low-energy ones. One example is provided by the evolution, within an LHC fill, of the variable R 9 , defined as the ratio of the energy in a 3×3 crystal matrix centered on the seed hit of the cluster, divided by the total energy of the cluster. This variable is an important measure of cluster shape, since it is often used to distinguish between showering or converted photons, and those not undergoing a bremsstrahlung process or conversion within the tracker. For example, in studies of Higgs boson physics, it is used to separate H → γγ events into categories with different m γγ effective mass resolutions. Thus it is important that the R 9 variable remains stable over time. Figure 15 shows the median of the R 9 distribution for clusters from electron pairs in the barrel having a mass consistent with that of the Z boson, during an LHC fill in 2016 with an average PU decreasing from a value of 42 at the beginning of the fill to a value of 13 at the end. The stability of the cluster shape as a function of instantaneous luminosity, obtained with the multifit algorithm, is clearly better than the one obtained with the weights reconstruction. The main reason the median R 9 values drift up during a fill is that the denominator of the R 9 ratio, which includes contributions from low-energy hits located outside of the 3×3 matrix, decreases in the weights algorithm when the instantaneous luminosity (and the PU) decreases.  Another effect that has been checked in data is the rejection power for anomalous signals ascribed to direct energy deposition in the APDs [14] by traversing particles. Unlike the hits in an electromagnetic shower, the anomalous signals generally occur in single channels of the calorimeter. They are rejected by a combination of a topological selection and a requirement on the hit timing. The topological selection rejects hits for which the value of the quantity (1 − E 4 /E 1 ) is close to 1, where E 1 is the energy of the crystal and E 4 is the energy sum of the four nearest neighboring crystals. A simulation of anomalous signals in the APDs is used, and the efficiency is defined as the fraction of the reconstructed hits in crystals with anomalous signals identified as such by the offline reconstruction. The rejection efficiency obtained when using the multifit reconstruction is improved by as much as 15% compared to the weights method for hits with E < 15 GeV. The probability of rejecting hits from genuine energy deposits has been checked on data with hits within clusters of Z → e + e − and is lower than 10 −3 over the entire p T spectrum of electrons from Z boson decays for both methods.

Summary
A multifit algorithm that uses a template fitting technique to reconstruct the energy of single hits in the CMS electromagnetic calorimeter has been presented. This algorithm was implemented before the start of the Run 2 data taking period of the LHC, replacing the weights method used in Run 1. The change was motivated by the reduction of the LHC bunch spacing from 50 to 25 ns, and by the higher instantaneous luminosity delivered in Run 2, which led to a substantial increase in both the in-time and out-of-time pileup. Procedures have been developed to provide regular updates of input parameters to ensure the stability of energy reconstruction over time.
Studies based on control samples in data show that the energy resolution for deposits ranging from a few to several tens of GeV is improved, using π 0 → γγ and Z → e + e − decays. The gain is more significant for lower energy electromagnetic deposits, for which the relative contribution of pileup is larger. This enhances the reconstruction of jets and missing transverse energy with the particle-flow algorithm used in CMS. These results have been reproduced with simulation studies, which show that an improvement relative to the weights method is obtained at all energies, including those relevant for photons from Higgs boson decays.
Simulation studies show that the new algorithm will perform successfully at the high-luminosity LHC, where a peak pileup of about 200 interactions per bunch crossing, with 25 ns bunch spacing, is expected.