1 Introduction

Muons are a crucial component of the physics programme of the ATLAS experiment [1] at the Large Hadron Collider (LHC) [2]. The discovery of the Higgs boson [3, 4] and the measurement of its properties [5], precision tests of the Standard Model [6], and searches for physics processes beyond the Standard Model [7, 8] all strongly rely on the performance of muon identification and measurement with the ATLAS detector.

During its second data-taking campaign (Run 2; 2015–2018), ATLAS collected 139 fb\(^{-1}\) of pp collisions at a centre-of-mass energy of \(\sqrt{s}=13\) \(\text {TeV}\) , a data sample of unprecedented size. A recent publication presented the performance of the new and optimised muon reconstruction and identification techniques developed for the analysis of the full Run 2 data sample [9]. The muon momentum measurement was previously published for early Run 2 data, corresponding to 3.2 fb\(^{-1}\) collected in 2015 [10].

This paper describes the methods used to calibrate the momentum measurement of muons reconstructed by the ATLAS detector for the full Run 2 data sample. First, the potential charge-dependent bias on the scale of the muon momentum measurement, introduced by the imperfect knowledge of the real detector geometry, is measured in reconstructed collision data with a sample of \(Z \rightarrow \mu \mu \) events.Footnote 1 A dedicated correction is derived and applied to the data. Then, \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) decays are selected in data and used to measure the resolution and scale of the muon momentum, which are compared with those predicted by simulation. A calibration procedure is applied to simulated events, to improve the agreement between the simulation and the data. Finally, a validation procedure of this calibration is performed using \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) events, in addition to an independent sample of \(\varUpsilon \rightarrow \mu {}\mu \) decays.

In contrast to previous publications, this paper presents a completely new methodology to determine the charge-dependent bias, sensitive to global shifts in the detectors’ positions, which could not be measured with previous techniques. Furthermore, the inclusion of new \(J/\psi \rightarrow \mu {}\mu \) data collected with dedicated trigger strategies, the development of new fitting techniques with better convergence properties, and the significantly larger data sample result in an unprecedented precision of the momentum calibration procedure. In addition to the calibrations developed separately for the tracks measured in the inner detector and in the muon spectrometer of the ATLAS detector, for the first time a dedicated calibration for tracks obtained by combining information from both sub-detectors is derived. The validation of the calibration procedure is performed using \(\varUpsilon \rightarrow \mu {}\mu \) events as well as \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) events in regions with finer granularity than those used in the calibration procedure.

This paper is structured as follows: the ATLAS detector is described in Sect. 2; the simulated and real data samples used for the measurements are presented in Sect. 3; the identification of muon candidates is discussed in Sect. 4; the methodologies for measuring the muon scale and momentum corrections are presented in Sect. 5; the results of the measurement and the validation of the data-driven corrections derived for simulated samples are presented in Sect. 6; final conclusions are provided in Sect. 7.

2 ATLAS detector

The ATLAS detector [1] at the LHC covers nearly the entire solid angle around the collision point.Footnote 2 It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadron calorimeters, and a muon spectrometer incorporating three large superconducting air-core toroidal magnets.

The inner detector system (ID) is immersed in a 2 T axial magnetic field (bending the tracks of charged particles in the transverse plane) and provides charged-particle tracking in the range of \(|\eta | < 2.5\). The high-granularity silicon pixel detector covers the vertex region and typically provides four measurements per track. The first hit is normally found in the insertable B-layer (IBL) installed before Run 2 [11, 12]. It is followed by the silicon microstrip tracker (SCT), which typically provides eight measurements per track. These silicon detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to \(|\eta | = 2.0\), typically providing 30 measurements per track. The TRT also provides electron identification information based on the fraction of hits above a higher energy-deposit threshold corresponding to transition radiation.

The calorimeter system covers the pseudorapidity range \(|\eta | < 4.9\). In the region \(|\eta | < 3.2\), electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering \(|\eta | < 1.8\) to correct for energy loss in material upstream of the calorimeters. Hadron calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures with \(|\eta | < 1.7\), and two copper/LAr hadron endcap calorimeters. The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic energy measurements respectively.

The muon spectrometer (MS) comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by the superconducting air-core toroidal magnets, which bend charged particles tracks in the rz plane, i.e. in the plane formed by the muon momentum vector and the beam axis. The field integral of the toroids ranges between 2.0 and 6.0 T m across most of the detector. Three layers of precision chambers, each consisting of layers of monitored drift tubes, cover the region \(|\eta | < 2.7\), complemented by cathode-strip chambers in the forward region, where the background is highest. The muon trigger system covers the range \(|\eta | < 2.4\) with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions.

Events are selected by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [13]. The first-level trigger accepts events from the 40 MHz bunch crossings at a rate below 100 kHz, which the high-level trigger further reduces to record events to disk at about 1 kHz.

An extensive software suite [14] is used in data simulation, in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

3 Data and Monte Carlo samples

3.1 Data sample

The analysis is performed using the \(\sqrt{s}=13\) \(\text {TeV}\)pp collision data sample recorded by the ATLAS detector between 2015 and 2018. Only events collected in stable beam conditions and with all relevant ATLAS magnets and detector subsystems fully operational are used in the analysis. A combination of single-muon trigger algorithms (for \(Z \rightarrow \mu \mu \) events) and trigger algorithms dedicated to \(J/\psi \rightarrow \mu {}\mu \) and \(\varUpsilon \rightarrow \mu {}\mu \) topologies [15] were used. Starting from the middle of 2016, due to the increasing instantaneous luminosity of the LHC, events associated with \(J/\psi \rightarrow \mu {}\mu \) and \(\varUpsilon \rightarrow \mu {}\mu \) triggers were collected with a dedicated data acquisition stream for B-physics and low-mass final states, separated from the main physics stream so that their reconstruction could be delayed if the availability of processing resources was limited [16]. This additional stream is analysed for a muon momentum calibration for the first time in this paper, significantly increasing the number of \(J/\psi \rightarrow \mu {}\mu \) and \(\varUpsilon \rightarrow \mu {}\mu \) events available. For the data sample used in this analysis a procedure to calibrate the muon detectors was applied, as described in Ref. [17]. The analysed data corresponds to an integrated luminosity of 139 fb\(^{-1}\) after trigger and data-quality requirements, with an average number of pp collisions per bunch crossing \(\langle \mu \rangle =33.7\). The calibration derived in the analysis does not show any significant dependence on \(\langle \mu \rangle \).

3.2 Simulated samples

Samples of Monte Carlo (MC) simulated inclusive prompt production of \(J/\psi \rightarrow \mu {}\mu \) and \(\Psi (2S)\rightarrow \mu \mu \) events were generated at leading-order (LO) accuracy using  [18], for matrix element (ME) calculations, and for the modelling of the parton shower, hadronisation, and the underlying event. The  [19] parton density function (PDF) set and the A14 [20] set of tuned generator parameters were used. The effect of QED final-state radiation was simulated with  [21, 22]. To increase the effective number of events for the analysis, the generated events were filtered, requiring at least one muon in the event to satisfy \(p_{\text {T}}\) \(>6\) \(\text {GeV}\) and \(|\eta |<2.5\).

Samples of \(\varUpsilon (\text {1S}) \), \(\varUpsilon (\text {2S}) \), and \(\varUpsilon (\text {3S}) \) simulated events with subsequent decays into two muons were generated with a similar configuration to that of the \(J/\psi \rightarrow \mu {}\mu \) samples, though the filter required at least two muons with \(p_{\text {T}}\) \(>4\) \(\text {GeV}\) and \(|\eta |<2.7\) in the event.

The samples of \(Z \rightarrow \mu \mu \), \(Z \rightarrow \tau \tau \), \(W \rightarrow \mu \nu \), and \(Z^{*}/\gamma ^{*}(\rightarrow \mu \mu )\) events were simulated using the  [23,24,25,26] generator at next-to-leading-order (NLO) accuracy in perturbative QCD, interfaced to with the AZNLO [27] set of tuned generator parameters. The PDF set [28] was used for the hard-scattering processes, whereas the PDF set was used for the modelling of the parton showering, hadronisation, and underlying event. The effect of QED final-state radiation was simulated with . The program [29] was used to decay bottom and charm hadrons. To improve the agreement with the data, the simulated Z samples are reweighted to reproduce the muon-pair transverse momentum and rapidity distributions measured in data.

A configuration similar to the one used for the Z samples described above was used to simulate diboson processes decaying into final states containing muons, (ZZWZWW), using and  [30] in this case.

The production of \(t\bar{t}\) events was modelled using the  [23,24,25, 31] generator at NLO with the  [32] PDF set and the \(h_\textrm{damp}\) parameterFootnote 3 set to 1.5 \(m_{\textrm{top}}\) [33]. The events were interfaced to with the A14 set of tuned generator parameters and using the PDF set [34]. The decays of bottom and charm hadrons were performed by .

An alternative sample of \(Z \rightarrow \mu \mu \) events was simulated with the  [35] generator using NLO ME for up to two partons, and LO ME for up to four partons calculated with Comix [36] and  [37,38,39]. They were matched to the parton shower [40] using the MEPS@NLO prescription [41,42,43,44] using the set of tuned parameters developed by the authors. The PDF set [32] was used and the samples were normalised to a next-to-next-to-leading-order (NNLO) prediction [45]. This sample was used to check the impact of radiative effects on the muon momentum corrections, as features an alternative model than the nominal sample. The re-weighting procedure applied to the nominal \(Z \rightarrow \mu \mu \) events sample is also applied to this alternative sample.

The effect of multiple interactions in the same and neighbouring bunch crossings (pile-up) was modelled by overlaying the simulated hard-scattering event with inelastic proton–proton (pp) events generated with  [18] using the PDF set [34] and the A3 set of tuned parameters [46]. The MC events were weighted to reproduce the distribution of the average number of interactions per bunch crossing observed in the data.

The detector response was simulated using the Geant4 toolkit and reconstructed with the same algorithms as used for real data [47, 48]. The detector simulation assumes perfect alignment for the ID and MS sub-detectors. Dedicated simulated samples with misaligned detector geometry are used for the validation of the calibration procedure. These include samples with a rotation of the ID layers with a linear dependence on the radius [49] and a perfect geometry for the MS. They also include samples with the positions of the middle layer of the MS chambers shifted by amounts corresponding to the residual alignment precision [50].

4 Muon reconstruction, identification and event selection

4.1 Muon reconstruction and identification

The muon reconstruction is described in detail in Ref. [9]; only a brief summary is provided here. Muons are reconstructed using information from the ID and/or MS sub-detectors, which provide two independent measurements of muons crossing the ATLAS detector. In this analysis, the following reconstruction algorithms are used.

  • ID tracks: tracks reconstructed using ID hits only.

  • Standalone tracks (SA): muon candidates obtained using MS hits only; to define kinematic quantities at the interaction point, SA tracks are extrapolated to the beam-spot taking into account energy loss and multiple scattering in the material upstream of the muon spectrometer.

  • Combined tracks (CB): muon candidates obtained by starting from a MS track and matching it to an ID track. A combined fit to the hits, taking into account energy loss in the calorimeters and multiple scattering in the spectrometer, is performed.

Several working points (WPs) are introduced in Ref. [9], defining sets of quality cuts applied to reconstructed muons. The Medium WP is the baseline for this analysis; this algorithm only selects CB muons in \(|\eta |<2.5\) and applies a set of requirements to reject poorly reconstructed tracks. The \({High}\)-\(p_{\text {T}}\) WP applies tighter cuts on CB muons to ensure an optimal muon reconstruction for analyses interested in the very high \(p_{\text {T}}\) regime, with a better momentum resolution. Other working points target either lower background rates, at the expense of a lower efficiency, or higher efficiency but with a lower purity for muon identification; these WPs have a momentum resolution performance similar to the Medium one. The individual ID and SA tracks associated with the CB muons are still accessible and used in the analysis, as discussed in the following. For simplicity, muon candidates reconstructed using the ID, MS or combined ID+MS information are referred to as ID, MS or CB muons, respectively.

4.2 Event selection

Proton–proton collision vertices are constructed from reconstructed trajectories of charged particles in the ID with transverse momentum \(p_{\text {T}}\) \(> 500~\text {MeV} \). Events are required to have at least one collision vertex. The vertex with the highest \(\sum {p_{\text {T}} ^2}\) of reconstructed tracks is selected as the primary vertex of the hard interaction. The data are subjected to quality requirements to reject events in which detector components were not operating correctly [51].

Events are required to be selected by a combination of the following trigger chains:

  • For \(Z \rightarrow \mu \mu \) candidate events, unprescaled single-muon trigger chains with the lowest kinematic threshold available in each data sample are used.

    • Data collected in 2015: at least one muon with \(p_{\text {T}} >20~\text {GeV} \) and passing a loose isolation requirement based on the scalar sum of the \(p_{\text {T}}\) of tracks in a cone around the muon candidate track. Events are also retained if selected by a second chain requiring at least one muon with \(p_{\text {T}} >40~\text {GeV} \) without any isolation requirement.

    • Data collected in 2016, 2017, 2018: a similar strategy as the one used in 2015 was employed, but with the first chain requiring a muon with \(p_{\text {T}} >26~\text {GeV} \) passing a tighter isolation requirement, and the second chain requiring at least one muon with \(p_{\text {T}} >50\text {GeV} \).

  • For \(J/\psi \rightarrow \mu {}\mu \) and \(\varUpsilon \rightarrow \mu {}\mu \) candidate events, trigger chains requiring at least two muons in the event were used. The trigger algorithms also performed common vertex fits to pairs of oppositely charged muon candidates, requiring at least one of the fitted vertices to satisfy fit quality criteria and have an invariant mass consistent with that of a \(J/\psi \) or \(\varUpsilon \) resonance [15]. The last requirement is significantly looser than that applied by the offline analysis, to avoid introducing a bias. The trigger algorithms applied the following kinematic requirements on the muons associated with the candidate resonance:

    • Data collected in 2015: both muons must satisfy \(p_{\text {T}} >4~\text {GeV} \).

    • Data collected in 2016: both muons must satisfy \(p_{\text {T}} >6~\text {GeV} \).

    • Data collected in 2017 and 2018: both muons must satisfy \(p_{\text {T}} >6~\text {GeV} \). Additional requirements were also applied on the angular distance between the two muons associated with the vertex and on the displacement of the fitted vertex in the transverse plane. Events were also selected using auxiliary trigger algorithms requiring the leading (sub-leading) muon to have \(p_{\text {T}} >11~\text {GeV} \) (\(p_{\text {T}} >6~\text {GeV} \)) but without any additional requirement.

Events are required to have two oppositely charged CB muons satisfying the Medium WP requirements and spatially matched with the muon candidates reconstructed by the respective trigger chains. Further kinematic requirements are applied to ensure that selected muons are in the regions where the triggers used for the analysis were fully efficient:

  • For \(Z \rightarrow \mu \mu \) candidate events, the leading muon must satisfy \(p_{\text {T}} >27~\text {GeV} \).

  • For \(J/\psi \rightarrow \mu {}\mu \) and \(\varUpsilon \rightarrow \mu {}\mu \) candidate events, both muons must satisfy \(p_{\text {T}} >6.25~\text {GeV} \). For events selected only by the auxiliary trigger chain in the 2017 and 2018 data samples, the requirement on the leading muon is changed to \(p_{\text {T}} >11.25~\text {GeV} \).

For each muon, requirements are also applied on the transverse \((d_0)\) and longitudinal \((z_0)\) impact parameters. The \(d_0\) of a charged-particle track is defined in the plane transverse to the beam as the distance from the primary vertex to the track’s point of closest approach. The \(z_0\) is the distance in the z direction between this track point and the primary vertex; this distance is represented by \(z_0 \sin {\theta }\) in the longitudinal projection. The candidate muons are required to satisfy:

  • \(\frac{|d_{0}|}{\sigma (d_{0})}<3\), with \(\sigma (d_{0})\) representing the uncertainty in the measured \(d_{0}\) value;

  • \(|z_0 \sin {\theta }|<\)0.5 mm.

After application of these selections, the non-prompt component of \(J/\psi \rightarrow \mu {}\mu \) production is reduced to less than 0.5%.

Finally the invariant mass of the muon pair is required to satisfy:

  • \(2.6<m_{\mu \mu }<3.5~\text {GeV} \) for \(J/\psi \rightarrow \mu {}\mu \) candidate events;

  • \(7<m_{\mu \mu }<14~\text {GeV} \) for \(\varUpsilon \rightarrow \mu {}\mu \) candidate events;

  • \(70<m_{\mu \mu }<130~\text {GeV} \) for \(Z \rightarrow \mu \mu \) candidate events.

5 Momentum scale corrections

The muon momentum scale and resolution are studied using \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) decays in data and simulated samples. While dedicated procedures are applied to correct the detector alignment [1, 49, 50], residual misalignments introduce a charge-dependent bias in the momentum measurement. Studies of this effect are discussed in Sect. 5.1, together with the determination of an appropriate set of corrections for the data. Simulated samples assume ideal detector alignment and thus do not require these corrections.

Although the simulation contains an accurate description of the ATLAS detector, the level of detail is not sufficient to reach the accuracy of 0.1% on the muon momentum scale and the percent level precision on the resolution measured in data. The analysis of this sample improves upon the excellent simulated description of the ATLAS detector and its interaction with muons, encompassing the best knowledge of the detector geometry, material distribution, and physics modelling at the time of event simulation. Sect. 5.2 details the measurement of momentum scale and resolution in data and simulation, and the determination of corresponding corrections for the simulated samples. Validation studies for the corrections are presented in Sect. 6.

5.1 Charge-dependent momentum scale calibration in data

After dedicated detector alignment procedures are applied, residual effects can still introduce a bias in the momentum measurement of the muon. These effects arise from both the ID and the MS sub-detectors.

The ID alignment is derived by a global \(\chi ^2\) minimisation of the track-hit residuals [49]. Residual detector displacements relative to the nominal detector geometry may however still be present, the weak modes, that are not fully corrected for by the current alignment procedures. ID alignment studies of these modes are periodically performed and data is reconstructed with an improved description of the detector geometry. Nevertheless, residual effects after this correction procedure still prevent the best possible precision being attained, motivating the study discussed in the following.

For the MS, the alignment procedures are less sensitive to charge-dependent biases, though residual effects may remain due to the limited precision of the procedure. An optical alignment system [1] monitors the position of the muon chambers relative to each other and relative to fiducial marks in the detector. The system compensates for chamber position variations as the data is collected. However, several aspects limit the accuracy of this system, including access shafts for detector operations and construction precision in some detector regions. Additionally, the analysis of dedicated data runs, either collecting cosmic-ray data or in proton–proton collisions with the magnetic field generated by the toroid systems switched off, allows the precision of the relative position of the muon chambers to be further constrained. This leads to a precision at the level of tens of micrometers on the sagitta of the muon track, and up to 120–130 \(\upmu \text {m}\) in specific detector regions [50]. This residual uncertainty in the alignment system manifests as charge-dependent effects in the muon spectrometer reconstruction procedure.

The precision of the muon momentum reconstruction can be further improved with a dedicated correction procedure accounting for the ID weak modes and MS alignment system uncertainties, which would otherwise degrade the momentum resolution. The charge-dependent bias can be approximated as:

$$\begin{aligned} \frac{q}{\hat{p}}=\frac{q}{p} + q \cdot \delta _s, \end{aligned}$$
(1)

where \(q=\pm 1\) is the charge of the muon, p and \(\hat{p}\) are the corrected and uncorrected momentum of the muon, respectively, and \(\delta _s\) is the strength of the bias. This parameterisation forms the basis for developing a correction to the data, recovering the residual bias and improving the momentum resolution.

To estimate (and later correct) the bias, the large sample of \(Z\rightarrow \mu \mu \) decays is used. The mass of the dimuon system, \(m_{\mu \mu }\), can be expressed as a function of the positively and negatively charged muons’ transverse momenta \(p_{\text {T}} ^+\) and \(p_{\text {T}} ^-\), respectively:

$$\begin{aligned} m_{\mu \mu }^{2} = 2 \cdot p_{\text {T}} ^+p_{\text {T}} ^- \cdot (\text {cosh}(\Delta \eta ) - \text {cos}(\Delta \phi )), \end{aligned}$$
(2)

where \(\Delta \eta \) and \(\Delta \phi \) are the differences between the pseudorapidity and azimuthal angles of the two muons. This expression is valid in the approximation of the muon mass being negligible with respect to the energies of the muons, and has been verified to hold in the context of these studies to sub-\(\text {MeV}\) precision. The bias is parameterised as a function of a grid of \(24\times 24\) equal-sized \(\eta \)\(\phi \) detector regions, a granularity chosen to ensure that the measurement is not affected by large statistical fluctuations while still correctly reflecting local biases or deformations. This high granularity allows the transverse momentum \(p_{\text {T}}\) of the measurement to be used instead of p, which is necessary when comparing ID and MS results, since the two sub-detectors have different bending planes due to their respective magnet systems. Combining the previous two equations, the biased invariant mass \(\hat{m}_{\mu \mu }^{2}\) can be expressed as:

$$\begin{aligned} \hat{m}_{\mu \mu }^{2} = \frac{m_{\mu \mu }^{2}}{(1+\delta _{s} p_{\text {T}} ^+)(1-\delta _{s} p_{\text {T}} ^-)}. \end{aligned}$$
(3)

Assuming the bias is small, Eq. (3) can be approximated as:

$$\begin{aligned} m_{\mu \mu }^{2} = \hat{m}_{\mu \mu }^{2}(1+\delta _{s}(\eta ,\phi ) p_{\text {T}} ^+ - \delta _{s}(\eta ,\phi ) p_{\text {T}} ^-). \end{aligned}$$
(4)

Since \(p_{\text {T}} ^{+}\sim p_{\text {T}} ^{-}\) in Z boson decays, charge-dependent biases largely cancel in the expression of \(m_{\mu \mu }^{2}\), hence do not impact the average dimuon mass over all detector regions, but only broaden the resolution of the \(Z\rightarrow \mu \mu \) peak. Therefore, the reconstructed invariant mass \(m_{\mu \mu }\) distribution defined in Eq. (4) is sensitive to the bias through its impact on the variance of the distribution. The values of the sagitta biases \(\hat{\delta }_{s}(\eta ,\phi )\) are evaluated by minimising the variance of the invariant mass distributions; then, the biased momentum of the muon \(\hat{p_{\text {T}}}\) is corrected using the following equation:

$$\begin{aligned} p_{\text {T}} = \frac{\hat{p_{\text {T}}}}{ 1 - q \hat{\delta }_{s}(\eta ,\phi )\hat{p_{\text {T}}}}. \end{aligned}$$
(5)

Finally, the updated \(p_{\text {T}}\) values are used to recalculate the invariant mass distribution and a new iteration is started. Given the large number of degrees of freedom induced by the dependence of the biases on \(\eta \) and \(\phi \), a direct solution of the equation becomes impractical. Instead, an iterative approach is chosen. At each iteration the values of \(\hat{\delta }_{s}(\eta ,\phi )\) obtained from the previous iteration are used as an input to \(\hat{p_{\text {T}}}\) of the next.

A complementary method to estimate this bias is also defined by approximating the average \(p_{\text {T}} \) of the muon to half the invariant mass of the dimuon pair. In this case, the bias can be approximated by:

$$\begin{aligned} \delta _s (\eta ,\phi ) = 4\cdot \frac{q (\hat{m}_{\mu \mu }(\eta ,\phi ) - \langle m_{\mu \mu } \rangle )}{\langle m_{\mu \mu } \rangle ^2}, \end{aligned}$$
(6)

where \(\langle m_{\mu \mu } \rangle \) is the average of the invariant mass of the dimuon pairs used to derive the correction, while \(\hat{m}_{\mu \mu }(\eta ,\phi )\) is the average invariant mass of the dimuon pairs when the muon with the highest transverse momentum is in the given \((\eta ,\phi )\) region.

Both methods additionally assume that the kinematic properties of positively and negatively charged muons produced by Z decays are symmetrical. In addition to the small asymmetry introduced by the weak mixing angle, this assumption breaks down due to the detector acceptance and selection requirements restricting the phase space of the selected muon pairs. The impact of these acceptance and kinematic restrictions can be estimated by applying the same procedure to events simulated with ideal detector alignment, obtaining a residual bias term \(\delta _{s}^{\text {MC}}\). The \(\delta _{s}^{\text {data}}\) measured in data can be corrected by subtracting the values measured in MC simulation, \(\delta _{s}^{\text {MC}}\), to define the final corrections:

$$\begin{aligned} \delta _{s}(\eta ,\phi ) = \delta _{s}^{\text {data}}(\eta ,\phi )-\delta _{s}^{\text {MC}}(\eta ,\phi ). \end{aligned}$$
(7)

Typical values for \(\delta _{s}^{\text {MC}}(\eta ,\phi )\) range from up to \(0.01~\text {TeV} ^{-1}\) in the central pseudorapidity region to up to about \(0.05~\text {TeV} ^{-1}\) in the forward pseudorapidity region, after event selection. The procedure is applied separately for the muon momentum measured using the CB, ID, and MS information, and dedicated corrections for each measurement are derived. Both procedures target a residual \(\delta _{s}\) of less than \(10^{-3}~\text {TeV} {}^{-1}\) through simultaneous iterative updates on the ID, MS, and CB momenta. The convergence is achieved once the target is reached for all momenta. For the first method this happens after five iterations on average, while the second typically requires 25 iterative updates. In both methods, the angular correlations between the two muons have a small impact on the estimate of \(\delta _s\), given the fine granularity chosen in \(\eta \times \phi \). Figure 1 shows the measured sagitta corrections evaluated on CB momenta, with biases up to 0.4 \(\text {TeV} {}^{-1}\). To avoid introducing a bias in the measurement due to the value assumed for \(m_{Z}\), the unbiased value of \(m_{\mu \mu }\) is taken from the data when integrating over all \(\eta \) and \(\phi \) bins for both methods. Results from the two methods, which rely on different assumptions, are compared. For the first method, a comparable bias for the leading and subleading \(p_{\text {T}} \) is assumed, while for the second method the approximation of \(p_{\text {T}} ^{+} \simeq p_{\text {T}} ^{-}\) is less precise at high \(\eta \) values. The \(\delta _s (\eta ,\phi )\) corrections obtained with the two methods have a correlation close to 100%, with an agreement between the two methods at the level of \(0.01~\text {TeV} {}^{-1}\) or better, with this difference taken as an additional systematic uncertainty. Results for the second method are given in Appendix A.

Fig. 1
figure 1

Charge-dependent biases on the muon \(p_{\text {T}}\) as evaluated on data for CB momenta after alignment and before applying the dedicated corrections. The biases are shown separately for the three data-taking periods, a years 2015 and 2016, b 2017, and c 2018. The bias is defined in Eq. (1)

Using the derived bias results, the transverse momenta of the muon are corrected based on Eq. (5) using the \(\hat{\delta }_{s}\) derived after the last iteration. The biases are re-evaluated after the application of the corrections and the results are shown in Fig. 2 for CB momenta. Both methods address relative differences between nearby \(\eta , \phi \) regions, as well as biases that are constant across full sub-detector regions. The biases are reduced to less than \(2\cdot 10^{-4}~\text {TeV} {}^{-1}\) in all regions of the detector, validating the method to several orders of magnitude better than the typical size of the correction.

To validate the correction procedure, simulated samples with misaligned detector configurations, as described in Sect. 3.2, were used. The injected biases were produced by a rotation of the ID detector layers with a linear dependence on the radial distance from the interaction point. The sizes of these biases in the simulation are the same as those observed in data from \(Z \rightarrow \mu \mu \) decays before applying the correction procedure. Figure 3 uses these biased detector geometry samples to compare the reconstructed and generated momentum distributions, as evaluated before and after the correction, plotted separately for positive and negative muons. The application of the correction reduces the differences between the positively and negatively charged muons, improving the reconstructed \(p_{\text {T}}\) resolution. Further closure tests are performed using simulated samples with distorted geometry on \(H\rightarrow ZZ\rightarrow \mu \mu \mu \mu \) and hypothetical narrow resonances decaying into two muons with masses greater than \(500~\text {GeV} \), to estimate the impact of the correction in muon momentum regimes far from that of muons originating from Z decays. Closure is observed at the same level of precision.

Fig. 2
figure 2

Residual charge-dependent biases on the muon \(p_{\text {T}}\) as evaluated on data for CB momenta after alignment and after application of the correction procedure. The residual biases are shown separately for the three data-taking periods, a years 2015 and 2016, b 2017, and c 2018. The bias is defined in Eq. (1)

Fig. 3
figure 3

The bias of the transverse momentum measured in simulated data. The upper panels show the leading muon in the event \((p_{\text {T}} ^{\text {lead}})\) compared with its true value separately for positive muons and negative muons, a for \(-1.05<\eta <0\) and b for \(0<\eta <1.05\). The dotted lines correspond to the momentum bias as evaluated on a simulation with a biased geometry, for positive and negative muons. The markers correspond to the bias evaluated with the same simulation after the application of the correction procedures. The lower panels show the ratios of positive to negative muons before and after correction

Several sources of systematic uncertainty associated with the correction procedure are studied. They relate to both the accuracy of the measurement of \(\delta _{s}\) and the assumptions used in the correction procedure.

The systematic uncertainty originating from the residual non-closure of the correction compared with a perfectly aligned detector simulation is evaluated as follows. First, the \(\delta _{s}\) value obtained after the last iteration on data is used to bias the \(p_{\text {T}} \) in a perfectly aligned simulation. This is done using Eq. (5), by flipping the sign of \(\delta _{s}\), and using the same granularity in \(\eta \) and \(\phi \) as used for deriving the correction. The resulting \(p_{\text {T}} \) is compared with the unbiased momentum and the difference is taken as a systematic uncertainty. A second component of the residual non-closure is estimated in simulation by injecting a set of constant known biases and by taking the difference of \(\delta _{s}\) between the injected and measured \(\delta _{s}\). The injected biases range from percent fractions up to twice those observed in the data at the first iteration of the correction procedure; the largest resulting \(\delta _{s}\) difference is taken as the uncertainty for all bins. Both methods show very good closure for the full range of injected biases. The uncertainties originating from the non-closures amount to about 0.02 \(\text {TeV} {}^{-1}\), depending on the pseudorapidity region.

A second contribution to the systematic uncertainties arises from extrapolations to momenta outside the phase-space region covered by Z decays. In fact, the relative contribution of the ID and MS detectors on the CB momentum measurement may vary in a non-trivial way. This may lead to biases on the estimate of \(\delta _{s}\) and of the correction. In this case, the residual bias is evaluated as the difference between simulation and data and as a function of \(\eta \) and \(p_{\text {T}} \) using the alternative method of Eq. (6). The difference compared with the simulation is used to account for the kinematic dependencies of the \(\delta _{s}\) estimate. The resulting uncertainty is small, compared with the non-closure described previously, in the central regions of pseudorapidity. However, in the forward pseudorapidity regions, it becomes sizeable and evolves as \(2~\times 10^{-4}~\text {TeV} {}^{-2}~\times (p_{\text {T}}\ - 0.045~\text {TeV})\).

The statistical uncertainties due to the observed and simulated number of Z decays are also taken into account. These are approximately an order of magnitude smaller than the first two components, but in the forward region they reach up to 0.02 \(\text {TeV} ^{-1}\).

Given the lack of sufficient \(Z \rightarrow \mu \mu \) decays above \(450~\text {GeV} \) in data to validate the methodology, the data are not corrected for charge-dependent biases beyond this value. Instead, an uncertainty accounting for the charge-dependent biases before any correction is assigned to the simulated momenta. This is estimated by biasing the simulated muon momenta in a perfectly aligned simulation, according to Eq. (5), using the values of \(\delta _{s}\) as measured before any correction. As done for the non-closure uncertainty, the uncertainty is derived by injecting bias into simulated events and taking the difference between the biased and the non-biased muon momentum values, resulting in an uncertainty of up to 0.4 \(\text {TeV} ^{-1}\) as shown in Fig. 1.

5.2 Muon momentum calibration procedure in simulation

In the following, the muon momentum calibration is defined as the procedure used to identify the corrections to the reconstructed muon transverse momenta in simulation in order to match the measurement of the same quantities in data. This procedure is performed after correcting for the charge-dependent bias discussed in the previous section. The transverse momenta of the ID and MS tracks associated with a CB muon, referred to as \(p_{\text {T}} ^{\text {ID}}\) and \(p_{\text {T}} ^{\text {MS}}\), respectively, are used in addition to the transverse momentum of the CB track, referred to as \(p_{\text {T}} ^{\text {CB}}\).

The corrected transverse momentum after applying the calibration procedure, \(p_{\text {T}} ^{{\text {Cor}},{\text {Det}}}\) (Det \(=\) CB, ID, MS), is described as:

$$\begin{aligned} p_{\text {T}} ^{{\text {Cor}},{\text {Det}}}= & {} \frac{p_{\text {T}} ^{{\text {MC}},{\text {Det}}} + \sum \nolimits ^{1}_{n=0} s^{\textrm{Det}}_{n}(\eta ,\phi )\left( p_{\text {T}} ^{{\text {MC}},{\text {Det}}}\right) ^{n}}{1+\sum \nolimits ^{2}_{m=0} \Delta r^{\textrm{Det}}_{m}(\eta ,\phi )\left( p_{\text {T}} ^{{\text {MC}},{\text {Det}}}\right) ^{m-1}g_{m}}, \end{aligned}$$
(8)

where \(p_{\text {T}} ^{{\text {MC}},{\text {Det}}}\) is the uncorrected transverse momentum in simulation, \(g_{m}\) are normally distributed random variables with zero mean and unit width, and the terms \(\Delta r_{m}^{\text {Det}}(\eta ,\phi )\) and \(s_n^{\text {Det}}(\eta , \phi )\) describe the momentum resolution smearing and scale corrections respectively, applied in a specific \((\eta , \phi )\) detector region. A possible \(s_2^{\text {Det}}(\eta , \phi )\) term in the numerator is neglected because it would model effects already corrected in data with the procedure described in Sect. 5.1.

The corrections described in Eq. (8) are defined in \(\eta \)\(\phi \) detector regions homogeneous in detector technology and performance. All corrections are divided into 18 pseudorapidity regions. In addition, the CB and MS corrections are divided into two \(\phi \) bins separating the two types of MS sectors: those that include the magnet coils (small sectors) and those between two coils (large sectors). The small and large MS sectors are subjected to independent alignment techniques and cover detector areas with different material distribution, leading to scale and resolution differences.

The numerator of Eq. (8) describes the momentum scales. The \(s_1^{\textrm{Det}}(\eta , \phi )\) term corrects for inaccuracy in the description of the magnetic field integral and the dimension of the detector in the direction perpendicular to the magnetic field. The \(s^{\textrm{Det}}_0(\eta , \phi )\) term models the effect on the CB and MS momentum from the inaccuracy in the simulation of the energy loss in the calorimeter and other material between the interaction point and the exit of the MS. As the energy loss between the interaction point and the ID is negligible, \(s^{\textrm{ID}}_0(\eta )\) is set to zero [10].

The denominator of Eq. (8) describes the momentum smearing that broadens the relative \(p_{\text {T}}\) resolution in simulation, \(\sigma (p_{\text {T}})/ p_{\text {T}} \), to properly describe the data. The corrections to the resolution assume that the relative \(p_{\text {T}}\) resolution can be parameterized as follows:

$$\begin{aligned} \frac{ \sigma (p_{\text {T}})}{p_{\text {T}}} = r_0/p_{\text {T}} \oplus r_1 \oplus r_2\cdot p_{\text {T}}, \end{aligned}$$
(9)

with \(\oplus \) denoting a sum in quadrature. In Eq. (9), the first term \((r_0)\) mainly accounts for fluctuations of the energy loss in the traversed material; the second term \((r_1)\) mainly accounts for multiple scattering, uncertainties related to, and inhomogeneities in, the modelling of the local magnetic field, and length-scale radial expansions of the detector layers; and the third term \((r_2)\) mainly describes intrinsic resolution effects caused by the spatial resolution of the hit measurements and residual misalignments between the different detector elements. The energy loss term has a negligible impact on the muon resolution in the momentum range considered, and therefore \(\Delta r^{\textrm{Det}}_{0}\) is set to zero.

In a second step, to cross-check the validity of the corrections obtained directly for the CB tracks, the corrected combined momenta from ID and MS measurements \(p_{\text {T}} ^{{\text {Corr}}\ {\text {ID}}+{\text {MS}}}\) is also obtained by combining the ID and MS corrected momenta with a weighted average:

$$\begin{aligned} p_{\text {T}} ^{{\text {Cor}},{\text {ID}}+{\text {MS}}} = f\cdot p_{\text {T}} ^{{\text {Cor}},{\text {ID}}} + (1 - f)\cdot p_{\text {T}} ^{{\text {Cor}}, {\text {MS}}}. \end{aligned}$$
(10)

The weight f is calculated by solving the following linear equation:

$$\begin{aligned} p_{\text {T}} ^{{\text {MC}},{\text {CB}}} = f\cdot p_{\text {T}} ^{{\text {MC}},{\text {ID}}} + (1 - f)\cdot p_{\text {T}} ^{{\text {MC}}, {\text {MS}}}, \end{aligned}$$
(11)

which assumes that the relative contribution of the two sub-detectors to the combined track remains unchanged before and after momentum corrections.

5.2.1 Determination of the \(p_{\text {T}}\) calibration constants

The CB, ID, and MS correction parameters contained in Eq. (8) are extracted from data using a fitting procedure that compares the invariant mass distributions for \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) candidates in data and simulation, selected as discussed in Sect. 4.

When extracting correction parameters, muons are assigned to \(\eta \)\(\phi \) regions of fit (ROFs), which are defined separately for the ID and the MS. The values of \(\Delta r^{\textrm{ID}}_{0}\), \( \Delta r^{\textrm{MS}}_{0}\), \( \Delta r^{\textrm{CB}}_{0}\), and \(s^{\textrm{ID}}_0\), are set to zero as previously discussed, while \( \Delta r^{\textrm{MS}}_{2}\) is determined from alignment studies using cosmic-ray data and special runs with the toroidal magnetic field turned off [50], and from detector calibration procedures [17].

The corrections are extracted using the distributions of the dimuon invariant mass, \(m_{\mu \mu }^{\textrm{Det}}\). When fitting a specific ROF, one muon is required to belong to the ROF and the other can be anywhere in the detector. To enhance the sensitivity to \(p_{\text {T}} \)-dependent correction effects, the dimuon pair is classified according to the \(p_{\text {T}} \) of the muons. For \(J/\psi \rightarrow \mu {}\mu \) decays, the fit is performed in two exclusive categories defined by the subleading muon transverse momentum: \(6.25< p_{\text {T}} ^{{\text {Det}},{\text {sublead}}} < 9~\text {GeV} \) and \(p_{\text {T}} ^{{\text {Det}},{\text {sublead}}} > 9~\text {GeV} \). For \(Z \rightarrow \mu \mu \) decays the two categories are defined by the leading muon transverse momentum: \( 20< p_{\text {T}} ^{{\text {Det}},{\text {lead}}} < 50~\text {GeV} \) and \(p_{\text {T}} ^{{\text {Det}},{\text {lead}}}> 50~\text {GeV} \).

Templates for the \(m_{\mu \mu }^{\text {Det}}\) variables are built using \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) simulated signal samples. The first step in the fitting procedure consists of estimating the contribution of the background processes. For the \(Z \rightarrow \mu \mu \) selection, the small background component (approximately 0.1%) is extracted from simulation and added to the templates. A much larger (up to 15%) non-resonant background from decays of light and heavy hadrons and from continuum Drell–Yan production is present for events selected in the \(J/\psi \rightarrow \mu {}\mu \) channel, estimated with a data-driven approach as this background is difficult to simulate. The dimuon invariant mass distribution in data is fitted in each ROF using a Crystal Ball function [52] with an exponential background distribution. This background model is then combined with the simulated signal templates used in the fit.

To estimate the scale and smearing parameters, a multi-stage procedure is used. A two-step random walk minimisation is first performed where the parameters are sampled within a specified range. The parameter configuration that leads to the best agreement for the invariant mass distribution between simulation and data is kept. In the first step, the best agreement is defined by a simplified metric where the compatibility between the means and standard deviations of the distributions is compared. In the second step, a binned \(\chi ^2\) is computed that compares the dimuon mass spectrum in simulation to that of data. Around this best-agreement parameter configuration, signal templates are then generated at discrete intervals and interpolated using moment morphing [53]. Using this signal model, a full minimisation using the binned \(\chi ^2\) is performed. This procedure is iteratively repeated, where the second muon outside the ROF definition is calibrated using results from the previous iteration, until the change of each parameter stays within the root mean squared of its values as estimated from the last five iterations. This happens after 16 iterations. Each calibration parameter’s values from the last five iterations are averaged to produce their final value.

Table 1 Summary of ID muon momentum resolution and scale corrections used in Eq. (8), averaged over the three main detector regions. The corrections are derived in 18 pseudorapidity regions, as described in Sect. 5, and averaged, with each region assigned a weight proportional to its \(\eta \) width. The uncertainties represent the sum in quadrature of the statistical and systematic uncertainties
Table 2 Summary of MS momentum resolution and scale corrections for small and large MS sectors, averaged over three main detector regions. The corrections for large and small MS sectors are derived in 18 pseudorapidity regions, as described in Sect. 5, and averaged, with each region assigned a weight proportional to its \(\eta \) width. The uncertainties represent the sum in quadrature of the statistical and systematic uncertainties. The larger uncertainties for some terms in the \(1.05\le |\eta |<2.0\) region are associated with the low bending power of the magnet system of the MS in part of that region
Table 3 Summary of CB momentum resolution and scale corrections for small and large MS sectors, averaged over three main detector regions. The corrections for large and small MS sectors are derived in 18 pseudorapidity regions, as described in Sect. 5, and averaged, with each region assigned a weight proportional to its \(\eta \) width. The energy loss term \(\Delta r_0^{\text {CB}}\) is negligible and therefore fixed to zero in the fit for all \(\eta \). The uncertainties represent the sum in quadrature of the statistical and systematic uncertainties

The calibration parameters obtained from the fits to the data are summarised in Tables 1, 2, and 3, averaged over three \(\eta \) regions. The sums in quadrature of the statistical and systematic uncertainties are shown, with the latter dominating. All sources of uncertainties are evaluated by varying the parameters of the template fit. The higher uncertainty for \(\Delta r_1^{\textrm{ID}}\) in the second bin is associated with the region of the ID with the largest amount of material, corresponding to the transition between the barrel and endcap region of its subdetectors. The increase in uncertainty as a function of \(|\eta |\) for \(s_1^{\textrm{ID}}\) is due to the decrease of the magnetic field integral of the solenoid as a function of \(|\eta |\). The larger uncertainties for some of the MS terms in the \(1.05\le |\eta |<2.0\) region are associated with the low bending power of the magnet system of the MS in part of that region.

The main contributions to the final systematic uncertainty are:

  • \(J/\psi \) only and Z only fit: The fit is repeated using only the \(J/\psi \rightarrow \mu {}\mu \) or \(Z \rightarrow \mu \mu \) decays with only \(s_1^{\textrm{Det}}\) left as a free parameter and the other parameters fixed to their nominal fitted value. The resulting uncertainty is defined by taking the difference relative to the nominal fitted \(s_1^{\textrm{Det}}\) and accounts for the extrapolation from the regions dominated by either the \(J/\psi \rightarrow \mu {}\mu \) or the \(Z \rightarrow \mu \mu \) in their respective \(p_{\text {T}}\) spectra.

  • Z kinematic reweighting: As discussed in Sect. 3.2, the reweighting maps are derived separately for each simulated sample. To evaluate the impact of the reweighting on the analysis, new templates of \(m_{\mu \mu }^{\textrm{ID}}\), \(m_{\mu \mu }^{\textrm{CB}}\), and \(m_{\mu \mu }^{\textrm{MS}}\) are produced. The results obtained when using the different \(Z \rightarrow \mu \mu \) samples with their respective reweighting scheme and the non-reweighted results for the nominal \(Z \rightarrow \mu \mu \) samples are used to derive the uncertainty. This systematic uncertainty impacts the overall Z lineshape by changing the relative weight of various momentum regimes and, by extension, the contributions of their dedicated corrections in the final fit.

  • Decay modelling and final state radiation modelling: The nominal simulated sample of \(Z \rightarrow \mu \mu \) events, of which the QED final-state radiation is handled by , is compared with \(Z \rightarrow \mu \mu \) events simulated with the sample which uses an alternative radiative modelling. The impact of the uncertainties in the decay modelling are significantly smaller than those originating from the components discussed above, partially due to the already discussed kinematic reweighting.

  • \(J/\psi \) and Z \(p_{\text {T}}\) template range: The \(p_{\text {T}}\) ranges of the \(J/\psi \) and Z templates are varied. For \(J/\psi \), the ranges of the two nominal \(p_{\text {T}}\) templates are \(p_{\text {T}} ^{J/\psi , {\text {nom}}} \in [6.25, 9]~\text {GeV} \) and \(p_{\text {T}} ^{J/\psi , {\text {nom}}} \in [9, 20]~\text {GeV} \). For Z, the ranges of two templates are \(p_{\text {T}} ^{Z, {\text {nom}}} \in ~[20, 50]~\text {GeV} \) and \(p_{\text {T}} ^{Z, {\text {nom}}} \in [50, 300]~\text {GeV} \). The boundary of the two templates for \(J/\psi \) is varied from 9 to 8 and 12 \(\text {GeV}\), and for Z is varied from 50 to 40 and 80 \(\text {GeV}\). This preserves the number of events, and covers any additional \(p_{\text {T}}\) dependencies.

  • Mass binning of \(J/\psi \) and Z: The number of bins for the \(m_{\mu \mu }\) templates is reduced from 200 to 150 for Z, and from 90 to 60 for \(J/\psi \). This systematic uncertainty covers any binning effect in generating templates.

  • Z mass window: New templates of \(m_{\mu \mu }\) are generated for the \(Z \rightarrow \mu \mu \) sample changing the \(m_{Z}\) window selection to \(75< m_{\mu \mu } < 115~\text {GeV} \) for all track types. The high and low mass regions away from the resonant pole of the Z are more sensitive to initial- and final-state radiation contributions, the running of \(\alpha ^{\text {EM}}_{Z}\), and other specific choices of the MC generators used in the \(Z \rightarrow \mu \mu \) simulation.

  • \(J/\psi \) mass window: The same approach as in the case of the Z is applied, changing the invariant mass window selection to \(2.75< m_{\mu \mu } < 3.4~\text {GeV} \) for all track types. The region furthest from the \(m_{J/\psi }\) peak is more sensitive to the shape variation of the combinatorial background, and the specific choices of the MC generators used in the \(J/\psi \rightarrow \mu {}\mu \) simulation.

  • \(J/\psi \) background: New \(m_{\mu \mu }\) templates are generated for the \(J/\psi \rightarrow \mu {}\mu \) samples using a background parameterisation different from the exponential function. For this systematic uncertainty, background parameterisations including a combination of two Crystal ball functions and a Chebyshev function are used.

  • Closure test and statistical uncertainty: The root mean squared of the last five iterations of each calibration parameter is used as the statistical uncertainty of the method. This accounts for any instabilities in the fit, and the correlations between different parameters. It was observed that small changes in the starting parameters of the fit do not affect its stability.

The largest component of the total uncertainty in the momentum scale originates from the comparison of the \(J/\psi \) and Z only fits. The remaining components are smaller by about a factor of two. For the resolution smearing terms, in the same kinematic regime, the largest uncertainty comes from the \(J/\psi \) and Z \(p_{\text {T}}\) template fit range. At significantly higher momenta the uncertainty in the \(J/\psi \) and Z \(p_{\text {T}}\) template fit range still dominates the resolution uncertainty in \(p_{\text {T}} ^{{\text {Cor}},{\text {CB}}}\).

As an additional check, the value of \(p_{\text {T}} ^{{\text {Cor}},{\text {ID}}+{\text {MS}}}\) is compared with \(p_{\text {T}} ^{{\text {Cor}},{\text {CB}}}\). Results from both methods are in agreement within their respective systematic uncertainties. The comparison is performed on simulated samples with a biased alignment of the MS on muons with momenta ranging from few tens of \(\text {GeV}\) to about 1 \(\text {TeV}\). The uncertainties in the momentum scale of \(p_{\text {T}} ^{{\text {Cor}},{\text {CB}}}\) are found to be smaller than those of \(p_{\text {T}} ^{{\text {Cor}}, {\text {ID}}+{\text {MS}}}\) with a slightly larger uncertainty increase in the resolution as a function of the transverse momentum, as shown in Appendix B. At high momenta the dominant uncertainty in \(p_{\text {T}} ^{\text {Cor},\text {ID}+\text {MS}}\) originates from the residual muon spectrometer misalignment.

Fig. 4
figure 4

Resolution of the muon \(p_{\text {T}}\) as obtained from simulation after derivation and application of all correction constants. Muons are selected using the \({High}\)-\(p_{\text {T}}\) WP. The resolution is shown as a function of the true \(p_{\text {T}}\) of the muon for a range from 1 \(\text {GeV}\) to 2.5 \(\text {TeV}\), a for muons with \(|\eta |<1.05\) and b for muons with \(|\eta |>1.05\). The resolution lines are obtained by interpolating between points sampled in steps of \(p_{\text {T}} \). The continuous lines correspond to the CB momentum, while the dashed lines to that from the ID, and the dotted lines to that of the MS. Statistical fluctuations, due to the simulated sample size result in small fluctuations on the interpolation between measured values

The resolution as a function of the muon momentum is shown in Fig. 4, (a) for muons in the barrel region \((|\eta |<1.05)\) and (b) for all other muons \((|\eta |>1.05).\) The resolution is estimated by taking half of the smallest interval containing the 68% of the distribution of the relative difference between the corrected and the generated single muon momentum in simulation for muons satisfying the \({High}\)-\(p_{\text {T}}\) WP. The resolution is approximately 2% (3%) at \(p_{\text {T}}\) values of 45 \(\text {GeV}\) in the \(|\eta |<1.05\) \((|\eta |>1.05)\) region and increases with \(p_{\text {T}}\). The resulting CB track momentum resolution is always better than that of the individual measurements in the ID or the MS. At low \(p_{\text {T}}\) the ID dominates while in the intermediate regime of few tens of \(\text {GeV}\) to a few hundreds of \(\text {GeV}\) both detector systems contribute equally to the measurement. At \(p_{\text {T}}\) values higher than few hundred \(\text {GeV}\) the resolution is dominated by the MS. The resolution is approximately 14% (19%) at 1 \(\text {TeV}\) in the \(|\eta |<1.05\) \((|\eta |>1.05)\) region.

6 Performance studies

The samples of \(J/\psi \rightarrow \mu {}\mu \), \(\varUpsilon \rightarrow \mu {}\mu \), and \(Z \rightarrow \mu \mu \) decays are used to validate the momentum corrections obtained with the template fit method described in the previous sections and measure the muon momentum reconstruction performance. The detector segmentation applied to \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) decays is chosen to be at least twice as fine as that used for deriving the simulation corrections. Furthermore, the momentum requirements used in both cases are looser than those used for deriving the corrections. The combination of the finer segmentation and looser requirements allows the data-driven validation of the correction methods both within the bins assumed in the template fits and extrapolated in \(p_{\text {T}}\) beyond the range of the calibration procedure. To complete the data-driven validation and performance measurements, an additional study is performed using \(\varUpsilon \rightarrow \mu {}\mu \) decays, which are statistically fully independent of the samples used to derived the calibration corrections.

The invariant mass distributions for the \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) candidates are shown in Fig. 5 and compared with corrected simulation. The lineshapes of the two resonances in simulation agree with the data within the systematic uncertainties, demonstrating the overall effectiveness of the \(p_{\text {T}}\) calibration.

Fig. 5
figure 5

Dimuon invariant mass distribution of a \(J/\psi \rightarrow \mu {}\mu \) and b \(Z \rightarrow \mu \mu \) candidate events reconstructed with CB muons. The upper panels show the invariant mass distribution for data and for the signal simulation plus the background estimate. The points show the data. The continuous line corresponds to the simulation with the MC momentum corrections applied. Background estimates are added to the signal simulation. The bands represent the effect of the systematic uncertainties in the MC momentum corrections. The lower panels show the ratios of the data to the MC simulation. In the Z sample, the MC background samples are added to the signal sample according to their expected cross-sections. In the \(J/\psi \) sample, the background is estimated from a fit to the data as described in the text. The sum of background and signal MC distributions is normalised to the data

When the two muons have similar momentum resolution and angular effects are neglected, the relative mass resolution, \(\sigma _{\mu \mu }/m_{\mu \mu }\), is directly proportional to the relative muon momentum resolution, \(\sigma _{p_{\mu }}/p_{\mu }\):

$$\begin{aligned} \frac{\sigma _{\mu \mu }}{m_{\mu \mu }} =\frac{1}{\sqrt{2}} \frac{\sigma _{p_{\mu }}}{p_{\mu }}. \end{aligned}$$
(12)

Similarly, the total muon momentum scale, defined as \(s = \langle (p^{\textrm{meas}} - p^{\textrm{true}})/ p^{\textrm{true}} \rangle \), is directly related to the dimuon mass scale, defined as \( s_{\mu \mu } = \langle (m_{\mu \mu }^{\textrm{meas}} - m_{\mu \mu }^{\textrm{true}})/ m_{\mu \mu }^{\textrm{true}} \rangle \):

$$\begin{aligned} s_{\mu \mu } = \sqrt{s_{\mu _1}s_{\mu _2}}, \end{aligned}$$
(13)

where \(s_{\mu _1}\) and \(s_{\mu _2}\) are the momentum scales of the two muons. The effectiveness of the momentum calibration is also measured by comparing the mean value \(m_{\mu \mu }\), and resolution \(\sigma (m_{\mu \mu })\) of the dimuon mass resonances. To measure such quantities, fully analytical fit functions are created for each resonance modelling both the signal and the background. Using the same definition of the resolution function for all resonances, this methodology allows a direct comparison of the resolution quantities as extracted separately for each resonance. In turn, this allows a precise measurement of the momentum reconstruction performance across a wide kinematic regime. The fitting procedures used to measure the mean value and resolution quantities are optimised, subsequently, for each resonance.

Fig. 6
figure 6

Fitted mean mass of the dimuon system for CB muons for a \(J/\psi \rightarrow \mu {}\mu \) and b \(Z \rightarrow \mu \mu \) events for data and corrected simulation as a function of the pseudorapidity of the highest-\(p_{\text {T}} \) muon. The upper panels show the fitted mean mass value for data and corrected simulation. The small variations of the invariant mass estimator as a function of pseudorapidity are due to imperfect energy loss corrections and magnetic field description in the muon reconstruction. Both effects are well reproduced in the simulation. The lower panels show the ratios of the data to the MC simulations. The error bars represent the statistical uncertainty; the shaded bands represent the systematic uncertainty in the correction and the systematic uncertainty in the extraction method added in quadrature

In \(J/\psi \rightarrow \mu {}\mu \) decays, the intrinsic width of the resonance is negligible compared with the experimental resolution. In contrast to Sect. 5.2, the mass window for the \(J/\psi \rightarrow \mu {}\mu \) validation fits is modified to \(2.8<m_{\mu \mu }<3.9~\text {GeV} \). The lower bound is raised to remove the abrupt spectrum sculpting due to the trigger criteria, which is difficult to model well analytically. The upper bound is extended to include the \(\Psi (2S)\) resonance and add a lever arm to constrain the estimation of the resolution.

The resonant peak of the \(J/\psi \) is modelled by a double-sided Crystal Ball function; the four parameters modelling the tails are extracted from simulation, and subsequently fixed in the fits to data. A secondary double-sided Crystal Ball function models the \(\Psi (2S)\) resonance that follows the 1s resonance in the \(J/\psi \) validation mass window. Similarly to the 1S resonance, the parameters modelling the tails are extracted from simulation and kept fixed when fitting to data. The mean value parameter of the resolution function is kept free floating for the \(\Psi (2S)\) resonance for fits on simulated and real data, while the resolution parameter is constrained to scale linearly with the resolution parameter of the 1S resonance using the ratio of the mean parameters. The non-resonant background or the background from mis-identified muons is described by an exponential function.

Fig. 7
figure 7

Dimuon invariant mass resolution for CB muons for a \(J/\psi \rightarrow \mu {}\mu \) and b \(Z \rightarrow \mu \mu \) events for data and corrected simulation as a function of the pseudorapidity of the highest-\(p_{\text {T}} \) muon. The upper panels show the fitted resolution value for data and corrected simulation. The lower panels show the ratios of the data to the MC simulations. The error bars represent the statistical uncertainty; the shaded bands represent the systematic uncertainty in the correction and the systematic uncertainty in the extraction method added in quadrature. For the \(Z \rightarrow \mu \mu \) the estimator shown in the plot is only sensitive to the detector resolution and not to the natural width of the Z boson, as discussed in the text

In \(Z \rightarrow \mu \mu \) decays, the fits use a convolution of the true lineshape (modelled by a Breit–Wigner function) with an experimental resolution function (a double-sided Crystal Ball). A fit range of \(75<m_{\mu \mu }<105~\text {GeV} \) is used. Similarly to the \(J/\psi \), the non-resonant background is described by an exponential function. The peak position and width of the Crystal Ball function are used as estimators for the \(m_{\mu \mu }\) and \(\sigma (m_{\mu \mu })\) variables in each of the \(\eta \) and \(p_{\text {T}}\) bins.

Figure 6 shows the position of the mean value of the invariant mass distribution, \(m_{\mu \mu }\), obtained from the fits to the Z boson and \(J/\psi \) samples as a function of the pseudorapidity of the highest-\(p_{\text {T}} \) muon for each decay. The distributions are shown for data and corrected simulation, with the ratios of the two in the lower panels. The simulation is in good agreement with the data. Minor deviations are within the momentum scale systematic uncertainties of \(0.05\%\) in the barrel region increasing with \(|\eta |\) to \(0.15\%\) for \(J/\psi \rightarrow \mu {}\mu \) decays, and to \(0.05\%\) in the barrel region increasing with \(|\eta |\) to \(0.1\%\) for \(Z \rightarrow \mu \mu \) decays. The systematic uncertainties shown in the plots include the effects of the uncertainties in the calibration constants described in Sect. 5.2. The observed level of agreement demonstrates that the \(p_{\text {T}}\) calibration for combined muon tracks described above provides an accurate description of the momentum scale in all \(\eta \) regions, over a wide \(p_{\text {T}}\) range. Similar levels of data/MC agreement are observed for the ID and MS components of the combined tracks.

Figure 7 displays the dimuon mass resolution \(\sigma (m_{\mu \mu })\) as a function of the leading-muon \(\eta \) for the two resonances. The dimuon mass resolution is about 1.3% and 1.6% at small \(\eta \) values for the \(J/\psi \) and Z bosons, respectively, and increases to 2.1% and 2.4% in the endcaps. This corresponds to a relative muon \(p_{\text {T}}\) resolution of 1.8% and 2.3% in the centre of the detector and 3.0% and 3.4% in the endcaps for \(J/\psi \) and Z boson decays, respectively. Uncertainties in the dimuon mass resolution range between 2% and 5% for the \(J/\psi \) and between 3% and about 6% for the Z boson, depending on the detector region.

Fig. 8
figure 8

Dimuon invariant mass distribution of \(\varUpsilon \rightarrow \mu {}\mu \) candidate events reconstructed with CB muons. The upper panels show the invariant mass distribution for data and for the signal simulation plus the background estimate. The points show the data. The continuous line corresponds to the simulation with the MC momentum corrections applied. The band represents the effect of the systematic uncertainties in the MC momentum corrections. The lower panel shows ratio of the data to the MC simulations. The background is estimated from a fit to the data as described in the text. The sum of background and signal MC distributions is normalised to the data

Using the same methodologies as for the \(J/\psi \rightarrow \mu {}\mu \) decays, the measurement of the scale dependence and resolution is repeated on the fully independent set of data from \(\varUpsilon \rightarrow \mu {}\mu \) decays. The same approach as for the \(J/\psi \) resonances models the 2S and 3S resonances of the \(\Upsilon \). This entails constraining the resolution parameter of the 3S resonance to scale linearly with that of the 2S resonance and that of the 2S resonance to scale linearly with that of the 1S resonance. The mean parameters of the resolution models of the 2S and 3S are kept free independently in data and simulation, similarly to what is done for the \(J/\psi \) resonance. The background is, however, modelled using a polynomial function derived from data. Figure 8 shows the data and simulation agreement after applying the momentum corrections on simulation for the invariant mass distributions of the three resonances of the \(\varUpsilon \rightarrow \mu {}\mu \). As done for the \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) decays, the momentum scale dependency and the resolution are extracted as a function of the pseudorapidity of the leading muon in the event. Figure 9 compares the data and simulation after all corrections. The data agree with the simulation within the systematic uncertainties. The uncertainty in the momentum scale is up to 0.05% in the central region and up to 0.1% in the forward region. Similarly, the extracted resolution is in agreement within about 3%. Good agreement between the dimuon mass resolution measured in data and simulation is also observed when deriving the corrections independently for the ID and MS components of the combined tracks, as shown in Appendix B.

Fig. 9
figure 9

a Fitted mean mass of the dimuon system for CB muons for \(\varUpsilon \rightarrow \mu {}\mu \) events in data and corrected simulation as a function of the pseudorapidity of the highest-\(p_{\text {T}}\) muon. b Dimuon invariant mass resolution for CB muons for \(\varUpsilon \rightarrow \mu {}\mu \) events for data and corrected simulation as a function of the pseudorapidity of the highest-\(p_{\text {T}} \) muon. The upper panels show the results obtained on data and simulation after all corrections. The lower panels show the ratios of the data to the MC simulation. The error bars represent the statistical uncertainty; the shaded bands represent the systematic uncertainty in the correction and the systematic uncertainty in the extraction method added in quadrature

The relative dimuon mass resolution \(\sigma _{\mu \mu }/m_{\mu \mu }\) is approximately proportional to the average momentum of the muons, as shown in Eq. (12). A direct comparison of the momentum resolution function determined with \(J/\psi \) , \(\Upsilon \), and Z boson decays can therefore be performed. To remove the effect of the correlation between the measurement of the dimuon mass resolution and the \(p_{\text {T}} \) of the muons, the following definition of transverse momentum is used:

$$\begin{aligned} p_{\text {T}} ^{*} = \hat{m} \sqrt{ \frac{\sin {\theta _1}\sin {\theta _2}}{2 (1-\cos \alpha _{12})}}, \end{aligned}$$
(14)

where \(\hat{m}\) is a fixed value, corresponding to the best known value of the mass of the Z boson, the \(J/\psi \), and the \(\Upsilon \), respectively. The variables \(\theta _1\) and \(\theta _2\) are the polar angles of the two muons, and \(\alpha _{12}\) is the opening angle of the muon pair. The relative dimuon mass resolution from \(J/\psi \rightarrow \mu {}\mu \), \(\varUpsilon \rightarrow \mu {}\mu \), and \(Z \rightarrow \mu \mu \) events measured in data is compared with that obtained from calibrated simulation as a function of \(p_{\text {T}} ^{*}\) in Fig. 10. The resolutions are in good agreement. Due to the larger number of events, the \(J/\psi \rightarrow \mu {}\mu \) measurement extends to higher momenta than the \(\varUpsilon \rightarrow \mu {}\mu \) one.

Fig. 10
figure 10

The upper panel shows the dimuon invariant mass resolution divided by the dimuon invariant mass for CB muons measured from \(J/\psi \rightarrow \mu {}\mu \), \(\varUpsilon \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) events as a function of the \( p_{\text {T}} ^{*}\) variable defined in the text. To account for the resolution differences induced by the different pseudorapidity distributions among muons from the three resonances, the data and simulations are re-weighted according to the pseudorapidity distribution of the muon with the highest momentum from \(\varUpsilon \rightarrow \mu {}\mu \) decays. The lower panel shows the ratio of the data to the MC simulation. The error bars represent the statistical uncertainty, while the bands show the systematic uncertainties

7 Conclusion

The momentum performance of the ATLAS muon reconstruction is measured using 139 fb\(^{-1}\) of data from pp collisions at \(\sqrt{s} = 13\) \(\text {TeV}\) recorded during the second run of LHC between 2015 and 2018.

Events from the \(Z \rightarrow \mu \mu \) process are used to correct the reconstructed data for a charge-dependent bias in the muon momentum measurement associated with detector alignment effects. This bias is reduced on average from up to 0.4 \(\text {TeV} {}^{-1}\) to \(2\cdot 10^{-4}~\text {TeV} {}^{-1}\) for muons with \(p_{\text {T}} <450~\text {GeV} \) after the corrections are applied, with associated uncertainty at the level of 0.03 \(\text {TeV} {}^{-1}\) for muons with transverse momentum of about \(45~\text {GeV} \).

The scale and resolution of the muon momentum measurement is studied in detail using \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) decays. These studies are used to correct the simulation, improving the agreement with the data and reducing the systematic uncertainties related to the muon calibration in physics analyses. The improvements in the \(p_{\text {T}}\) correction methods described in this paper and the substantial number of \(J/\psi \rightarrow \mu {}\mu \) and \(Z \rightarrow \mu \mu \) decays collected in the Run 2 data sample together improve the precision of the momentum scale by up to a factor of two relative to the previous publication based on 3.2 fb\(^{-1}\) of collected data [10]. For \(Z \rightarrow \mu \mu \) decays, the uncertainty in the momentum scale varies from a minimum of \(0.05\%\) for \(|\eta | <1\) to a maximum of \(0.15\%\) for \(|\eta |\sim 2.5\).

The dimuon mass resolution is about 1.3% (1.6%) at small values of pseudorapidity for \(J/\psi \) (Z) decays, and increases up to 2.1% (2.4%) in the endcaps. This corresponds to a relative muon \(p_{\text {T}}\) resolution of 1.8% (2.3%) at small values of pseudorapidity and 3.0% (3.4%) in the endcaps for \(J/\psi \) (Z) decays. After applying momentum corrections, the \(p_{\text {T}}\) resolution in data and simulation agree within the quoted uncertainties, which are at the level of, or better than, 5% (6%) for \(J/\psi \) (Z) decays depending on the \(\eta \) range.

Validation studies performed with \(Z \rightarrow \mu \mu \) and \(J/\psi \rightarrow \mu {}\mu \) show that the corrections applied to the simulation bring the agreement with the data to the level of their estimated uncertainties. An additional, statistically independent validation performed using \(\varUpsilon \rightarrow \mu {}\mu \) decays confirms the correctness of the correction procedure.