Measurement of the Higgs boson mass in the $H \rightarrow ZZ^* \rightarrow 4\ell$ decay channel using 139 fb$^{-1}$ of $\sqrt{s}=13$ TeV $pp$ collisions recorded by the ATLAS detector at the LHC

The mass of the Higgs boson is measured in the $H \rightarrow ZZ^* \rightarrow 4\ell$ decay channel. The analysis uses proton-proton collision data from the Large Hadron Collider at a centre-of-mass energy of 13 TeV recorded by the ATLAS detector between 2015 and 2018, corresponding to an integrated luminosity of 139 fb$^{-1}$. The measured value of the Higgs boson mass is $124.99\pm0.18\text{(stat.)}\pm0.04\text{(syst.)}$ GeV and is based on improved momentum-scale calibration for muons relative to previous publications. The measurement also employs an analytic model that takes into account the invariant-mass resolution of the four-lepton system on a per-event basis and the output of a deep neural network discriminating signal from background events. This measurement is combined with the corresponding measurement using 7 and 8 TeV $pp$ collision data, resulting in a Higgs boson mass measurement of $124.94\pm0.17\text{(stat.)}\pm0.03\text{(syst.)}$ GeV.


Introduction
The observation of a Higgs boson, , in 2012 by the ATLAS and CMS collaborations [1,2] in proton-proton ( ) collisions produced by the Large Hadron Collider (LHC) was a major step towards understanding the mechanism of the electroweak (EW) symmetry breaking [3][4][5].An unknown parameter of the Standard Model (SM), the Higgs boson mass   is related to the SM vacuum stability [6] and its value is required for precise calculations of EW observables, including the production and decay properties of the Higgs boson itself.These calculations are needed to test the coupling structure of the SM Higgs boson, as suggested in Ref. [7] and references therein.Therefore, it is important to determine the Higgs boson mass experimentally.
The mass of the Higgs boson was measured to be 125.09± 0.24 GeV [8] in a combined analysis performed by the ATLAS and CMS collaborations using approximately 25 fb −1 of √  = 7 and 8 TeV   collision data recorded in 2011 and 2012, respectively, commonly referred to as Run 1.The individual measurements, reported in Refs.[9,10], used the  →   * → 4ℓ (where ℓ =  or ) and  →  decay modes because of their excellent mass resolution.
The ATLAS Collaboration reported a measurement of   using the  →   * → 4ℓ and  →  channels with 36.1 fb −1 of √  = 13 TeV   collision data recorded in 2015 and 2016.This was combined with the Run 1 ATLAS measurement to obtain a value of   = 124.97± 0.24 GeV [11].The CMS Collaboration measured   using the  →   * → 4ℓ and  →  channels with 35.9 fb −1 of √  = 13 TeV   collision data recorded in 2016.This was combined with the Run 1 CMS measurement to obtain a value of   = 125.38 ± 0.14 GeV [12].This paper reports a new measurement of   in the  →   * → 4ℓ channel using a 139 fb −1 dataset of 13 TeV   collisions produced by the LHC and recorded by the ATLAS detector between 2015 and 2018, commonly referred to as Run 2. Most of the techniques used in this analysis are described in detail in Ref. [13].
The  →   * → 4ℓ process is identified by selecting four leptons (ℓ = , ) in the final state, and by identifying one pair of same-flavour leptons that is consistent with arising from an on-shell Z-boson decay.The mass measurement is performed using an analytic model to describe the distribution of the reconstructed 4ℓ invariant mass,  4ℓ , as a function of   and to fit to the observed distribution.In comparison with the previous ATLAS result from the  →   * → 4ℓ channel [11], the measurement has been improved by using the complete Run 2 dataset and a new high-precision muon momentum calibration.The   uncertainty has also been reduced by introducing a neural-network-based classifier to discriminate between signal and background processes and by including the event-by-event invariant-mass resolution of the four-lepton system in the analytic model used to fit the collision data.This paper is organised as follows.The ATLAS detector is described in Section 2, and the data samples and Monte Carlo simulations are presented in Section 3. The reconstruction of muons and electrons, as well as their calibration procedures, is discussed in Section 4, while the event selection is explained in Section 5.The statistical models describing the signal and background samples are defined in Section 6 and results of the measurement are presented in Section 7. The conclusions are provided in Section 8.

ATLAS detector
The ATLAS experiment [14] at the LHC is a multi-purpose particle detector with nearly 4 coverage in solid angle. 1 It includes an inner tracking detector (ID) used for charged-particle tracking, surrounded by a 2 T solenoid.Electromagnetic (EM) and hadronic calorimeters are placed outside the solenoid, followed by a muon spectrometer (MS).
The ID provides precise reconstruction of tracks within a pseudorapidity range || ≤ 2.5.The highgranularity silicon pixel detector covers the vertex region and typically provides four measurements per track, the first hit normally being in the insertable B-layer (IBL) installed before Run 2 [15,16].It is followed by the silicon microstrip tracker, which usually provides eight measurements per track.These silicon detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to || = 2.0.The TRT also provides electron-identification information based on the detection of transition radiation X-ray photons.The ATLAS calorimeter system covers the pseudorapidity range || < 4.9, with finer granularity over the region matching the inner detector.The lead/liquid-argon (LAr) EM calorimeter is divided into two half-barrels (|| < 1.475) and two endcap components (1.375 < || < 3.2).It is segmented into three longitudinal (depth) sections over the region || < 2.5, and into two depth sections for 2.5 < || < 3.2.Hadronic calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures within || < 1.7, and two copper/LAr hadronic endcap calorimeters extending the coverage to || = 3.2.The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic energy measurements, respectively.
The muon spectrometer, located beyond the calorimeters, is designed to detect muons in the region || < 2.7 and to provide momentum measurements with a relative resolution better than 3% over a wide range of 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the -axis along the beam pipe.The -axis points from the IP to the centre of the LHC ring, and the -axis points upwards.Cylindrical coordinates (, ) are used in the transverse plane,  being the azimuthal angle around the -axis.
The pseudorapidity is defined in terms of the polar angle  as  = − ln tan(/2).Angular distance is measured in units of transverse momentum,  T , and up to 10% at  T ∼ 1 TeV.A system of three superconducting air-core toroidal magnets provides a magnetic field with a bending power of approximately 2.5 Tm in the barrel and up to 6 Tm in the endcaps.
A two-level trigger system [17] is used to select events.The first-level trigger is implemented in hardware and uses a subset of the detector information to accept events at a maximum rate of about 100 kHz.This is followed by a software-based trigger that reduces the accepted event rate to 1 kHz on average.An extensive software suite [18] is used in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

Data and event simulation
This measurement uses data from   collisions with a centre-of-mass energy of 13 TeV collected between 2015 and 2018 using single-lepton, dilepton, and trilepton triggers [19,20] as detailed in Ref. [13].The combined efficiency of these triggers is approximately 98%, 99%, 97%, and 99% for the 4, 22, 22, and 4 final states, respectively, for the simulated  →   * → 4ℓ events passing the event selection described in Section 5 (assuming   = 125 GeV).After data-quality requirements are imposed, the integrated luminosity of the data sample is 139 fb −1 [21].
Details of the simulated Monte Carlo (MC) events used in this analysis can be found in Refs.[13,22].Higgs boson production via the gluon-gluon fusion (ggF) process was modelled at next-to-next-to-leading-order (NLO) accuracy in the strong coupling constant  s using the Powheg NNLOPS generator [23][24][25][26][27][28][29][30][31] with the PDF4LHC15nnlo set of parton distribution functions (PDFs) [32].Higgs bosons produced via vectorboson fusion (VBF), in association with a vector boson ( ) or in association with a top-quark pair ( t), were simulated at NLO accuracy with the Powheg Box generator [25][26][27], using the PDF4LHC15nlo PDF set.The loop-induced  →   process was simulated with the Powheg Box generator at leading-order (LO) accuracy.Higgs boson production in association with a top quark () or with a bottom-quark pair ( b) was simulated at NLO accuracy using the MadGraph5_aMC@NLO generator [33,34] with the NNPDF3.0PDF set [35].For all signal processes, the EvtGen 1.2.0 generator [36] was used for the simulation of the bottom-and charm-hadron decays.Correspondingly, the Pythia 8 generator [37] was used for the  →   * → 4ℓ decay as well as for parton showering, hadronisation, and simulation of the underlying event.Higgs bosons produced via the ggF and VBF processes or in association with a vector boson were simulated for a range of   values from 123 to 127 GeV.Higgs boson production in association with a top quark, a top-quark pair, or a bottom-quark pair were simulated assuming   = 125 GeV, as these processes make a negligible contribution.These samples are normalised to cross-sections obtained from the most recent predictions provided by the LHC Higgs Working Group [7].
The   * continuum background was modelled separately for quark-initiated ( ), gluon-initiated ( ), and vector boson scattering (EW  ) production.The   process was modelled at NLO accuracy in  s using the Sherpa 2.2.2 generator [38][39][40][41] with the NNPDF3.0nnloPDF set.The   process with 0 or 1 jet in the final state was modelled with a LO calculation using the same Sherpa generator setup, with higher-order corrections calculated using massless quark loops [42][43][44] in the heavy top-quark approximation [45].The resulting   cross-section was scaled by 1.7 ± 1.0 to account for the uncertainty in additional higher-order effects.Production of EW   was modelled at LO accuracy with the Sherpa v2.2.2 with matching of the matrix element to the parton shower using the ME+PS@NLO prescription [46].These samples were normalised to cross-sections obtained directly from the Sherpa simulation.Alternative   samples, produced with the Powheg Box v2 and MadGraph5_aMC@NLO generators, are used for studies of the systematic uncertainties.For the nominal and alternative   * continuum samples, the Pythia 8 generator was used for parton showering, hadronisation, and simulation of the underlying event.
Backgrounds from   production and  t events were modelled using the Powheg Box v2 generator, while  bosons produced in association with jets were simulated using the Sherpa 2.2.1 generator.Minor contributions from processes with three electroweak bosons, denoted by , were modelled using Sherpa 2.2.2.Small backgrounds originating from top-quark production in association with one or more electroweak bosons or additional top quarks, such as ,  , ,  , ,  , , , and  (denoted by   ), were simulated using the MadGraph5_aMC@NLO generator.The Pythia 8 generator was used for parton showering, hadronisation, and simulation of the underlying event.
Generated events were processed through the ATLAS detector simulation [47] within the Geant4 framework [48] and reconstructed in the same way as collision data.Additional   interactions in the same or neighbouring bunch crossings, referred to as pile-up, are included in the simulation.The pile-up events were modelled using the Pythia 8 generator with the A3 set of tuned parameters [49] and the NNPDF2.3loPDF set [50].
The normalisation of the non-resonant   * background component is determined by the fit of a signal and background model to the observed  4ℓ distributions.The normalisations of the backgrounds from hadrons or hadron-decay products misidentified as prompt leptons, and from +jets,  t and   processes (referred to as the reducible background), are determined using data-driven techniques, and are explained in detail in Ref. [13].

Muon and electron reconstruction
Muon candidates are reconstructed using a combination of different algorithms [51].The reconstruction of muon candidates within || < 2.5 is primarily performed by a global fit of reconstructed tracks in the ID and the MS.In the central detector region (|| < 0.1), where the MS has reduced geometrical coverage, muons are also identified by matching a reconstructed ID track to either an MS track segment ('segment-tagged muons') or a calorimetric energy deposit consistent with a minimum-ionising particle ('calorimeter-tagged muons').Calorimeter-tagged muons are required to have  T > 15 GeV.For both the segment-tagged and calorimeter-tagged muons, the muon momentum is measured from the ID track alone.In the forward MS region (2.5 < || < 2.7) outside the ID coverage, MS tracks with hits in three MS layers are accepted as 'stand-alone muons' and combined with track segments formed from hits in the silicon tracker, if they exist.Additionally, 'loose' muon-identification criteria [51] are applied to reject low-quality tracks that have missing hits in the MS or have poor agreement between the reconstructed MS and ID tracks.These requirements have an efficiency of at least 98% for muons with  T above 5 GeV.
Muons are required to be isolated by using both calorimeter-based and track-based isolation variables and applying the 'PflowLoose' criteria [51].The efficiency of the isolation selection is about 85% for muons with  T above 5 GeVand increases to >97% for muons with  T > 20 GeV.
Although the detector is well aligned overall, there are residual local misalignments that effect the reconstructed muon-track sagitta [52,53] and introduce a small, charge-dependent, momentum bias.The bias is measured with an iterative procedure that minimises the observed width of the line-shape of  →  +  − decays [54].This correction, which is applied to the collision data, reduces the width of the dimuon invariant-mass distribution in -boson decays by approximately 1% [54].A systematic uncertainty is assigned to this correction to cover residual differences between observed and simulated data.It is estimated to be, on average, 1% of the correction itself for muons from  →  +  − decays.
Corrections are also applied to the reconstructed muon momentum in simulation in order to precisely match the data.These corrections to the simulated momentum resolution and momentum scale are parameterised as an expansion in powers of the muon  T , with each coefficient measured as a function of  and .The corrections are extracted from large data samples of / →  +  − and  →  +  − decays, using techniques that reduce any residual biases and by directly calibrating the global re-fitted track [54].Compared to the previously used corrections, this reduces the associated uncertainties by approximately a factor of four.
The momentum-scale corrections range from 0.1% to 0.3% for muons with  T between 5 and 100 GeV.These account for inaccuracies in the estimation of the energy loss in the traversed material, local magnetic field inaccuracies, and geometrical distortions.The corrections to the momentum resolution for muons with  T of 5-100 GeV are at the percent level.The main sources of systematic uncertainty for these corrections are the residual biases in the calibration method and the consistency of calibrations using / →  +  − or  →  +  − decays separately.For muons from  →  +  − decays, which have an average  T of approximately 45 GeV, the momentum scale is determined with a precision better than 0.01%.The resolution is known with a precision of 0.4% for muons with || < 1, while for muons in high-|| regions the precision is approximately 1%.
A reconstructed electron consists of a cluster of energy deposits in the calorimeter and a matched ID track [55].Variable-size clusters are created dynamically from calorimeter-energy deposits, improving the invariant-mass resolution of the four-lepton system, especially when bremsstrahlung photons are present.Electron ID tracks are fitted using an optimised Gaussian-sum filter (GSF) [56] that accounts for non-linear effects arising from energy loss through bremsstrahlung.Quality criteria are used to improve the purity of selected electron candidates.The quality of an electron candidate is evaluated by using a likelihood method that employs measurements from the tracking system and the calorimeter system, and quantities that combine both tracking and calorimeter information [56].The 'loose' likelihood criteria, together with track hit requirements, are applied to electron candidates.Electrons are required to be isolated using both the calorimeter-based and track-based isolation variables as discussed in Ref. [13].
The energy of electrons is estimated from the calorimeter energy clusters, using a combination of simulationbased and data-driven corrections [55,57].A single simulation-based correction, which accounts for the energy lost in the material upstream of the calorimeter, the energy deposited in the cells neighbouring the cluster in  and , and the energy lost beyond the LAr calorimeter, is derived using multivariate regression algorithms.Data-driven corrections account for effects such as those associated with the material in front of the EM calorimeter and material between the presampler and the calorimeter, and the inter-calibration of the different calorimeter layers.The remaining energy-scale difference between data and simulation is parameterised by energy-independent linear corrections, defined in different regions of .
Similarly, deviations of the energy resolution in the simulation from that in the data are parameterised as an -dependent additional term.The electron reconstruction efficiency is >97% for electron  T > 18 GeVand the additional identification efficiency varies between 75 and 90% for the kinematic region relevant to this analysis.
The main sources of systematic uncertainties in the electron-energy scale include uncertainties in the method used to extract the energy scale correction, as well as uncertainties due to the extrapolation of the energy scale from  →  +  − events to electrons with different energies.A detailed explanation of these uncertainties can be found in Ref. [55].In the case of electrons with  T around 40 GeV, the total relative uncertainties range between 0.04% and 0.2% for most of the detector acceptance.For electrons with  T around 10 GeV the relative uncertainty ranges between 0.3% and 0.8%.
Systematic uncertainties in the calorimeter-energy resolution for electrons arise from uncertainties in the modelling of the sampling term and in the measurement of the constant term in -boson decays, in the amount of material in front of the calorimeter, and in the modelling of the contribution to the resolution due to fluctuations in the pile-up from additional   interactions in the same or neighbouring bunch crossings.
The uncertainty in the energy resolution for electrons with transverse energy between 30 and 60 GeV varies between 5% and 10%.

Event selection
Events are required to contain at least four isolated leptons emerging from a common vertex and forming two pairs of oppositely charged same-flavour leptons.Electrons are required to be within the geometrical acceptance of the inner detector (|| < 2.47) and to have  T > 7 GeV, while muons must be within the geometrical acceptance of the muon spectrometer (|| < 2.7) and have  T > 5 GeV (except for calorimeter-tagged muons, as explained in Section 4).At most one calorimeter-tagged or stand-alone muon is allowed per Higgs boson candidate.The three higher-( T or  T ) leptons in each quadruplet are required to pass thresholds of 20, 15, and 10 GeV, respectively, and have Δ(ℓ, ℓ ′ ) > 0.1.These thresholds are chosen to maximize signal yield consistent with selecting events with high trigger efficiency.Contributions from misidentified leptons are reduced by requiring the lepton tracks to have low transverse-impact-parameter significances and to be compatible with originating from a common vertex.A detailed description of the event selection can be found in Ref. [13].
The lepton pair with an invariant mass closest to the -boson mass in each quadruplet is referred to as the leading dilepton, while the remaining pair is referred to as the subleading dilepton.The selected quadruplets are separated into four subchannels, according to the flavour of the leading and subleading pairs.In order of decreasing expected selection efficiency and resolution, they are 4, 22, 22, and 4.
The  4ℓ resolution is about 1.5 GeV for subchannels with a subleading muon pair (4 and 22) and about 2.1 GeV for subchannels with a subleading electron pair (22 and 4).Only one quadruplet is selected from each event, based on the mass of the leading dilepton, the final state, and, for events with additional leptons, the value of the LO matrix element, as described in Ref. [13].
Final-state radiation (FSR) photons are searched for in all events following the procedure described in Ref. [13].FSR candidates are defined as collinear if their angular separation from the nearest lepton of the quadruplet satisfies Δ < 0.15, and non-collinear otherwise.Collinear FSR candidates are considered only for muons from the leading dilepton as the electron candidate reconstruction includes collinear FSR, while non-collinear FSR candidates are considered for both muons and electrons from either the leading or the subleading dilepton.Only one FSR candidate is included in the quadruplet as events with more than one FSR candidate are rare and have negligible effect on the four-lepton mass resolution.In cases where there are more one one FSR candidate, preference is given to collinear FSR and to the candidate with the highest  T .FSR photons are found in 4% of the events and their energy is included in the mass computation, improving the  4ℓ resolution by about 1%.
Finally, the four-momenta of leptons in the leading pair are recomputed by performing a kinematic fit which constrains their invariant mass to the -boson mass, as discussed in Ref. [58].The fit, which takes into account lepton and FSR kinematic information, their associated experimental uncertainties, and the -boson width, improves the  4ℓ resolution by about 17% [9].The lepton four-momenta reconstruction uncertainties were derived from simulation and are corrected to match those observed in data, using a large sample of -boson decays.
In the mass range 115 <  4ℓ < 130 GeV, 313 candidate events are observed.The yield is in agreement with the expectation of 321 ± 14 events, 65% of which are predicted to originate from signal processes, assuming   = 125 GeV.The  4ℓ distributions are shown in Figure 1 for each final state and in Figure 2 for the inclusive final state for the mass range of 105-160 GeV, which is used to determine the Higgs boson mass.
The dominant contribution to the background is non-resonant   * production, accounting for approximately 89% of the total background yield.The reducible background contributes approximately 9%.The  and    events are estimated to constitute approximately 2% of the total background.The residual combinatorial background, originating from events with additional prompt leptons, is found to be negligible [13].

Signal-background discriminant
To provide additional separation between the  →   * → 4ℓ signal and the   * → 4ℓ background, a deep feed-forward neural network (NN) is employed.The NN is trained with Keras [59] using the TensorFlow [60] backend, following the method described in Ref. [13].Signal events for training are taken from simulated samples with different masses of the Higgs boson, as described in Section 3, thus reducing the dependence of the discriminant on   .
The  T and  of the four-lepton system are used as inputs to the NN, together with a matrix-element-based kinematic discriminant    * [58].The discriminant    * is defined as ln( where M    * denotes the matrix element for leading-order (LO)  →   * → 4ℓ production and M   * the sum of the corresponding matrix elements for the  q →   * and  →   * continuum backgrounds, all calculated at LO with the MadGraph 2.2.1 generator [33].
The distribution of the output of the NN,   , is shown in Figure 3.The   for each event is included in the final fit as an additional observable, as explained in Section 6.The additional separation between signal and background provided by the NN improves the precision of this measurement by 2%.
The QRNN is trained on signal MC events using the Tensorflow library and the Keras Python application programming interface.The inputs are the  T , , and  of the individual leptons, as well as the four-lepton momentum, constrained by the -boson mass constraint, and its uncertainty.The output of the QRNN for each final state is the predicted quantile of the difference between the reconstructed mass and the true mass of the four-lepton system.The targeted quantile is calibrated using a step-wise scanning procedure so that it results in an estimate that produces 68% confidence-level intervals when tested on simulated events.
The   distributions are shown in Figure 4.Each event is assigned the   estimated by the QRNN and employed in the likelihood fit to   .Taking into account the per-event resolution reduces the total expected uncertainty in   by 1%.The observed and expected (pre-fit)  4ℓ distributions for the selected Higgs boson candidates.The predicted number of events for these distributions is taken from simulation for the signal,   * ,   , and  processes, while it is taken from the data-driven estimate (see Section 6) for the +jets and  t backgrounds.The total uncertainty in the prediction is shown by the hatched band, which also includes the theoretical uncertainties of the SM cross-section for the signal and the   * background.Higgs boson events in this plot are simulated with   = 125 GeV.The -value quantifying the agreement between the data and MC predictions is 0.36.The observed and expected (pre-fit)   distributions for the selected Higgs boson candidates.The predicted number of events for these distributions is taken from simulation for the signal,   * ,   , and  processes, while it is taken from the data-driven estimate (see Section 6) for the +jets and  t backgrounds.The total uncertainty in the prediction is shown by the hatched band, which also includes the theoretical uncertainties of the SM cross-section for the signal and the   * background.Higgs boson events in this plot are simulated with   = 125 GeV.The -value quantifying the agreement between the data and MC predictions is 0.28.The observed and expected (pre-fit) event-level resolution   distributions predicted by the QRNN for the selected Higgs boson candidates.The predicted number of events for these distributions is taken from simulation for the signal,   * ,   , and  processes, while it is taken from the data-driven estimate (see Section 6) for the +jets and  t backgrounds.The total uncertainty in the prediction is shown by the hatched band, which also includes the theoretical uncertainties of the SM cross-section for the signal and the   * background.Higgs boson events in this plot are simulated with   = 125 GeV.The 4 and 22 events typically have   ∼ 1.5 GeV, while the 22 and 4 events typically have   ∼ 2.1 GeV.The -value quantifying the agreement between the data and MC predictions is 0.81.

Signal and background model
The Higgs boson  4ℓ distribution is the result of the convolution of its theoretical line-shape, a narrow relativistic Breit-Wigner (BW) distribution of 4.1 MeV width [7] centred on   , with the detector response for the four-lepton invariant-mass distribution.The BW width is more than two orders of magnitude smaller than the  4ℓ detector resolution.Therefore, the signal line-shape in  4ℓ is completely dominated by the detector response.
The signal probability density function is modelled as a function of  4ℓ , signal versus   * discriminator   , and the per-event resolution estimator   .The model can be written as where the following approximations are used: • P (  |  ,   ) ≃ P (  |  ) because the neural network discriminant does not directly depend on the per-event  4ℓ resolution, • P (  |  ) ≃ P (  ) since the averaged per-event resolution does not depend on   within the range of 105 to 160 GeV used in this measurement, and • P (  ) is omitted from the probability density function because it was observed that it has approximately the same distribution for signal and background events in the Higgs boson peak region.Dedicated checks of the assumption that P (  ) can be omitted have shown that this has negligible impact on the measurement.
The probability density function P (  |  ) is determined through interpolation between neighbouring   values using simulated data.The probability density function P ( 4ℓ |  ,   ,   ) in each subchannel is described by a double-sided Crystal Ball [62] probability density function that consists of a Gaussian core and two power-law tails.
The mean of the Gaussian core is parameterised as a function of   and   for each subchannel  as in order to decorrelate the uncertainties of the shifts in the mean reconstructed  4ℓ due to the subchannel dependence from those due to the   dependence.The   parameters are consistent with unity within a few percent for all final states.The dependence of   on   is parameterised as a second-order polynomial, and its values lie in the 124-125 GeV range for the 22 and 4 final states, and between 124.6 GeV and 125 GeV for the 22 and 4 final states.The residual correlations between the   dependence of the mean and shape of the reconstructed  4ℓ distribution, and the   dependence of the mean have been checked with simulation and found to be negligible.The   parameters and the   functions are extracted from an unbinned maximum-likelihood fit to the combined ggF, VBF, and   samples, which were simulated with different values of   , as described in Section 3.
The standard deviation of the Gaussian core is expressed as a function of the predicted event-level resolution   and is parameterised as a function of   to account for a residual correlation between   and   .The parameters of the left tail of the double-sided Crystal Ball are parameterised as a function of   only, which results in few percent variation as a function of   , while those of the right tail are found to be constant for a given final state.The parameters of the signal model for each subchannel are derived by a simultaneous maximum-likelihood fit to signal MC samples for   values ranging from 123 GeV to 127 GeV.The resulting probability density function is shown in Figure 5 for the   = 125 GeV signal sample.The residual bias in the measured   for the derived model, estimated by pseudo-experiments sampling MC events, is at the level of 6 MeV and is treated as an additional systematic uncertainty.In the fit of the model to the data,   can vary freely while the other parameters are constrained to vary in accordance with the impact of the relevant systematic uncertainties.Negligible biases in   are observed when fitting the model to simulated Higgs signal samples with different values of   .
The probability density functions for the   * and    +  backgrounds are estimated as a function of  4ℓ and   for each subchannel from MC simulation.The size of the background contributions from misidentified and non-isolated leptons are estimated from a data-control region [13], with the dependence on  4ℓ and   parameterised using simulated samples.
The probability density functions for the backgrounds are smoothed using an adaptive kernel estimation technique, employing Gaussian kernels [63] to reduce statistical fluctuations.The effects of uncertainties that modify the shapes of probability density functions are described by using non-linear moment morphing [64] to interpolate between the nominal probability density functions and those accounting for the relevant systematic variations.Such uncertainties include the theoretical and experimental uncertainties for the   * and    +  probability density functions and imperfect knowledge of the relative background composition for the reducible background's probability density function.
The  4ℓ distribution is described by the sum of a signal distribution and the distributions of the backgrounds.
The normalisations of the signal contribution and the   * process in each subchannel are free parameters and extracted directly from the fit to data, while those of the reducible background in each channel are constrained according to estimates obtained from data, using minimal input from simulation and following the methodology described in Ref. [13].The normalisation of the    +  background is constrained according to the uncertainties in the theory prediction.
The fit model was validated by utilising a two-step unblinding process, where the same, unknown, bias is applied to all events in the observed dataset and the likelihood model, as described in Section 7, is fitted to it.No significant changes to the compatibility between channels or to any free or constrained parameters, with the exception of   , are observed when this bias is removed.

Results
The mass measurement is performed by maximising the profile-likelihood ratio [65,66] , where m and θ denote the unconditional maximum-likelihood estimates of the parameters of the likelihood function L, while θ is the conditional maximum-likelihood estimate of the parameters  for a fixed value of the parameter of interest,   .Systematic uncertainties and their correlations are modelled by introducing nuisance parameters  with priors described by Gaussian or log-normal functions that reflect the uncertainties in the values of the nuisance parameters.
The estimate of   is extracted by performing a simultaneous unbinned maximum-likelihood fit to the four subchannels (4, 22, 22, and 4) in the  4ℓ range between 105 and 160 GeV.The expected and observed yields, along with the signal-to-background ratio, are shown in Table 1 for the signal region defined by limiting the data to  4ℓ between 115 and 130 GeV.
The free parameters of the fit are   , the signal normalisation, and background normalisations for each of the four subchannels, while the nuisance parameters associated with the systematic uncertainties are constrained by their respective priors.Figure 6 shows the  4ℓ distribution of the data together with the result of the fit to the  →   * → 4ℓ candidates.The small shape fluctuations in the background prediction arise from the integrations of the background probability density functions; their effects on the mass fit are negligible.The fit results in   = 124.99± 0.18(stat.)± 0.04(syst.)GeV.
All fitted normalisations are found to be compatible with the SM expectations.By allowing the fit to determine the normalization in each subchannel, the result is insensitive to the SM predictions for each subchannel.There is negligible change in the fit results if the relative fractions are constrained to the SM.The fit for   is also performed independently for each decay channel.The  4ℓ distributions for the four channels are shown in Figure 7.The resulting profile-likelihood-ratio variation as a function of   and the resulting fitted values are compared with those of the combined fit in Figure 8.The   measurements from the individual channels are compatible with the combined measurement, with a -value of 0.82.
The statistical uncertainty of   is estimated by fixing all nuisance parameters associated with systematic uncertainties to their best-fit values, leaving all remaining parameters unconstrained.Uncertainties on Table 1: The observed and expected (pre-fit) yields for the different decay final states in the region with  4ℓ between 115 and 130 GeV.The predicted number of events is taken from simulation for the signal,   * ,   , and  processes, while for +jets and  t it is taken from the data-driven estimate (see Section 6).The uncertainty in the prediction includes the theoretical uncertainties of the SM cross-section for the signal and the   * background.Higgs boson yields are determined using simulation with   = 125 GeV.   the Higgs boson signal originating from the QCD scale, PDF variations, and modelling of FSR account result in an uncertainty in   of 14 MeV.Other sources, including background modelling and simulation statistics, contribute less than 5 MeV.The systematic uncertainties are applied following the strategy detailed in Ref. [13].The total systematic uncertainty is estimated by subtracting the square of the statistical uncertainty from the square of the total uncertainty and calculating the square root.Table 2 shows the leading contributions to the systematic uncertainty of   .The uncertainties in the electron energy scale and in the muon momentum scale, resolution, and sagitta bias correction are described in Section 4. Other sources, including background modelling and simulation statistics, contribute less than 5 MeV.The total uncertainty is in agreement with the expectation of 0.19 GeV and is dominated by the statistical component.Figure 9 shows the distribution of the expected uncertainty in   obtained by fitting a set of pseudo-experiments assuming   = 125 GeV.
The result of this measurement is combined with the measurement of   using the Run 1 dataset in the same final states, which was   = 124.51± 0.52 [9].The correlation scheme for the systematic uncertainties of the two measurements follows the one used in Ref. [11], where only the uncertainties in the

Systematic Uncertainty Contribution [MeV]
Muon momentum scale ±28 Electron energy scale ±19 Signal-process theory ±14 lepton calibration were considered correlated.In this combination, the electron calibration uncertainty is correlated between the two measurements, while the muon calibration systematic uncertainty is uncorrelated due to the use of improved and independent techniques, as described in Section 4. Systematic uncertainties are reduced by about 14% when using this correlation scheme.The combined result, calculated by combining the likelihoods of the two measurements is   = 124.94± 0.17(stat.)± 0.03(syst.)GeV, which has slightly smaller statistical and systematic uncertainties.The -value for the two individual results, assuming the Higgs boson mass is the combined value, is 0.4.

Summary
The mass of the Higgs boson is measured from a maximum-likelihood fit to the invariant mass and the predicted invariant-mass resolution of the  →   * → 4ℓ decay channel.The results are obtained from the full Run 2   collision data sample recorded by the ATLAS experiment at the CERN Large Hadron Collider at a centre-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 139 fb −1 .The measurement is based on the latest calibrations of muons and electrons, and on improvements to the analysis techniques used to obtain the previous result using data collected by ATLAS in 2015 and 2016.
Thanks to a larger dataset, improved experimental techniques, and more precise lepton calibration, the statistical uncertainty of the measurement has been reduced by a factor of two and the systematic uncertainty by about 20% relative to the previous Run 2 published result.This measurement is combined with the previous one obtained in the same channel with ATLAS Run 1 data.The result of the combination is   = 124.94± 0.17(stat.)± 0.03(syst.)GeV.This is the most precise measurement of   in the  →   * → 4ℓ channel by the ATLAS Collaboration.

Figure 1 :
Figure1: The observed and expected (pre-fit)  4ℓ distributions for the selected Higgs boson candidates, for the different decay final states (a) 4, (b) 22, (c) 22, and (d) 4 events.The predicted number of events for these distributions is taken from simulation for the signal,   * ,   , and  processes, while it is taken from the data-driven estimate (see Section 6) for the +jets and  t backgrounds.The total uncertainty in the prediction is shown by the hatched band, which also includes the theoretical uncertainties of the SM cross-section for the signal and the   * background.Higgs boson events in this plot are simulated with   = 125 GeV.The -values quantifying the agreement between the data and MC predictions are (a) 0.81, (b) 0.43, (c) 0.77, and (d) 0.15.

Figure 2 :
Figure2: The observed and expected (pre-fit)  4ℓ distributions for the selected Higgs boson candidates.The predicted number of events for these distributions is taken from simulation for the signal,   * ,   , and  processes, while it is taken from the data-driven estimate (see Section 6) for the +jets and  t backgrounds.The total uncertainty in the prediction is shown by the hatched band, which also includes the theoretical uncertainties of the SM cross-section for the signal and the   * background.Higgs boson events in this plot are simulated with   = 125 GeV.The -value quantifying the agreement between the data and MC predictions is 0.36.

Figure 3 :
Figure3: The observed and expected (pre-fit)   distributions for the selected Higgs boson candidates.The predicted number of events for these distributions is taken from simulation for the signal,   * ,   , and  processes, while it is taken from the data-driven estimate (see Section 6) for the +jets and  t backgrounds.The total uncertainty in the prediction is shown by the hatched band, which also includes the theoretical uncertainties of the SM cross-section for the signal and the   * background.Higgs boson events in this plot are simulated with   = 125 GeV.The -value quantifying the agreement between the data and MC predictions is 0.28.

Figure 4 :
Figure4: The observed and expected (pre-fit) event-level resolution   distributions predicted by the QRNN for the selected Higgs boson candidates.The predicted number of events for these distributions is taken from simulation for the signal,   * ,   , and  processes, while it is taken from the data-driven estimate (see Section 6) for the +jets and  t backgrounds.The total uncertainty in the prediction is shown by the hatched band, which also includes the theoretical uncertainties of the SM cross-section for the signal and the   * background.Higgs boson events in this plot are simulated with   = 125 GeV.The 4 and 22 events typically have   ∼ 1.5 GeV, while the 22 and 4 events typically have   ∼ 2.1 GeV.The -value quantifying the agreement between the data and MC predictions is 0.81.

Figure 5 :
Figure 5: The probability density function derived from signal MC simulation using the per-event resolution (red line) is shown in comparison with the  4ℓ distribution in signal MC simulation (black points) for   = 125 GeV.The probability density function is derived for each channel separately and they are added together assuming predicted yields.Both the probability density function and the MC simulation are normalised to unity.

Figure 6 :
Figure 6: The  4ℓ data distribution from all subchannels combined (black points) is shown along with the signalplus-background post-fit probability density function (red line).The -value quantifying the agreement between the data and the fit model is 0.53.

Figure 8 :
Figure 8: The test statistic, −2ln(), values as a function of   are shown in (a) for the fit in each of the final states 4 (purple), 22 (green), 22 (orange), and 4 (blue), and for the combined fit (black), both with (solid lines) and without (dashed lines) systematic uncertainties.The horizontal dashed line indicates the location of the one- uncertainty.The fit results obtained in each final state are shown in (b) together with the combined result.The uncertainty bar on each point is the total statistical and systematic uncertainty; the brackets show the statistical uncertainty only.The vertical (red) line indicates the combined Run 2 result, and the grey band its total uncertainty.

Figure 9 :
Figure 9: The distributions of the total   uncertainty from pseudo-experiments assuming   = 125 GeV are shown, for when the fit does (black) and does not (blue) take into account systematic uncertainties.The solid lines correspond to the expected uncertainty distribution from pseudo-experiments, while the vertical dashed lines indicate the observed values of the uncertainties.The one-sided -value for compatibility between the observed and expected total uncertainties is 0.28.

Table 2 :
Largest contributions to the systematic uncertainty of   .