Running of the top quark mass from proton-proton collisions at s=13TeV

Abstract The running of the top quark mass is experimentally investigated for the first time. The mass of the top quark in the modified minimal subtraction ( MS ‾ ) renormalization scheme is extracted from a comparison of the differential top quark-antiquark ( t t ¯ ) cross section as a function of the invariant mass of the t t ¯ system to next-to-leading-order theoretical predictions. The differential cross section is determined at the parton level by means of a maximum-likelihood fit to distributions of final-state observables. The analysis is performed using t t ¯ candidate events in the e± μ∓ channel in proton-proton collision data at a centre-of-mass energy of 13 TeV recorded by the CMS detector at the CERN LHC in 2016, corresponding to an integrated luminosity of 35.9 fb − 1 . The extracted running is found to be compatible with the scale dependence predicted by the corresponding renormalization group equation. In this analysis, the running is probed up to a scale of the order of 1 TeV.


Introduction
Beyond leading order in perturbation theory, the fundamental parameters of the quantum chromodynamics (QCD) Lagrangian, i.e. the strong coupling constant α S and the quark masses, are subject to renormalization. As a result, these parameters depend on the scale at which they are evaluated. The evolution of α S and of the quark masses as a function of the scale, commonly referred to as "running", is described by renormalization group equations (RGEs).
The running of α S was experimentally verified on a wide range of scales using jet production in electron-proton, positron-proton, electron-positron, proton-antiproton, and proton-proton (pp) collisions, as summarized, e.g. in Refs. [1,2]. To determine the running, the value of α S evaluated at an arbitrary reference scale is extracted in bins of a physical energy scale Q and then converted to α S (Q ) using the corresponding RGE [2]. The validity of this procedure lies in the fact that, in a calculation, the renormalization scale is normally identified with the physical energy scale of the process. The same procedure can be used to determine the running of the mass of a quark. In the modified minimal subtraction (MS) renormalization scheme, the dependence of a quark mass m on the scale μ is described by the RGE E-mail address: cms -publication -committee -chair @cern .ch. μ 2 dm(μ) where γ (α S (μ)) is the mass anomalous dimension, which is known up to five-loop order in perturbative QCD [3,4]. The solution of Eq. (1) can be used to obtain the quark mass at any scale μ from the mass evaluated at an initial scale μ 0 . The running of the b quark mass was demonstrated [5] using data from various experiments at the CERN LEP [6-9], SLAC SLC [10], and DESY HERA [11] colliders. Measurements of charm quark pair production in deep inelastic scattering at the DESY HERA were used to determine the running of the charm quark mass [12]. These measurements represent a powerful test of the validity of perturbative QCD. Furthermore, RGEs can be modified by contributions from physics beyond the standard model, e.g. in the context of supersymmetric theories [13].
This Letter describes the first experimental investigation of the running of the top quark mass, m t , as defined in the MS scheme. The running of m t is extracted from a measurement of the differential top quark-antiquark pair production cross section, σ tt , as a function of the invariant mass of the tt system, m tt . The differential cross section, dσ tt /dm tt , is determined at the parton level by means of a maximum-likelihood fit to distributions of final-state observables using tt candidate events in the e ± μ ∓ final state, extending the method described in Ref. [14] to the case of a differential measurement. This method allows the differential cross section to be constrained simultaneously with the systematic uncertain-ties. In this analysis, the parton level is defined before radiation from the parton shower, which allows for a direct comparison with fixed-order theoretical predictions. The measurement is performed using pp collision data at √ s = 13 TeV recorded by the CMS detector at the CERN LHC in 2016, corresponding to an integrated luminosity of 35.9 fb −1 . The running mass, m t (μ), is extracted at next-to-leading order (NLO) in QCD as a function of m tt by comparing fixed-order theoretical predictions at NLO to the measured dσ tt /dm tt . The running of m t is probed up to a scale of the order of 1 TeV.

The CMS detector and Monte Carlo simulation
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid. A two-level trigger system selects events of interest for analysis [15]. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [16].
The particle-flow (PF) algorithm [17] aims to reconstruct and identify electrons, muons, photons, charged and neutral hadrons in an event, with an optimized combination of information from the various elements of the CMS detector. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track [18]. The momentum of muons is obtained from the curvature of the corresponding track [19]. Jets are reconstructed from the PF candidates using the anti-k T clustering algorithm with a distance parameter of 0.4 [20,21], and the jet momentum is determined as the vectorial sum of all particle momenta in the jet. The missing transverse momentum vector is computed as the negative vector sum of the transverse momenta (p T ) of all the PF candidates in an event. Jets originating from the hadronization of b quarks (b jets) are identified (b tagged) using the combined secondary vertex [22] algorithm, using a working point that corresponds to an average b tagging efficiency of 41% for simulated tt events, and an average misidentification probability of 0.1% and 2.2% for light-flavour jets and c jets, respectively [22].
In this analysis, the same Monte Carlo (MC) simulations as in Ref. [14] are used. In particular, tt, tW, and Drell-Yan (DY) events are simulated using the powheg v2 [23][24][25][26][27][28] NLO MC generator interfaced to pythia 8.202 [29] for the modelling of the parton shower and using the CUETP8M2T4 underlying event tune [30,31]. In the simulation, the proton structure is described by means of the NNPDF3.0 [32] parton distribution function (PDF) set. The largest background contributions are represented by tW and DY production. Other background processes include W+jets production and diboson events, while the contribution from QCD multijet production is found to be negligible. Contributions from all background processes are estimated from simulation and are normalized to their predicted cross section. Further details on the MC simulation of the backgrounds can be found in Ref. [14].

Event selection and systematic uncertainties
Events are collected using a combination of triggers which require either one electron with p T > 12 GeV and one muon with p T > 23 GeV, or one electron with p T > 23 GeV and one muon with p T > 8 GeV, or one electron with p T > 27 GeV, or one muon with p T > 24 GeV. In the analysis, tight isolation requirements are applied to electrons and muons based on the ratio of the scalar sum of the p T of neighbouring PF candidates to the p T of the lepton candidate. Events are then required to contain at least one electron and one muon of opposite electric charge with p T > 25 GeV for the leading and p T > 20 GeV for the subleading lepton, and |η| < 2.4. This kinematic selection defines the visible phase space.
In events with more than two leptons, the two leptons of opposite charge with the highest p T are used. Jets with p T > 30 GeV and |η| < 2.4 are considered, but no requirement on the number of reconstructed jets or b-tagged jets is imposed. Further details on the event selection can be found in Ref. [14].
In events with at least two jets, the invariant mass of the tt system is estimated by means of the kinematic reconstruction algorithm described in Ref. [33]. The reconstructed invariant mass is indicated with m reco tt . The kinematic reconstruction algorithm examines all possible combinations of reconstructed jets and leptons and solves a system of equations under the assumptions that the invariant mass of the reconstructed W boson is 80.4 GeV and that the missing transverse momentum originates solely from the two neutrinos coming from the leptonic decays of the W bosons. In addition, the kinematic reconstruction algorithm requires an assumption on the value of the top quark mass, m kin t . Any possible bias due to the choice of this value is avoided by incorporating the dependence on m kin t in the fit described in Section 4. To estimate this dependence, the kinematic reconstruction and the event selection are repeated with three different choices of m kin t , corresponding to 169.5, 172.5, and 175.5 GeV, and the top quark mass used in the MC simulation, m MC t , is varied accordingly. The parameter m kin t = m MC t is then treated as a free parameter of the fit. The sources of systematic uncertainties are classified as experimental and modelling uncertainties. Experimental uncertainties are related to the corrections applied to the MC simulation. These include uncertainties associated with trigger and lepton identification efficiencies, jet energy scale [34] and resolution [35], lepton energy scales, b tagging efficiencies [22], and the uncertainty in the integrated luminosity [36]. Modelling uncertainties are related to the simulation of the tt signal, and include matrix-element scale variations in the powheg simulation [37,38], scale variations in the parton shower [31], variations in the matching scale between the matrix element and the parton shower [30], uncertainties in the underlying event tune [30], the PDFs [39], the B hadron branching fraction and fragmentation function [40,41], and uncertainties related to the choice of the colour reconnection model [42,43]. Furthermore, as in previous CMS analyses, e.g. [14,44,45], an uncertainty that accounts for the observed difference in the shape of the top quark p T distribution between data and simulation [33,46,47] is applied. The dependence on the top quark width has been investigated and was found to be negligible. Other sources of uncertainty include the modelling of the additional pp interactions within the same or nearby bunch crossings and the normalization of background processes. For the latter, an uncertainty of 30% is assigned to the normalization of each background process. Further details on the sources of systematic uncertainties and the considered variations can be found in Ref. [14].
The simulated tt sample is split into four subsamples corresponding to bins of m tt at the parton level. Each subsample is treated as an independent signal process, representing the tt production at the scale μ k , which is chosen to be the centre-of-gravity of bin k, defined as the mean value of m tt in that bin. The subsample corresponding to the bin k is denoted with "Signal (μ k )". The m tt bin boundaries, the corresponding fraction of simulated events in each bin, and the representative scales μ k are summarized in Table 1 The m tt bin boundaries, the corresponding fraction of events in the powheg simulation, and the representative scale μ k .   Table 1, where the values are estimated from the nominal powheg simulation. The width of each bin, m k tt , is chosen taking into account the resolution in m reco tt . Fig. 1 shows the distribution of m reco tt after the fit to the data, which is described in the next section.

Fit procedure and cross section results
The differential tt cross section at the parton level is measured by means of a maximum-likelihood fit to distributions of finalstate observables where the systematic uncertainties are treated as nuisance parameters. In the likelihood, the number of events in each bin of any distribution of final-state observables is assumed to tt being the total tt cross section in the bin k of m tt , the expected number of events in the bin i of any of the considered final-state distributions, denoted with ν i , can be written as  (2), which relates the various σ (μ k ) tt (and hence the parton-level differential cross section) to distributions of final-state observables, embeds the detector response and its parametrized dependence on the systematic uncertainties. Therefore, the maximization of the likelihood function provides results for σ (μ k ) tt that are automatically unfolded to the parton level. This method (described, e.g. in Ref. [48]) is also referred to as maximum-likelihood unfolding and, unlike other unfolding techniques, allows the nuisance parameters to be constrained simultaneously with the differential cross section. The unfolding problem was found to be well-conditioned, and therefore no regularization is needed. The expected signal and background distributions contributing to the fit are modelled with templates constructed using simulated samples.
Selected events are categorized according to the number of btagged jets, as events with 1 b-tagged jet, 2 b-tagged jets, or a different number of b-tagged jets (zero or more than two). The effect of the systematic uncertainties on the normalization of the different signals in each of these categories is parametrized using multinomial probabilities. In particular, based on the tt topology, the number of events with one (S k 1b ), two (S k 2b ), or a different number of b-tagged jets (S k other ) in each bin of m tt is expressed as: Here, L is the integrated luminosity, A k sel is the acceptance of the event selection in the m tt bin k, and k sel represents the efficiency for an event in the visible phase space to pass the full event selection. The acceptance A k sel is defined as the fraction of tt events in the bin k that, at the generator (particle) level, enter the visible phase space described in Section 3, while k sel includes experimental selection criteria, e.g. isolation and trigger requirements. Furthermore, k b represents the b tagging probability and the parameter C k b accounts for any residual correlation between the tagging of two b jets in a tt event. The quantities A k sel , k sel , k b , and C k b are determined from the signal simulation and, although they are not free parameters of the fit, they vary according to the parameters λ and m MC t . In each category, the remaining effects of the systematic uncertainties on signal processes are treated as shape uncertainties. The quantities s k i in Eq. (2) are then derived from the signal shape and normalization in the corresponding category. In this way, a precise parametrization of the dependence of signal normalizations on the nuisance parameters and m MC t is obtained. In fact, the parameters in Eqs. (3)-(5) are less subject to statistical fluctuations than the s k i .
In order to constrain each individual σ (μ k ) tt , events with at least two jets are further divided into subcategories of m reco tt , using the same binning as for m tt (Table 1). The choice of the input distributions to the fit in the different event categories is summarized in Table 2. The total number of events is chosen as input to the fit for all subcategories with zero or more than two b-tagged jets, where the contribution of the background processes is the largest, in order to mitigate the sensitivity of the measurement to the shape of the distributions of background processes. The same choice is made for the subcategories corresponding to the last bin in m reco tt , where the statistical uncertainty in both data and simulation is large, and for events with less than two jets, where the kinematic reconstruction cannot be performed. In the remaining subcategories with one b-tagged jet, the minimum invariant mass found when combining the reconstructed b jet and a lepton, referred to as the m min b distribution, is fitted. This distribution provides the Table 2 Input distributions to the fit in the different event categories. The number of jets, the number of b-tagged jets, the number of events, and the p T of the softest jet are denoted with N jets , N b , N events , and "jet p min T ", respectively, while the category corresponding to the bin k in m reco tt is indicated with "m reco tt k".  [49]. In the remaining subcategories with two b-tagged jets, the p T spectrum of the softest selected jet in the event is used to constrain jet energy scale uncertainties at small values of p T , the kinematic range where systematic uncertainties are the largest. The distributions used in the fit are compared to the data after the fit in the supplemental material.
The efficiencies of the kinematic reconstruction in data and simulation have been investigated in Ref. [33] and they were found to differ by 0.2%. Therefore, the efficiency in the simulation is corrected to match the one in data. An uncertainty of 0.2% is assigned to each bin of m tt independently. The same uncertainty is also assigned to tt events with one or two b-tagged jets, independently. For tt events with zero or more than two b-tagged jets, where the combinatorial background is larger, an uncertainty of 0.5% is conservatively assigned. These uncertainties are treated as uncorrelated to account for possible differences between the different m tt bins and categories of b-tagged jet multiplicity. Similarly, an additional uncertainty of 1% is assigned to the sum of the background processes, independently for each bin of m reco tt , in order to reduce the correlation between the signal and the background templates. The impact of these uncertainties on the final results is found to be small compared to the total uncertainty.
The dependence of the signal shapes, of the parameters A k sel , k sel , k b , and C k b , and of the background contributions on m MC t and on the nuisance parameters λ is modelled using second-order polynomials [14]. In the fit, Gaussian priors are assumed for all the nuisance parameters. The negative log-likelihood is then minimized, using the Minuit program [50], with respect to σ (μ k ) tt , m MC t , and λ. Finally, the fit uncertainties in the various σ (μ k ) tt are determined using Minos [50]. Additional extrapolation uncertainties, which reflect the impact of modelling uncertainties on A k sel , are estimated without taking into account the constraints obtained in the visible phase space [14]. Moreover, an additional uncertainty arising from the limited statistical precision of the simulation is estimated using MC pseudo-experiments [14], where templates are varied within their statistical uncertainties taking into account the correlations between the nominal templates and the templates corresponding to the systematic variations. The template dependencies are then rederived and the fit to the data is repeated more than ten thousand times. For each parameter of interest, the rootmean-square of the best fit values obtained with this procedure is taken as an additional uncertainty and added in quadrature to the total uncertainty from the fit.
The measured σ (μ k ) tt are shown in Fig. 2 and compared to fixed-order theoretical predictions in the MS scheme at NLO [51] implemented for the purpose of this analysis in the mcfm v6.8 program [52,53]. In the calculation, the renormalization scale, μ r , and factorization scale, μ f , are both set to m t . The MS mass of the top quark evaluated at the scale μ = m t is denoted with m t (m t ). The calculation is interfaced with the ABMP16_5_nlo PDF set [54], which is the only available PDF set where m t is treated in the MS scheme and where the correlations between the gluon PDF, α S , and m t are taken into account. In the calculation, the value of α S at the Z boson mass, α S (m Z ), is set to the value determined in the ABMP16_5_nlo fit, which in the central PDF corresponds to 0.1191 [54]. In order to demonstrate the sensitivity to the top quark mass, predictions for dσ tt /dm tt obtained with different values of m t (m t ) are shown. Furthermore, it is worth noting that this method provides a cross section result with significantly improved precision compared to measurements that perform unfolding as a separate step, e.g. as the one described in Ref. [33].
The dominant uncertainties in the measured σ (μ k ) tt are associated with the integrated luminosity, the lepton identification efficiencies, the jet energy scales and, at large m tt , the modelling of the top quark p T . The two latter uncertainties are marginally constrained in the fit, while the first two are not constrained. Furthermore, the post-fit values of all nuisance parameters are found to be compatible with their pre-fit value, within one standard deviation.
The numerical values of the measured σ (μ k ) tt , their correlations, the impact of the various sources of uncertainty, and the pulls and constraints of the nuisance parameters related to the modelling uncertainties can be found in the supplemental material.

Extraction of the running of the top quark mass
The measured differential cross section is used to extract the running of the top quark MS mass at NLO as a function of the scale μ = m tt . The procedure is similar to the one used to extract the running of the charm quark mass [12]. The value of m t (m t ) is determined independently in each bin of m tt from a χ 2 fit of fixed-order theoretical predictions at NLO to the measured σ (μ k ) tt . The theoretical predictions are obtained as described in Section 4 for Fig. 2. The χ 2 definition follows the one described in Ref. [55], which accounts for asymmetries in the input uncertainties. The extracted m t (m t ) are then converted to m t (μ k ) using the CRunDec v3.0 program [56], where μ k is the representative scale of the process in a given bin of m tt , as described in Section 3. As relevant in a NLO calculation, the conversion is performed with one-loop precision, assuming five active flavours (n f = 5) and α S (m Z ) = 0.1191 consistently with the used PDF set.
This procedure is equivalent to extracting directly m t (μ k ) in each bin. Furthermore, the result does not depend on the exact choice of μ k , provided that it is representative of the physical energy scale of the process. In fact, a change in μ k would correspond to a change in m t (μ k ) according to the RGE. The extracted values of m t (μ k ) and their uncertainties can be found in the supplemental material.
In order to benefit from the cancellation of correlated uncertainties in the measured σ (μ k ) tt , the ratios of the various m t (μ k ) to m t (μ 2 ) are considered. In particular, the quantities r 12 = m t (μ 1 )/m t (μ 2 ), r 32 = m t (μ 3 )/m t (μ 2 ), and r 42 = m t (μ 4 )/m t (μ 2 ) are extracted. With this approach the running of m t , i.e. the quantity predicted by the RGE (Eq. (1)), is accessed directly. The measurement at the scale μ 2 is chosen as a reference in order to minimize the correlation between the extracted ratios.
Four different types of systematic uncertainty are considered for the ratios: the uncertainty in the various σ (μ k ) tt in the visible phase space (referred to as fit uncertainty), the extrapolation uncertainties, the uncertainties in the proton PDFs, and the uncertainty in the value of α S (m Z ). The fit uncertainty includes experimental and modelling uncertainties described in Section 3. Scale variations in the mcfm predictions are not performed, since the scale dependence of m t is being investigated at a fixed order in perturbation theory. In fact, scale variations in the hard scattering cross section are conventionally performed as a means of estimating the effect of missing higher order corrections and are therefore not applicable in this context.
Uncertainties in the proton PDFs affect the mcfm prediction and therefore the extracted values of the various m t (μ k ). In order to estimate their impact, the calculation is repeated for each eigenvector of the PDF set and the differences in the extracted ratios are added in quadrature to yield the total PDF uncertainties.
In the ABMP16_5_nlo PDF set, α S (m Z ) is determined simultaneously with the PDFs, therefore its uncertainty is incorporated in that of the PDFs. However, the uncertainty in α S (m Z ) also affects the CRunDec conversion from m t (m t ) to m t (μ k ). This effect is estimated independently and is found to be negligible. The impact of extrapolation uncertainties is estimated by varying the measured σ (μ k ) tt within their extrapolation uncertainty, separately for each source and simultaneously in the different bins in m tt , taking the correlations into account. The various contributions are added in quadrature to yield the total extrapolation uncertainty.
The correlations between the extracted masses arising from the fit uncertainty are estimated using MC pseudo-experiments, taking the correlations between the measured σ (μ k ) tt as inputs. The uncertainties are then propagated to the ratios using linear uncertainty propagation, taking the estimated correlations into account. The numerical values of the ratios are determined to be: −0.017 (PDF+α S ) +0.017 −0.013 (extr). Here, the fit uncertainty (fit), the combination of PDF and α S uncertainty (PDF+α S ), and the extrapolation uncertainty (extr) are given. The most relevant sources of experimental uncertainty are the integrated luminosity, the lepton identification efficiencies, and the jet energy scale and resolution. Among modelling uncertainties related to the powheg+pythia 8 simulation of the tt signal, the largest contributions originate from the scale variations in the parton shower, the uncertainty in the shape of the p T spectrum of the top quark, and the matching scale between the matrix element and the parton shower. The statistical uncertainties are found to be negligible. The correlations between the ratios arising from the fit uncertainty are investigated using a pseudo-experiment procedure which consists in repeating the extraction of the ratios using The result shows agreement between the extracted running and the RGE prediction at one-loop precision within 1.1 standard deviations in the Gaussian approximation and excludes the no-running hypothesis at above 95% confidence level (2.1 standard deviations) in the same approximation.

Summary
In this Letter, the first experimental investigation of the running of the top quark mass, m t , is presented. The running is extracted from a measurement of the differential top quark-antiquark (tt) cross section as a function of the invariant mass of the tt system, m tt . The differential tt cross section, dσ tt /dm tt , is determined at the parton level using a maximum-likelihood fit to distributions of final-state observables, using tt candidate events in the e ± μ ∓ channel. This technique allows the nuisance parameters to be constrained simultaneously with the differential cross section in the visible phase space and therefore provides results with significantly improved precision compared to conventional procedures in which the unfolding is performed as a separate step. The analysis is performed using proton-proton collision data at a centre-of-mass energy of 13 TeV recorded by the CMS detector at the CERN LHC in 2016, corresponding to an integrated luminosity of 35.9 fb −1 .
The running mass m t (μ), as defined in the modified minimal subtraction (MS) renormalization scheme, is extracted at one-loop precision as a function of m tt by comparing fixed-order theoretical predictions at next-to-leading order to the measured dσ tt /dm tt .
The extracted running of m t is found to be in agreement with the prediction of the corresponding renormalization group equation, within 1.1 standard deviations, and the no-running hypothesis is excluded at above 95% confidence level. The running of m t is probed up to a scale of the order of 1 TeV.

Acknowledgements
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses.

Appendix A. Supplementary material
Supplementary material related to this article can be found online at https://doi .org /10 .1016 /j .physletb .2020 .135263.