Measurements of the groomed jet radius and momentum splitting fraction with the soft drop and dynamical grooming algorithms in pp collisions at $\sqrt{s}=5.02$ TeV

This article presents measurements of the groomed jet radius and momentum splitting fraction in pp collisions at $\sqrt{s}=5.02$ TeV with the ALICE detector at the Large Hadron Collider. Inclusive charged-particle jets are reconstructed at midrapidity using the anti-$k_{\rm{T}}$ algorithm for transverse momentum $60<p_{\mathrm{T}}^{\rm{ch\; jet}}<80$ GeV/$c$. We report results using two different grooming algorithms: soft drop and, for the first time, dynamical grooming. For each grooming algorithm, a variety of grooming settings are used in order to explore the impact of collinear radiation on these jet substructure observables. These results are compared to perturbative calculations that include resummation of large logarithms at all orders in the strong coupling constant. We find good agreement of the theoretical predictions with the data for all grooming settings considered.

Jet grooming techniques, such as soft drop [13][14][15] and dynamical grooming [16][17][18][19], reduce the magnitude of non-perturbative contributions to jet substructure cross sections in pp collisions by selectively removing soft large-angle radiation. This allows for well-controlled comparisons of measurements to perturbative QCD (pQCD) calculations. Grooming techniques have also previously been applied to heavy-ion collisions, in order to explore whether the quark-gluon plasma modifies the hard substructure of jets [19][20][21][22][23][24][25][26][27][28][29]. Several measurements of groomed jet observables have been made in pp and heavy-ion collisions at the LHC and RHIC [30][31][32][33][34][35][36][37], as well as in e + e − collisions [38]. The benefits of different jet grooming algorithms remain a topic of ongoing study, since different grooming algorithms have different perturbative structure and offer different flexibility via grooming parameters that can be adapted to specific physics goals in either proton-proton or heavy-ion collisions (see e.g. Refs. [19,26,29]). In this article, we explore both the soft drop and dynamical grooming algorithms, and test the ability of pQCD calculations to describe their behavior for a variety of grooming parameters.
Jet grooming algorithms rely on procedures to recluster the constituents of reconstructed jets into a structure that better isolates perturbative emissions in the jet. One such structure is the primary Lund plane, which approximately represents the angular and momentum phase space of partonic emissions off the leading hard-scattered parton. The soft drop and dynamical grooming algorithms each identify a single splitting in the primary Lund plane [39] that satisfies a grooming condition. The two algorithms are further described in Section 3. In this article, we consider two observables that define the kinematics of the identified splitting: z g , the groomed jet momentum splitting fraction, and θ g , the (scaled) groomed jet radius, as shown in Fig. 1. The groomed jet momentum splitting fraction is defined as the fraction of transverse momentum (p T ) relative to the beam that the sub-leading prong in the splitting carries relative to its parent: The (scaled) groomed jet radius is defined as the angular distance between the two prongs of the identified hard splitting where R is the jet radius and R g is the rapidity-azimuth (y-ϕ) separation of the identified splitting.
The soft drop z g and θ g distributions have recently been calculated in pp collisions at Next-to-Leading Logarithmic (NLL ′ ) accuracy [40,41]. Measurements of z g and θ g serve to test these analytical predictions, in particular, the role of beyond-LL pQCD effects, as well as constrain the role of non-perturbative effects. Moreover, by measuring these observables for a variety of grooming conditions β (see Section 3.1 for further details), one can systematically study the role of collinear radiation in jet substructure, since increasing β removes less and less collinear radiation in the grooming process. Measurements of both z g and θ g for β = 0, 1, and 2 have been performed by the ATLAS Collaboration [32] for dijet events with leading p jet T > 300 GeV/c, and several measurements of z g and θ g have been performed for β = 0 across a wide range of jet p T [31,33,36,38]. In this article, we complement these studies by measuring z g and θ g for β = 0, 1, and 2 for 60 < p ch jet T < 80 GeV/c.
p T,leading + p T,subleading y φ Figure 1: Graphical representation of the angularly-ordered Cambridge-Aachen reclustering of jet constituents and subsequent grooming procedure, with the identified splitting denoted in black and the splittings that were groomed away in light blue.
The dynamically groomed z g and θ g distributions have recently been calculated in pp collisions at Nextto-Next-to-Double Logarithm (N 2 DL) accuracy [16,18]. In this article, we perform the first measurement of dynamically groomed jet substructure observables, providing the first test of these calculations.
We report measurements in pp collisions at center-of-mass collision energy √ s = 5.02 TeV with the ALICE detector. Charged-particle jets are reconstructed in the pseudorapidity range |η jet | < 0.5 for jet radius R = 0.4 with 60 < p ch jet T < 80 GeV/c. The z g and θ g distributions are measured using both the soft drop and dynamical grooming procedures, each with a variety of grooming settings. These results are compared to pQCD calculations as well as the PYTHIA8 [42,43] Monte Carlo (MC) event generator. While track-based jet observables are collinear-unsafe [44][45][46], they can be measured with greater precision than calorimeter-based jet observables, and recent measurements have demonstrated that for many substructure observables track-based distributions are compatible with the corresponding collinear-safe distributions [32]. Comparisons of theoretical calculations to our track-based jet substructure measurements are discussed further in Section 5.

Experimental setup and data sets
A description of the ALICE detector and its performance can be found in Refs. [47,48]. The pp data set used in this analysis was collected in 2017 during LHC Run 2 at √ s = 5.02 TeV using a minimum-bias trigger defined by the coincidence of the signals from two scintillator arrays in the forward region (V0 detectors) [49]. The event selection includes a primary vertex selection, where the primary vertex is required to be within 10 cm from the center of the detector along the beam direction. Events with more than one reconstructed primary vertex were classified as pileup and rejected [50]. After these selections, the pp data sample contains 870 million events and corresponds to an integrated luminosity of 18.0 ± 0.4 nb −1 [51].
The analysis uses charged-particle tracks reconstructed with information from the Time Projection Chamber (TPC) [52] and the Inner Tracking System (ITS) [53]. Two types of tracks are defined: global tracks and complementary tracks. Global tracks are required to include at least one hit in the silicon pixel detector (SPD) comprising the first two layers of the ITS and to satisfy several track quality selections. Complementary tracks are all those satisfying all the selection criteria of global tracks except for the request of a point in the SPD. They are refitted using the primary vertex to constrain their trajectory in order to preserve a good momentum resolution, especially at high transverse momentum. Including this second class of tracks ensures approximately uniform azimuthal acceptance, while preserving similar p T resolution to tracks with SPD hits. Tracks with 0.15 < p T < 100 GeV/c are accepted over pseudorapidity range |η| < 0.9 and azimuthal angle 0 < ϕ < 2π.
The instrumental performance of the detector is estimated with a MC simulation done using PYTHIA8 [42] with the Monash 2013 tune [43] for the event generation and GEANT3 [54] for the transport code propagating particles through the simulated ALICE apparatus. The tracking efficiency in pp collisions is approximately 67% at track p T = 0.15 GeV/c, and rises to approximately 84% at p T = 1 GeV/c, and remains above 75% at higher p T . The momentum resolution σ (p T )/p T was estimated from the covariance matrix of the track fit [48], and is approximately 1% at track p T = 1 GeV/c and 4% at p T = 50 GeV/c.

Analysis method
Jets are reconstructed from charged-particle tracks with FastJet 3.2.1 [55] using the anti-k T algorithm with E-scheme recombination with resolution parameter R = 0.4 [56,57]. All tracks are assigned a mass equal to the π ± meson mass. The jet axis is required to be within the fiducial volume of the TPC, η jet < 0.5, where η jet is the jet pseudorapidity. The jet reconstruction performance for this data set is described in Ref. [30]. The underlying event (UE) consists of approximately p T = 1 GeV/c per jet, and is not subtracted. Therefore, UE corrections must be included in theoretical calculations when comparing to the data.

Grooming algorithms
The soft drop and dynamical grooming algorithms identify a single splitting in the primary Lund plane [39] that satisfies a grooming condition. The i th splitting in the primary Lund plane is defined by where ∆R i = ∆y 2 i + ∆ϕ 2 i is the rapidity-azimuth separation of the i th splitting. Note that when reconstructing the primary Lund plane, one must choose a reclustering radius R recluster ; for soft drop R recluster = R is used, which results in θ g ≤ 1, whereas for our implementation of dynamical grooming R recluster = ∞ is used, which results in θ g > 1 for <1% of cases (which we neglect).
In the soft drop grooming algorithm, the grooming condition is given by where z cut and the exponent β are tunable free parameters of the grooming algorithm. The first such splitting to pass the grooming condition defines the soft drop groomed jet splitting. As the grooming parameter β increases, the quantity z cut θ β i becomes small for collinear radiation. This causes the algorithm to be less likely to drop collinear radiation -corresponding to less grooming overall, and particularly less grooming for collinear radiation. Note that for the values β ≥ 0 considered here, z g is Sudakov safe [15] and θ g is infrared-collinear safe [40].
The dynamical grooming algorithm, on the other hand, identifies the splitting that maximizes over all splittings in the primary Lund plane, where the exponent a is a continuous free parameter. The grooming parameter a defines the density with which the phase space of the Lund plane is groomed away. The case a → 0 selects the splitting with largest z, and is somewhat similar to soft drop with β = 0, which grooms away splittings below a certain z. The case a = 1 selects the splitting with largest transverse momentum, and is roughly analogous to soft drop with β = −1, which grooms away splittings below a certain transverse momentum (see Ref. [16] for further details). Since the grooming condition in dynamical grooming defines a maximum rather than an explicit cut (as in the case of soft drop), every dynamically groomed jet will always return a splitting, whereas in soft drop it is possible that a jet does not contain any splitting satisfying the grooming condition.

Corrections
The reconstructed p ch jet T and z g (θ g ) differ from their true values due to tracking inefficiency, particlematerial interactions, and track p T resolution. To account for these effects, events are simulated using PYTHIA8 Monash 2013 [42,43] for the event generation and GEANT3 [54] for the transport code propagating particles through the simulated ALICE apparatus, as described in Section 2. The truth-level jets are constructed from the charged primary particles of the PYTHIA8 event, defined as all particles with a mean proper lifetime larger than 1 cm/c, and excluding the decay products of these particles [58]. A 4D response matrix is constructed that describes the detector response in p ch jet T and z g (and similarly for Then, a 2D unfolding is performed in p ch jet T and z g using the iterative Bayesian unfolding algorithm [59,60] implemented in the RooUnfold package [61]. The distributions are corrected for "misses", in which a jet exists inside the considered truth level range but not inside the detector level range. The rate of "fakes", in which a jet exists inside the considered detector level range but not inside the truth level range, is negligible. The number of iterations, which sets the strength of regularization, is chosen by minimizing the quadrature sum of the statistical and systematic unfolding uncertainties. This results in the optimal number of iterations equal to 3 in all cases.
To validate the performance of the unfolding procedure, refolding tests are performed, in which the response matrix is multiplied by the unfolded solution and compared to the original detector-level spectrum. Closure tests are also performed, in which the shape of the generated MC spectrum is modified to account for the fact that the true distribution may be different from the MC spectrum. In all cases, successful closure within statistical and systematic uncertainties is achieved.

Systematic uncertainties
Systematic uncertainties due to the tracking efficiency, the unfolding procedure, and the MC generator model dependence are considered. Table 1 summarizes the systematic uncertainty contributions from each of these sources. The total systematic uncertainty is calculated as the sum in quadrature of all of the individual systematic uncertainties described below.
The systematic uncertainty due to the uncertainty in the tracking efficiency is evaluated using random rejection of additional tracks in the jet finding. The tracking efficiency uncertainty, estimated from the variation of the track selection criteria and a detailed study of the ITS-TPC track-matching efficiency uncertainty, is 4%. In order to assign a systematic uncertainty to the nominal result, an alternative response matrix is constructed by randomly rejecting an additional 4% of tracks in jet finding, and the unfolding procedure is repeated. This result is compared to the nominal result, with the differences in each bin taken as the systematic uncertainty. The uncertainty on the track momentum resolution is a sub-leading effect to the tracking efficiency and is taken to be negligible.
nominal result is taken as the systematic uncertainty.
-The prior distribution is scaled by a power law in p ch jet T and by p ±0.5 T z ±0.5 g for the z g analysis. For the θ g analysis, a linear scaling in θ g by ±50% over its reported range, scaling by p ±0.5 T [1 ± 0.5(2θ g − 1)], is applied. The average difference between the result unfolded with this prior and the original is taken as the systematic uncertainty.
-The binnings in z g and θ g are varied to be finer and coarser than the nominal binning.
-The lower bound in the detector level charged-particle jet transverse momentum p ch jet T,det range is extended up and down by 5 GeV/c. The total unfolding systematic uncertainty is then the standard deviation of the variations, ∑ N i=1 σ 2 i /N, where N = 4 and σ i is the systematic uncertainty due to a single group of variations, since they each comprise independent estimates of the same underlying systematic uncertainty in the regularization.
The systematic uncertainty due to the model dependence of the generator used to construct the response matrix is estimated by comparing results obtained with PYTHIA8 Monash 2013 [42,43] to that obtained with Herwig7 (default tune) [62]. The tracking efficiency and track p T resolution are parameterized using fast simulations and response matrices are built using these two generators. These response matrices are then used to unfold the measured data, and the differences between the two unfolded results in each interval are taken as a symmetric uncertainty.

Results
We report the z g and θ g distributions in the p ch jet T interval between 60 and 80 GeV/c. All presented results use R = 0.4 jets reconstructed from charged particles at midrapidity, and are corrected for detector effects. The distributions are reported as normalized differential cross sections, where N jet (σ jet ) is the number (cross section) of inclusive charged-particle jets within the given p ch jet T interval, and N (σ ) is the number (cross section) of groomed splittings. The same normalization as in Eq. 6 is used for θ g . With this normalization, the integral of Eq. 6 is equal to the fraction of jets that pass the grooming condition. Figures 2 and 3 show the measured z g and θ g distributions for jets with soft drop grooming for grooming parameters z cut = 0.1 and β = 0, 1, and 2. The z g distributions fall with increasing z g , as is typical of the Altarelli-Parisi splitting functions [63]. The z g distribution for β = 0 cannot populate z g < 0.1 due to the grooming condition. However, for β > 0 it is possible for a sufficiently narrow splittings with z g < z cut to pass the grooming condition. The z g distributions are generally described by PYTHIA8 [42] within approximately 20%. The θ g distributions exhibit a peak at increasingly large θ g as β increases, due to the angular component in the grooming condition. The θ g distributions are described by PYTHIA8 [42] typically within 20% but with deviations at low θ g up to approximately 50%. Due to ill-defined perturbative accuracy in general-purpose MC generators such as PYTHIA and the fact that they are highly tuned to reproduce data, including jet-related observables [43], it is difficult to draw detailed physics conclusions from their comparison to data. Because of this, we instead turn our attention to comparisons with analytical calculations based on pQCD, where deeper insight can be obtained.

Soft drop
Theoretical calculations with soft drop grooming have been carried out within the Soft-Collinear Effective Theory (SCET) framework [64] for θ g [40] and z g [41]. These calculations include all-order resummations of large logarithms to NLL ′ accuracy [40]. In order to compare these parton-jet predictions to our measurement using charged-particle jets, a "forward folding" procedure is applied to account for hadronization and charged-particle effects, followed by a bin-by-bin scaling to account for Multi-Parton Interactions. These corrections are carried out following the procedure outlined in Ref. [30]. Given that the scale θ g z g p T R becomes non-perturbative at low θ g , and that our measurements of the z g distribution do not include a lower cutoff in θ g , we forgo these comparisons for the z g distribution and refer the reader to Ref. [41]. Instead, we focus on comparison of the measured θ g distribution to the SCET calculations. Figure 4 compares the measured θ g distributions with pQCD calculations based on SCET [40] using either PYTHIA8 [42] or Herwig7 [62] to account for non-perturbative corrections. The PYTHIA8 and Herwig7 corrections show generally similar behavior. Systematic uncertainties on the analytical predictions are estimated by systematically varying combinations of scales that emerge in the calculation. The softest of these scales determines a transition between the perturbative and non-perturbative regimes: where Λ is the energy scale at which α s becomes non-perturbative. This transition is indicated by a dashed vertical blue line at Λ = 1 GeV/c, taking p T to be the weighted average p ch jet T in the considered interval scaled by 20% to approximately translate the p T scale from charged-particle jets to full jets. The cross section is normalized according to the integral of the distribution in the perturbative region, The measured θ g distributions agree with the SCET calculations within uncertainties in the perturbative region (i.e. to the right of the dashed line), whereas divergence is seen at low values of θ g , where nonperturbative effects dominate and the perturbative calculation is expected to break down. This holds for all values of β . Note that the perturbative regime contains an increasingly small fraction of the distribution as β grows, which demonstrates that at these p ch jet T values, the majority of the θ g distribution can only be captured by pQCD for sufficiently small β .     [40] and corrected for non-perturbative effects using either PYTHIA8 [42] or Herwig7 [62]. The distributions are normalized such that the integral of the perturbative region defined by θ g > θ NP g (to the right of the dashed vertical blue line) is unity. The non-perturbative scale in Eq. 7 is taken to be Λ = 1 GeV/c. In determining the normalization, intervals that overlap with the dashed blue line are considered to be in the non-perturbative (left) region. Figures 5 and 6 show the z g and θ g distributions in pp collisions for jets with dynamical grooming for several values of the grooming parameter a. For small values of a, the grooming condition favors splittings with symmetric longitudinal momenta, which is reflected in the distributions skewing towards large z g and small θ g . As a increases, the grooming condition favors splittings with large angular separation, which is reflected in the distributions skewing towards small z g and large θ g . The results are compared with PYTHIA8 Monash 2013 [42,43], which generally describes the data within approximately 20%.

Dynamical grooming
In Figs. 7 and 8, we compare the z g and θ g distributions, respectively, to pQCD calculations described in Ref. [18]. The theoretical calculations include non-perturbative corrections based on MC event generators, which are implemented in Ref. [18]. The theoretical uncertainty bands account for scale variations together with non-perturbative effects, the latter generally being the dominant contribution. The calculations generally describe the data within the precision of the statistical and systematic uncertainties of the data and the theoretical uncertainties of the calculation, demonstrating that pQCD predictions, when coupled with corrections for non-perturbative effects, provide a sufficient description of the data even at the moderate p ch jet T considered here.

Conclusions
We have presented new measurements of the groomed jet radius and momentum splitting fraction in pp collisions at √ s = 5.02 TeV with the ALICE detector at the Large Hadron Collider. We studied two grooming algorithms, soft drop and dynamical grooming, each with a variety of grooming settings in order to study their impact on soft-and wide-angle radiation. These studies have provided the first mea-  TeV with dynamical grooming [16] for three values of the grooming parameter a, compared with PYTHIA8 Monash 2013 [42,43] calculations.   surement of a jet substructure observable with the dynamical grooming procedure. We compared these results to perturbative calculations that include resummation of large logarithms at all orders in the strong coupling constant, and generally found agreement of the theoretical predictions with the data in the perturbative regime. This conclusion holds for all grooming settings considered. However, the soft drop θ g distributions increasingly deviate from the perturbative calculations at small θ g as the grooming pa-rameter β is increased (corresponding to grooming away less collinear radiation). This is in accordance with the predicted limitation of the perturbative calculation in describing the non-perturbative region, and provides guidance for the regimes within which perturbative QCD can be used to describe the observables. These measurements can be used both to test future perturbative calculations and models of non-perturbative effects, and can serve as a baseline reference for future measurements in heavy-ion collisions.