New Angles on Energy Correlation Functions

Jet substructure observables, designed to identify specific features within jets, play an essential role at the Large Hadron Collider (LHC), both for searching for signals beyond the Standard Model and for testing QCD in extreme phase space regions. In this paper, we systematically study the structure of infrared and collinear safe substructure observables, defining a generalization of the energy correlation functions to probe $n$-particle correlations within a jet. These generalized correlators provide a flexible basis for constructing new substructure observables optimized for specific purposes. Focusing on three major targets of the jet substructure community---boosted top tagging, boosted $W/Z/H$ tagging, and quark/gluon discrimination---we use power-counting techniques to identify three new series of powerful discriminants: $M_i$, $N_i$, and $U_i$. The $M_i$ series is designed for use on groomed jets, providing a novel example of observables with improved discrimination power after the removal of soft radiation. The $N_i$ series behave parametrically like the $N$-subjettiness ratio observables, but are defined without respect to subjet axes, exhibiting improved behavior in the unresolved limit. Finally, the $U_i$ series improves quark/gluon discrimination by using higher-point correlators to simultaneously probe multiple emissions within a jet. Taken together, these observables broaden the scope for jet substructure studies at the LHC.


Introduction
With the Large Hadron Collider (LHC) rapidly acquiring data at a center-of-mass energy of 13 TeV, jet substructure observables are playing a central role in a large number of analyses, from Standard Model measurements [1][2][3][4][5][6][7][8][9][10][11][12] to searches for new physics . 1 As the field of jet substructure matures [35][36][37][38], observables are being designed for increasingly specific purposes, using a broader set of criteria to evaluate their performance beyond simply raw discrimination power. Continued progress relies on achieving a deeper understanding of the QCD dynamics of jets, allowing for more subtle features within a jet to be exploited. This understanding has progressed rapidly in recent years, due both to advances in explicit calculations of jet substructure observables  as well as to the development of techniques for understanding the dominant properties of substructure observables using analytic [64][65][66] and machine learning [67][68][69][70][71][72][73] approaches. A particularly powerful method for constructing jet substructure observables is power counting, introduced in Ref. [65]. Given a basis of infrared and collinear (IRC) safe observables, power counting can identify which combinations are optimally sensitive to specific parametric features within a jet. 2 Furthermore, power counting elucidates the underlying physics probed by the observable. This approach was successfully applied to the energy correlation functions [74], leading to a powerful 2-prong discriminant called D 2 [65]. Vital to the power counting approach, though, is a sufficiently flexible basis of IRC safe observables to allow the construction of discriminants with specific properties.
In this paper, we exploit the known properties of IRC safe observables to systematically identify a useful basis for jet substructure, which we call the generalized energy correlation functions. These observables-denoted by v e (β) n and defined in Eq. (3.3)-are an extension of the original energy correlation functions with a more flexible angular weighting. 3 Specially, these new observables correlate v pairwise angles among n particles, whereas the original correlators were restricted to v equaling n choose 2. Using these generalized correlators, we apply power counting to identify new jet substructure observables for each of the major jet substructure applications at the LHC: 3-prong boosted top tagging, 2-prong boosted W/Z/H tagging, and 1-prong quark/gluon discrimination. In each case, our new observables exhibit improved performance over traditional observables when tested with parton shower generators.
The flexibility of our basis, combined with insights from power counting, allows us to tailor our observables for specific purposes, beyond those that have been previously considered. As an interesting example, we are able to specifically design observables for use on groomed jets [41,42,[75][76][77][78]. While grooming procedures are heavily used at the LHC to remove jet contamination from initial state radiation, underlying event, and pileup, most LHC analyses apply observables that were designed for use on ungroomed jets. Here, by understanding the impact of grooming on soft radiation, we introduce a 2-prong discriminant, M 2 , which exhibits almost no discrimination power on ungroomed jets, but outperforms traditional observables when measured on groomed jets. This observable therefore acts both as a probe of the grooming procedure and as a powerful discriminant. We also show how the use of groomed observables leads to remarkably stable distributions as a function of the jet mass and p T , even for distributions that are unstable before grooming, such as D 2 . This has recently been emphasized as a desirable feature for substructure observables, particularly to facilitate sideband calibration and produce smooth mass distributions for backgrounds [79]; observables modified to achieve stability have been used by both ATLAS and CMS [80,81].
The generalized energy correlation functions allow us to introduce a wide variety of new substructure observables, though we focus on three series with particularly nice properties. The first is the M i series, defined via the ratio (1.1) These observables identify jets with i hard prongs, but, as mentioned above, are only effective for discrimination on suitably groomed jets. The second is the N i series, defined via the ratio ( 1 e (β) which are designed to mimic the behavior of the N -subjettiness ratio τ i,i−1 [82,83]. The N i observables are defined without respect to subjet axes, and therefore exhibit improved behavior compared to N -subjettiness, particularly in the transition to the unresolved region, where the definition of subjet axes becomes ambiguous. The third is the U i series, defined as which probe multiple emissions within 1-prong jets and can be used to improve quark/gluon discrimination. In all cases, the parameter β controls the overall angular scaling of these observables, and the (β) superscript will often be dropped when clear from context. To guide the reader, we summarize the particular applications studied in this paper, so that the (un)interested reader can skip to the relevant section. These observables will be made available in the EnergyCorrelator FastJet contrib [84,85] starting in version 1.2.0.
• Boosted Top Tagging (Sec. 4): -N 3 : An axes-free observable which reduces to the N -subjettiness ratio τ 3,2 in the resolved limit, but exhibits improved performance in the unresolved limit on groomed jets.
-N 2 : An axes-free observable which reduces to the N -subjetttiness ratio τ 2,1 in the resolved limit, but exhibits improved performance on both groomed and ungroomed jets.
• Quark/Gluon Discrimination (Sec. 6): -U i : A new series of observables for quark/gluon discrimination which probes the structure of multiple soft gluon emissions from the hard jet core, leading to improved performance over the standard C 1 observable [74].
The specific form of these observables, and the origin of their discrimination power, will be analyzed using power counting. We verify all power-counting predictions using parton shower generators and compare the performance of our newly introduced observables to traditional observables for each of the above applications. The remainder of this paper is organized as follows. In Sec. 2, we review standard substructure and grooming techniques as well as the power counting approach for understanding soft and collinear scaling. In Sec. 3, we discuss the general structure of IRC safe observables and introduce the generalized energy correlation functions, v e n , as well as the M i , N i , and U i series. The three key case studies bulleted above appear in Secs. 4, 5, and 6. We conclude in Sec. 7 and discuss possible future directions for improving our understanding of jet substructure at the LHC.

Review of Substructure Approaches
In this section, we review a number of standard jet substructure techniques that will be used throughout this paper. We begin in Sec. 2.1 by defining the energy correlation functions [74] and N -subjettiness ratios [82,83], both of which are widely used in jet substructure. In Sec. 2.2, we review the soft drop/modified mass drop [41,42,86] algorithm, which we use as our default grooming procedure. Finally in Sec. 2.3, using the 2-point energy correlation function as an example, we review the power-counting approach for analyzing jet substructure observables, which features heavily in later discussions. Readers familiar with these topics can safely skip to Sec. 3, though we recommend reviewing the logic of Sec. 2.3.

Energy Correlation Functions and N -subjettiness
The energy correlation functions [74] are a convenient basis of observables for probing multi-prong substructure within a jet. In this paper, we use the 2-, 3-, and 4-point energy correlation functions, defined as 4 e (β) where n J is the number of particles in the jet. The generalization to higher-point correlators is straightforward, though we will not use them here. For simplicity, we often drop the explicit angular exponent β, writing the observable as e n . This simplified notation will also be used for other observables introduced in the text. It is convenient to work with dimensionless observables, written in terms of a generic energy fraction variable, z, and a generic angular variable, θ. The precise definitions of the energy fraction and angle can be chosen depending on context and do not affect our power-counting arguments. For the case of pp collisions at the LHC, which is the focus of our later studies, we work with longitudinally boost-invariant variables, where p T i , φ i , and y i are the transverse momentum, azimuthal angle, and rapidity of particle i, respectively. Two other measures intended for e + e − collisions are available in the EnergyCorrelator FastJet contrib [84,85]. The first is a definition based strictly on energies and opening angles, where E J is the total jet energy, and Θ ij is the Euclidean angle between the 3-momenta p i and p j . There is an alternative definition in terms of energies and Mandelstam invariants, which reduces to Eq. (2.3) in the collinear limit but is easier for analytic calculations. From Eq. (2.1), we see that the n-point energy correlation functions vanish in the soft and collinear limits, and therefore are natural resolution variables for (n − 1)-prong substructure. A number of powerful 2-prong discriminants have been formed from the energy correlation functions [65,74], namely  Figure 1: Schematic depiction of the phase space for (a) the energy correlation functions e 2 , e 3 and (b) the N -subjettiness observables τ 1 , τ 2 . In both cases, contours of the relevant ratio observable, D 2 or τ 2,1 , are shown as white dashed curves. These ratios are chosen such that the contours cleanly separate the 1-and 2-prong regions of phase space.
Beyond their discrimination power, these observables have nice analytic properties. First, since they can be written as a sum over particles in the jet without reference to external axes, they are automatically "recoil-free" [74,[87][88][89][90]. Second, since they have well-defined behavior in various soft and collinear limits, they are amenable to resummed calculations; in Ref. [58], D 2 was calculated to next-to-leading-logarithmic (NLL) accuracy in e + e − for both signal (boosted Z) and background (QCD) jets.
The basic structure of the e 2 , e 3 phase space is shown in Fig. 1a and discussed in more detail in Refs. [58,65]. Signal jets which have resolved 2-prong structure live in the region of phase space satisfying e 3 (e 2 ) 3 , whereas QCD background jets with 1-prong structure live in the phase space region defined by (e 2 ) 3 e 3 (e 2 ) 2 . The observable D 2 is designed to define contours which cleanly separate the 1-prong and 2-prong regions of phase space, and therefore identifies the extent to which a jet is 1-or 2-prong-like.
Observables for boosted top tagging have also been proposed using the energy correlation functions, namely the C 3 observable [74], (2.6) and the D 3 observable [66], (2.7) Here, x and y are constants given in Ref. [66] that depend on the jet mass and p T . The C 3 observable does not exhibit particularly good discrimination power, and while D 3 , which was constructed using the power counting approach, performs well, it has a complicated functional form. For the boosted top study in Sec. 4, we compare to a simplified version of the D 3 observable obtained by setting x = y = 0, which behaves well on groomed jets. Unlike its more complicated cousin, this simplified D 3 has only a single angular exponent. We also find it interesting to compare our new observables to N -subjettiness. The (normalized) N -subjettiness observable τ N [82,83] is defined as 5 Here, the angle θ iK is measured between particle i and subjet axis K in the jet. As for the case of the energy correlation functions, a number of different possible measures can be used to define θ iK . For our LHC studies, we take θ iK = R iK , analogously to Eq. (2.2). Unlike the energy correlation functions of Eq. (2.1), which correlate groups of n particles within the jet, N -subjettiness divides a jet into N sectors and correlates the particles in each sector with their corresponding axis. Thus, implicit in the definition of N -subjettiness in Eq. (2.9) is the definition of appropriate N -subjettiness axes. Different definitions of the axes can lead to different behaviors of the observable, particularly away from the resolved limit [94]. A natural definition is to choose the axes that minimize the value of τ N itself [83], as is done for the classic e + e − event shape thrust [95]. Exact minimization is computationally challenging, though, so a number of definitions which approximate the minimum are used instead, which are provided in the Nsubjettiness FastJet contrib [84,85].
The relevant N -subjettiness ratio observables are (2.10) Here, τ 2,1 is designed to be small when a jet has well-resolved 2-prong substructure, making it useful for boosted W /Z/H tagging. Similarly, τ 3,2 is designed to be small in the 3-prong limit, useful for boosted tops. The observable τ 2,1 was calculated in e + e − collisions for signal (boosted Z) jets at N 3 LL accuracy [39]. The phase space for τ 1 , τ 2 is shown schematically in Fig. 1b, along with contours of constant τ 2,1 . Background QCD jets are defined by the linear scaling τ 2 ∼ τ 1 , whereas signal jets are defined by τ 2 τ 1 . This phase space structure is different from that of the e 2 and e 3 observables shown in Fig. 1a, where the phase space for background QCD jets is defined by two boundaries with distinct scalings. It is this fact which ultimately leads to many of the differences seen between D 2 and τ 2,1 , including the fact that the τ 2,1 distribution is more stable as a function of jet mass and p T . The phase space for τ 3,2 is similar to τ 2,1 , to be contrasted with the complicated phase space for D 3 [66]. Using the generalized energy correlation functions, we can define new axes-free observables that mirror the phase space structures of τ 2,1 and τ 3,2 , thereby exhibiting similar scaling and stability behaviors, particularly for groomed jets. This will be discussed for τ 3,2 in Sec. 4 and for τ 2,1 in Sec. 5.

Soft Drop Grooming
Two powerful tools which have emerged from the study of jet substructure are groomers [41,42,[75][76][77][78] and pileup mitigation techniques [96][97][98][99][100][101][102], both of which remove soft radiation from a jet. Groomers have proven to be useful both for removing jet contamination as well as for identifying hard multi-prong substructure within a jet. In this paper, we use the soft drop [86] groomer with β = 0, which coincides with the modified mass drop procedure [41,42] with µ = 1. The soft drop groomer exhibits several theoretical advantages over other groomers; in particular, it removes non-global logarithms [103] to all orders, and it mitigates the process dependence of jet spectra. The soft-dropped groomed jet mass has recently been calculated to NNLL accuracy [60,61].
Starting from a jet identified with an IRC safe jet algorithm (such as anti-k t [104]), the soft drop algorithm is defined using Cambridge/Aachen (C/A) reclustering [105][106][107]. Specializing to the case of β = 0, the algorithm proceeds as follows: 1. Recluster the jet using the C/A clustering algorithm, producing an angular-ordered branching history for the jet.

2.
Step through the branching history of the reclustered jet. At each step, check the soft drop condition Here, z cut is a parameter defining the scale below which soft radiation is removed. If the soft drop condition is not satisfied, then the softer of the two branches is removed from the jet. This process is then iterated on the harder branch.
3. The soft drop procedure terminates once the soft drop condition is satisfied.
Given a jet that has been groomed with the soft drop procedure, we can then measure any IRC safe observable on this jet and it will remain IRC safe. As we will see, because soft drop removes soft radiation from a jet, power-counting arguments for groomed jets can be dramatically different than those for ungroomed jets. This is previewed in Fig. 2 Figure 2: Same as Fig. 1, but after applying jet grooming. The upper-boundary of phase space for D 2 is modified by removing soft radiation, while the parametric behavior of τ 2,1 is unchanged. This modified phase space for 2-prong discriminants will be discussed in more detail in Sec. 5.

�������� �� �
an angular weighting exponent β, which controls the aggressiveness of the groomer, and we expect deviations away from our default of β = 0 to yield similar behavior, so long as the groomer continues to remove parametrically soft particles. We also expect that other groomers such as trimming [78], which is used heavily by the ATLAS experiment, will behave similarly for the same value of z cut . We leave a detailed study of other groomers to future work.

Power Counting the Soft/Collinear Behavior
An efficient approach for studying jet substructure is power counting [65], which allows one to determine the parametric scaling of observables. This parametric behavior is determined by the soft and collinear limits of QCD and is robust to hadronization or modeling in parton shower generators. Here, we briefly review the salient features of power counting, using the 2-point energy correlator as an example. We refer readers interested in a more detailed discussion to the original paper.
High-energy QCD jets are dominated by soft and collinear radiation, a language which will be used frequently throughout this paper. Since QCD is approximately conformal, there is no intrinsic energy or angular scale associated with this radiation. 6 By applying a measurement to a jet, though, one introduces a scale, which then determines the scaling of soft and collinear radiation. The simple observation that all scales are set by the measure- , and collinear-soft (orange) radiation, as well as the characteristic scales, z s , θ cc , z cs , and θ 12 . ment itself allows for a powerful understanding of the jet's energy and angular structure. Arguments along these lines are ubiquitous in the effective field theory (EFT) community. For example, in Soft Collinear Effective Theory (SCET) [108][109][110][111], they are used to identify the appropriate EFT modes required to describe a particular set of measurements.
In the context of power counting, soft and collinear emissions are defined by their parametric scalings. A soft emission, denoted by s, is defined by Here, z s is the momentum fraction, as defined in Eq. (2.2), and θ sx is the angle to any other particle x in the jet, including other soft particles. The scaling θ sx ∼ 1 means that θ sx is not assigned any parametric scaling associated with the measurement. A collinear emission, denoted by c, is defined by Here, θ cc is the angle between two collinear particles, while θ cs is the angle between a collinear particle and a soft particle. In an EFT context, overlaps between soft and collinear regions are systematically removed using the zero-bin procedure [112], but this is not relevant for the arguments here. The soft and collinear modes are illustrated in Fig. 3a and their scalings are summaried in Table 1a.
We now use the simple example of e 2 to demonstrate how an applied measurement sets the scaling of soft and collinear radiation. 7 The analysis of more general observables 7 In this analysis, we do not consider the scale set by the jet radius, R. For R 1, the jet radius must also be considered in the power counting and the scale R appears in perturbative calculations. For recent work on the resummation of logarithms associated with this scale, see Refs. [113][114][115][116] (2.14) If we only consider regions of phase space where e 2 1, such that we have a well-defined collimated jet, all particles in the jet either have small z i or small θ ij . In this phase space region, the observable is indeed dominated by soft and collinear emissions.
To determine the scaling of z s and θ cc in terms of the observable, we can consider the different possible contributions to e 2 : soft-soft correlations, soft-collinear correlations, and collinear-collinear correlations. Parametrically, e 2 can therefore be written as Expanding this result to leading order in z s and θ cc , we find Since we have only measured a single observable, e 2 , it sets the only scale in the jet, and there is no measurement to further distinguish the scalings of soft and collinear particles. We therefore find the scaling of z s and θ cc in terms of the observable, More generally, after identifying all parametrically different modes that can contribute to a set of measurements, the scaling of those modes is determined by the measured observables.
In this paper, we are interested not only in jets with soft and collinear radiation, but also in jets which have well-resolved substructure. In addition to the strictly soft and collinear modes which are found in Fig. 3a, a jet with well-resolved substructure also includes radiation emitted from the dipoles within the jet, shown in orange for the particular case of a 2-prong jet in Fig. 3b. This radiation is referred to as "collinear-soft" (or just "c-soft") as it has a characteristic angle θ 12 defined by the opening angle of the subjets, as well as a momentum fraction z cs 1, both of which are set by the measurement. The appropriate EFT description for multi-prong substructure is referred to as SCET + [55,[117][118][119], and the scaling of the collinear-soft mode is summarized in Table 1b. Using the mode structure of multi-prong jets, it is straightforward to apply power-counting arguments to a wide variety of n-prong jet substructure observables, as demonstrated in Secs. 4 and 5.
We also apply power-counting arguments to groomed jets after soft drop has been applied. The effect of the grooming algorithm is not just to remove jet contamination, but also to modify the power counting in interesting, and potentially useful, ways. As discussed in Sec. 2.2, soft drop with β = 0 is defined with a single parameter z cut , which determines the scale below which soft radiation is removed. To perform a proper powercounting analysis, one should also incorporate the scale z cut and consider different cases depending on the relative scaling of z cut and z s . For simplicity, we ignore this complication through most of this paper and assume that the soft drop procedure simply removes the soft modes. That said, the residual soft scaling will matter for the quark/gluon study in Sec. 6. For a more detailed discussion, and a proper treatment of the scale z cut involving collinear-soft modes, see Refs. [60,61].

Enlarging the Basis of Jet Substructure Observables
An important goal of jet substructure is to design observables that efficiently identify particular features within a jet. A popular, and theoretically well-motivated, approach is to construct observables from combinations, often ratios, of IRC safe jet shapes. 8 Such observables are widely employed at the LHC, and have proven to be both experimentally useful and theoretically tractable. Indeed, the observables reviewed in Sec. 2.1-τ 2,1 , τ 3,2 , C 2 , and D 2 -are all of this form.
Essential to this approach is a flexible basis of IRC safe observables from which to build discriminants. While the original energy correlators are indeed a useful basis, they are still somewhat restrictive. For example, the phase space structure of e 2 and e 3 in Fig. 1a is completely fixed, as are all of the parametric properties inherited from this structure, such that D 2 is the only combination that parametrically distinguishes 1-and 2-prong substructure.
In this section, we enlarge the basis of jet substructure observables by defining generalizations of the energy correlation functions, allowing for a more general angular dependence than considered in Eq. (2.1). These new observables are flexible building blocks, which we use in the rest of this paper to identify promising tagging observables using power-counting techniques. 9 8 These ratios are not themselves IRC safe, but are instead Sudakov safe [120,121]. For a discussion of Sudakov safety for the case of D2, see Ref. [58]. For this reason, the ratio observables we construct in this paper cannot be written in the form of Eq. (3.1), even though their v e (β) n ingredients can. 9 An alternative approach to identifying specific features within jets is machine learning, which has seen significant recent interest [67][68][69][70][71][72][73]. The contrast between these strategies has been dubbed "deep thinking" versus "deep learning". In the deep thinking approach pursued here, the goal is to identify the physics principles that lead to discrimination power, focusing on observables with desirable properties for Figure 4: Schematic depiction of a hard scattering event. A general IRC safe observable can be constructed by summing over all energy deposits, E i , in an event, with a symmetric angular weighting function depending on the dimensionless unit vectorsp i .

General Structure of Infrared/Collinear Safe Observables
In order to engineer the phase space structure of observables to have specific properties, we first need to systematically understand the structure of IRC safe observables that probe n-particle correlations. The general structure of an IRC safe observable is shown schematically in Fig. 4, where any IRC safe observable can be constructed from the energy deposits and angular information on the sphere. In the pp case, of course, one typically uses the longitudinally boost-invariant quantities p T and R ij , but the following argument is insensitive to that coordinate change.
As shown in Ref. [122][123][124][125], any IRC safe observable can be constructed from the following (complete) basis of observables 10 where E i is the energy of particle i,p i is a dimensionless unit vector describing its direction, and f N is a symmetric function of its arguments. For IRC safety, we must further demand that the function f N vanishes when any two particles become collinear. Note that Eq. (3.1) is a linear function of the momenta of the particles and a symmetric function of the angles. This basis of observables are referred to in the literature as C-correlators [122][123][124][125].
Since the above discussion is completely general, it is not immediately obvious that it is useful for jet substructure studies. Still, Eq. (3.1) has the interesting feature that, while first-principles calculations. In the deep learning approach, the goal is to use reliable training samples to optimize the discrimination power and, in many cases, visualize the underlying physics. Ultimately, one would want to merge these two approaches, which could help avoid theoretical blindspots in the cataloging of observables and mitigate modeling uncertainties inherent in training samples. Detailed studies in data, ideally with high purity samples, will also be needed for a complete understanding. 10 With a completely generic angular weighting function, fN , this basis is of course overcomplete.
the dependence on the energies is fixed by IRC safety, the angular function f N is much less restricted and can be chosen for specific purposes. The original energy correlators in Eq. (2.1) are a specific case of Eq. (3.1), where, up to an overall normalization, the angular weighting function is The key observation is that by considering alternative angular weighting functions for npoint correlators beyond Eq. (3.2), we can define a more flexible basis of observables for jet substructure studies.

New Angles on Energy Correlation Functions
There are many known decompositions of the angular function f N -including Fox-Wolfram moments [126,127] and orthogonal polynomials on the sphere [128]-but these are not necessarily optimal for jet substructure. The reason is that jets with well-resolved subjets exhibit a hierarchy of distinct angular scales, so we need to design f N to identify hierarchical-instead of averaged-features within a jet. As seen in Eq. (3.2), the original energy correlation functions do capture multiple angular scales, but they do so all at once; it would be preferable if f N could identify one angular scale at a time in order to isolate different physics effects. Furthermore, to make power-counting arguments more transparent, we want f N to exhibit homogeneous angular scaling, such that each term in Eq. (3.1) has a well-defined scaling behavior without having to perform a non-trivial expansion in the soft and collinear limits.
With these criteria in mind, we can now translate the general language of IRC safe observables into a useful basis for jet substructure studies. The angular function f N has to be symmetric in its arguments, and the simplest symmetric function that preserves homogeneous scaling is the min function. 11 This leads us to the generalized energy correlation functions, which depend on n factors of the particle energies and v factors of their pairwise angles, v e (β) n = where min (m) denotes the m-th smallest element in the list. For a jet consisting of fewer than n particles, v e n is defined to be zero. More explicitly, the three arguments of the generalized energy correlation functions are as follows.
• The subscript n, appearing to the right of the observable, denotes the number of particles to be correlated. This plays the same role as the n subscript for the standard e n energy correlators in Eq. (2.1). 11 The appearance of min can also be viewed as the lowest-order Taylor expansion of a more generic observable, which should be a good approximation in the case of small radius jets. This can be seen explicitly in App. A, where different functional forms are compared that give the same quantitative behavior as the min version here. Another motivation for the min definition is that it naively behaves more similarly to thrust [95] or N -jettiness [91], though we emphasize that v en does not rely on external axes.
• The subscript v, appearing to the left of the observable, denotes the number of pairwise angles entering the product. By definition, we take v ≤ n 2 , and the minimum then isolates the product of the v smallest pairwise angles.
• The angular exponent β > 0 can be used to adjust the weighting of the pairwise angles, as in Eq. (2.1).
For the special case of v = n 2 , the generalized energy correlators reduce to the standard ones in Eq. (2.1), with 1 e 2 ≡ e 2 , 3 e 3 ≡ e 3 , 6 e 4 ≡ e 4 , and so on for the higher-point correlators.
Compared to the original energy correlators, the generalization in Eq. (3.3) allows more flexibility in the angular scaling; this simplifies the construction of useful ratios and extends the possible applications of energy correlators. In the case of boosted top tagging, for example, the standard e 4 = 6 e 4 observable involves six different pairwise angles. A decaying boosted top quark, however, does not have six characteristic angular scales, so most of these angles are redundant and only serve to complicate the structure of the observable. This is reflected in the definition of D 3 in Eq. (2.7), which involves three distinct terms [66].
To make more explicit the definition in Eq. (3.3), we summarize the particular correlators used in our case studies below. For boosted 2-prong tagging in Sec. 5, we use the 2-point energy correlation function whose definition is unique, since it only involves only a single pairwise angle. We also need the 3-point correlators, which have three variants probing different angular structures: 12 Interestingly, we are able to construct powerful observables from each of these three 3-point correlators, resulting in different tagging properties. For boosted top tagging in Sec. 4, we also need the 4-point correlators. There are six possible variants, but we only study three of them in the body of the text: where min (2) is again the second smallest element in the list. Here, we see the simplicity in the angular structure of 1 e 4 and 2 e 4 , as compared to 6 e 4 which involves all six angles. The vertical dots denote other 4-point correlation functions; we have not found them to be particularly useful, but they might have applications in (and beyond) jet substructure. When constructing jet substructure observables, it is often desirable to work with ratios that are approximately boost invariant. Since the different generalized correlators probe a different number of energy fractions and pairwise angles, each scales differently under Lorentz boosts. Under a boost γ along the jet axis and assuming a narrow jet, the energies and angles scale as This implies that the transformation of v e n under boosts along the jet axis is determined Therefore, another way of interpreting the different v e n is as ways of probing n particle correlations with different properties under Lorentz boosts. The v index therefore broadens the set of boost-invariant combinations that can be formed. Finally, we remark that the definition in Eq. (3.3) is certainly not unique, and we explore a few alternative definitions in App. A that reduce to the min function in collinear limits. To further generalize Eq. (3.3) while maintaining homogeneous scaling, one could use different angular exponents depending on the ordering of the angles. For the cases that we consider, though, we find that v e n is sufficiently general to provide excellent performance while keeping the form of the observable (relatively) simple. That said, we expect alternative f N functions to also be useful, and their performance could be studied using the same power-counting techniques pursued here.

New Substructure Discriminants
Our case studies are based primarily on three series of observables formed from the generalized correlators. We summarize their definitions here, and study their discrimination power in the forthcoming sections using both power-counting arguments and parton shower generators.

The M i Series
The M i series of observables is defined as This observable is dimensionless, being formed as a ratio of dimensionless observables. As can be seen from Eq. (3.8), it is also invariant to boosts along the jet axis, since one angular factor appears in both the numerator and denominator. These observables are constructed to identify i hard prongs, but due to their limited angular structure, they are only effective when acting on suitably groomed jets. The main example of the M i series that we will consider explicitly in this paper is which provides an example of a 2-prong substructure observable that only performs well after grooming. In App. B.1, we briefly discuss the behavior of M 3 for boosted top tagging, where we argue that a more aggressive grooming strategy would be needed to make M 3 performant.

The N i Series
We also define the N i series of observables as As with the M i series, the N i series is dimensionless, and from Eq. (3.8), it is boost invariant, as two angular factors appear in both the numerator and denominator. Indeed, the fact that the 2-point correlation function appears squared in the denominator is fixed by boost invariance.
Two particular examples we find useful for this paper are which is a powerful boosted W/Z/H tagger, and which is a powerful boosted top tagger on groomed jets. More generally, N i should be effective as an i-prong tagger, as discussed in App. C, at least for groomed jets. The N i observables take their name from the fact that in the limit of a resolved jet, they behave parametrically like the N -subjettiness ratio observables, as discussed in Secs. 4 and 5. Despite their similarity to N -subjettiness, the N i observables achieve their discrimination power in a completely different manner, which has both theoretical and experimental advantages.

The U i Series
Finally, we consider the U i series of observables defined as (3.14) which are designed for quark/gluon discrimination. Note that unlike M i and N i , the U i observables are not boost invariant. For the case i = 1, U 1 coincides with the usual quark/gluon discriminants formed from the energy correlation functions [74], namely which probe single soft particle correlations within the jet. For i > 1, the U i observables probe multi-particle correlations within the jet in a specific way that is useful for quark/gluon discrimination.
In this section, we use the generalized energy correlation functions to construct N 3 , a simple but powerful boosted top tagger designed for use on groomed jets. Unlike τ 3,2 , N 3 is defined without reference to external axes, allowing it to achieve better background rejection at high signal efficiencies. Interestingly, in the limit of well-resolved subjets and acting on groomed jets, N 3 has identical power counting to N -subjettiness. The behavior on ungroomed jets is discussed in App. B.2.

Constructing the N 3 Observable
To detect boosted top jets with hard 3-prong substructure, we can use combinations of 2point, 3-point, and 4-point correlators. Due to the large number of possible combinations, the power counting approach becomes essential to systematically study the behavior of these observables.
Crucially, one must consider not only the case of three subjets with equal energies and opening angles, as shown in Fig. 5a, but also the strongly-ordered limit, shown in Fig. 5b. When the opening angles are hierarchical, the emission modes for each of the dipoles are distinct and must be treated separately, as discussed in Ref. [66]. For lack of a better name, we call these additional modes collinear-collinear-soft modes (shown in magenta in Figure 5: Configurations used in the power-counting analysis for N 3 , showing the modes and scales entering the description of the jets. In (a), the three subjets carry equal energies, and there is no hierarchy between the angles. In (b), each of the subjets carries equal energies, but there is a hierarchy in the opening angles of the jets, requiring an extra collinear-collinear-soft mode, shown in magenta, in the power-counting analysis.

Mode
Energy Angle soft z s 1 collinear 1 θ cc c-soft z cs θ 12 cc-soft z ccs θ 23 Table 2: A summary of the modes in Fig. 5b which enter the power-counting analysis for boosted top quarks.
Fig. 5b) to distinguish them from collinear-soft modes (shown in orange). A summary of these different modes, and the scaling of their angles and energies, are given in Table 2. These modes satisfy the relations Note the reversal of the energy and angle hierarchies: collinear-collinear-soft modes have smaller angles but higher energies than collinear-soft modes. With this slight modification, the power-counting analysis proceeds identically to the simpler case shown in Fig. 3b. Many experimental analyses use jet shapes as measured on groomed jets, even if the original jet shapes were proposed without grooming. Grooming has the advantage of making jet properties resistant to pileup contamination and it also leads to observables that are more stable as the jet mass and p T are varied. More generally, grooming techniques minimize sensitivity to low momentum particles and the corresponding experimental uncertainties associated with their reconstruction. It is also possible to use a combination of groomed and ungroomed (or lightly groomed) substructure discriminants [154,155]. Here, we design our observable specifically for use on groomed jets, since it will help us identify discriminants that are both performant and stable. From the perspective of power counting, grooming simplifies the scaling properties of observables, since we can ignore regions of phase space with soft wide-angle subjets. In the past, such regions caused complications in designing top tagging observables based on energy correlators [66], as seen in the definition of D 3 in Eq. (2.7). After jet grooming, we can drop soft radiation (shown in green in Fig. 5) for the purposes of power counting.
From Eq. (3.6), we have six 4-point correlators we could use to form ratio observables with the 2-and 3-point correlators. To reduce the number of possibilities, we restrict our attention to boost-invariant combinations, but this still leaves many ratios to test. In App. B.3, we outline a systematic strategy to isolate the most promising 3-prong discriminants using power counting. Here, we focus on the most performant observable, which was presented in Sec. 3.3.2 as a member of the N i series.
To understand why N 3 is a powerful discriminant on groomed jets, we need to contrast the phase space for 3-prong signal jets versus 2-prong background jets. For the 3-prong top signal, it is sufficient to study the strongly-ordered limit in Fig. 5b, since the balanced case of Fig. 5a can be obtained by setting z ccs = z cs and θ 23 = θ 12 . Using the methods of Sec. 2.3 on the modes from Table 2, we find the following parametric scaling: The dominant background to boosted top quarks are gluon and quark jets, particularly bottom quarks when subjet b-tagging is used [8,[143][144][145][146][147][148][149]. While we ordinarily think of these as being 1-prong backgrounds (see Fig. 3a), they are mainly relevant when they feature 2-prong substructure from a hard parton splitting. Therefore, the phase space configuration we have to consider for the background is that of Fig. 3b. Using the modes from Table 1b, we find 2-prong background (groomed): From these power-counting relations, we now want to derive the scaling of 2 e 4 versus 1 e 3 . For signal jets, 2 e 4 is always smaller than 1 e 3 , since they share a factor of θ β 23 , but each term in 2 e 4 is also multiplied by parametrically small quantity. In particular, θ cc θ 23 by the assumption of Eq. ( 1 e   is the product of two terms in 1 e 3 , so we have the relation This shows that the particular combination chosen to define N 3 is indeed appropriate, since we can isolate the top signal region by making a cut of N 3 1. These phase space relations are shown in Fig. 6a.
To further improve our understanding, it is instructive to compare this with the Nsubjettiness ratio τ 3,2 , whose phase space is shown in Fig. 6b. For strongly-ordered 3-prong substructure, we find For 2-prong background jets, we find Remarkably, in both cases, this leads to the relations Therefore, on groomed jets, the N 3 and τ 3,2 observables are parametrically identical: This result is quite surprising. By summing over groups of four particles and taking double products of their pairwise angles, we have achieved an observable that behaves parametrically like an N -subjettiness ratio. The observables N 3 and τ 3,2 achieve their discrimination power in completely different ways, as shown schematically in Fig. 7. Each term in 2 e 4 is sensitive to multiple energies and angles and contains cross terms like θ β 12 θ β cc . By contrast, N -subjettiness does not contain such cross terms; after determining the axes, each term in the N -subjettiness sum is independent of the presence of other subjets. Despite these differences, Eq. (4.9) shows that the 4-point correlation function factorizes into a product of lower-point N -subjettiness observables, yielding the same parametric behavior in the resolved limit.
While there are no parametric difference between N 3 and τ 3,2 , our parton shower study will show that N 3 exhibits improved discrimination power on groomed jets, particularly at high efficiencies. Part of the reason this occurs is because N 3 is defined without respect to subjet axes. This not only offers the practical advantage of not needing to specify an axes-finding algorithm, but it also has an effect on the behavior of N 3 away from the powercounting regime. Recall that N -jettiness was originally designed to isolate regions of phase space where there are N well-resolved jets [91]. In this limit, the axes are well defined and independent of the particular axes definition up to power corrections. When used in jet substructure, however, N -subjettiness is used both in the limit of well-resolved subjets as well as in the limit of unresolved subjets. Indeed, in many substructure analyses, relatively loose requirement are placed on N -subjettiness, such that the τ 3,2 cut is placed precisely in the unresolved region. Here, N -subjettiness can exhibit pathological behavior related to the axes choice [94]. By contrast, the N 3 observable, being composed simply as sums over the jet constituents, is well behaved throughout the entire jet spectrum, and this will be reflected in its improved performance.

Performance in Parton Showers
Having understood the power counting of N 3 on groomed jets, we now study its behavior in parton shower generators, comparing N 3 with both τ 3,2 and the simplified version of D 3 defined in Eq. (2.8). The comparison to τ 3,2 is particularly interesting, since the parametrics in Eq. (4.10) suggest it should perform similarly to N 3 in the resolved limit.
For our parton shower study, we generate background QCD jets from pp → jj events, where we consider separately the cases of j = g (gluon) and j = u (representative of light quarks). We also consider the case of b-quark backgrounds, which are interesting to treat separately due to recent advances in b-tagged substructure [8,[143][144][145][146][147][148][149]; heavy quarks were generated from the process pp → bb. The boosted top signal is generated from pp → tt events, with both tops decaying hadronically.
Events were generated with MadGraph5 2.3.3 [156] at the 13 TeV LHC and showered with Pythia 8.219 [157,158] with underlying event and hadronization implemented with the default settings. Anti-k T [104] jets with radius R = 1.0 were clustered in Fast-Jet 3.2.0 [84] using the Winner Take All (WTA) recombination scheme [90,159]. 13 The energy correlation functions and N -subjettiness ratio observables were calculated using the EnergyCorrelator and Nsubjettiness FastJet contribs [84,85]. For N -subjettiness, we use one-pass WTA minimization with β = 1. As a concrete example of a groomer, we use β = 0 soft drop [86] (a.k.a. modified mass drop with µ = 1 [41,42]) with z cut = 0.1, though our general observations should be independent of the particular choice of groomer. 13 WTA axes align with a hard prong within the jet. They are nice theoretically, as they avoid recoil due to soft emissions [74,[87][88][89][90]. For low pT tops, however, the use of WTA axes can potentially lead to lopsided axes. We explicitly checked that are our results are unmodified if standard E-scheme recombination is used instead. Top Efficiency Top vs. b-Quark (Groomed) Pythia    Here, light quark jets are used as representative of the background; gluons and b-jets behave similarly. The shift in the signal distribution in the lowest p T J bin is due to a high fraction of top quarks whose decay products are not fully captured by the R = 1.0 jet radius.
As discussed in Sec. 4.1, we focus on the behavior of the observables on groomed jets, where N 3 was designed to perform well and where N 3 behaves parametrically like τ 3,2 . In App. B.2, we study boosted top tagging without grooming, where N 3 is still a reasonably powerful discriminant on ungroomed jets, but not as strong as τ 3,2 . We also discuss the behavior of M 3 in App. B.1, using power-counting arguments to show why it is a poor discriminant with standard groomers, but might perform better with a more aggressive grooming strategy.
In Fig. 8, we show distributions for groomed N 3 , comparing the top jet signal to the backgrounds of b-quark, light quark, and gluon jets. A groomed mass cut of m SD > 80 GeV is applied, following a recent ATLAS study [160]. Here, we use β = 2 as the angular exponent for N 3 ; power counting does not, in this case, predict a preferred value of β, so it could be optimized for experimental performance. The behavior of these distributions is quite interesting, particularly for p T J > 500 GeV in Fig. 8b, where the top quarks are truly boosted. The signal distribution drops off sharply above N 3 1.5, while the background distribution extends to larger values for all three samples. This behavior leads to excellent performance at high signal efficiencies, and is quite different than for τ 3,2 (see Fig. 24 in App. B.2). Note that these distributions are calculated after the soft drop mass cut, so the region where N 3 exhibits improved performance is the one directly relevant for LHC searches.
In Fig. 9, we show signal efficiency versus background rejection (ROC) curves for boosted top discrimination against b-quark, light quark, and gluon jets. The baseline efficiencies for the m SD > 80 GeV mass selection are p T J = 200 GeV : E t = 61%, E b = 2.4%, E q = 2.8%, E g = 6.6%, (4.11) p T J = 500 GeV : E t = 87%, E b = 10%, E q = 10%, E g = 19%, (4.12) and we normalize the ROC curves to show just the gain in performance from adding a 3-prong substructure cut. Comparing p T J = 200 GeV and p T J = 500 GeV, we conclude that the behavior of N 3 is reasonably robust as a function of p T J (see Fig. 10 for higher p T J values). The simplified version of D 3 with this choice of angular exponent gives rather poor discrimination power, especially for gluon jets; the apparent negative discrimination power for certain ROC curves in Fig. 9 is due to the use of a (non-optimal) one-sided cut. It is also satisfying to see the behavior predicted from the power-counting analysis. At lower top efficiencies, where there are well-resolved jets, N 3 and τ 3,2 exhibit similar discrimination power, but at higher efficiencies, where there are not well-resolved jets, the structure of N 3 leads to considerably improved performance. It would be interesting to see whether these parton shower predictions remain true in LHC data. Finally, another important feature of the soft-dropped N 3 observable is its stability as the mass and p T of the jet are varied. This has recently been emphasized in Ref. [79] as a highly desirable feature of jet substructure observables, as it removes mass sculpting. In Fig. 10, we show the signal and background distributions for three different values of the jet p T J , namely p T J = {200, 500, 1000} GeV following Ref. [160]. Remarkable stability of the N 3 distribution is seen, with the main distortion appearing for the top sample in the lowest p T J bin, where the R = 1.0 jet radius is not always large enough to capture all of the top decay products. Between p T J = 500 GeV and 1000 GeV, there are almost no changes to either the signal or background distributions. We conclude that soft-dropped N 3 is a powerful boosted top tagger that exhibits many experimentally desirable features.

New Observables for 2-prong Substructure
Jet substructure techniques have played an increasingly important role in recent LHC searches, especially for new resonances with decays involving boosted W/Z/H bosons [21][22][23][24][25][26][27][28][161][162][163][164][165]. In order to understand any possible hint of new physics in diboson analyses, it is essential to have exceptional control over the behavior of jet substructure discriminants, to allay concerns about possible analysis artifacts [166,167]. In our view, echoing the perspective of Ref. [79], properties like stability with jet p T and resilience to mass sculpting are just as important as (and perhaps more so than) absolute tagging performance.
In this section, we use the generalized correlators to construct robust and performant 2-prong taggers. This is an application where the original energy correlators have already proven useful through the C 2 and D 2 ratios [65,74]. Here, we propose three new ratios: ( 1 e were constructed to only be performant on groomed jets. Therefore, these observables are probes not only of 2-prong jet substructure but also of any grooming procedure applied to the jet. 14

Power-Counting Analysis and Observable Phase Space
The power counting for 2-prong discriminants follows straightforwardly from Sec. 2.3, using the modes summarized in Fig. 3 and Table 1. Since the phase space is much simpler than in the 3-prong case, we can study the behavior of M 2 , N 2 , and D (α,β) 2 both before and after jet grooming.
To begin, we consider the 1-prong background in Table 3 and power count the contributions to e 2 and v e 3 from every possible triplet of soft and collinear modes. We do the same for the 2-prong signal in Table 4, where we also have to consider collinear-soft modes, though we do not show the power-suppressed triplets for brevity. These tables show that while the 3-point correlators have similar behavior for soft particles, they have different behavior for correlations among collinear particles (cf. the first row of Table 3 and the  second and third row of Table 4). This is expected given the different number of pairwise angles in the definition of each v e 3 . We discuss the consequences of this power counting for each of the proposed ratios in the following subsections.

M 2
The observable M 2 is based on 1 e 3 : 14 See also Ref. [168] for an example of an observable designed specifically to probe the grooming procedure by measuring non-global correlations, and Ref. [44] for an example of improving discrimination power by understanding the behavior of the grooming procedure.  Table 4: Same as Fig. 3, but for a jet with a resolved 2-prong substructure. The different contributions arise from correlations among soft (S), collinear (C i ), and collinear-soft (C s ) radiation. Power-suppressed contributions are not shown.
We first consider its behavior on 1-and 2-prong jets without grooming. For 1-prong background jets from This exhibits a non-trivial phase space with boundaries 1 e 3 ∼ (e 2 ) 2 when the jet is dominated by soft radiation, and 1 e 3 ∼ e 2 when the jet is dominated by collinear radiation. For 2-prong signal jets from From the fact that z cs 1, and θ cc θ 12 , one therefore finds the inequality 1 e 3 e 2 . The phase space for M 2 is shown in Fig. 11a. This power-counting analysis demonstrates that before any grooming has been applied, there is considerable overlap between the parametric phase space regions occupied by 1-and 2-prong jets. Therefore, M 2 has limited discrimination power on ungroomed jets. The power-counting analysis also makes clear why M 2 performs so poorly: 1-prong jets are dominated by soft radiation with scaling 1 e 3 ∼ (e 2 ) 2 , which overlaps with the 2-prong signal region with 1 e 3 e 2 . The fact that this overlap is caused only by soft radiation also suggests that it can be eliminated by applying a jet grooming procedure to remove soft radiation.
In Fig. 11b, we show the phase space for M 2 after grooming. Soft drop removes the z s contributions from Eq. (5.3), which pushes 1-prong background jets to the upper boundary of the phase space with 1 e 3 ∼ e 2 . By contrast, the parametric scaling of the signal jets is unaffected by the soft drop procedure. 15 This yields a triangular phase space that resembles the case of τ 2,1 in Fig. 1b, where 1-prong background jets live on the upper boundary and 15 As stated at the end of Sec. 2.3, for simplicity we do not power count the grooming parameter zcut. It is well understood how to properly incorporate zcut into the power-counting analysis (see e.g. [60,61]), but this has a negligible impact for understanding the qualitative behavior of 2-prong discriminants. Figure 11: Parametric phase space for the M 2 observable (a) before grooming and (b) after grooming. The grooming procedures removes wide-angle soft radiation, pushing 1-prong jets to the upper boundary of the phase space.

���
2-prong signal jets live in the bulk. Perhaps counterintuitively, the soft drop procedure pushes the background to larger values of M 2 , achieving better discrimination power.
The M 2 observable therefore provides an interesting example of a discriminant that only performs well after grooming. It emphasizes the parametric effect that grooming procedures can have on radiation within a jet, beyond simply removing jet contamination. For this reason, we expect precision calculations of the M 2 distribution to provide useful insights into the behavior of such grooming procedures.

N 2
The observable N 2 is based on 2 e 3 , The power-counting argument for N 2 closely parallels M 2 . We will see that the phase space for N 2 is parametrically unmodified by the grooming procedure, making it performant on both groomed and ungroomed jets. We again begin by analyzing the parametric behavior of the observable on ungroomed jets. Using Table 3 for 1-prong background jets, we find In contrast to M 2 , the 1-prong background jets exhibit only a single scaling, 2 e 3 ∼ (e 2 ) 2 , for jets dominated by either soft or collinear radiation. Using Table 4 for 2-prong signal Figure 12: Same as Fig. 11 but for N 2 . The grooming procedure does not modify the scaling of the phase space, so (a) and (b) are identical as far as power counting is concerned. Therefore, the N 2 observable exhibits good discrimination power both before and after grooming is applied.
jets, we find 3 ∼ z s θ β 12 + z cs θ 2β 12 + θ β cc θ β 12 . (5.7) Signal jets satisfy the inequality 2 e 3 (e 2 ) 2 , explaining the definition of the N 2 observable. The phase space before grooming is summarized in Fig. 12a, where there is clear separation between 1-prong background jets, which live on the upper boundary of the phase space, and 2-prong signal jets, which live in the bulk of the phase space, again resembling the case of τ 2,1 in Fig. 1b.
Because the 1-prong background jets have a single scaling, removing z s from Eq. (5.6) has no effect on the parametric phase space. Similarly, removing z s from Eq. (5.7) does not change the parametrics of the 2-prong signal. Therefore, N 2 behaves more similarly to other 2-prong discriminants in the literature, since its discrimination power does not come entirely from the grooming procedure. The power-counting analysis also suggests that N 2 should be a powerful 2-prong discriminant both before and after grooming is applied; this will be verified in the parton shower studies below.
It is also interesting to contrast the N 2 phase space in Fig. 12a with that of D 2 in Fig. 1a. For D 2 , the background jets are bounded by two different scaling behaviors, whereas for N 2 , the background jets exhibit a single scaling and therefore live entirely on the boundary of phase space. Since this boundary is purely geometric, the N 2 distributions are remarkably insensitive to the mass or p T of the jet, even before grooming is applied. . Only after grooming is applied can the 1-and 2-prong regions of phase space be separated.
Just as N 3 is related to τ 3,2 (see Sec. 4.1), N 2 behaves parametrically like τ 2,1 in the resolved limit. The power-counting analysis proceeds identically as for N 3 and will not be repeated here; see App. C for the general argument relating N i to τ i,i−1 . We want to emphasize again that, in analogy to Fig. 7, N 2 exhibits τ 2,1 -like behavior without reference to any axes within the jet. It therefore does not exhibit the axes pathologies that arise for N -subjettiness in the limit of unresolved substructure, and N 2 can therefore be expected to have improved performance compared to τ 2,1 , particularly at high efficiencies.

D (1,2) 2
Our final example of a 2-prong discriminant is based on 3 e 3 = e 3 , where we reconsider the D 2 observable with two distinct angular exponents, The case of α = β was first defined in Ref. [65] and analytically calculated in Ref. [58]. While the phase space for D (α,β) 2 was discussed in detail in Ref. [58], we focus on the impact that α = β has on groomed jet discrimination.
With distinct angular exponents α and β, 1-prong background jets exhibit the scaling The background therefore occupies a non-trivial phase space with boundaries e As discussed in Ref. [58] for ungroomed jets, the observable D (α,β) 2 only provides good discrimination between 1-prong and 2-prong jets for 3α > 2β . (5.11) When this relation is violated, the phase space regions for signal and background jets overlap. This is shown in Fig. 13a, where contours of D (α,β) 2 cannot separate the 1-and 2-prong regions when Eq. (5.11) is violated. After a grooming procedure is applied, though, the overlapping phase space region is removed, as shown schematically in Fig. 13b. Now the constraint in Eq. (5.11) no longer applies, and the angular exponents can be chosen with a particular focus on discrimination power on groomed jets.
The choice of (α, β) exponents could be tuned to optimize performance, but we advocate that α = 1, β = 2 is a natural choice for groomed 2-prong discrimination. This choice explicitly violates Eq. (5.11), so D such that a cut on the jet mass, or p T , is effectively a cut on e 2 . Without grooming, one would typically take α = 2, but with grooming, one can lower the angular exponent α to 1 to more directly probe collinear emissions. Importantly, by considering the observable with separate α and β exponents, we are able to satisfy both the requirement that it behaves sensibly under a mass cut, as well as improve its sensitivity to collinear emissions. This is not possible with the α = β version of the D 2 observable, and indeed, D on groomed jets at the LHC.

Performance in Parton Showers
We now perform a parton shower study to verify the predictions of the above powercounting analysis. It is useful to briefly summarize our robust predictions: • The M 2 observable should provide little discrimination power before grooming, but will act as a powerful discriminant after the removal of wide-angle soft radiation.
• The N 2 observable will act as a powerful discriminant both before and after grooming, matching the behavior of τ 2,1 in the resolved limit.
• The D (1,2) 2 observable will behave similarly to M 2 , providing good discrimination power only after grooming has been applied.
These predictions rely only on parametric scalings and are therefore independent of the implementation details of the perturbative parton shower or the hadronization model. For conciseness, we only show results generated with Pythia 8.219, though we used Vincia 2.0.01 [169][170][171][172][173][174][175] to check that the same results could be obtained with an alternative perturbative shower. We have not yet studied hadronization uncertainties, but we expect them to be small, particularly for groomed jets.
To verify these power-counting predictions, we use the same analysis and generation strategy as Sec. 4.2, again using a jet radius of R = 1.0. We generate background QCD jets from pp → Zj events, where we consider separately the cases of j = g (gluon) and j = u, d, s (light quark), letting the Z decay leptonically to avoid additional hadronic activity. The 2-prong signal of boosted Z bosons are generated from pp → ZZ events, with one Z decaying leptonically, and the other to light quarks, q = u, d, s. We do not address in this paper the issue of sample dependence and the impact of color connections to the rest of the event. While it would be interesting to compare the discrimination power of N 2 against the more-prevalent pp → jj background, we expect the conclusions from pp → Zj to be robust, especially after grooming has been applied.
For concreteness, we always set the angular exponent in the energy correlator to β = 2, such that a mass cut directly corresponds to a cut on the denominator of the observable, see Eq. (5.12). While this is a nice theoretical feature, it is by no means necessary, and the value of β could be optimized for experimental performance. To focus on the phase space where tagging performance actually matters, we place a cut of m ∈ [80, 100] GeV for all of the ungroomed distributions and a cut of m SD ∈ [80, 100] GeV for all of the groomed distributions. We only present distributions with a cut of p T > 500 GeV, though other p T ranges exhibit similar behaviors.
In Fig. 14, we show normalized distributions of M 2 , N 2 , and D (1,2) 2 before and after soft drop grooming. Despite all being derived from 3-point correlators, they exhibit rather different behaviors. As expected, M 2 is a poor discriminant before grooming is applied; amusingly, the distributions of the Z boson signal and quark jet background are essentially identical. As predicted by the power-counting analysis, the soft drop grooming procedure pushes the background M 2 distributions to larger values while leaving the signal distribution largely unmodified. We are not aware of another substructure discriminant with such a dramatic shift in behavior after jet grooming.
Turning to N 2 , it exhibits good discrimination power both before and after grooming is applied, even though the shapes of the distributions are substantially modified by grooming. Before grooming, the N 2 distribution exhibits a sharp edge at its upper boundary. This arises because 1-prong background jets have a single parametric scaling and are therefore compressed along the upper boundary of the phase space (see Fig. 12). After grooming, N 2 remains a powerful discriminant, as the phase space is parametrically unchanged by the grooming procedure. As expected, the peak values of the distributions decrease as soft radiation is groomed away, but the range spanned by the distribution remains approximately  constant. This highlights the fact that parametric arguments give robust predictions about the boundaries of phase space but not the specific shapes of the distributions. In App. C, we also verify that N 2 and τ 2,1 exhibit the same parametric behaviors in the resolved limit.
Finally, the D (1,2) 2 observable, while only a fair discriminant before grooming, exhibits good discrimination power after soft drop is applied. Therefore, we have seen that all of the power-counting predictions are observed in the parton shower generators, suggesting that parametric scalings dominate the behavior of these observables, at least for the purposes of 2-prong substructure tagging. From Fig. 14, we see that some of the observables behave quite differently for the quark and gluon samples. We revisit the possibilities of using v e 3 for quark/gluon discrimination in Sec. 6, where we introduce the U 2 observable, which is based on 1 e 3 , similar to M 2 .
To study the discrimination power more quantitatively, we show ROC curves before and after grooming in Fig. 15, considering the quark and gluon backgrounds separately. The baseline efficiencies for the ungroomed and groomed mass selections are m ∈ [80, 100] GeV : E Z = 27%, E q = 17%, E g = 15%, m SD ∈ [80, 100] GeV : E Z = 37%, E q = 2.6%, E g = 4.3%, (5.13) where we again normalize the ROC curves to show only the gains from the new 2-prong discriminants. 16 We use D 2 (with β = 2) as a standard reference, since it is currently used by the ATLAS experiment for its excellent tagging performance [5, 6, 9, 24, 26-28, 161, 162, 176, 177]. 17 Of the three new observables, only N 2 is designed to act as a discriminant on ungroomed jets. In both Figs. 15a and 15c, we see that N 2 outperforms the standard D 2 observable in discriminating against both quark and gluon jets. From power-counting arguments, we cannot predict the relative performance between the quark and gluon samples, but the fact that N 2 sees significant performance gains on the gluon sample is very encouraging. As discussed in Sec. 5.1, the discrimination power of N 2 is closely related to τ 2,1 in the resolved limit, but with an improved behavior in the transition to the unresolved region. We discuss this relation in more detail in App. C, showing that N 2 has slightly improved performance compared to τ 2,1 on ungroomed jets, but considerably improved performance after grooming.
After jet grooming, shown in Figs. 15b and 15d, all three new observables offer improved discrimination power over D 2 . Comparing the results before and after grooming, we see dramatic gains in performance for M 2 and D (1,2) 2 , as expected from power counting. It is rather curious that after grooming, all three observable offer comparable discrimination power, even though they are based on v e 3 correlators with different characteristic behaviors. It would be interesting to study the correlations between these observables to see if they are probing complementary physics effects. Such correlations go beyond the power-counting analysis of this paper, so we leave a study to future work. 16 Note the improved signal significance in the groomed case, which offsets the apparent decrease in discrimination performance when comparing the ungroomed and groomed ROC curves. 17 Note that ATLAS uses D2 after jet trimming [78], which has a similar parametric behavior to D2 after soft drop in the region we are considering.  Figure 15: ROC curves for boosted Z boson (left column) before grooming and (right column) after grooming. The discrimination power is shown against (top row) quark jets and (bottom row) gluon jets. As predicted by power counting, the application of grooming greatly modifies the relative performance of the different observables. Note that an ungroomed mass cut is applied in the left column, while a groomed mass cut is applied in the right column. Efficiencies from these mass cuts are given in Eq. (5.13). See Fig. 27 in App. C for a comparison to τ 2,1 , and see Fig. 28 in App. D for a hybrid strategy using a groomed mass cut but ungroomed discriminants.
Thus far, we have only considered observables measured entirely on either groomed or ungroomed jets. Experimentally, though, it may be desirable to measure ungroomed observables after the application of a groomed mass cut (see e.g. [178]); we refer to this as a "hybrid" strategy. In App. D, we present ROC curves for M 2 , N 2 , D 2 , and τ 2,1 using this hybrid strategy and analyze their behavior using power counting. We leave a more detailed study of the optimal use of mixed groomed/ungroomed observables to future work.

Stability in Parton Showers
In addition to their absolute performance, our new 2-prong discriminants exhibit stable behavior, especially after grooming. As recently emphasized in Ref. [79], stability of background distributions as a function of mass and p T cuts is an important consideration when designing jet substructure observables. Excessive dependence on jet mass and p T can lead to mass sculpting, which can increase systematic uncertainties in sideband fits, counteracting gains from improved tagging performance.
To illustrate how the phase space structure controls the stability of the observable, it is interesting to study the stability of D 2 , M 2 , and N 2 before and after grooming. These three observables represent the three scaling behaviors we have encountered in this paper. Prior to grooming, we have: Fig. 1a: The background occupies a non-trivial phase space region that does not overlap with the signal.
• M 2 in Fig. 11: The background occupies a non-trivial phase space region overlapping with the signal.
• N 2 in Fig. 12: The background is confined to a single scaling on the boundary of phase space.
The D (1,2) 2 observable has a similar phase space structure to M 2 , and will therefore behave similarly, so we do not show it explicitly in this section. Note that τ 2,1 has the same phase space structure as N 2 , so it exhibits related stability properties.
In Fig. 16, we use parton showers to test the stability of D 2 , M 2 , and N 2 on the light quark background as the jet mass cut is varied. 18 Prior to grooming, only the N 2 observable exhibits any degree of stability on the background. After grooming, all three observables have a nicely stable peak position and shape, and the residual variation could be compensated using the decorrelation technique of Ref. [79]. We can now use a powercounting analysis to demonstrate how these behaviors are dictated by the form of the phase space. Although we focus on light quark jets in Fig. 16, similar stability properties are observed for gluon jets. This is also emphasized by the power-counting argument, which is insensitive to the quark or gluon nature of the jet.
We begin by considering the observables before grooming. For D 2 in Fig. 1a, the background region is defined by two different scalings, one of which defines the upper boundary 18 We could alternatively vary the cut on the jet pT . From the power-counting analysis, all stability properties are determined by functions of the ratio mJ /pT J , and therefore it is straightforward to understand the pT J dependence from the mJ dependence. of the phase space and one of which defines the scaling of the boundary between the signal and background, and therefore the scaling of the desired cut value for discrimination. The upper boundary of the phase space is defined by the scaling e 3 ∼ (e 2 ) 2 , leading to the maximum value (5.14) Simplifying to the case of β = 2, and using Eq. (5.12), we have which depends sensitively on m J and p T J . This behavior can be clearly seen in Fig. 16a, where the D 2 distribution shifts dramatically with the jet mass cut, an undesirable feature for the purposes of sideband calibration. For M 2 with a phase space given in Fig. 11, we see quite different behavior. In this case, the upper boundary of the phase space is defined by 1 e 3 ∼ e 2 , and therefore M 2 has a maximum value which is largely independent of the jet mass, p T , and the angular exponent β. Stability of the maximal value (endpoint), though, is not sufficient to guarantee stability of the distribution. Indeed, the scaling of the lower boundary of the phase space for the background is 1 e 3 ∼ (e 2 ) 2 , so we expect a sharp drop in the background, and therefore a peak in the distribution, around which depends sensitively on m J and p T J , but in exactly the opposite way as D 2 . This behavior is observed in Fig. 16c. Finally, for N 2 shown in Fig. 12, the background region is defined by a single scaling, namely 2 e 3 ∼ (e 2 ) 2 , which defines the upper boundary. Since there is a single scaling, we expect the peak for the background distribution to be defined by the same scaling. This means that N 2 has a maximum value and a peak location that both scale like which is largely independent of the jet mass, p T , and the angular exponent β. This is well verified in the parton shower analysis, as shown in Fig. 16e. Thus, we see that by carefully engineering the phase space of an observable, one can achieve properties, such as stability, that are important experimentally. In this specific case, the stability of the full N 2 distribution gives further evidence that N 2 is a promising 2-prong tagger, even without grooming.
After grooming away soft radiation, we see from Figs. 16b, 16d, and 16f that all the distributions are stable, and from our power counting analysis, it is easy to understand why this is true. For D 2 , grooming has a dramatic impact (note the change in the x-axis range), since it removes the region of phase space that leads to the undesired scaling behavior in Eq. (5.15) (see also Fig. 2a). In this way, the endpoint for groomed D 2 (as well as the whole distribution) becomes remarkably robust to the jet mass cut. For the M 2 observable, the grooming removes the background in the bulk of phase space and pushes it to the upper boundary, as shown in Fig. 11, stabilizing the peak of the M 2 distribution but leaving the endpoint largely unchanged. After jet grooming, the parametric phase space for N 2 is unmodified, so the endpoint and peak scaling in Eq. (5.19) should not change. Comparing Figs. 16e and 16f, we see that the specific value of the N 2 endpoint and peak is modified, but the stability with varying mass cut is robust.
Therefore, in all cases after grooming, we have groomed : This demonstrates three distinct ways of generating a stable distribution: engineering the background phase space to directly have the desired boundary (e.g. N 2 ), or grooming soft radiation to the stabilize the boundary (e.g. D 2 ) or the peak (e.g. M 2 ) of the background distribution. It is important to emphasize that the power-counting analysis can only identify the power-law scaling of the distribution in m J or p T J . Removing this power-law scaling does not, however, guarantee complete numerical stability of the distribution. For this, techniques such as designing decorrelated taggers (DDT) [79] can be used. We expect that methods like DDT will be most powerful when applied to variables that are already naturally stable, but we leave a study to future work.

Improving Quark/Gluon Discrimination
A major challenge in the field of jet substructure is reliable quark/gluon discrimination. Despite its many potential applications, there has been significant difficulty both in understanding the behavior of quark/gluon discriminants in parton showers, as well as in developing analytically-tractable observables which surpass the Casimir scaling limit (see Eq. (6.1) below). For detailed discussions of these issues, we refer the reader to Refs. [43,74,[179][180][181][182], as well as to studies in data [10,[183][184][185]. Quark/gluon discrimination has mostly been studied using IRC safe observables, such as the angularities [131,186] or 2-point energy correlation functions C 1 = e 2 [74], which are set by a single emission at LL accuracy. 19 At LL order, and ignoring nonperturbative 19 Important exceptions are (IRC unsafe) multiplicity-based observables, which have a long history in QCD [187][188][189][190][191][192][193][194][195][196][197][198] (see [199] for a recent experimental study), and more recently, shower deconstruction [200]. effects, one can show that the discrimination power of such observables is set by the Casimir scaling relation where x is the fraction of quarks retained by the cut and disc(x) is the fraction of gluons retained. In this way, discrimination power is capped by the ratio of the gluon and quark color charges, C A /C F = 9/4. Casimir scaling arises because after a single emission, the discrimination power is set only by the color factor associated with the hard jet core, independent of the particular details of the observable. Beyond LL accuracy, where one is sensitive to physics beyond the leading emission, improved discrimination power is observed. In Ref. [74], an analytic calculation of C 1 was performed at NLL accuracy, and a noticeable increase in discrimination power beyond the Casimir limit was found for β < 1 (though not confirmed in an ATLAS study [185]). For small values of β, however, one is highly sensitive to nonperturbative effects, which must be modeled or extracted from data. Particularly for gluon jets, which are not well constrained by LEP event shape data [201][202][203][204], this leads to significant discrepancies between distributions obtained from different parton shower generators. 20 This in turn leads to rather large uncertainties in the predicted quark/gluon efficiencies; see Refs. [43,182] for detailed studies.
Given the Casimir scaling limit of single-emission observables, a promising approach for improving quark/gluon discrimination is to design observables that are directly sensitive to multiple emissions within the jet, even at lowest order. In this section, we define a series of observables U i specifically intended for this purpose. Since these observables exhibit different behavior from standard single-emission observables, they may also prove useful in improving the parton shower description of quark and gluon jets. We will particularly emphasize the stability of their discrimination power as a function of the angular exponent β, which could be helpful for disentangling perturbative and nonperturbative effects.

Probing Multiple Emissions with U i
A standard observable for quark/gluon discrimination is the 2-point energy correlation function e 2 , whose scaling was derived already in Eq. (2.17) for 1-prong jets: As discussed, e 2 is set at LL accuracy by a single emission from the hard core. Note that the scaling is the same for quarks and gluons, since C F = 4/3 versus C A = 3 is not a parametric difference between the samples. To go beyond this single-emission behavior, we consider the 3-point correlators, v e 3 , which explicitly probe two emissions from the hard jet core. Using the modes in Table 1a, we derive the following scalings (which were already given in Sec. 5.1): We can draw a number of interesting conclusions from Eq. (6.3). First, in the majority of phase space there is a direct relationship between the last two 3-point correlators and the 2-point correlator: 2 e 3 ∼ (e 2 ) 2 and 3 e 3 ∼ (e 2 ) 2 . 21 We therefore do not expect 2 e 3 or 3 e 3 to yield improved quark/gluon discrimination power compared to e 2 ; this illustrates the importance of understanding parametric correlations between different observables. By contrast, 1 e 3 does not obey such a relation to e 2 , since only for 1 e 3 is the cross term θ β cc z s power suppressed. Since 1 e 3 directly probes the double-soft limit of a jet, without soft/collinear cross talk at leading power, we can expect it to carry more information about the flavor of the jet's initiating parton. This intuition will be verified in our parton shower study.
Another interesting feature of 1 e 3 is the relative scaling between the collinear and soft modes, as can be seen from comparing Eq. (6.3) to Eq. (6.2). To improve quark/gluon efficiency with e 2 , one typically needs to use small values of the angular exponent β. Since 1 e 3 already has a suppressed soft scaling, it can achieve good quark/gluon discrimination at comparatively higher values of the angular exponent. In the parton shower study below, we will find that the performance of 1 e 3 with β = 2 is comparable to e 2 with β = 0.2. This relative scaling also modifies the structure of nonperturbative corrections, although we will not discuss this aspect further in this paper. 22 Note that the discrimination power as a function of β is not a prediction of power counting and can only be obtained by explicit calculations (or measurements) of the distributions.
Seeing the potential of 1 e 3 , it is natural to consider higher-point correlators. For an n + 1 point correlator, we have which probes the n-soft limit, again without soft/collinear cross talk at leading power. We are therefore led to define the U i series of observables, for quark/gluon discrimination. More generally, we hope that these observables will prove useful for probing the structure of the QCD shower. From the power counting in Eq. (6.4), we see that the scaling of the soft modes for U i depends on the index i as z i s . One might therefore naively think that after grooming 21 Because of the θ 3β cc term, this parametric relation is strictly speaking not true for 3e3, but the difference is power suppressed in much of the phase space. 22 Our reluctance to weigh in on nonperturbative corrections is because a standard shape function analysis [205][206][207][208][209], which is applicable for e2, does not hold for 1e3. In future work, we might hope to extend the shape function logic to non-additive observables like 1e3. Quark Efficiency Gluon Mistag Mult. is applied, all the U i observables would be identical. This is not the case for a fixed value of z cut , however, since the soft scale increases as a function of i. To emphasize this point, the average values of U i are typically U 2 = 0.05 and U 3 = 0.01 (see Fig. 14 from our parton shower study below). By Eq. (6.4), these correspond to z s values of z s 0.25 and z s 0.4, respectively, both of which are well above the z cut = 0.1 scale that we use as our grooming benchmark. Therefore, the emissions that dominate the U 2 and U 3 distributions are not actually removed by our grooming procedure. Thus, the behavior of U i is expected to be more resilient to grooming for larger values of i.

Performance in Parton Showers
We now use a parton shower study to verify the above power-counting predictions and to assess quantitatively the potential improvements in quark/gluon discrimination power achievable using higher-point correlators. For reasons of computational time we restrict our study of the U i series to i = 1, 2, 3. 23 The quark and gluon jets are generated from the same Pythia pp → Z + j samples described in Sec. 5.2, and the same overall analysis strategy applies, though no cut is placed on jet masses. Furthermore, we use a smaller jet radius of R = 0.6. Given known parton shower uncertainties, it would be interesting to study different shower and hadronization algorithms to understand the degree to which LHC measurements of U i could provide insight into quark/gluon tagging; we leave such studies to future work.  Figure 18: Distributions of (a) U 2 and (b) U 3 for β = 0.2, as measured on quark and gluon jets.
We begin by verifying the power-counting argument of Eq. (6.3), which suggested that 2 e 3 and e 3 should be highly correlated with U 1 = C 1 = e 2 . Even though 2 e 3 and e 3 probe three particle correlations, they have a fixed scaling relation with respect to e 2 , and are therefore not expected to provide new information for quark/gluon tagging. Taking 2 e 3 as a representative example in Fig. 17a, we compare the distributions of 2 e 3 and 1 2 (e 2 ) 2 ; they are remarkably similar so we conclude that power counting is indeed capturing the dominant scaling relation. From the ROC curves in Fig. 17b, we see that the discrimination power of e 2 , 2 e 3 , and 3 e 3 are very similar for the same value of β, with limited improvement observed by including 3-particle correlations. This emphasizes that probing multi-particle correlations does not, in and of itself, improve quark/gluon discrimination, since higherpoint correlation functions can be correlated with lower-point correlation functions.
We now consider the behavior of U 2 and U 3 , which were designed to exploit multiparticle correlations to improve quark/gluon discrimination. In Fig. 18, we show distributions of U 2 and U 3 with β = 0.2, indicating good separation of the quark and gluon samples. This is quantified in Fig. 19a, which shows ROC curves for U i comparing i = 1, 2, 3. Recall that U 1 = C 1 = e 2 is a standard quark/gluon discriminant and a useful baseline to assess performance gains (even if Pythia itself skews optimistic about quark/gluon separation power [43,74]). Going from U 1 to U 2 to U 3 , the discrimination power at high efficiencies does increase with more emissions being probed, though the change is relatively small going from i = 2 to i = 3.
Beyond absolute performance gains, it is also interesting to study the relative performance of U i as a function of the angular exponent β. In Fig. 19b, we show the gluon rejection at 70% quark efficiency as a function of β. 24 Unlike for U 1 = e 2 , where the dis- Quark Efficiency Gluon Mistag Mult. β Gluon Rejection

70% Quark Efficiency
Pythia 8.219, R=0.6, pT>500 GeV (b) Figure 19: Comparison of the quark/gluon discrimination power for U 1 , U 2 , and U 3 to the prediction from Casimir scaling and the result for hadron multiplicity. (a) ROC curves demonstrating the improvement in performance as more emissions are probed. (b) Gluon rejection at 70% quark efficiency as a function of the angular exponent β. The performance of the U i observables appears to asymptote to hadron multiplicity as i is increased. crimination power falls off rapidly with increasing β, for U 2 , and even more so for U 3 , the discrimination power remains well above the Casimir scaling limit, even into the large β regime where U i should be amenable to fixed-order or resummed perturbative calculations. We find this much flatter behavior of the discrimination power with respect to β to be one of the most interesting features of these observables, suggestive that multiple soft emissions are just as important as hard collinear emissions for discriminating quarks from gluons. Full ROC curves for different values of the angular exponents are provided in App. E.
It would be interesting to see if there is asymptotic behavior as i → ∞, though this is likely only meaningful in the context of a comparative study of parton shower generators, since it depends sensitively on the assumptions made for correlated soft emissions. As a first step in this direction, in Fig. 18 we compare U i to hadron multiplicity, which is known to be a powerful quark/gluon discriminant. Remarkably, the performance of the U i observables appears to asymptote to multiplicity as i is increased, both in the shape of the ROC curves as well as in the behavior as a function of β. It would be interesting to understand whether this connection can be made formal, and whether the U i observables can be used to give an IRC safe definition of a multiplicity-like observable.
Finally, we want to test whether this improvement in quark/gluon discrimination power is robust to grooming. In Fig. 20, we compare the U 2 and U 3 distributions before and after grooming has been applied. At large values of the observables, relatively little difference is observed for our baseline grooming parameters, as expected from the power-counting analysis of Sec. 6.1. At smaller values of the observables, there is a distortion in the distributions due to the fact that grooming substantially decreases the overall particle At large values of the observable, grooming has no impact on either the quark or gluon distribution, as expected. The corresponding groomed ROC curves are given in Fig. 30. In (b), the bin at zero is due to jets that have three or fewer particles after grooming. multiplicity. In particular, there are expected features at U 2 = 0 (U 3 = 0), from when the grooming gives less than three (four) particles in the jet. In this regime, power-counting arguments are no longer applicable since the distribution is dominated by nonperturbative effects. That said, as shown in App. E, the ROC curves after grooming exhibit the same features as in the ungroomed case, with U 2 and U 3 outperforming U 1 , indicating that this parametric prediction is still robust.
It would be of great interest to perform explicit calculations of U 2 to understand its exact dependence on the color Casimirs, as well as on the angular exponent β. A resummed calculation, in particular, would shed light onto the all-orders structure of multiple-emission observables, which have not been widely explored in the literature. 25 It would also be useful to understand whether the measurement of multiple U i observables with different β values could be used to improve quark/gluon discrimination. The multi-differential cross section for U 1 = e 2 with two different angular exponents was calculated in Refs. [54,55] and the gains in performance for quark/gluon discrimination were studied in Ref. [43] from the perspective of mutual information. In preliminary investigations, we find that correlations among the U i are indeed helpful, but we leave a detailed study to future work.

Conclusions
Continued progress in jet substructure relies on the ability to devise observables that can probe increasingly detailed aspects of jets. In this paper, we used the known structures imposed by IRC safety to motivate the generalized energy correlation functions, v e n , a flexible basis for constructing new substructure discriminants. These generalized correlators incorporate an angular weighting function, allowing them to probe different angular structures within a jet. We presented a number of case studies of relevance to the jet substructure community-boosted top tagging, boosted W/Z/H tagging, and quark/gluon discrimination-demonstrating the power of power-counting techniques to design discriminants for specific purposes. In each case, our newly-developed observables outperform standard jet shapes in parton shower studies.
The three series of observables introduced in this paper-M i , N i , and U i -exhibit new ways to probe the soft and collinear limits of QCD. The M i series is designed for tagging groomed jets, showing that the removal of soft radiation can dramatically change the phase space of i-prong discriminants. The N i series is designed to mimic N -subjettiness in the limit of resolved substructure, showing how to probe radiation patterns around collinear prongs without requiring external axes. Finally, the U i series is designed to evade the usual quark/gluon limitations imposed by Casimir scaling, showing the importance of multiple soft emissions for quark/gluon radiation patterns. Taken together, these observables widen the scope for jet substructure investigations, allowing more handles to optimally use jets at the LHC.
Given their tagging performance, it would be interesting to calculate these observables from first principles. This would provide insights into the impact of jet grooming on multi-prong observables, the difference between axes-based and axes-free observables, and the structure of multiple emissions within quark/gluon jets. We are particularly interested in the differences between groomed and ungroomed distributions, since jet grooming not only changes the power counting of observables, but it also changes the logarithmic structure and power corrections in analytic calculations [41,42,44,60,61,86]. Beyond jet substructure, we suspect that the generalized correlators could eventually be useful as a tool for performing NNLO calculations; powerful slicing schemes have been devised using N -jettiness [210,211] and v e n -based slicing could potentially be valuable in regimes where axes are inappropriate or cumbersome.
One aspect of jet substructure that has not been studied here is the correlations between discriminants. We did apply power-counting techniques to identify correlations among basis elements to define optimal discriminants, but we did not consider whether power-counting could reveal parametric relationships between different proposed discriminants. Along similar lines, we did not consider in detail the hybrid strategy of using both groomed and ungroomed observables. In preliminary investigations, we find that, not surprisingly, discriminants with the same power counting are highly correlated. When discriminants have different power counting, though, there appears to be additional information gained through multi-variate combinations. At the moment, our application of power counting does not tell us what these multi-variate correlations are or whether we can robustly predict performant combinations. We look forward to developing more sophisticated power-counting strategies to exploit these correlations in the future.
Finally, we want to emphasize the importance of first-principles calculations and unfolded experimental measurements of U 1 , U 2 , and U 3 . While the expected tagging performance of 2-and 3-prong discriminants-like M 2 , N 2 , D (1,2) 2 , and N 3 -can be seen directly from power-counting arguments, this is not the case for quark/gluon discriminants, since C F and C A are not parametrically different quantities. For 1-prong jets, power counting can tell us which soft/collinear features are probed by the U i series, but it cannot reliably predict their expected parametric behavior or relative performance. In parton shower studies, we do find that U 2 and U 3 exhibit improved performance over naive Casimir scaling, even in the larger β regime where they are under better perturbative control, suggesting that the U i series is a sensitive probe of the QCD shower. Therefore, measurements of the U i series, along with comparisons to parton shower (and eventually analytic) predictions, are likely to lead to deeper understanding of jets in QCD.

A Alternative Angular Weighting Functions
As discussed in Sec. 3.1, any symmetric function of the angles, f N (p i 1 ,p i 2 , . . . ,p i N ), that vanishes in the collinear limits can in principle be used in Eq. (3.1). While we argued in Sec. 3.2 that the min function is particularly effective due to its ability to isolate hierarchical angular structures, other functional forms can certainly be used. In this appendix, we study two alternate definitions of the angular weighting function, which, from a power counting perspective, are identical to those considered in the text.
For concreteness, we study variants of the N 2 observable from Sec. 5, which was based on a 3-point correlator: One variant is to consider an angular weighting function that smoothly approximates the min function. 26 (A.2) 26 The r notation is motivated by the resistance formula for a set of parallel resistors. Another variant is to use the geometric fact that in the collinear limit, the minimum product of pairwise distances is parametrically the same as the area of the triangle spanned by the three points 27 2 a (β) ( 1 e where s = (θ ij + θ jk + θ ik )/2 comes from Heron's formula. While A 2 is parametrically identical to N 2 , it has the interesting property that it vanishes when the vectors defining the three particles are coplanar, similar to dipolarity introduced in Ref. [212]. Even though the N 2 , R 2 , and A 2 observables have identical power counting, their distributions could in principle differ by O(1) numbers, possibly allowing for improved discrimination power. In Fig. 21a we compare the distributions of these three observables in Pythia, showing that they are rather similar. To aid the eye, we have rescaled the R 2 and A 2 distributions to match the N 2 distribution. Turning to the Z versus quark ROC curve in Fig. 21b, the performance is nearly identical. This further emphasizes that the behavior of the observables is dominated by parametric scalings. Since we did not find any gains from using these more complicated variants, we restricted the study in the text to the definition given in Eq. (3.3).
It is still an interesting question whether other choices of angular weighting functions might lead to improved performance in more complicated jet substructure applications. It seems unlikely, however, since for small radius jets, one can Taylor expand the angular function in the small θ limit, and observables with the same power counting must have the same lowest-order expansion. In practice, the use of smoother definitions which approximate the min function might be useful for performing perturbative calculations. We can see from a power-counting analysis, however, that even with grooming, M 3 will not perform well. Following the notation of Sec. 4.1, a strongly-ordered 3-prong jet has 3-prong signal (groomed):

B Aspects of 3-prong Tagging
1 e For signal jets, we have the relation 1 e 4 1 e 3 , so we would like the background to satisfy 1 e 4 ∼ 1 e 3 . That desired relation is violated, though, by contributions of the collinear-soft modes to 1 e 4 , due to the different z cs scalings in Eq. (B.3). We therefore predict from power counting that M 3 should be a poor discriminant.
In Fig. 22, we show the distribution of M 3 for boosted top jets compared to those from QCD jet backgrounds, where little discrimination power is observed. Similar to how ordinary grooming was required for M 2 to become an effective discriminant in the 2prong case, it is likely that another layer of grooming is be needed to remove the undesired collinear-soft contributions to M 3 and make it an effective 3-prong tagger. While we do not pursue M 3 further in this paper, it would be interesting to consider alternative grooming methods designed to isolate 3-prong structure and mitigate both soft and collinear-soft radiation. As a starting point, one could consider doubly-soft-dropped boosted top jets, where after an initial application of soft drop, one reapplies soft drop to the two remaining prongs.

B.2 N 3 Without Grooming
In Sec. 4.1, we argued that on groomed jets with well-resolved substructure, N 3 behaves parametrically like τ 3,2 , but exhibits improved discrimination power in the transition to the unresolved region. On ungroomed jets, however, N 3 behaves differently from τ 3,2 , and in particular, it does not provide good discrimination in regions of phase space where there is a soft wide-angle subjet. This same issue was discussed in detail for the case of D 3 in Ref. [66]; the treatment of the soft subjet region of phase space required the addition of two extra terms to D 3 , leading to the complicated form shown in Eq. (2.7). To avoid the soft subjet issue, and to advocate for the stability of groomed observables, we explicitly focused on the case of groomed top jets in Sec. 4.1.
Here, we compare N 3 and τ 3,2 on ungroomed jets. Though N 3 was not designed for use on ungroomed jets, it still provides reasonably good discrimination power, though not as good as τ 3,2 . Distributions of ungroomed N 3 are shown in Fig. 23a, where we use an alternative mass window cut of m ∈ [160, 240] GeV. The discrimination performance for the top signal against the b-quark, light quark, and gluon jet backgrounds are shown in Figs. 23b, 23c, and 23d, respectively. The best performance is seen in rejecting quark jets, although ungroomed N 3 has worse performance on gluon jets. Interestingly, similar quark/gluon differences were seen for D 3 in Ref. [140], although the nature of this behavior is not understood and is not necessarily connected in any way to the use of energy correlators.
Though N 3 was designed for use on groomed jets, we believe that N 3 is a sufficiently good discriminant on ungroomed jets to merit further investigations. At minimum, ungroomed N 3 distributions could be measured as a baseline to test the impact of jet grooming. We offer a bounty to the first group that identifies an axes-free observable with the same power counting as ungroomed τ 3,2 .
For completeness, in Fig. 24, we show the N -subjettiness observable τ 3,2 as measured     on the same samples, both before and after grooming. As expected, excellent discrimination power is observed is observed before grooming. After grooming, the discrimination power is worsened primarily due to the behavior in the unresolved region, namely as τ 3,2 → 1. It is in this region that N 3 exhibits improved performance, as seen already in the behavior of the distributions in Fig. 8 and the performance in the ROC curve in Fig. 9. There are, however, a large number of other possible observables that could be formed from combinations of the different 2-, 3-, and 4-point correlators. In this appendix, we describe in more detail the justification for our focus on N 3 . It is interesting that this process happens to identify an observable with the same parametric behavior as the Nsubjettiness ratio τ 3,2 . As discussed in the text, we focus on the case of groomed jets. This means that we can ignore soft radiation for our power counting analysis. For groomed boosted top jets, it is sufficient to consider a 3-prong configuration with hierarchical angles, as illustrated in Fig. 5b. In particular, we do not have to consider the soft subjet phase space region from Ref. [66], which has hierarchical energies, since those configurations are removed by the grooming procedure. For the 3-prong signal, the scaling of the 2-point correlator is While the background scalings of the 4-point correlators would be needed to verify signal/background separation, as was done in Sec. 4.1, they are not needed to restrict the combinations under consideration. Since their form is not particularly illuminating, we do not show them here. While there are a large number of observables listed above, the analysis can be simplified by noting that for both the signal and background, the information contained in the 3-point correlators 2 e 3 and 3 e 3 is redundant, since it can be expressed in terms of e 2 and 1 e 3 . Furthermore, any observable derived from power counting will be linear in the 4-point correlator and will have the 3-point correlator appearing in the denominator raised to some power. Finally, from Eq. (B.7), we see that θ 23 appears at most raised to the third power; it therefore suffices to consider 1 e 3 raised at most to the third power. The power of e 2 is then fixed by Lorentz invariance.
The above logic allows us to write down a parametrically complete set of potential 3-prong observables, At this point, one can then either power count each of these options explicitly to test for background isolation, or simply evaluate their performance in a parton shower generator.
To limit the number of options to consider, one can apply the further constraint that e 2 should not appear explicitly in the observable, to mitigate correlations with the jet mass. This is equivalent to setting y = v, and gives Note that v = 1 gives M 3 and v = 2 gives N 3 . Among all of the O v,y observables, we found that the best performing one in Pythia was N 3 , which then became the focus of our boosted top study.

B.4 Power Counting N 3
While the identification of the parametrically optimal discriminant is usually fairly straightforward given the parametric expressions for the observables, confusions can arise when the scalings have multiple terms. Here, we present more details for the signal analysis of the N 3 observable from Sec. 4.1, to illustrate how power counting can be performed systematically. This allows one to avoid potential confusions when there are competing parametric relations. This same approach can be used in the other examples studied in the paper, though for the 1-and 2-prong case studies, we find that the more heuristic treatment in the text is just as illuminating as the systematic strategy. We begin by recalling the power counting for 1 e 3 and 2 e 4 , considering the signal with (hierarchical) 3-prong substructure:    Figure 26: Distributions of N 2 and τ 2,1 on the Z signal and quark background (a) before grooming and (b) after grooming. To aid visual comparison, τ 2,1 has been rescaled to match the endpoint of N 2 .
In this appendix, we show that this is generically true, suggesting that N i is indeed an appropriate observable for identifying i-prong substructure on groomed jets.
Since we work with groomed jets, we do not have to consider soft subjet configurations (i.e. i-prong jets with hierarchical energies). Instead, the power counting is determined by the generalization of Fig. 5b with hierarchical angles, where a jet has i subjets, two of which become collinear and approach an (i − 1)-subjet configuration. We label the two subjets that approach each other by 1 and 2, such that θ 12 denotes the angle between them. By assumption, θ 12 is smaller than the angles between any other subjets (which we power count as θ st ∼ 1), but larger than the typical collinear scale θ cc .
By considering the contributions from collinear modes aligned along subjets 1 and 2, we find the parametric relation where all other pairwise combinations of modes are power suppressed. Here, we are assuming that the N -subjettiness axes are defined such that one axis is aligned with subjet 1 or 2, with the remaining i − 2 axes aligned along the other subjets; this is indeed the configuration that minimizes τ i−1 in the small θ 12 limit, assuming balanced energies. Adding an extra axis yields τ where now the i axes align with the i subjets.
For the correlator involving two angles, the power-counting analysis yields where the ellipses denote contributions from collinear-soft modes, which depend on the other angles between the subjets. To understand the appearance of θ β 12 θ β cc , note that the largest contribution to 2 e i+1 comes from selecting two collinear modes from one subjet and one collinear mode from each of the remaining i − 1 subjets; for that configuration, the two smallest pairwise angles are indeed θ cc and θ 12 .
Generalizing the argument in App. B.4, Eqs. (C.1) and (C.3) imply 2 e i+1 ( 1 e i ) 2 on i-prong signal jets, such that the appropriate i-prong discriminant is ( 1 e (β) where the last relation should be understood in the power-counting sense. Therefore, as advertised, the N i observable is indeed related to the N -subjettiness ratio τ i,i−1 , and both are expected to be good i-prong discriminants.
As an example to demonstrate this parametric relation, we consider the case i = 2, which was alluded to in Sec. 5. The relevant observables are shown schematically in Fig. 25. In Fig. 26, we show distributions of τ 2,1 and N 2 before and after grooming for β = 2, taking quarks as representative of the background. To aid in a visual comparison, we have rescaled the τ 2,1 distributions by a common factor to match the N 2 endpoint. Before grooming, the shapes of the two distributions are quite different, with N 2 being much more peaked towards the endpoint for the background. After soft drop has been applied, the distributions for the two observables are quite similar, as predicted by the power-counting discussion above.
Still, there is a non-parametric difference between the τ 2,1 and N 2 distributions, which leads to improved tagging performance for N 2 . This can be seen by eye in the groomed plot in Fig. 26b, where the background distribution for N 2 is pushed to higher values while the signal distribution is more rapidly falling toward the endpoint. More quantitatively, we can consider the ROC curves in Fig. 27. For the ungroomed case, the discrimination power is similar, with N 2 showing slightly improved behavior at higher efficiencies. For the groomed case, there are significant gains to be had in using N 2 instead of τ 2,1 . 28

D Hybrid Strategies for 2-prong Observables
Throughout the text, we focused on discriminants formed from combinations (often ratios) of either groomed or ungroomed observables. It is also interesting to consider discriminants formed from mixtures of groomed and ungroomed observables [154,155], which we will refer to as a hybrid strategy. While we will not explore this topic in detail, we take as a simple example ungroomed 2-prong observables after the application of a groomed mass cut. In Fig. 28, we show the ROC curves for boosted Z discrimination, showing light quark and gluon backgrounds separately; this should be contrasted with Fig. 15. The behavior of 28 While it is possible that different axes choices for N -subjettiness could provide improved performance, it seems to us that any axes definition will be ambiguous in the unresolved region. This also highlights the nice property that N2 is defined without respect to subjet axes.  Figure 28: Same as Figs. 15b and 15d but using a hybrid strategy where a cut is placed on the groomed jet mass, but the discriminants are ungroomed.
these hybrid observables can be understood using the power-counting analysis of Sec. 5.3, where we analyzed the stability of the observables as a function of m J and p T J . For signal jets, a cut on the groomed mass has little effect due to the color singlet nature of the Z boson, and therefore the hybrid observables should have a similar behavior to the ungroomed observables. For background QCD jets, however, applying a groomed mass cut in the same mass window enforces a higher effective cut on the ungroomed mass. This, in turn, enters the scaling relations for the background distributions given in Sec. 5.3: , using a groomed mass cut has the interesting effect of pushing the ungroomed background distribution to higher values, thereby improving discrimination power. For N 2 , the distribution is parametrically unmodified, and therefore similar discrimination power is expected for the ungroomed and hybrid observables. For D (2) 2 , larger effective mass values push the distribution to lower values, thereby worsening discrimination power. These power-counting predictions are seen clearly in Fig. 28.
The above behavior is perhaps counterintuitive, especially the poor performance of D (2) 2 and the good performance of M 2 , but it follows straightforwardly from the power-counting analysis. That said, the quantitative discrimination power depends crucially on the choice of mass window, and one must keep in mind that this study is based on a relatively narrow soft-dropped mass cut around m Z . Further studies are therefore warranted to test whether discrimination performance can indeed be improved by simultaneously using information before and after grooming.  Figure 30: Same as Fig. 19, but after grooming. The improved performance of U 2 and U 3 relative to U 1 is robust to removing soft radiation.

E Supplemental Quark/Gluon Plots
In Fig. 19b, we emphasized the stability of U i for i = 2, 3 as a function of the angular exponent β. In Fig. 29, we show the full ROC curves for both U 2 and U 3 as a function of the angular exponent β. Neither observable asymptotes to the Casimir scaling prediction, even at high efficiencies or high β values. Furthermore, the U 3 distributions exhibit stability as a function of β throughout the whole ROC curve. This would be interesting to verify in an analytic calculation.
In Fig. 30a, we show the ROC curves for U 2 and U 3 after grooming for β = 0.2, showing that the U i series continues to perform better for larger values of i. In Fig. 30b, we show the performance as a function of β, demonstrating the stability of U 3 , even after grooming.