Identification of high transverse momentum top quarks in $pp$ collisions at $\sqrt{s}$ = 8 TeV with the ATLAS detector

This paper presents studies of the performance of several jet-substructure techniques, which are used to identify hadronically decaying top quarks with high transverse momentum contained in large-radius jets. The efficiency of identifying top quarks is measured using a sample of top-quark pairs and the rate of wrongly identifying jets from other quarks or gluons as top quarks is measured using multijet events collected with the ATLAS experiment in 20.3 fb$^{-1}$ of 8 TeV proton-proton collisions at the Large Hadron Collider. Predictions from Monte Carlo simulations are found to provide an accurate description of the performance. The techniques are compared in terms of signal efficiency and background rejection using simulations, covering a larger range in jet transverse momenta than accessible in the dataset. Additionally, a novel technique is developed that is optimized to reconstruct top quarks in events with many jets.


Introduction
Conventional top-quark identification methods reconstruct the products of a hadronic top-quark decay (t → bW → bq q) as jets with a small radius parameter R (typically R = 0.4 or 0.5). 1 There are usually several of these small-R jets in a high-energy, hard proton-proton (pp) collision event at the Large Hadron Collider (LHC). Hadronic top-quark decays are reconstructed by taking those jets which, when combined, best fit the kinematic properties of the top-quark decay, such as the top-quark mass and the W-boson mass. These kinematic constraints may also be fulfilled for a collection of jets which do not all originate from the same top-quark decay chain.
In analyses of LHC pp collisions, conventional top-quark identification methods are inefficient at high top-quark energies because the top-quark decay products are collimated and the probability of resolving separate small-R jets is reduced. Top quarks with high transverse momentum (p T 200 GeV) may instead be reconstructed as a jet with large radius parameter, R ≥ 0.8 (large-R jet) [1][2][3][4][5][6][7][8][9][10][11][12][13]. An analysis of the internal jet structure is then performed to identify and reconstruct hadronically decaying top quarks (top tagging).
Since a single jet that contains all of the decay products of a massive particle has different properties from a jet of the same transverse momentum originating from a light quark or gluon, it is possible to use the substructure of large-R jets to distinguish top quarks with high p T from jets from other sources, for example from multijet production. These differences in the jet substructure can be better resolved after contributions from soft gluon radiation or from additional pp interactions in the same or adjacent bunch crossings (pile-up) are removed from the jets. Such methods are referred to as jet grooming and consist of either an adaptive modification of the jet algorithm or a selective removal of soft radiation during the process of iterative recombination in jet reconstruction [14][15][16].
The jet-substructure approach aims to reduce combinatorial background from assigning small-R jets to top-quark candidates in order to achieve a more precise reconstruction of the top-quark four-momentum and a higher background rejection. In searches for top-anti-top quark (tt) resonances, the improved kinematic reconstruction leads to a better mass resolution for large resonance masses (≥ 1 TeV) compared to the conventional approach, resulting in an increased sensitivity to physics beyond the Standard Model (SM) [17].
ATLAS has published performance studies of jet-substructure methods for top tagging at a pp centreof-mass energy of √ s = 7 TeV [18]. In the paper presented here, the performance of several approaches to top tagging at √ s = 8 TeV is documented. Top tagging based on the combination of jet-substructure variables, Shower Deconstruction [19,20], and the HEPTopTagger [21,22] is studied, as described in Section 5. A new method, HEPTopTagger04, is introduced. Optimised for top tagging in events with many jets, it uses a preselection of small-R jets as input to the HEPTopTagger algorithm.
Monte-Carlo (MC) simulation is used to compare the efficiencies and misidentification rates of all approaches over a large kinematic range. The performance of the different methods is studied in data using two different event samples: a signal sample enriched with top quarks and a background sample dominated by multijet production. The signal sample is used to measure top-tagging efficiencies from data, which are compared to the predictions obtained from MC simulations. Quantifying the degree to which MC simulations correctly model the top-tagging efficiency observed in data is crucial for any physics analysis in which top-tagging methods are used because MC simulations are commonly used to model signal and background processes. The signal sample is also used to determine the energy scale of subjets in situ from the reconstructed top-quark mass distribution. Top-tagging misidentification rates are measured in the background sample and are also compared to the prediction of MC simulations.

The ATLAS detector
The ATLAS detector consists of an inner tracking detector system (ID), which is surrounded by electromagnetic (EM) and hadronic calorimeters and a muon spectrometer (MS). The ID consists of silicon pixel and strip detectors and a transition-radiation tracker covering |η| < 2.5, and it is immersed in a 2 T axial magnetic field. The EM calorimeters use lead/liquid argon (LAr) technology to provide calorimetry for |η| < 3.2, with copper/LAr used in the forward region 3.1 < |η| < 4.9. In the region |η| < 1.7, hadron calorimetry is provided by steel/scintillator calorimeters. In the forward region, copper/LAr and tungsten/LAr calorimeters are used for 1.5 < |η| < 3.2 and 3.1 < |η| < 4.9, respectively. The MS surrounds the calorimeter system and consists of multiple layers of trigger and tracking chambers within a toroidal magnetic field generated by air-core superconducting magnets, which allows for the measurement of muon momenta for |η| < 2.7. ATLAS uses a three-level trigger system [23] with a hardware-based first-level trigger, which is followed by two software-based trigger levels with an increasingly fine-grained selection of events at lower rates. A detailed description of the ATLAS detector is given in Ref. [24].

Monte-Carlo simulation
MC simulations are used to model different SM contributions to the signal and background samples. They are also used to study and compare the performance of top-tagging algorithms over a larger kinematic range than accessible in the data samples.
Top-quark pair production is simulated with POWHEG-BOX r2330.3 [25][26][27][28] interfaced with PYTHIA v6.426 [29] with the set of tuned parameters (tune) Perugia 2011C [30] and the CT10 [31] set of parton distribution functions (PDFs). The h damp parameter, which effectively regulates the high-p T gluon radiation in POWHEG, is left at the default value of h damp = ∞. This MC sample is referred to as the POWHEG+PYTHIA tt sample. Alternative tt samples are used to evaluate systematic uncertainties. A sample generated with MC@NLO v4.01 [32,33] interfaced to Herwig v6.520 [34] and JIMMY v4.31 [35] with the AUET2 tune [36], again simulated using the CT10 PDF set, is used to estimate the uncertainty related to the choice of generator. To evaluate the impact of variations in the parton shower and hadronization models, a sample is generated with POWHEG-BOX interfaced to Herwig and JIMMY. The effects of variations in the QCD (quantum chromodynamics) initial-and final-state radiation (ISR and FSR) modelling are estimated with samples generated with ACERMC v3.8 [37] interfaced to PYTHIA v.6.426 with the AUET2B tune and the CTEQ6L1 PDF set [38], where the parton-shower parameters are varied in the range allowed by data [39]. For the study of systematic uncertainties on kinematic distributions resulting from PDF uncertainties, a sample is generated using POWHEG-BOX interfaced with PYTHIA v.6.427 and using the HERAPDF set [40]. For all tt samples, a top-quark mass of 172.5 GeV is used.
In measurements of the differential tt production cross section as a function of the top-quark p T , a discrepancy between data and MC predictions was observed in 7 TeV data [54]. Based on this measurement, a method of sequential reweighting of the top-quark-p T and tt-system-p T distributions was developed [55], which gives better agreement between the MC predictions and 8 TeV data. In this paper, this reweighting technique is applied to the POWHEG+PYTHIA tt sample, for which the technique was developed. The predicted total tt cross section at NNLO+NNLL is not changed by the reweighting procedure.
Single-top-quark production in the s-and Wt-channel is modelled with POWHEG-BOX and the CT10 PDF set interfaced to PYTHIA v6.426 using Perugia 2011C. Single-top-quark production in the t-channel is generated with POWHEG-BOX in the four-flavour scheme (in which b-quarks are generated in the hard scatter and the PDF does not contain b-quarks) using the four-flavour CT10 PDF set interfaced to PYTHIA v6.427. The overlap between Wt production and tt production is removed with the diagramremoval scheme [56] and the different single-top-production processes are normalized to the approximate NNLO cross-section predictions [57][58][59].
Events with a W or a Z boson produced in association with jets (W+jets or Z+jets) are generated with ALPGEN [60] interfaced to PYTHIA v6.426 using the CTEQ6L1 PDF set and Perugia 2011C. Up to five additional partons are included in the calculation of the matrix element, as well as additional cquarks, cc-quark pairs, and bb-quark pairs, taking into account the masses of these heavy quarks. The W+jets contribution is normalized using the charge asymmetry in W-boson production in data [61, 62] by selecting µ+jets events and comparing to the prediction from MC simulations. The Z+jets contribution is normalized to the calculation of the inclusive cross section at NNLO in QCD obtained with FEWZ [63].
For the comparison of the different top-tagging techniques using MC simulation only, multijet samples are generated with PYTHIA v8.160 with the CT10 PDF set and AU2. As a source of high-transversemomentum top quarks, samples of events with a hypothetical massive Z resonance decaying to top-quark pairs, Z → tt, are generated with resonance masses ranging from 400 GeV to 3000 GeV and a resonance width of 1.2% of the resonance mass [64] using PYTHIA v8.175 with the MSTW2008 68% CL LO PDF set [49,50] and AU2.
For a study of top-quark reconstruction in a final state with many jets, the process 2 pp → H +t (b) → tbt(b) is generated in a type-II 2HDM model [65] with a mass of 1400 GeV of the charged Higgs boson using POWHEG-BOX interfaced to PYTHIA v8.165 with AU2 and the CT10 PDF set. The width of the charged Higgs boson is set to zero and the five-flavour scheme is used. The additional b-quark (in parentheses above) can be present or not, depending on whether the underlying process is gg → H +t b or gb → H +t .
All MC samples are passed through a full simulation of the ATLAS detector [66] based on GEANT4 [67], except for the tt samples used to estimate systematic uncertainties due to the choice of MC generator, parton shower, and amount of ISR/FSR, which are passed through a faster detector simulation with reduced complexity in the description of the calorimeters [68]. All MC samples are reconstructed using the same algorithms as used for data and have minimum-bias events simulated with PYTHIA v8.1 [69] overlaid to match the pile-up conditions of the collision data sample.

Object reconstruction and event selection 4.1. Object reconstruction
Electron candidates are reconstructed [70,71] from clusters in the EM calorimeter and are required to have a track in the ID, associated with the main primary vertex [72], which is defined as the one with the largest p 2 T,track . They must have E T > 25 GeV and |η cluster | < 2.47 excluding the barrel/end-capcalorimeter transition region 1.37 < |η cluster | < 1.52, where η cluster is the pseudorapidity of the cluster in the EM calorimeter. The shape of the cluster in the calorimeter must be consistent with the typical energy deposition of an electron and the electron candidate must satisfy the mini-isolation [17, 73] requirement to reduce background contributions from non-prompt electrons and hadronic showers: the scalar sum of track transverse momenta within a cone of size ∆R = 10 GeV/E el T around the electron track must be less than 5% of the electron transverse energy E el T (only tracks with p T > 1 GeV are considered in the sum, excluding the track matched to the electron cluster).
Muons are reconstructed [74] using both the ID and the MS and must be associated with the main primary vertex of the event. Muons are required to have p T > 25 GeV and |η| < 2.5 and are required to be isolated with requirements similar to those used for electron candidates: the scalar sum of the track transverse momenta within a cone of size ∆R = 10 GeV/p µ T around the muon track must be less than 5% of p µ T , where p µ T is the transverse momentum of the muon. Jets are built [75] from topological clusters of calorimeter cells, which are calibrated to the hadronic energy scale [76] using a local cell-weighting scheme [77]. The clusters are treated as massless and are combined by adding their four-momenta, leading to massive jets. The reconstructed jet energy is calibrated using energy-and η-dependent corrections obtained from MC simulations. These corrections are obtained by comparing reconstructed jets with geometrically matched jets built from stable particles (particle level). The corrections are validated using in situ measurements of small-R jets [78].
are corrected to be, on average, equal to the particle-level jet mass, p T , and pseudorapidity using MC simulations [18,83]. An illustration of trimming is given in Figure 4 of Ref. [18].
The C/A R = 1.5 jets are required to satisfy p T > 200 GeV. These jets are used as input to the HEPTopTagger, which employs an internal pile-up suppression, and are therefore left ungroomed. For trimmed anti-k t R = 1.0 jets, the minimum p T is raised to 350 GeV to reduce the fraction of jets not containing all top-quark decay products due to the smaller jet radius parameter. All large-R jets must satisfy |η| < 2.0.
The missing transverse momentum is calculated from the vector sum of the transverse energy of clusters in the calorimeters, and it is corrected for identified electrons, muons and anti-k t R = 0.4 jets, for which specific object-identification criteria are applied [84]. The magnitude of the missing transverse momentum is denoted by E miss T .

Event selection
The data used in this paper were taken in 2012 at a centre-of-mass-energy √ s = 8 TeV and correspond to an integrated luminosity of 20.3 fb −1 [85]. Data are used only if all subsystems of the detector as well as the trigger system were fully functional. Baseline quality criteria are imposed to reject contamination from detector noise, non-collision beam backgrounds, and other spurious effects. Events are required to have at least one reconstructed primary vertex with at least five associated ID tracks, each with a p T larger than 400 MeV. This vertex must be consistent with the LHC beam spot [72]. In addition, all anti-k t R = 0.4 jets in the event which have p T > 20 GeV are required to satisfy the "looser" quality criteria discussed in detail in Ref. [78], otherwise the event is rejected.
Two different event samples are used to study the performance of top-tagging algorithms in data: a signal sample enriched in hadronically decaying top quarks and a background sample consisting mainly of multijet events.

Signal sample
For the signal sample, a selection of tt events in the lepton+jets channel is used, in which one of the W bosons from tt → W + bW −b decays hadronically and the other W boson decays leptonically. The selection is performed in the muon channel and the electron channel.
The selection criteria for the muon and electron channels differ only in the requirements imposed on the reconstructed leptons. For the muon channel, the events are required to pass at least one of two muon triggers, where one is optimized to select isolated muons with a transverse momentum of at least 24 GeV and the other selects muons with at least 36 GeV without the isolation requirement. Exactly one muon with p T > 25 GeV is required as defined in Section 4.1. Muons are rejected if they are close to an anti-k t R = 0.4 jet that has p T > 25 GeV. The rejection occurs if ∆R(µ, jet) < (0.04 + 10 GeV/p µ T ). Events in the muon channel are rejected if they contain an additional electron candidate.
For the electron channel, events are required to pass at least one of two triggers. The first is designed for isolated electrons with p T > 24 GeV and the second trigger requires electrons with p T > 60 GeV without the isolation requirement. Exactly one electron is required with E T > 25 GeV as defined in Section 4.1. An electron-jet overlap removal is applied based on the observation that the electron p T contributes a significant fraction of the p T of close-by anti-k t R = 0.4 jets. Therefore, the electron momentum is subtracted from the jet momentum before kinematic requirements are applied to the jet, so that jets close to an electron often fall below the jet p T threshold. If the electron-subtracted jet still fulfils the kinematic requirements for anti-k t R = 0.4 jets and the electron is still close, the electron is considered not isolated. In this case, the electron is removed from the event and the original non-subtracted jet is kept. Events in the electron channel are rejected if they also contain a muon candidate.
To select events with a leptonically decaying W boson, the following requirements are imposed. The events are required to have missing transverse momentum E miss T > 20 GeV. Additionally, the scalar sum of E miss T and the transverse mass of the leptonic W-boson candidate must satisfy E miss is calculated from the transverse momentum of the lepton, p T , and E miss T in the event. The variable ∆φ is the azimuthal angle between the lepton momentum and the E miss T direction.
To reduce contamination from W+jets events, each event must contain at least two b-tagged anti-k t R = 0.4 jets with p T > 25 GeV and |η| < 2.5. A neural-network-based b-tagging algorithm [86] is employed, which uses information on the impact parameters of the tracks associated with the jet, the secondary vertex, and the decay topology as its input. The operating point chosen for this analysis corresponds to a b-tagging identification efficiency of 70% in simulated tt events. In tt events with high-momentum top quarks, the direction of the b-quark from the leptonic decay of a top quark is often close to the lepton direction. Hence, at least one b-tagged jet is required to be within ∆R = 1.5 of the lepton direction. A second b-tag away from the lepton is required that fulfils ∆R(lepton, b-tag) > 1.5. This b-tagged jet is expected to originate from the b-quark from the hadronic top-quark decay, and is expected to be well separated from the decay products of the leptonically decaying top quark.
Each event is required to contain at least one large-R jet that fulfils the requirement ∆R(lepton, large-R jet) > 1.5. This criterion increases the probability that the large-R jet originates from a hadronically decaying top quark. The large-R jet has to fulfil |η| < 2 and exceed a p T threshold. The jet algorithm, the radius parameter, and the p T threshold depend on the top tagger under study. An overview is given in Table 1. The top taggers are introduced in Section 5 where also the choice of particular large-R jet types is motivated. If several large-R jets in an event satisfy the mentioned criteria, only the jet with the highest p T is considered. This choice does not bias the measurements presented in this paper, because the top-tagging efficiencies and misidentification rates are measured as a function of the large-R jet kinematics.
In simulated events containing top quarks, large-R jets are classified as matched or not matched to a hadronically decaying top quark. The classification is based on the distance ∆R between the axis of the large-R jet and the flight direction of a generated hadronically decaying top quark. The top-quark flight direction at the top-quark decay vertex is chosen, so as to take into account radiation from the top quark changing its direction. Matched jets are those with ∆R smaller than a predefined value R match , while notmatched jets are those with ∆R > R match . The radius R match is 0.75 for the anti-k t R = 1.0 jets and 1.0 for Tagger Jet algorithm Grooming Radius parameter p T range |η| range Tagger I-V anti-k t the C/A R = 1.5 jets. Changing R match to 1.0 for the anti-k t R = 1.0 jets has a negligible impact on the size of the not-matched tt contribution (less than 1%). Alternative matching schemes were tested but did not show improved matching properties, such as a higher matching efficiency.
Distributions for the signal selection with at least one trimmed anti-k t R = 1.0 jet with p T > 350 GeV are shown in Figure 1. The top-quark purity in this sample is 97%, with a small background contribution from W+jets production (3%). Single-top production accounts for 4% of the event yield and the tt prediction accounts for 93% (62% from matched and 31% from not-matched events). Not-matched tt events are an intrinsic feature of the signal selection. With different selection criteria the fraction of not-matched tt events varies, as does the total number of selected events. The chosen signal selection in the lepton+jets channel was found to be a good compromise between a reduced fraction of not-matched tt events and a sizeable number of selected events.
The mass and the transverse momentum of the highest-p T trimmed anti-k t R = 1.0 jet are shown in Figures 1(a) and 1(b), respectively. The systematic uncertainties shown in these plots are described in detail in Section 6. The mass distribution shows three peaks: one at the top-quark mass, a second at the W-boson mass and a third around 35 GeV. According to simulation, which describes the measured distribution within uncertainties, the top-quark purity in the region near the top-quark mass is very high, with the largest contribution being matched tt. The peak at the position of the W-boson mass originates from hadronically decaying top quarks where the b-jet from the decay is not contained in the large-R jet.
Even smaller masses are obtained if one of the decay products of the hadronically decaying W boson is not contained in the large-R jet or if only one top-quark-decay product is captured in the large-R jet. In these cases, a small mass is obtained due to the kinematic requirements imposed during trimming. The fraction of not-matched tt increases for decreasing large-R jet mass indicating a decreasing fraction of jets with a close-by hadronically decaying top quark. Only a small fraction of the peak at small mass is due to matched tt. The large-R jet p T exhibits a falling spectrum, and the application of the sequential p T reweighting to the simulation (cf. Section 3) yields a good description of the data.
The dominant systematic uncertainties in Figure 1 result from uncertainties in the large-R jet energy scale (JES), the PDF, and the tt generator. The contributions from these sources are approximately equal in size, except for large-R jets with p T > 500 GeV where the choice of tt generator dominates. These uncertainties affect mostly the normalization of the distributions. For the PDF and tt generator uncertainties, this normalization uncertainty comes about as follows: while the total tt cross section is fixed when the different MC event samples are compared, the p T dependence of the cross section varies from sample to sample, leading to a change in normalization for the phase space considered here (p T > 350 GeV).
Distributions for events fulfilling the signal selection with at least one C/A R = 1.5 jet with p T > 200 GeV, to be used in the HEPTopTagger studies, are shown in Figure 2. According to the simulation, the top quark purity in this sample is 97%. The only non-negligible background process is W+jets production (3%). The tt prediction is split into a matched part (59%) and a not-matched part (29%). Single-top production contributes 9% to the total event yield. The mass of the highest-p T C/A R = 1.5 jet with p T > 200 GeV is shown in Figure 2(a) and it exhibits a broad peak around 190 GeV. The large-R-jet mass distributions from not-matched tt, single-top production, and W+jets production have their maxima at smaller values than the distribution from matched tt. No distinct W-boson peak is visible, because the C/A R = 1.5 jets are ungroomed. The p T spectrum of the highest-p T C/A R = 1.5 jet is smoothly falling and well described by simulation after the sequential p T reweighting is applied (Figure 2(b)).
The C/A R = 1.5 jet distributions are described by the simulation within the uncertainties. The systematic uncertainties are slightly smaller than those in the distributions shown in Figure 1 for anti-k t R = 1.0 jets  Figure 1: Detector-level distributions of variables reconstructed in events passing the signal-sample selection (tt) with at least one trimmed anti-k t R = 1.0 jet with p T > 350 GeV. Shown in (a) is the mass and in (b) the transverse momentum of the highest-p T anti-k t R = 1.0 jet. The vertical error bar indicates the statistical uncertainty of the measurement. Also shown are distributions for simulated SM contributions with systematic uncertainties (described in Section 6) indicated as a band. The tt prediction is split into a matched part for which the large-R jet axis is within ∆R = 0.75 of the flight direction of a hadronically decaying top quark and a not matched part for which this criterion does not hold. The ratio of measurement to prediction is shown at the bottom of each subfigure and the error bar and band give the statistical and systematic uncertainties of the ratio, respectively. The impacts of experimental and tt modelling uncertainties are shown separately for the ratio.
with p T > 350 GeV because the tt modelling uncertainties increase with large-R jet p T . The uncertainties in the large-R JES, the b-tagging efficiency, the prediction of the tt cross section, and tt modelling uncertainties from the choice of generator, parton shower, and PDF set all contribute to the systematic uncertainty in the large-R-jet mass distribution. The uncertainty from the choice of generator increases in the high-mass tail, which is particularly sensitive to additional radiation close to the hadronically decaying top quark. The modelling uncertainties for the large-R-jet p T distribution increase with p T due to increasing uncertainties from the large-R JES, the b-tagging efficiency, and the tt modelling uncertainties. The increase of the tt modelling uncertainty with large-R-jet p T is an observation consistent with Figure 1 Distributions of other kinematic variables are also well described by the simulation and are shown in Appendix A.

Background sample
Due to the high threshold of the unprescaled jet triggers, such triggers do not provide an unbiased background sample of large-R jets from multijet production. Therefore, the misidentification rate is measured in a multijet sample collected with single-electron triggers, where the event is triggered by an object which in the detailed offline analysis fails the electron-identification requirements.
For the electron candidate used at the trigger level, the requirements on the pseudorapidity of the cluster of calorimeter cells are the same as for reconstructed electrons (cf. Section 4.1). Events with an offline reconstructed electron satisfying loose identification requirements [71] (these loose identification  Figure 2: Detector-level distributions of (a) the mass and (b) the transverse momentum of the highest-p T C/A R = 1.5 jet in events passing the signal-sample selection (tt) with at least one C/A R = 1.5 jet with p T > 200 GeV. The vertical error bar indicates the statistical uncertainty of the measurement. Also shown are distributions for simulated SM contributions with systematic uncertainties (described in Section 6) indicated as a band. The tt prediction is split into a matched part for which the large-R jet axis is within ∆R = 1.0 of the flight direction of a hadronically decaying top quark and a not matched part for which this criterion does not hold. The ratio of measurement to prediction is shown at the bottom of each subfigure and the error bar and band give the statistical and systematic uncertainties of the ratio, respectively. The impacts of experimental and tt modelling uncertainties are shown separately for the ratio. requirements do not include isolation criteria) are rejected to reduce contributions from electroweak processes. Only large-R jets well separated from the electron-trigger candidate are studied. This selection provides a sample that is largely dominated by multijet production, for which the electron-trigger candidate is a jet misidentified as an electron. Events are required to be selected by the trigger for electrons with p T > 60 GeV and not by the trigger for isolated electrons with a threshold of 24 GeV (described in Section 4.2.1). Not using the isolated electron trigger reduces top-quark contamination in the selected jet sample. The fraction of tt events before requiring a tagged top candidate is negligible. After requiring a tagged top candidate, the tt events are subtracted for the top taggers for which they present a non-negligible part of the sample, as detailed in Section 8.2.
At least one large-R jet is required with a jet axis separated from the electron-trigger object by ∆R > 1.5. The algorithm, radius parameter, and p T threshold of the jet depend on the particular top-tagging algorithm under study (see Table 1). If several large-R jets satisfying these criteria are found, only the jet with the highest p T is considered for the study of the misidentification rate. This choice does not bias the measurements, because the misidentification rate is measured as a function of the large-R-jet p T .

Top-tagging techniques
Top tagging classifies a given large-R jet as a top jet if its substructure satisfies certain criteria. This paper examines several top-tagging methods, which differ in their substructure analysis and which are described in the following subsections.
Due to the different substructure criteria applied, the methods have different efficiencies for tagging signal jets and different misidentification rates for background jets. High efficiency is obtained for loose criteria and implies a high misidentification rate. The performance of the taggers in terms of efficiencies and misidentification rates is provided in Section 7.1.

Substructure-variable taggers
The choice of trimmed anti-k t R = 1.0 jets (as defined in Section 4.1) for substructure-based analyses has been previously studied in detail [18], including comparisons of different grooming techniques and parameters. The following jet-substructure variables are used for top tagging in this analysis: • trimmed mass -The mass, m, of the trimmed anti-k t R = 1.0 jets is less susceptible to energy depositions from pile-up and the underlying event than the mass of the untrimmed jet. On average, large-R jets containing top-quark decay products have a larger mass than background jets.
• k t splitting scales -The k t splitting scales [87] are a measure of the scale of the last recombination steps in the k t algorithm, which clusters high-momentum and large-angle proto-jets last. Hence, the k t splitting scales are sensitive to whether the last recombination steps correspond to the merging of the decay products of massive particles. They are determined by reclustering the constituents of the trimmed large-R jet with the k t algorithm and are defined as in which ∆R i j is the distance between two subjets i and j in η-φ space, and p Ti and p T j are the corresponding subjet transverse momenta. Subjets merged in the last k t clustering step provide the √ d 12 observable, and √ d 23 is the splitting scale of the second-to-last merging. The expected value of the first splitting scale √ d 12 for hadronic top-quark decays captured fully in a large-R jet is approximately m t /2, where m t is the top quark mass. The second splitting scale √ d 23 targets the hadronic decay of the W boson with an expected value of approximately m W /2. The use of the splitting scale for W-boson tagging in 8 TeV ATLAS data is explored in Ref. [88]. Background jets initiated by hard gluons or light quarks tend to have smaller values of the splitting scales and exhibit a steeply falling spectrum.
• N-subjettiness -The N-subjettiness variables τ N [89,90] quantify how well jets can be described as containing N or fewer subjets. The N subjets found by an exclusive k t clustering of the constituents of the trimmed large-R jet define axes within the jet. The quantity τ N is given by the p T -weighted sum of the distances of the constituents from the subjet axes: in which p Tk is the transverse momentum of constituent k, ∆R min k is the distance between constituent k and the axis of the closest subjet, and R is the radius parameter of the large-R jet. The ratio τ 3 /τ 2 (denoted τ 32 ) provides discrimination between large-R jets formed from hadronically decaying top quarks with high transverse momentum (top jets) which have a 3-prong subjet structure (small values of τ 32 ) and non-top jets with two or fewer subjets (large values of τ 32 ). Similarly, the ratio τ 2 /τ 1 ≡ τ 21 is used to separate large-R jets with a 2-prong structure (hadronic decays of Z or W bosons) from jets with only one hard subjet, such as those produced from light quarks or gluons.
The variable τ 21 is studied in the context of W-boson tagging with the ATLAS and CMS detectors in Ref. [88] and Ref. [91], respectively. A method that distinguishes hadronically decaying high-p T Z bosons from W bosons is studied in Ref. [92].
Distributions of the k t splitting scales and N-subjettiness variables for large-R jets in a top-quark-enriched event sample (cf. Section 4.2.1) are shown in Figure 3. The √ d 12 distribution shows a broad shoulder at values above 40 GeV and the matched tt contribution exhibits a peak near m t /2 as expected. For the not-matched tt contribution and the W+jets process, √ d 12 takes on smaller values and the requirement of a minimum value of √ d 12 can be used to increase the ratio of top-quark signal to background (S /B). For the second splitting scale √ d 23 , signal and background are less well separated than for √ d 12 , but √ d 23 also provides signal-background discrimination. The distribution of τ 32 shows the expected behaviour, with the matched tt contribution having small values, because the hadronic top-quark decay is better described by a three-subjet structure than by two subjets. For not-matched tt and W+jets production, the distribution peaks at ≈ 0.75. Requiring a maximum value of τ 32 increases the signal-to-background ratio. For τ 21 , the separation of signal and background is less pronounced, but values above 0.8 are obtained primarily for background. Thus, τ 21 also provides signal-background discrimination.
The distributions are well described by the simulation of SM processes within systematic uncertainties, which are described in Section 6. For all distributions shown, the large-R JES, tt generator, and partonshower uncertainties give sizeable contributions, as do the uncertainties of the modelling of the respective substructure variables shown. The uncertainties for √ d 12 and √ d 23 are dominated by the tt generator and ISR/FSR uncertainties, respectively, for low values of the substructure variable. Low values of these variables are mainly present for not-matched tt, for which the modelling is particularly sensitive to the amount of high-p T radiation in addition to tt, because these large-R jets do not primarily originate from hadronically decaying top quarks. The modelling of additional radiation in tt events is also an important uncertainty for the number of events at low values of τ 32 and τ 21 , for which the tt ISR/FSR uncertainties dominate the total uncertainty. The modelling of the substructure variables themselves dominates for high values of √ d 12 , √ d 23 , τ 32 , and τ 21 .
Different top taggers, based on these substructure variables, are defined (Table 2). A large-R jet is tagged as a top jet by the corresponding tagger if the top-tagging criteria are fulfilled. Substructure tagger III was optimized for a search for tt resonances in the single-lepton channel [17]. Compared to other taggers, it has a rather high efficiency and misidentification rate because the analysis required only little background rejection, as the background was already much reduced by a lepton requirement. Removing the mass requirement or the requirement on √ d 12 further increases the efficiency (taggers I and II). The W top tagger was optimized for a search for tb resonances (W ) in the fully-hadronic decay mode [2], where a high background suppression is required. The efficiency of this tagger is therefore lower than that of taggers I to III. Taggers IV and V are introduced to study the effect of a requirement on √ d 23 in addition to the requirements of tagger III.
Distributions of the p T and mass of trimmed anti-k t R = 1.0 jets after applying the six different taggers based on substructure variables are shown in Figures 4 and 5, respectively, for events passing the full signal selection of Section 4.2.1. While the p T spectra look similar after tagging by the different taggers, the mass spectra differ significantly due to the different substructure-variable requirements imposed by the taggers. Taggers II to V require the mass to be greater than 100 GeV, and this cut-off is visible in the distributions. The mass distribution after the √ d 12 > 40 GeV requirement of Tagger I ( Figure 5(a)) differs from that of the pre-tag distribution (Figure 1     the W top tagger on the mass spectrum is visible by comparing Figure 5(f) with the pre-tag distribution (Figure 1(a)). The prominent peak around the top-quark mass shows that the sample after tagging is pure in jets which contain all three decay products of the hadronic top-quark decay.
All distributions are described by the MC simulation within uncertainties, indicating that the kinematics and the substructure of tagged large-R jets are well modelled by simulation. The uncertainty in the large-R jet p T requiring a top tag is dominated by the large-R JES and the parton-shower and tt generator uncertainties. Hence, the same uncertainties dominate in the different regions of the p T spectrum as before requiring a top tag (Section 4.2.1). The uncertainty on the large-R-jet mass distributions is dominated by the jet-mass scale uncertainty for all substructure taggers. The large-R JES as well as tt modelling uncertainties also contribute, but have a smaller impact. For all substructure taggers, the uncertainties in the substructure variables used in the respective taggers have a non-negligible impact, in particular for low large-R jet masses, i.e. in the regime which is sensitive to the modelling of not-matched tt and extra radiation.

Shower Deconstruction
In Shower Deconstruction (SD) [19,20], likelihoods are separately calculated for the scenario that a given large-R jet originates from a hadronic top-quark decay and for the scenario that it originates from a background process. The likelihoods are calculated from theoretical hypotheses, which for the application in this paper correspond to the SM. The signal process is the hadronic decay of a top quark and for the background process, the splitting of hard gluons into qq is considered. For signal and background, the effect of the parton shower is included in the calculation of the likelihood. Subjets of the large-R jet are used as proxies for partons in the underlying model and a weight is calculated for each possible shower that leads to the observed subjet configuration. This weight is proportional to the probability that the assumed initial particle generates the final configuration, taking into account the SM amplitude for the underlying hard process and the Sudakov form factors for the parton shower. A discriminating variable χ is calculated as the ratio of the sum of the signal-hypothesis weights to the sum of the backgroundhypothesis weights. For a set {p κ i } of N observed subjet four-momenta p κ i , in which i ∈ [1, N], the value of χ is given by with P({p κ i }|signal) being the weight for the hypothesis that a signal process leads to the observed configuration {p κ i } and the sum in the numerator is over all showers, in which signal processes lead to this configuration. Similarly, the denominator sums the weights for the background processes. If χ is larger than a certain cut value, the large-R jet is tagged as a top jet. By adjusting the threshold value for χ, the tagging efficiency can be changed continuously.
The inputs to SD are the four-momenta of the subjets in the large-R jet. SD has an internal mechanism to suppress pile-up, which is based on the fact that the weights of the likelihood ratio contain the probability that a subset of the subjets did not originate from the hard interaction but are the result of pile-up. Details can be found in Refs. [19,20]. In this paper, trimmed anti-k t R = 1.0 jets are used as input to SD, but the subjets of the untrimmed jet are fed to the SD algorithm, and the kinematic properties (p T , η) of the trimmed jet are only used to preselect the signal sample. This procedure avoids interference of the trimming with the SD-internal pile-up suppression.   To obtain the best SD performance, the smallest structures in the flow of particles should be resolved by the subjets used as input to SD. Therefore, C/A R = 0.2 subjets are used, as they are the jets with the smallest radius parameter for which ATLAS calibrations and calibration uncertainties have been derived [18,76]. Only the nine hardest subjets of the large-R jet are used in the present study to reduce the processing time per event, which grows with the number of subjets considered in the calculation. The signal weight is zero for large-R jets with fewer than three subjets because a finite signal weight requires the existence of at least three subjets which are identified with the three partons from the top-quark decay. To speed up the computation of the signal weights, the signal weight is set to zero if no combination of at least three subjets can be found that has an invariant mass within a certain range around the top-quark mass. The rationale for this mass requirement is that subjet combinations outside of this mass range would receive only a very small (but finite) weight due to the Breit-Wigner distribution assumed for the signal hypothesis. Similarly, a subset of the subjets which have a combined invariant mass close to the top-quark mass must give an invariant mass within a given range around the W-boson mass. Due to detector effects, the values of these ranges around the top-quark mass and the W-boson mass must be tuned to optimize the performance and cannot be extracted directly from the model. The values used in this study are a range of 40 GeV around a top-quark mass of 172 GeV and a range of 20 GeV around a W-boson mass of 80.4 GeV. For the background hypothesis, no constraint on the subjet multiplicity is present and also no mass-range requirements are imposed.
Distributions of the multiplicity and p T of C/A R = 0.2 subjets found in the untrimmed anti-k t R = 1.0 jets from the signal selection are shown in Figure 6. These subjets are used as input to SD and must satisfy the kinematic constraints p T > 20 GeV and |η| < 2.1. The subjet multiplicity of the large-R jet is shown in Figure 6(a). Most of the large-R jets have two or three subjets and only a small fraction have more than four subjets. Of the large-R jets, 41% have fewer than three subjets and are hence assigned a SD signal weight of zero. The simulation describes the data within statistical and systematic uncertainties indicating that the input to the SD algorithm, the subjet multiplicity and kinematics, are well described. For two and three subjets, the uncertainty is dominated by uncertainties in the large-R JES and the PDF. For one subjet and for four or more subjets, as well, the uncertainty is dominated by the subjet energy-resolution uncertainty. The source of most events with only one subjet is not-matched tt, for which the modelling of additional low-p T radiation exceeding the minimum subjet p T depends on the precision of the subjet energy scale and resolution. The same effect is present for four or more subjets, because hadronically decaying top quarks are expected to give rise to a distinct three-subjet structure and additional subjets may be due to additional low-p T radiation close to the top quark.
The p T distributions of the three hardest subjets are shown in Figures 6(b)-6(d). The p T of the highest-p T subjet is larger than ≈ 100 GeV and has a broad peak from 200 to 400 GeV. The shoulder at 370 GeV is caused by large-R jets from not-matched tt and W+jets background, as many of these jets have only one subjet, as shown in Figure 6(a), and in that case the single subjet carries most of the momentum of the large-R jet, i.e. most of the momentum is concentrated in the core of the jet. Therefore, the shoulder at 370 GeV is due to the requirement p T > 350 GeV for the large-R jet. The systematic uncertainty in the region mainly populated by jets with one dominant subjet (p T > 350 GeV) or by jets with many subjets (100 < p T < 150 GeV) in Figure 6(a) has sizeable contributions from the modelling of the subjet properties, here the subjet energy scale. While the large-R JES also contributes for 100 < p T < 150 GeV, it is dominant for jets mainly showing the expected distinct two-subjet or three-subjet structure (150 < p T < 350 GeV). For p T > 500 GeV, the largest uncertainty results from the difference between the tt generators, as this is the main source of uncertainties for the modelling of tt events in the upper range of the p T spectrum studied.  Figure 6: Detector-level distributions of C/A R = 0.2 subjets found in the untrimmed anti-k t R = 1.0 jet corresponding to the highest-p T trimmed anti-k t R = 1.0 jet with p T > 350 GeV in the signal selection: (a) the subjet multiplicity, and (b) the p T of the highest-p T subjet, (c) the second-highest-p T subjet, and (d) the third-highest-p T subjet. The vertical error bar indicates the statistical uncertainty of the measurement. Also shown are distributions for simulated SM contributions with systematic uncertainties (described in Section 6) indicated as a band. The tt prediction is split into a matched part for which the large-R jet axis is within ∆R = 0.75 of the flight direction of a hadronically decaying top quark and a not matched part for which this criterion does not hold. The ratio of measurement to prediction is shown at the bottom of each subfigure and the error bar and band give the statistical and systematic uncertainties of the ratio, respectively. The impacts of experimental and tt modelling uncertainties are shown separately for the ratio.
For the second-highest subjet p T , the background distribution peaks near the 20 GeV threshold. These are subjets in large-R jets with only two subjets where the highest-p T subjet carries most of the large-R jet momentum. These asymmetric configurations, where the highest-p T subjet carries a much larger p T than the second-highest-p T subjet, are seen mainly for the not-matched tt and W+jets processes. The acceptance limit at 20 GeV cuts into the p T distributions of all but the highest-p T subjet, as also seen for the distribution of the third-highest-p T subjet. The uncertainties in the distributions of the secondhighest-p T and third-highest-p T subjet are again dominated by the uncertainty of the subjet modelling, i.e. the subjet energy-resolution and energy-scale modelling, for low values of p T (mostly populated by not-matched tt events) and for high values of p T . For intermediate values (60-150 GeV for the secondhighest-p T subjet and 40-100 GeV for the third-highest-p T subjet), where jets with a distinct top-like subjet structure dominate the distributions, the large-R JES uncertainty dominates. If 40 < p T < 60 GeV for the second-highest subjet, the large-R JES uncertainty contributes significantly, but does not dominate due to significant contributions from the PDF and generator uncertainties.
The following invariant masses of combinations of the C/A R = 0.2 subjets are shown in Figure 7 for events fulfilling the signal selection: the mass of the two highest-p T subjets, m 12 , the mass of the secondhighest-p T and third-highest-p T subjet, m 23 , and the mass of the three hardest subjets, m 123 . These distributions illustrate some of the masses built from subjet combinations which are used by SD to reject subjet combinations that lead to masses outside the top-quark and W-boson mass ranges. Also these distributions are described by the simulation within statistical and systematic uncertainties and give further confidence in the description of the inputs to the SD algorithm. The uncertainty for large values of m 12 , m 23 and m 123 , i.e. for values larger than 140 GeV, 120 GeV and 165 GeV, respectively, is dominated by the subjet energy-scale uncertainty, consistent with this uncertainty also being dominant for large values of the subjet transverse momenta ( Figure 6). The parts of the distributions which are populated with jets showing primarily a distinct top-like substructure again show large contributions from the large-R JES uncertainty (60 < m 12 < 140 GeV, 80 < m 23 < 120 GeV, 135 < m 123 < 165 GeV), where the ISR/FSR and the subjet JES uncertainties also contribute for m 23 . For lower values, the three different invariant masses are all sensitive to radiation effects in a region populated by not-matched tt events, i.e. jets which do not originate from a hadronically decaying top quark. ISR/FSR uncertainties contribute to 20 < m 12 < 30 GeV, the subjet energy resolution contributes significantly to m 23 < 60 GeV and m 123 < 135 GeV, and also the PDF uncertainty has an increasing effect with increasing m 23 for 10 < m 23 < 60 GeV with the uncertainty from the subjet energy resolution decreasing with increasing m 23 . For 20 < m 12 < 30 GeV, the large-R JES uncertainty dominates the total uncertainty together with the ISR/FSR uncertainty. For m 23 < 10 GeV, the uncertainty is dominated by the uncertainty on the subjet energy resolution and the differences between the tt generators. For 30 < m 12 < 60 GeV, the choice of tt generator and the large-R JES dominate the total uncertainty.
The distributions of the SD weights and the ratio of the weights, i.e. the final discriminant χ (Eq. (3)), are shown in Figure 8 for events fulfilling the signal-selection criteria. For ≈ 60% of the large-R jets, the signal weight is zero because there are fewer than three subjets or the top-quark or W-boson masswindow requirements are not met. These cases are not shown in Figure 8. The natural logarithm of the sum perm. P({p κ i }|signal) of all weights obtained with the assumption that the subjet configuration in the large-R jet is the result of a hadronic top-quark decay is shown in Figure 8(a). The logarithm of the sum of all weights for the background hypothesis is shown in Figure 8(b). For the signal hypothesis the distribution peaks between −23 and −21, while for the background hypothesis the peak is at lower values, between −26 and −25. The logarithm of the ratio of the sums of the weights χ, is shown in Figure 8(c). The ln χ distribution is also shown in Figure 8(d) for large-R jet p T > 550 GeV, which defines a different kinematic regime for which the probability to contain all top-quark decay products in the large-R jet is higher than for the lower threshold of 350 GeV. All distributions of SD output variables are described by simulation within the statistical and systematic uncertainties. The subjet energy-resolution uncertainty dominates for low values of the logarithm of the SD signal weight (region < −26), the logarithm of the SD background weight (region < −30) and ln χ (region < 1 in Figure 8    The same distribution as in (c) but for the requirement that the trimmed large-R jet p T be larger than 550 GeV. The vertical error bar indicates the statistical uncertainty of the measurement. Also shown are distributions for simulated SM contributions with systematic uncertainties (described in Section 6) indicated as a band. The tt prediction is split into a matched part for which the large-R jet axis is within ∆R = 0.75 of the flight direction of a hadronically decaying top quark and a not matched part for which this criterion does not hold. The ratio of measurement to prediction is shown at the bottom of each subfigure and the error bar and band give the statistical and systematic uncertainties of the ratio, respectively. The impacts of experimental and tt modelling uncertainties are shown separately for the ratio.
Distributions of the p T and the mass of anti-k t R = 1.0 jets tagged as top jets by SD using the requirement ln(χ) > 2.5 are shown in Figure 9 for events passing the signal selection. The p T (Figure 9(a)) and the mass (Figure 9(b)) are shown for the trimmed version of the anti-k t R = 1.0 jet. The p T spectrum is smoothly falling and the mass spectrum is peaked at m t . Both distributions are described by the simulation within the uncertainties. The uncertainty of the simulation for p T < 400 GeV is dominated by the The mass of the topquark candidate, where the four-momentum is calculated by taking the weighted average of each signal-hypothesis four-momentum. The vertical error bar indicates the statistical uncertainty of the measurement. Also shown are distributions for simulated SM contributions with systematic uncertainties (described in Section 6) indicated as a band. The tt prediction is split into a matched part for which the large-R jet axis is within ∆R = 0.75 of the flight direction of a hadronically decaying top quark and a not matched part for which this criterion does not hold. The ratio of measurement to prediction is shown at the bottom of each subfigure and the error bar and band give the statistical and systematic uncertainties of the ratio, respectively. The impacts of experimental and tt modelling uncertainties are shown separately for the ratio. uncertainties in the subjet energy scale and on the PDF. From 400 to 500 GeV, important contributions come from the PDF, ISR/FSR, the large-R JES, and the parton shower. Between 500 and 550 GeV, the large-R JES gives the largest contribution. For p T > 550 GeV, the dominant uncertainties are the ones on the PDF and the large-R JES. For masses below 160 GeV, the uncertainty is dominated by the uncertainties in the subjet energy scale and resolution. For masses greater than 210 GeV, the differences between the generators and the PDF uncertainty dominate, consistent with previous figures, where the large-R jet mass distribution receives significant contributions from the generator uncertainty for high mass values. In the mass region 160-210 GeV, multiple sources contribute significantly to the uncertainty.
A top-quark mass distribution can be constructed differently, making use of the SD weights. The signal weights are related to the likelihood of a set of subjets to originate from a top-quark decay. For each set of subjets, a combined four-momentum is built by adding the four-momenta of all subjets in the set. A topquark four-momentum is then reconstructed as a weighted average of the four-momenta of all possible subjet combinations: where p κ (i) is the four-momentum of the i-th subjet. The mass p 2 SD is shown in Figure 9(c). For the background, this mass takes on values closer to the top-quark mass than in Figure 9(b) because of the use of the signal weights in Eq. (4). Although not directly used in the SD tagging decision, this mass offers a glimpse into the inner workings of SD. The distribution is similar to the distribution of the trimmed jet mass. While the width in the central peak region from 140 to 200 GeV is similar, outliers in the weighted mass are significantly reduced. The distribution is well described by the simulation within statistical and systematic uncertainties. The systematic uncertainties are dominated by the uncertainties in the subjet energy scale and resolution.

HEPTopTagger
C/A R = 1.5 jets are analysed with the HEPTopTagger algorithm [21,22], which identifies the hard jet substructure and tests it for compatibility with the 3-prong pattern of hadronic top-quark decays. This tagger was developed to find top quarks with p T > 200 GeV and to achieve a high rejection of background, which is largest for low-p T large-R jets. The HEPTopTagger studied in this paper is the original algorithm which does not employ multivariate techniques. An extended version, HEPTopTagger2, has been developed in Ref. [93]. The algorithm makes use of the fact that in C/A jets, large-angle proto-jets are clustered last. The HEPTopTagger has internal parameters that can be changed to optimize the performance, and the settings used in this paper are given in Table 3 and are introduced in the following brief summary of the algorithm.
In the first step, the large-R jet is iteratively broken down into hard substructure objects using a mass-drop criterion [14]. The procedure stops when all substructure objects have a mass below the value m cut . In the second phase, all combinations of three substructure objects are tested for kinematic compatibility with a hadronic top-quark decay. Energy contributions from underlying event and pile-up are removed using a filtering procedure: small distance parameter C/A jets are built from the constituents of the substructure objects using a radius parameter that depends on the distance between these objects but has at most the value R max filt . The constituents of the N filt highest-p T jets found in this way (filter jets) are then clustered into three top-quark subjets using the exclusive C/A algorithm. In the final step, kinematic requirements are applied to differentiate hadronic top-quark decays from background. One of the criteria is that one pair of subjets must have an invariant mass in the range 80.4 GeV × (1 ± f W ) around the W-boson mass, with f W being a parameter of the algorithm. If all criteria are met, the top-quark candidate is built by adding the four-momenta of the N filt highest-p T filter jets. The large-R jet is considered to be tagged if the top-quark-candidate mass is between 140 and 210 GeV and the top-quark-candidate p T is larger than 200 GeV. An illustration of the HEPTopTagger algorithm is given in Figure 6 of Ref. [18].
Distributions of the HEPTopTagger substructure variables after requiring a top tag are shown in Figure 10, together with the p T and mass distributions of the top-quark candidate for events passing the signal selection. The purity of processes with top quarks (tt and single-top production) in this sample is more than 99%. The variable m 12 (m 23 ) is the invariant mass of the highest-p T (second-highest-p T ) and the second-highest-p T (third-highest-p T ) subjet found in the final, i.e. exclusive, subjet clustering step. The variable m 13 is defined analogously, and the variable m 123 is the mass of the three exclusive subjets. The ratio m 23 /m 123 is used internally in the HEPTopTagger algorithm and is displayed in Figure 10(a). It shows a peak at m W /m t , which indicates that in most of the cases, the highest-p T subjet corresponds to the b-quark. The inverse tangent of the ratio m 13 /m 12 is also used internally in the HEPTopTagger algorithm and its distribution is shown in Figure 10(b). The HEPTopTagger top-quark-candidate p T (Figure 10(c)) is peaked at ≈ 250 GeV and falls smoothly at higher p T . At around 200 GeV, the tagging efficiency increases strongly with p T (cf. Section 8.1) and therefore there are fewer entries in the lowest p T interval from 200 to 250 GeV than would be expected from a falling p T distribution. The HEPTopTagger topquark-candidate mass (Figure 10(d)) is peaked near the top-quark mass with tails to lower and higher values. To be considered as HEPTopTagger-tagged, the top-quark candidate must have a mass between 140 and 210 GeV.
The distributions of m 23 /m 123 and arctan(m 13 /m 12 ), as well as the top-quark-candidate p T and mass are well described by the simulation within statistical and systematic uncertainties. For the two ratios of subjet invariant masses, important sources of systematic uncertainty are the subjet JES, the b-tagging efficiency and the tt modelling uncertainties from the choice of the PDF set and the ISR/FSR settings. The choice of PDF set dominates the uncertainty for m 23 /m 123 for very low and very high values of the ratio. These uncertainties also contribute to the modelling of the top-quark-candidate p T and η. The uncertainty in the top-quark-candidate p T increases with p T due to increasing uncertainties from the subjet JES, the b-tagging efficiency and the choice of PDF set, as well as from additional tt modelling uncertainties due to the choice of generator and parton shower.
A variant of the HEPTopTagger has been developed that uses a collection of small-R jets as input, instead of large-R jets. This variant is referred to as HEPTopTagger04, because it is based on small-R jets with R = 0.4. This approach can be useful when aiming for a full event reconstruction in final states with many jets in events in which the top quarks have only a moderately high transverse momentum (p T > 180 GeV). The advantages of the method are explained using the performance in MC simulation in Section 7.2.
The HEPTopTagger04 technique proceeds as follows. All sets of up to three anti-k t R = 0.4 jets (small-R jets in the following) are considered, and an early top-quark candidate (not to be confused with the HEPTopTagger candidate) is built by adding the four-momenta of these jets. Only sets with m candidate > m min and p T,candidate > p T,min are kept and all small-R jets in the set must satisfy ∆R i,candidate < ∆R max . The values of these parameters are given in Table 4. The constituents of the selected small-R jets are then passed to the HEPTopTagger algorithm to be tested with being compatible with a hadronically decaying top quark. The same parameters as given in Table 3    the HEPTopTagger algorithm based on the small-R jets' constituents, it is called a HEPTopTagger04 topquark candidate. If more than one HEPTopTagger04 top-quark candidate is found in an event, they are all kept if they do not share a common input jet. In the case that top-quark candidates share small-R input jets, the largest possible set of top-quark candidates which do not share input jets is chosen. If multiple such sets exist, the set for which the average top-quark-candidate mass is closest to the top-quark mass is selected.
Post-tag distributions from the HEPTopTagger04 approach for events passing the signal selection (but omitting all requirements related to a large-R jet) are shown in Figure 11 and show features similar to the ones described for the HEPTopTagger. Events are classified as matched or not-matched based on the angular distance between hadronically decaying top quarks and the top-quark candidate, and not the large-R jet as in the other tagging techniques, because for the HEPTopTagger04 no large-R jet exists. The distributions are well described by the simulation within statistical and systematic uncertainties. The systematic uncertainty of the predicted event yield after tagging is approximately 16%, with the largest contributions from the subjet energy scale (8.1%), the uncertainty in initial-state and final-state radiation (8.9%), the tt cross-section normalization (6.2%), the PDF uncertainty (5.2%), and the uncertainty in the b-tagging efficiency (5.1%). The uncertainties related to the anti-k t R = 0.4 jets used as input to the HEPTopTagger04 method have a negligible impact (< 1%), as the anti-k t R = 0.4 jet energies are only used to select the early top-quark candidate in the HEPTopTagger04 procedure and the HEPTopTagger algorithm is run on the constituents of these anti-k t R = 0.4 jets.

Systematic uncertainties
The measurements presented in this paper are performed at the detector level, i.e. differential in reconstructed kinematic quantities and not corrected for detector effects such as limited efficiency and resolution. The measured distributions are compared with SM predictions obtained from MC-generated events which have been passed through a simulation of the detector and are reconstructed in the same way as the data. Systematic uncertainties of the predictions can be grouped into different categories: uncertainties related to the simulation of the detector response and the luminosity measurement, and uncertainties related to the modelling of the physics processes (production cross sections, parton shower, hadronization, etc.).
Systematic uncertainties in the results presented in this paper are obtained by varying parameters of the simulation (one parameter at a time) and repeating the analysis with this varied simulation to determine its impact. The change from the nominal prediction is taken as the 1σ uncertainty related to the uncertainty in the varied parameter. The systematic uncertainties are considered uncorrelated unless otherwise specified.

Experimental uncertainties
The uncertainty in the integrated luminosity is 2.8%. It is derived from a calibration of the luminosity scale derived from beam-separation scans, following the methodology detailed in Ref.   Systematic uncertainties related to jet reconstruction are considered as follows. The uncertainty in the energy scale of anti-k t R = 0.4 jets is determined using a combination of in situ techniques exploiting the transverse-momentum balance between a jet and a reference object such as a photon or a Z boson [78]. The uncertainty in the energy resolution of anti-k t R = 0.4 jets is found to have negligible impact for the results presented here.
The large-R jets and subjets used in this analysis are reconstructed from calorimeter information. Systematic uncertainties related to the modelling of the calorimeter response in simulation are estimated by comparing these jets to tracks which are matched to the jets [18]. Uncertainties in the following quantities are estimated in this way: the energy scale of the large-R jets; the k t splitting scales, the N-subjettiness ratios, and the mass of trimmed anti-k t R = 1.0 jets; the subjet energy scale for SD. For p T < 900 GeV of trimmed anti-k t R = 1.0 jets, the uncertainty is not derived from the track-jet method, but using γ+jet events and an additional uncertainty based on the difference between the calorimeter's response to QCD jets and jets from tt decays. The uncertainties in the k t splitting scales, the N-subjettiness ratios and the trimmed mass are 4-7% for p T between 350 and 700 GeV, depending on the jet p T , η and the ratio m/p T . For values of m/p T < 0.1, the uncertainties are larger and reach values of up to 10%. The subjet energyscale uncertainty for the HEPTopTagger is determined in situ from the reconstructed top-quark mass peak as described in Section 6.2. The correlations between the uncertainties in the substructure variables used by taggers I-V and the W top tagger have not been determined; the largest observed variations are used based on testing different combinations of zero and full (anti-)correlation of the systematic uncertainties of the different substructure variables.
The energy-resolution uncertainties for C/A R = 1.5 jets and for subjets used by SD and the HEP-TopTagger are determined using the p T balance in dijet events [18]. To determine the impact of the energy-resolution uncertainty for trimmed anti-k t jets with R = 1.0, the energy resolution in simulation is scaled by 1.2. The impact of the mass-resolution uncertainty for trimmed anti-k t R = 1.0 jets is estimated analogously.

In situ determination of the subjet energy scale for the HEPTopTagger
The top-quark candidates identified with the HEPTopTagger in the µ+jets channel of the signal selection are used to determine the subjet energy scale for the HEPTopTagger. For this study, the signal selection with only the b-tag close to the lepton is used and the second b-tag requirement with ∆R > 1.5 from the lepton direction is omitted. With this change, the µ+jets channel alone provides sufficient events to perform this study. The four-momentum of the top-quark candidate is obtained in the HEPTopTagger by combining the calibrated subjet four-momenta. A change in the subjet p T is therefore reflected in a change of the top-quark-candidate momentum. The top-quark peak in the distribution of the top-quark-candidate mass can be used to constrain the energy-scale uncertainty of the subjets as suggested in Ref. [96]. The method consists of varying the energy scale of the calibrated subjets in simulation and comparing the resulting top-quark mass distribution to the one from data. A higher (lower) subjet energy scale shifts the predicted distribution to larger (smaller) masses. This shift is constrained by the necessity to describe the measured mass peak within uncertainties.
The subjet energy-scale uncertainty is determined by calculating a χ 2 value for different variations of the energy scale. The χ 2 is calculated in the mass window from 133 to 210 GeV, in 11 bins of width 7 GeV. The statistical uncertainties of the measured and predicted number of top-quark candidates in each bin are taken into account, as well as all systematic uncertainties other than that of the subjet energy scale itself. The systematic uncertainties due to the imperfect modelling of the physics processes (Section 6.3) are considered, including a systematic uncertainty in the top-quark mass of ±1 GeV.
Variations of the subjet energy scale are considered by raising or lowering all subjet transverse momenta in a correlated way: in which f is a function which specifies the relative variation. Three different scenarios for the dependence of f on the subjet p T are considered (the parameters k i are constants): • f = k 1 √ p T (larger variation for high-energy subjets), • f = k 2 /p T (larger variation for low-energy subjets), • f = k 3 (no p T dependence, variation by a constant factor).
Separate χ 2 values are determined for all three functional forms and for different values of the parameters k i . The HEPTopTagger top-quark-candidate mass distribution is shown in Figure 12(a). The simulation is shown for the nominal energy scale and, as an example, for the case of the variation with f = k 2 /p T with k 2 = 1 GeV. For subjets with p T = 100 GeV, this corresponds to a relative change of the transverse momentum of ±1%. The description of the measured distribution is improved by the +1% variation. The level of agreement between the measured and predicted distributions is quantified in terms of the χ 2 value shown in Figure 12(b) for different values of k 2 . The variation is expressed as the relative p T change for subjets with p T = 100 GeV (JES shift). A parabola is fitted to the χ 2 values as a function of the JES shift. The best agreement is obtained for a JES shift of +1%, which leads to the smallest χ 2 , χ 2 min . This result can be used to correct the subjet p T scale in the simulation. This is left to future studies. Here, an uncertainty in the p T scale is determined as follows. From the two JES-shift values that correspond to χ 2 = χ 2 min + 1, the larger absolute value is used as the 1σ systematic uncertainty of the p T scale. In Figure 12(b) this uncertainty is 2.2%.
The subjet energy-scale uncertainty is determined in two bins of large-R-jet p T (< 320 GeV, > 320 GeV) and two bins of large-R jet pseudorapidity (|η| < 0.7, 0.7 < |η| < 2.0). The results are shown in Figure 13. The largest relative uncertainty is 10% at a subjet p T of 20 GeV, dropping with 1/p T to 2.5% at 90 GeV and then rising proportionally to √ p T , reaching 3.5-4.0% at 200 GeV. The uncertainty depends weakly on the large-R jet p T and η.
In the HEPTopTagger analysis, the impact on each studied quantity (the number of tagged large-R jets, the tagging efficiency, and the mistag rate) is determined for all three functional forms. The largest of the three changes in the quantity is then used as the uncertainty related to the imperfectly known subjet energy scale.

Uncertainties in the modelling of physics processes
Uncertainties related to the tt simulation are taken into account as follows. If the uncertainties are estimated from samples not generated with the nominal tt generator POWHEG+PYTHIA, then the sequential p T reweighting mentioned in Section 3 is not applied, because the reweighting used only applies to POWHEG+PYTHIA: the nominal POWHEG+PYTHIA prediction without reweighting is compared to the prediction from the alternative simulation without reweighting.  The tt cross-section uncertainty of +13 −15 pb quoted in Section 3 is used and an additional normalization uncertainty of +7.6 −7.3 pb from a variation of the top-quark mass by ±1.0 GeV is added in quadrature, leading to a total relative normalization uncertainty of +5.9% −6.6% . For the evaluation of the other tt modelling uncertainties mentioned below, the total tt cross section of the generated event samples is set to the value given in Section 3, so that no double-counting of normalization uncertainties occurs.
To account for uncertainties in the parton shower, the prediction from POWHEG+Herwig is compared to the prediction from POWHEG+PYTHIA. Uncertainties in the choice of tt generator are estimated by comparing the prediction from MC@NLO+Herwig with the prediction from POWHEG+Herwig. The uncertainty in the amount of ISR and FSR is estimated using two ACERMC+PYTHIA tt samples with increased and decreased radiation.
PDF uncertainties affect the normalization of the total tt cross section and this is taken into account as described in Section 3. They additionally affect the tt cross section in the phase space examined by this analysis and the distributions of kinematic variables. These effects are determined by comparing the prediction based on CT10 to the prediction based on HERAPDF1.5. The cross-section difference obtained when comparing these two PDF sets was found to match the difference due to the CT10 PDF uncertainty [54] for this region of phase space.
The factorization and renormalization scales are varied by factors two and one half and the impact on the total tt cross section is included in the cross-section uncertainty. The impact in the phase space examined by this analysis and on the distributions of kinematic variables is evaluated by comparing dedicated tt samples in which the two scales are varied independently. The variation of the renormalization scale has a significant impact, while the analysis is not sensitive to variations of the factorization scale beyond the change of the total tt cross section.
The impact of variations on the top-quark-candidate mass peak of varying the top-quark mass in the generator by ±1.0 GeV is taken into account for the in situ determination of the subjet energy scale in Section 6.2. For the efficiency and misidentification-rate measurements this uncertainty is negligible compared to other sources of systematic uncertainty.
The uncertainties on the normalization of the single top, W+jets, and Z+jets background contributions were found to have a negligible impact.

Study of top-tagging performance using Monte-Carlo simulation 7.1. Comparison of top-tagging performance
The performance of the different top-tagging approaches is compared using MC simulations to relate the different large-R jets used by the taggers and to extend the comparison in large-R jet p T beyond the kinematic reach of the 8 TeV data samples.
The performance is studied in terms of the efficiency for tagging signal large-R jets and the background rejection, defined as the reciprocal of the tagging rate for background large-R jets. Signal jets are obtained from Z → tt events and background jets are obtained from multijet events. Multijets typically pose the largest background in tt analyses in the fully hadronic channel. The W+jets background, where the W boson decays hadronically, is less important because of the smaller cross section. Also, in the kinematic region considered in the comparison presented here, it was shown for the HEPTopTagger that the mistag rate is similar for multijet background and background from W → q q [18]. In the lepton+jets channel, W+jets tends to be the most important background if the W boson decays leptonically, and then the background from the additional jets is very similar to the multijets case. The conclusions drawn in this section can therefore be extended to the context of this W+jets background.
Stable-particle jets are built in all MC events using the anti-k t algorithm and a radius parameter R = 1.0. These jets are trimmed with the same parameters as described in Section 4.1 for the detector-level jets. These particle-level jets are used to relate the different jet types used at reconstruction level. The different types of large-R jets used by the tagging algorithms are listed in Table 1. Each reconstructed large-R jet must be geometrically matched to a particle-level jet within ∆R = 0.75 for the trimmed anti-k t R = 1.0 jets, and within ∆R = 1.0 for the C/A R = 1.5 jets. The fraction of reconstructed large-R jets with no matching particle-level jet is negligible. In addition, particle-level jets in the signal sample must be geometrically matched to a hadronically decaying top quark within ∆R = 0.75. The top-quark flight direction at the top-quark decay vertex is chosen, consistent with the matching procedure discussed in Section 4.2.1. The particle-level jet p T spectrum of the signal sample is reweighted to the p T spectrum of the background sample to remove the dependence on a specific signal model. However, since the results in this section are given for different ranges of p T , the conclusions are believed to hold, approximately independently of the choice of specific underlying p T spectrum.
The comparison is performed in bins of the p T of the particle-level jet, p true T , in the range 350 < p true T < 1500 GeV in which all taggers are studied. For the performance comparison, the statistical uncertainties of the simulated efficiencies and rejections are taken into account, while no systematic uncertainties are considered.
The background rejection is shown as a function of the tagging efficiency in Figures 14 and 15  When using only a single substructure-variable cut, the best performing variables in all studied p true T intervals are the splitting scale √ d 12 at high efficiency and √ d 23 at lower efficiency. At an efficiency of 80%, a cut on √ d 12 achieves a background rejection of ≈ 3-6 over the full range in p true T . At an efficiency of 40%, a cut on √ d 23 achieves a rejection of ≈ 25 for lower values of p true T , decreasing to a rejection of 15 for 700 < p true T < 1000 GeV and 11 for 1000 < p true T < 1500 GeV, respectively. The efficiency at which the rejection of a cut on √ d 23 is higher than the rejection for the trimmed-mass cut depends on p true T : it is ≈ 45% for 350 < p true T < 400 GeV and increases to 90% for 1000 < p true T < 1500 GeV. A cut on the trimmed mass performs similarly to the √ d 12 cut. A cut on τ 32 performs significantly worse. For high efficiencies and the ranges of lower p true T (e.g. ≈ 60-90% for 350 < p true T < 400 GeV), the cut on the trimmed mass shows only a small increase in the rejection with decreasing signal efficiency. For lower efficiencies, the rejection increases more strongly with decreasing signal efficiency. This is due to the two distinct W-boson and top-quark mass peaks in signal, as exemplified in Figure 1(a) Figure 14: The background rejection as a function of the tagging efficiency of large-R jets, as obtained from MC simulations for 350 GeV < p T < 400 GeV and 550 GeV < p T < 600 GeV for trimmed anti-k t R = 1.0 particlelevel jets to which the large-R jets are geometrically matched. The HEPTopTagger uses C/A R = 1.5 jets; the other taggers use trimmed anti-k t R = 1.0 jets. For SD, the cut value of the discriminant ln χ is scanned over. Substructure-variable-based taggers are also shown including single scans over the trimmed mass, √ d 12 , √ d 23 , τ 32 and scans over cuts on √ d 23 and τ 32 for substructure tagger V and the W top tagger, respectively. The curves are not shown if the background efficiency is higher than the signal efficiency, which for some substructure-variable scans occurs for very low signal efficiencies, i.e. for scans in the tails of the distributions. The statistical uncertainty from the simulation is smaller than the symbols for the different working points and it is no larger than the width of the lines shown.  Figure 15: The background rejection as a function of the tagging efficiency of large-R jets, as obtained from MC simulations for 700 GeV < p T < 1000 GeV and 1000 GeV < p T < 1500 GeV for trimmed anti-k t R = 1.0 particlelevel jets to which the large-R jets are geometrically matched. The HEPTopTagger uses C/A R = 1.5 jets; the other taggers use trimmed anti-k t R = 1.0 jets. For SD, the cut value of the discriminant ln χ is scanned over. Substructure-variable-based taggers are also shown including single scans over the trimmed mass, √ d 12 , √ d 23 , τ 32 and scans over cuts on √ d 23 and τ 32 for substructure tagger V and the W top tagger, respectively. The curves are not shown if the background efficiency is higher than the signal efficiency, which for some substructure-variable scans occurs for very low signal efficiencies, i.e. for scans in the tails of the distributions. The statistical uncertainty from the simulation is smaller than the symbols for the different working points and it is no larger than the width of the lines shown.
≈ 80% for 1000 < p true T < 1500 GeV. By varying the τ 32 requirement in the W top tagger, rejections close to the ones of SD and the HEPTopTagger can be achieved at the same efficiency.
For SD, the cut value of the discriminant ln χ is varied. The maximum efficiency is ≈ 50% in the lowest p T bin studied (350 < p true T < 400 GeV). For higher p T , the efficiency rises up to 70%. The maximum efficiency is determined by the requirement of having at least three subjets which combine to an invariant mass near the top-quark mass and a subset of these subjets to give a mass near the W-boson mass. The increase of the maximum efficiency from approximately 50% at 350-400 GeV to approximately 70% at 550-1000 GeV is a result of the larger average containment of the top-quark decay products in the large-R jet at higher p T . At the highest p T values (1000-1500 GeV), the use of R = 0.2 subjets limits the efficiency as the top-quark decay products cannot be fully resolved for an increasing fraction of large-R jets, resulting in a maximum efficiency of ≈ 50%.
For 350 < p true T < 400 GeV, the HEPTopTagger has an efficiency of 34% at a rejection of 47. For p true T > 550 GeV, the efficiency is ≈ 40% and the rejection is ≈ 35, approximately independent of p true T . The HEPTopTagger performance was also investigated for 200 < p true T < 350 GeV (not shown): efficiency and rejection are 18% and 300, respectively, for 200 < p true T < 250 GeV, 22% and 130 for 250 < p true T < 300 GeV, and 28% and 65 for 300 < p true T < 350 GeV. For 350 < p true T < 450 GeV, the performance of SD, the HEPTopTagger, and the W top tagger are comparable. For 450 < p true T < 1000 GeV, SD offers the best rejection in simulation, up to its maximum efficiency. Top tagging efficiencies above 70% can be achieved with cuts on substructure variables, where, depending on p true T , optimal or close-to-optimal performance can be achieved with a requirement on √ d 12 alone. For 1000 < p true T < 1500 GeV, of all the top-tagging methods studied, the HEPTopTagger offers the best rejection (≈ 30) at an efficiency of ≈ 40%, making it a viable option for high-p T searches despite not having been optimized for this p T regime. The only tagger studied for 200 < p true T < 350 GeV is the HEPTopTagger.

HEPTopTagger04 performance
The efficiencies for hadronically decaying top quarks to be reconstructed as top-quark candidates with the HEPTopTagger04 and HEPTopTagger methods are shown in Figure 16 as a function of the true p T of the top quark in simulated tt events. The events are selected according to the criteria described in Section 4.2.1, except that all requirements related to large-R jets are not applied in the case of HEPTopTag-ger04. For these efficiencies, a top quark is considered tagged if a top-quark candidate is reconstructed with a momentum direction within ∆R = 1.0 of the top-quark momentum direction. The definition of the efficiency is therefore different from the large-R-jet-based one used in Section 7.1, where also a different event selection and different matching criteria are applied. The efficiency of the HEPTopTagger04 method increases with the p T of the top quark and reaches values of ≈ 50% for p T > 500 GeV. The efficiency of the HEPTopTagger04 method is lower than the efficiency of the HEPTopTagger, but follows the trend of the HEPTopTagger efficiency closely. The HEPTopTagger efficiency reaches higher values than in Section 7.1 primarily because the event selection here requires two b-tagged jets.
This efficiency, however, does not take into account the specific needs of event reconstruction in final states with top quarks and many additional jets, for which the HEPTopTagger04 was designed. An example of such a topology in an extension of the SM is the associated production of a top quark and a charged Higgs boson, H + , decaying to tb, i.e. pp → H +t (b) → tbt(b). After the decay of the top quarks, the final state contains three or four b-quarks. Up to two b-jets not associated with a top-quark decay can in principle be reconstructed, and they should not be part of the reconstructed top-quark candidates. In ATLAS, b-jets are usually reconstructed using the anti-k t algorithm with R = 0.4. For large H + masses, for which the top quarks from its decay may have large p T , ensuring no overlap between the top-quark candidates and the unassociated b-jets may not be trivial. In this case, hadronically decaying top quarks may be reconstructed with large-R jet substructure analysis. The reconstruction of anti-k t R = 0.4 and large-R jets, however, proceeds independently, so that the same clusters may be present in anti-k t R = 0.4 and large-R jets. If the anti-k t R = 0.4 jet and the large-R jet overlap, the b-tagged anti-k t R = 0.4 jet might also originate from the hadronic top quark decay, which prevents an unambiguous reconstruction of the final state. Moreover, clusters included in both objects may lead to a double-counting of deposited energy, which is an issue if for example an invariant mass is formed from the tagged top and a close-by b-jet targeting the H + → tb decay.
In the case of the HEPTopTagger, subjets of the large-R jet are explicitly reconstructed, and it would be an option to only consider anti-k t R = 0.4 jets not matched to one of the three subjets which form the topquark candidate as being not associated with a hadronically decaying top. This approach, however, is not straightforward because of the different jet algorithms and jet radii used for HEPTopTagger subjets and b-tagging. A simple approach is to require an angular separation ∆R between the top-quark candidate and the anti-k t R = 0.4 jets in the event, denoted HEPTopTagger+∆R in the following. The HEPTopTagger04 is therefore compared to HEPTopTagger+∆R, using the latter as a benchmark.
In Figure 17(a), the energy shared by anti-k t R = 0.4 jets and C/A R = 1.5 jets is shown for simulated tt events. The shared energy is calculated from the clusters of calorimeter cells included as constituents in the small-R and large-R jets. The C/A jets are required to fulfil |η| < 2.1 and p T > 180 GeV, and the anti-k t jets must fulfil |η| < 2.5 and p T > 25 GeV. All combinations of large-R C/A jets and small-R anti-k t jets in each event are shown. The shared energy is normalized to the total energy of the small-R jet and this shared energy fraction is shown as a function of the angular separation ∆R of the small-R and large-R jets. The region of small angular separation is populated by combinations where a large fraction of the energy of the small-R jet is included in the large-R jet, i.e. where the two jets originate from the same object. However, for larger values of ∆R, a significant fraction of the energy of the small-R jet can still be shared with the large-R jet.  Figure 17: (a) Energy fraction of clusters included in anti-k t jets with R = 0.4 also included in C/A jets with R = 1.5 in tt MC simulation as a function of the angular separation of the two jets. The C/A jets have to fulfil |η| < 2.1 and p T > 180 GeV, and all combinations of large-R and small-R jets in each event are shown. (b) Efficiency for the H + selection for the HEPTopTagger04 method for a 1400 GeV H + signal (blue, full circles) and for HEPTopTagger for which an angular separation ∆R is required between the top-quark candidate and the closest anti-k t R = 0.4 jet (or lepton) in the event (red open circles), HEPTopTagger+∆R. The efficiency of an alternative H + selection with three b-tagged anti-k t R = 0.4 jets is shown in addition for HEPTopTagger+∆R. For HEPTopTagger+∆R, the efficiency is shown as a function of ∆R, while the HEPTopTagger04 algorithm is independent of ∆R.
The HEPTopTagger04 approach solves the issue of overlap between large-R and small-R jets by passing only the constituents of a set of small-R jets to the HEPTopTagger algorithm and by removing these small-R jets from the list of jets considered for the remaining event reconstruction, i.e. the identification of extra b-jets.
The charged-Higgs-boson process mentioned above is used to illustrate the advantage of the HEPTopTag-ger04 approach. A basic event selection for events with an H + boson is introduced in order to study the performance of the HEPTopTagger04 in this topology using simulated events only. It consists of the signal selection for tt events as detailed in Section 4.2.1 requiring at least one top-quark candidate reconstructed with the HEPTopTagger04 method and two b-tagged anti-k t R = 0.4 jets not considered as part of the HEPTopTagger04 candidate (H + selection). The b-tagged anti-k t R = 0.4 jets are allowed to be identical to the b-tagged jets required in the signal selection, if these jets are not part of the HEPTopTagger04 candidate.
The HEPTopTagger04 method is compared with HEPTopTagger+∆R in the H + selection. Only those b-tagged anti-k t R = 0.4 jets that are more than ∆R away from the top-quark candidate are considered in the H + selection for HEPTopTagger+∆R. Moreover, the top-quark candidate is required to be separated from the reconstructed lepton by at least ∆R. Figure 17(b) shows the efficiency of the H + selection for a 1400 GeV H + signal MC sample for HEPTopTagger+∆R as a function of ∆R, and for the HEPTopTag-ger04 method, which is independent of ∆R. The HEPTopTagger04 leads to a higher efficiency than the simple HEPTopTagger+∆R benchmark for values of ∆R > 0.5. In order to avoid energy sharing, larger values of ∆R would be appropriate (cf. Figure 17(a)). For small values of ∆R, HEPTopTagger+∆R shows a higher efficiency than the HEPTopTagger04 method, because at least one b-tagged jet largely overlaps with the top-quark candidate and can be identified with the b-quark from the top-quark decay and not with one of the additional b-quarks from the pp → H +t (b) → tbt(b) process. An additional b-tagged anti-k t R = 0.4 jet can be required in the event selection for HEPTopTagger+∆R to address this issue, which leads to a lower efficiency for HEPTopTagger+∆R than for the HEPTopTagger04 method for all values of ∆R.
In order to determine the optimal method for a particular application, mistag-rate comparisons of the two approaches are important to evaluate using the exact selection of that analysis due to the critical dependence on the dominant background composition and kinematic region.

Measurement of the top-tagging efficiency and mistag rate
In this section, the signal and background samples introduced in Sections 4.2.1 and 4.2.2 are used to study the top tagging efficiency and the mistag rate for the different top taggers introduced in Section 5.

Top-tagging efficiency
The large-R jets in the signal selection are identified with a high-p T hadronically decaying top quark in lepton+jets tt events and are therefore used to measure the top-tagging efficiency in data as a function of the kinematic properties of the large-R jet (p T , η). The tagging efficiency is given by the fraction of tagged large-R jets after background has been statistically subtracted using simulation. In each large-R jet p T and η bin i, the efficiency is defined as in which data is the number of measured (tagged) large-R jets; tt not matched is the number of (tagged) not-matched large-R jets, i.e. jets not matched to a hadronically decaying top quark (cf. Section 4.2), according to the POWHEG+PYTHIA simulation; • N (tag) non-tt is the number of (tagged) large-R jets predicted by simulation to arise from other background contributions, such as W+jets, Z+jets and single-top production.
Systematic uncertainties affecting the numerator and the denominator do not fully cancel in the ratio, because in particular the amount of not-matched tt production is much reduced after requiring a top-tagged jet, but before the top-tagging requirement the number of not-matched tt events is non-negligible.
The measurement is shown for p T bins in which the relative statistical uncertainty of the efficiency is less than 30% and the relative systematic uncertainty is less than 65%. Two regions in large-R jet pseudorapidity are chosen, |η| < 0.7 and 0.7 < |η| < 2.0, in which approximately equal numbers of events are expected.
The measured efficiency is compared to the efficiency in simulated tt events, which is defined as in which N (tag) MC is the number of (tagged) large-R jets in matched tt events which pass the signal selection.

Efficiency of the substructure-variable taggers
The measured and predicted top-tagging efficiencies for the top taggers I-V and the W top tagger are studied as a function of the p T of the trimmed anti-k t R = 1.0 jet in the two pseudorapidity regions. In Figures 18 and 19, the efficiencies in the lower |η| region are shown. The efficiencies of the different top taggers are similar in the two η regions, as seen in Figure 20, in which the efficiencies of tagger III and the W top tagger in the higher |η| region are shown.
When a large-R jet is considered matched according to the geometric matching of the jet axis to the direction of the top quark, this does not necessarily imply that all decay products of the top quark are contained inside the large-R jet. Even after subtracting the not-matched contribution in Eq. (6), a significant fraction of the large-R jets with lower p T therefore do not contain all top-quark decay products. The tagging efficiency is high when all decay products are contained in the large-R jet. The efficiency is therefore low for large-R jets with small p T and it rises with p T because of the tighter collimation.
The efficiency decreases with increasing tagger number from tagger I to tagger V and the lowest efficiency of the tested taggers based on substructure variables is found for the W top tagger. The efficiencies vary between 40% and 90%, depending on the tagger and the p T of the large-R jet. The efficiencies are similar in the two η regions but the measurement is more precise for |η| < 0.7.
The measurement of the efficiency is limited by the systematic uncertainties resulting from the subtraction of background jets. The uncertainties in the measured efficiency include uncertainties related to the choice of generator used for tt production. In the lowest large-R jet p T bin, the relative uncertainties of the efficiency for |η| < 0.7 are 10% to 14%, depending on the tagger, and for 0.7 < |η| < 2.0 they vary between 11% and 17%. For |η| < 0.7, the systematic uncertainties in the interval 500 to 600 GeV vary between approximately 17% and 29%. For 0.7 < |η| < 2.0 the uncertainties from 450 to 500 GeV are 18 to 26%. The systematic uncertainty is dominated by the different efficiencies from using POWHEG or MC@NLO for the generation of the tt contribution for |η| < 0.7. In the range 0.7 < |η| < 2.0, the large-R JES, the PDF, the parton-shower and the ISR/FSR uncertainties also contribute significantly to the total systematic uncertainty.
Also shown in the figures is the prediction for f MC obtained from the simulated POWHEG+PYTHIA tt events using the nominal simulation parameters and not considering systematic uncertainties. The prediction obtained in this way is consistent with the measured efficiency within the uncertainties of the measurement. In the simulation, for which the statistical uncertainty is much smaller than for the data, the efficiencies continue to rise with p T , indicating that a plateau value is not reached in the p T range studied here.
The ratio f data / f MC is shown in the bottom panels of Figures 18-20. The nominal POWHEG+PYTHIA prediction is used for f MC . For this ratio, the full systematic uncertainties of f MC are considered, including the uncertainty from the choice of tt generator. The full correlation with the uncertainty of f data is taken into account in the systematic uncertainty of the ratio. The ratio is consistent with unity within the uncertainty in all measured p T and η ranges. For |η| < 0.7, the uncertainty of f data / f MC is 8-16% (depending on the tagger) for large-R jet p T from 350 to 400 GeV and 17-28% for 500-600 GeV. For 0.7 < |η| < 2.0, the uncertainty is 10-19% for 350-400 GeV and 19-28% for 450-500 GeV.  Figure 18: The efficiency f data , as defined in Eq. (6), for tagging trimmed anti-k t R = 1.0 jets with |η| < 0.7 with top taggers based on substructure variables (taggers I-IV) as a function of the large-R jet p T . Background (BG) is statistically subtracted from the data using simulation. The vertical error bar indicates the statistical uncertainty of the efficiency measurement and the data uncertainty band shows the systematic uncertainties. Also shown is the predicted tagging efficiency f MC , as defined in Eq. (7), from POWHEG+PYTHIA without systematic uncertainties. The ratio f data / f MC of measured to predicted efficiency is shown at the bottom of each subfigure and the error bar gives the statistical uncertainty and the band the systematic uncertainty. The systematic uncertainty of the ratio is calculated taking into account the systematic uncertainties in the data and the prediction and their correlation.  Figure 19: The efficiency f data , as defined in Eq. (6), for tagging trimmed anti-k t R = 1.0 jets with |η| < 0.7 with top taggers based on substructure variables (tagger V and W top tagger) as a function of the large-R jet p T . Background (BG) is statistically subtracted from the data using simulation. The vertical error bar indicates the statistical uncertainty of the efficiency measurement and the data uncertainty band shows the systematic uncertainties. Also shown is the predicted tagging efficiency f MC , as defined in Eq. (7), from POWHEG+PYTHIA without systematic uncertainties. The ratio f data / f MC of measured to predicted efficiency is shown at the bottom of each subfigure and the error bar gives the statistical uncertainty and the band the systematic uncertainty. The systematic uncertainty of the ratio is calculated taking into account the systematic uncertainties in the data and the prediction and their correlation.  Figure 20: The efficiency f data , as defined in Eq. (6), for tagging trimmed anti-k t R = 1.0 jets with 0.7 < |η| < 2.0 based on substructure variables (tagger III and W top tagger) as a function of the large-R jet p T . Background (BG) is statistically subtracted from the data using simulation. The vertical error bar indicates the statistical uncertainty of the efficiency measurement and the data uncertainty band shows the systematic uncertainties. Also shown is the predicted tagging efficiency f MC , as defined in Eq. (7), from POWHEG+PYTHIA without systematic uncertainties. The ratio f data / f MC of measured to predicted efficiency is shown at the bottom of each subfigure and the error bar gives the statistical uncertainty and the band the systematic uncertainty. The systematic uncertainty of the ratio is calculated taking into account the systematic uncertainties in the data and the prediction and their correlation.

Efficiency of Shower Deconstruction
The measurement of the efficiency for tagging anti-k t R = 1.0 jets with SD, using the requirement ln(χ) > 2.5, is presented in Figure 21. The signal weights are calculated assuming that all top-quark decay products are included in the large-R jet. This containment assumption leads to a rising efficiency with top-quark p T because of the tighter collimation at high p T . The SD efficiency is approximately 30% in the region with the lowest p T of the large-R jet (350-400 GeV), increases with p T and reaches ≈ 45% for 500-600 GeV in the lower |η| range and for 450-500 GeV in the higher |η| range. Within uncertainties, the measured efficiencies are compatible between the two η regions.
In the lowest measured p T region, the relative uncertainty is ≈ 16%, with the largest contributions coming from the difference observed when changing the tt generator from POWHEG to MC@NLO (12%). The uncertainties in the subjet energy scale and resolution have a much smaller impact of 0.6% and 0.4%, respectively. For p T between 500 and 600 GeV in the lower |η| range, the relative uncertainty is ≈ 32%, with the largest contributions resulting from the generator choice (27%).
The efficiency from POWHEG+PYTHIA follows the trend of the measured efficiency and the predicted and measured efficiencies agree within uncertainties, but the predicted efficiency is systematically higher. The ratio f data / f MC is approximately 80% throughout the considered p T range. The relative uncertainty of the ratio is ≈ 25% for |η| < 0.7. For 0.7 < |η| < 2.0, the uncertainty varies between ≈ 25% and ≈ 35%.  Figure 21: The efficiency f data , as defined in Eq. (6), for tagging trimmed anti-k t R = 1.0 jets with Shower Deconstruction, using the requirement ln(χ) > 2.5, as a function of the large-R jet p T . The large-R jets are selected in the signal selection and have pseudorapidities (a) |η| < 0.7 and (b) 0.7 < |η| < 2.0. Background (BG) is statistically subtracted from the data using simulation. The vertical error bar indicates the statistical uncertainty of the efficiency measurement and the data uncertainty band shows the systematic uncertainties. Also shown is the predicted tagging efficiency f MC , as defined in Eq. (7), from POWHEG+PYTHIA without systematic uncertainties. The ratio f data / f MC of measured to predicted efficiency is shown at the bottom of each subfigure and the error bar gives the statistical uncertainty and the band the systematic uncertainty. The systematic uncertainty of the ratio is calculated taking into account the systematic uncertainties in the data and the prediction and their correlation.

Efficiency of the HEPTopTagger
The efficiency for tagging C/A R = 1.5 jets with the HEPTopTagger is shown in Figure 22 as a function of the large-R jet p T . In the lowest p T interval from 200 to 250 GeV the efficiency is ≈ 10%. The efficiency increases with p T because of the geometric collimation effect and reaches ≈ 40% for p T between 350 and 400 GeV and 45-50% for p T > 500 GeV. The efficiencies in the two η regions are very similar. The measurement is systematically limited. In the lowest measured jet p T interval from 200 to 250 GeV, the relative systematic uncertainty is 8.5% with similar contributions coming from several sources, the three largest ones being the difference between POWHEG and MC@NLO as the tt generator (3.9%), the large-R jet energy scale (3.3%), and the b-tagging efficiency (3.3%). The contributions from the imperfect knowledge of the subjet energy scale and resolution are 2.5% and 2.7%, respectively. For large-R jet p T between 600 and 700 GeV, the relative uncertainty is 54%, and the largest contributions are from the generator choice (44%) and the large-R JES (22%), while the subjet energy scale (2.1%) and resolution (0.6%) have only a small impact.
When clustering objects (particles or clusters of calorimeter cells) with the C/A algorithm using R = 1.5 and comparing the resulting jet with the jet obtained by clustering the same particles with the anti-k t algorithm using R = 1.0 and then trimming the anti-k t jet, the p T is larger for the C/A jet than for the trimmed anti-k t jet. In this paper, the p T interval 600-700 GeV for the C/A R = 1.5 jets corresponds approximately to the interval 500-600 GeV for the trimmed anti-k t R = 1.0 jets. Beyond this p T , the statistical and systematic uncertainties become larger than 30% and 65%, respectively.
The efficiency predicted by the POWHEG+PYTHIA simulation agrees with the measurement within the uncertainties. The ratio f data / f MC is consistent with unity, within uncertainties of ≈ 30% in the lowest and highest measured p T intervals and ≈ 15% between 250 and 450 GeV.
The total systematic uncertainty of the efficiency measurements when integrating over the full p T range and the range 0 < |η| < 2 is given in Table 5. The total uncertainty is 12-20% for the substructurevariable-based taggers, 22% for SD, and 9.9% for the HEPTopTagger. The largest uncertainty results from the choice of tt generator for the subtraction of the not-matched tt contribution, which introduces a normalization uncertainty in the acceptance region of the measurement (high top-quark p T ), because the p T -dependence of the cross section is different between POWHEG and MC@NLO. This difference is larger at high p T , which translates to a larger uncertainty for the substructure-variable-based taggers and SD, which use trimmed anti-k t R = 1.0 jets with p T > 350 GeV, whereas the HEPTopTagger uses C/A R = 1.5 jets with p T > 200 GeV. For the same reason, the uncertainties in the parton shower and the PDF have a larger impact for higher large-R jet p T .
The large-R JES uncertainty affects the HEPTopTagger efficiency less strongly than the efficiencies of the other taggers (Table 5). This is due to the requirement placed on the top-quark-candidate transverse momentum (p T > 200 GeV). The HEPTopTagger algorithm rejects some of the large-R jet constituents in the process of finding the hard substructure objects (mass-drop criterion) and when applying the filtering against underlying-event and pile-up contributions. The top-quark-candidate p T is determined by the subjet four-momenta and is smaller than the large-R jet p T , so that the requirement p T (top-quark candidate) > 200 GeV is stricter than the requirement p T (large-R jet) > 200 GeV. This is also the reason why the subjet energy-scale uncertainty has a larger impact on the efficiency of the HEP-TopTagger compared to SD, because for SD no p T requirement on the top-quark candidate is included in the signal-and background-hypothesis weights.  Figure 22: The efficiency f data , as defined in Eq. (6), for tagging C/A R = 1.5 jets with the HEPTopTagger as a function of the large-R jet p T . The large-R jets are selected in the signal selection and have pseudorapidities (a) |η| < 0.7 and (b) 0.7 < |η| < 2.0. Background (BG) is statistically subtracted from the data using simulation. The vertical error bar indicates the statistical uncertainty of the efficiency measurement and the data uncertainty band shows the systematic uncertainties. Also shown is the predicted tagging efficiency f MC , as defined in Eq. (7), from POWHEG+PYTHIA without systematic uncertainties. The ratio f data / f MC of measured to predicted efficiency is shown at the bottom of each subfigure and the error bar gives the statistical uncertainty and the band the systematic uncertainty. The systematic uncertainty of the ratio is calculated taking into account the systematic uncertainties in the data and the prediction and their correlation.

Source
Relative uncertainty of top-tagging efficiency (%) Tagger Tagger Total  13  12  14  15  18  20  22 9.9 Table 5: The relative uncertainty of the measured top-tagging efficiency (in percent) due to different sources of systematic uncertainty and the total systematic uncertainty obtained by adding the different contributions in quadrature.

Mistag rate
Large-R jets identified in the background selection are used to measure the top-tagging misidentification rate (mistag rate). In each large-R jet p T bin i, the mistag rate is defined as with N (tag) data the number of measured (tagged) large-R jets. The contamination from tt events is negligible before requiring a tagged top candidate. After requiring a HEPTopTagger-tagged top candidate, the average contamination is ≈ 3% (200 < p T < 700 GeV). It is smaller than 3% for p T < 350 GeV. For larger values of p T , however, the contamination from tt increases, as the large-R jet p T spectrum falls more steeply for multijet production than for tt events, leading to a contamination of up to ≈ 5% for 350 < p T < 600 GeV and ≈ 11% for 600 < p T < 700 GeV.
For SD, the average contamination after requiring a tagged top candidate is ≈ 8% (350 < p T < 700 GeV). Although the HEPTopTagger gives higher background rejection than SD with ln(χ) > 2.5, the contamination for SD is larger on average, because the contamination increases with large-R jet p T and the SD is only studied for trimmed anti-k t R = 1.0 jets with p T > 350 GeV. For the substructure-variable taggers, the average contamination is smaller than 1.6%. Hence only for the top taggers with high rejection, SD and the HEPTopTagger, the contribution from tt events is subtracted from the numerator of Eq. (8) before calculating the mistag rate. The systematic uncertainty of the tt contribution is estimated to be ≈ 50% in each p T interval. This uncertainty influences the measurement of the mistag rate by a negligible amount compared to the statistical uncertainty that results from the finite number of tagged large-R jets in data. Therefore, only the statistical uncertainty is reported.
The measured mistag rate is compared to the mistag rate observed in multijet events simulated with PYTHIA, which is defined as in which N (tag) MC is the number of (tagged) large-R jets which pass a looser background selection than required in data. The electron-trigger requirement, the minimum distance requirement between the electrontrigger object and the large-R jet, and the veto on reconstructed electrons are removed. Including these requirements for simulation reduces the event yield significantly, which leads to less predictive power for the mistag rate with the result that the simulation still describes the measured mistag rates, but with large statistical uncertainties.
Removing the requirements mentioned above from the background selection for the simulation is expected not to bias f mistag MC,i . The low-p T threshold of the electron trigger avoids biases towards dijet events with a well defined hard scattering axis, and a possible trigger bias is reduced by using only large-R jets away from the trigger object, i.e. jets with ∆R > 1.5. The specific requirements applied only for data are therefore designed to allow for a measurement of the mistag rate in pure multijet events which avoids trigger biases and can hence be compared to the mistag rate observed in MC simulations.
The electron-trigger requirement is fulfilled preferentially for trigger objects with high p T . The p T of the electron-trigger object and that of the large-R jet under study for the mistag-rate determination are correlated through the common hard parton-parton scattering process. The large-R jet p T spectrum is therefore different for events in which the electron-trigger combination is activated compared to those events in which this trigger combination is inactive. As the trigger requirement is not applied in simulation, the average p T of the large-R jets in simulation is observed to be lower than in data. The reconstructed MC p T distribution of the large-R jets is therefore reweighted to the p T distribution observed in data. This reweighting procedure has only a small impact on the mistag rate, which is measured in bins of large-R jet p T .

Mistag rate for the substructure-variable taggers
The mistag rate f mistag data is shown in Figures 23-24 for the different top taggers as a function of the large-R jet p T . Anti-k t R = 1.0 jets are used for SD. The mistag rates rise with the p T of the large-R jet, because increased QCD radiation at higher p T produces structures inside the jets that resemble the structures in top jets. For taggers with high efficiency a larger mistag rate is found than for those with lower efficiency, because these looser top-tagging criteria are met by a larger fraction of the background jets.
The mistag rate for trimmed anti-k t R = 1.0 jets tagged using substructure-variable requirements are shown in Figure 23. In the lowest p T interval from 350 to 400 GeV, the mistag rates for the taggers I-V and the W top tagger are approximately 22%, 20%, 16%, 12%, 6%, and 4%, respectively. The measured mistag rate increases with p T and reaches values between 24% and 36% for taggers I-IV in the p T interval 600-700 GeV. In this highest p T interval, the mistag rate is ≈ 16% for tagger V and ≈ 6% for the W top tagger. The predicted mistag rate f mistag MC from PYTHIA is also shown with an uncertainty band that includes systematic uncertainties due to the large-R JES and resolution uncertainties, and uncertainties of the modelling of the substructure variables. Within the uncertainties, the prediction from PYTHIA agrees with the measurement for all taggers. The uncertainties on the ratio f data / f MC are 5-9% for taggers I-IV, and, depending on the large-R jet p T , ≈ 10% for tagger V and ≈ 20% for the W top tagger. The systematic uncertainties of tagger V and the W top tagger are larger than for taggers I-IV because of the conservative treatment of the correlation between the variations of the different substructure variables as mentioned in Section 6.

Mistag rate for Shower Deconstruction
For SD, the mistag rate increases from 1% for p T between 350 and 400 GeV to ≈ 4% for 600-700 GeV. The prediction from PYTHIA shows the same trend as in data and agrees well with the measurement within relative systematic uncertainties between ≈ 40% at low p T and ≈ 13% at high p T , which result from the uncertainties in the energy scales and resolutions of the subjets and the large-R jets. Integrated over p T , the subjet energy-scale and energy-resolution uncertainties lead to relative uncertainties of 15% and 13%, respectively, while the uncertainty in the large-R JES contributes 10%. The large-R jet energyresolution uncertainty has a negligible impact (< 1%).

Mistag rate for the HEPTopTagger
For the HEPTopTagger, the mistag rate increases from 0.5% for large-R jet p T between 200 and 250 GeV to 3% for 450-500 GeV. Above 500 GeV, the statistical uncertainties of the measured rate become large. The PYTHIA simulation agrees well with the measurement. The systematic uncertainty of the simulation is given by uncertainties in the large-R JES and resolution, and the energy scale and resolution of the subjets. The relative systematic uncertainty decreases with p T : it is 90% in the lowest measured p T  Figure 23: The mistag rate f mistag data , as defined in Eq. (8), for trimmed anti-k t R = 1.0 jets as a function of the large-R jet p T using the substructure-variable taggers I-V and the W top tagger. The large-R jets are selected with the background selection and have pseudorapidities |η| < 2.0. The vertical error bar indicates the statistical uncertainty in the measurement of the mistag rate. Also shown is the predicted mistag rate f mistag MC , as defined in Eq. (9), from PYTHIA with systematic uncertainties included. The ratio of measured to predicted mistag rate is shown at the bottom of each subfigure and the error bar gives the statistical uncertainty of the measurement. bin and 8% in the highest p T bin. This behaviour is driven by the subjet energy-resolution and energyscale uncertainties, because at low large-R jet p T a larger fraction of the HEPTopTagger subjets have momenta near the 20 GeV threshold. The mistag-rate uncertainty at low p T is dominated by the subjet energy-resolution uncertainty. The impact of the large-R jet uncertainties is significantly smaller.

Summary and conclusions
Jet substructure techniques are used to identify high-transverse-momentum top quarks produced in protonproton collisions at √ s = 8 TeV at the LHC. The 2012 ATLAS dataset is used, corresponding to an integrated luminosity of 20.3 ± 0.6 fb −1 .
Jets with a large radius parameter R are reconstructed and their substructure is analysed using a range of techniques that are sensitive to differences between hadronic top-quark decay and background processes. Jets are tagged as top jets by requirements imposed on the jet mass, splitting scales, and N-subjettiness, and by using the more elaborated algorithms of Shower Deconstruction (SD) and the original (not multivariate) HEPTopTagger. Six different combinations of requirements on substructure variables are investigated, five combinations denoted by taggers I-V and the W top tagger. For these taggers and for Shower Deconstruction, trimmed anti-k t R = 1.0 jets with p T > 350 GeV are used. Cambridge/Aachen (C/A) R = 0.2 subjets with p T > 20 GeV are used for SD. The HEPTopTagger was designed for, and is used with, ungroomed C/A R = 1.5 jets down to jet transverse momenta of 200 GeV. The difference in the jet algorithms, radii and grooming implies that the same top quark leads to a higher p T for the C/A R = 1.5 jet. A variant of the HEPTopTagger algorithm is introduced, HEPTopTagger04, which operates on the constituents of a set of anti-k t R = 0.4 jets instead of one C/A R = 1.5 jet. This technique is optimized to avoid energy overlap when different types of jets and jet radius parameters are used to reconstruct the full event final state. The advantage of this technique compared to a separation requirement applied to the C/A R = 1.5 jet is studied for simulated events with charged-Higgs-boson decays.
The performance of the various top-tagging techniques is compared using simulation by matching the different reconstructed jets to trimmed anti-k t R = 1.0 jets formed at the particle level. The reciprocal of the mistag rate, the background rejection, is studied as a function of the efficiency in intervals of the particle-level jet transverse momentum, p true T , ranging from 350 to 1500 GeV, while the efficiency and rejection of the HEPTopTagger is also studied for 200 < p true T < 350 GeV. For 350 < p true T < 1000 GeV, SD offers the best rejection up to its maximum achievable efficiency. Top-tagging efficiencies above 70% can be achieved with cuts on substructure variables, for example, yielding rejections of approximately 3-6 for an efficiency of 80%. A rejection of ≈ 15-20 at an efficiency of ≈ 50% can be achieved with the W top tagger over the range 450 < p true T < 1000 GeV. For 1000 < p true T < 1500 GeV, of all the top-tagging methods studied, the HEPTopTagger offers the best rejection (≈ 30) at an efficiency of ≈ 40%.
An event sample enriched in top-quark pairs is used to study the distributions of substructure variables. Simulations of Standard Model processes describe the relevant distributions well for the six substructurevariable taggers, SD, HEPTopTagger and HEPTopTagger04 within the uncertainties. The uncertainty in the energy scale of the subjets used by the HEPTopTagger is derived by comparing the mass of the topquark candidate reconstructed in data and simulation. The relative subjet p T uncertainty varies between 1% and 10%, depending on p T and the functional form chosen to describe the p T dependence.
The sample enriched in top-quark pairs is used to measure the efficiency to tag jets containing a hadronic top-quark decay. The efficiency is determined for jet p T between 200 and 700 GeV for the C/A R = 1.5 jets and for 350-600 GeV for the trimmed anti-k t R = 1.0 jets. The reach in p T is limited by statistical and systematic uncertainties, which become large at high p T . Jets not originating from hadronic topquark decays are subtracted using simulation and the subtraction leads to systematic uncertainties in the measured efficiency. Integrated over the measured p T range, the relative systematic uncertainty of the efficiency varies between ≈ 10% and ≈ 20% for the different substructure-variable-based taggers, and is ≈ 20% for SD and ≈ 10% for the HEPTopTagger. The dominant source of uncertainty is the modelling of tt events, and increases with large-R jet p T . The quoted p T -integrated uncertainties are smaller for the HEPTopTagger efficiency, because the measurement extends to smaller large-R jet p T . Simulated events generated with POWHEG+PYTHIA, with the h damp parameter set to infinity and the tt and top-quark p T spectra sequentially reweighted to describe the tt cross section measured at 7 TeV, describe the efficiency within the uncertainties of the measurement.
A sample enriched in multijet events is used to measure the mistag rate of the algorithms. The misidentification rate increases with the p T of the large-R jet and, in the range of p T studied, reaches values of 6-36% for the different substructure-variable taggers, ≈ 4% for SD, and ≈ 3% for the HEPTopTagger. The measured mistag rate is well described by simulations using PYTHIA within the modelling uncertainties and the statistical uncertainties of the measurement.
For top-tagging analyses with a low background level, e.g. tt resonance searches at top quark p T > 700 GeV in the final state with one charged lepton, it is recommended to use a top tagger with high efficiency, such as the substructure-variable-based taggers I-IV studied in this paper. If high rejection is required, e.g. for an all-hadronic final state, then for p T > 1000 GeV, one of the following taggers is likely to give the best sensitivity, depending on the details of the analysis: the W top tagger, the HEPTopTagger, or SD. For p T between 450 and 1000 GeV, SD is the tagger of choice if high rejection is required. Only the performance of the HEPTopTagger has been studied for p T down to 200 GeV. In final states with high jet multiplicity where the full event needs to be reconstructed, the HEPTopTagger04 method is a useful approach to avoid energy sharing between small-R and large-R jets.
In analyses, the uncertainty in the top-tagging efficiency for Standard-Model and beyond-the-Standard-Model predictions comprises detector-related uncertainties and theoretical modelling uncertainties. The background in analyses should be determined by employing data-driven methods, as it was done for the ATLAS Run 1 analyses because the mistag rate was observed to depend strongly on the choice of trigger, and small deficiencies in the trigger simulation can have a large impact on the analysis.
The energy scale of the HEPTopTagger subjets should be determined using the in situ method pioneered in this paper. This method takes into account all subjets used by the HEPTopTagger, even those with radius parameter R < 0.2, for which the MC-based calibrations determined for R = 0.2 are used.
It is demonstrated in this paper that the substructure of top jets shows the expected features and that it is well modelled by simulations. Top tagging has been used in LHC Run 1 analyses and its importance will increase in Run 2 with more top quarks produced with high transverse momentum due to the higher centre-of-mass energy.   Figure 25: Detector-level distributions of variables reconstructed in events passing the signal-sample selection (tt) with at least one trimmed anti-k t R = 1.0 jet with p T > 350 GeV. (a) The transverse momentum of the charged lepton and (b) the distance in (η, φ) between the highest-p T b-jet within ∆R = 1.5 of the lepton and the highestp T trimmed anti-k t R = 1.0 jet. The vertical error bar indicates the statistical uncertainty of the measurement. Also shown are distributions for simulated SM contributions with systematic uncertainties (described in Section 6) indicated as a band. The tt prediction is split into a matched part for which the large-R jet axis is within ∆R = 0.75 of the flight direction of a hadronically decaying top quark and a not matched part for which this criterion does not hold. The ratio of measurement to prediction is shown at the bottom of each subfigure and the error bar and band give the statistical and systematic uncertainties of the ratio, respectively. The impacts of experimental and tt modelling uncertainties are shown separately for the ratio.
Distributions for events fulfilling the signal selection with at least one C/A R = 1.5 jet with p T > 200 GeV, as used in the HEPTopTagger studies, are shown in Figure 26. The distribution of the transverse mass m W T is shown in Figure 26(a). It exhibits a peak near the W-boson mass, which is expected if the reconstructed charged lepton and the E miss T correspond to the charged lepton and neutrino from the W decay and the momenta of the two particles lie in the transverse plane. The missing-transverse-momentum distribution (Figure 26(b)) displays a peak around 55 GeV and a smoothly falling spectrum for larger values.
All distributions are described by the simulation within the uncertainties. Important sources of systematic uncertainty for the m W T and E miss T distributions are the large-R JES, the b-tagging efficiency, the prediction of the tt cross section, and tt modelling uncertainties from the choice of generator, parton shower, and PDF set. None of these uncertainties dominates. for events passing the signal selection with at least one C/A R = 1.5 jet with p T > 200 GeV. The vertical error bar indicates the statistical uncertainty of the measurement. Also shown are distributions for simulated SM contributions with systematic uncertainties (described in Section 6) indicated as a band. The tt prediction is split into a matched part for which the large-R jet axis is within ∆R = 1.0 of the flight direction of a hadronically decaying top quark and a not matched part for which this criterion does not hold. The ratio of measurement to prediction is shown at the bottom of each subfigure and the error bar and band give the statistical and systematic uncertainties of the ratio, respectively. The impacts of experimental and tt modelling uncertainties are shown separately for the ratio. [5] ATLAS Collaboration, Measurement of the differential cross-section of highly boosted top quarks as a function of their transverse momentum in √ s = 8 TeV proton-proton collisions using the ATLAS detector, Phys. Rev. D93 (2016) 032009, arXiv:1510.03818 [hep-ex].         [54] ATLAS Collaboration, Measurements of normalized differential cross sections for tt production in pp collisions at √ s = 7 TeV using the ATLAS detector, Phys. Rev. D90 (2014) 072004, arXiv:1407.0371 [hep-ex]. [57] N. Kidonakis, Next-to-next-to-leading logarithm resummation for s-channel single top quark production, Phys. Rev. D81 (2010) 054028, arXiv:1001.5034 [hep-ph].