Identification of heavy-flavour jets with the CMS detector in pp collisions at 13 TeV

Many measurements and searches for physics beyond the standard model at the LHC rely on the efficient identification of heavy-flavour jets, i.e. jets originating from bottom or charm quarks. In this paper, the discriminating variables and the algorithms used for heavy-flavour jet identification during the first years of operation of the CMS experiment in proton-proton collisions at a centre-of-mass energy of 13 TeV, are presented. Heavy-flavour jet identification algorithms have been improved compared to those used previously at centre-of-mass energies of 7 and 8 TeV. For jets with transverse momenta in the range expected in simulated t̄ events, these new developments result in an efficiency of 68% for the correct identification of a b jet for a probability of 1% of misidentifying a light-flavour jet. The improvement in relative efficiency at this misidentification probability is about 15%, compared to previous CMS algorithms. In addition, for the first time algorithms have been developed to identify jets containing two b hadrons in Lorentz-boosted event topologies, as well as to tag c jets. The large data sample recorded in 2016 at a centre-of-mass energy of 13 TeV has also allowed the development of new methods to measure the efficiency and misidentification probability of heavy-flavour jet identification algorithms. The b jet identification efficiency is measured with a precision of a few per cent at moderate jet transverse momenta (between 30 and 300 GeV) and about 5% at the highest jet transverse momenta (between 500 and 1000 GeV).

The CMS collaboration E-mail: cms-publication-committee-chair@cern.ch A : Many measurements and searches for physics beyond the standard model at the LHC rely on the efficient identification of heavy-flavour jets, i.e. jets originating from bottom or charm quarks. In this paper, the discriminating variables and the algorithms used for heavy-flavour jet identification during the first years of operation of the CMS experiment in proton-proton collisions at a centre-of-mass energy of 13 TeV, are presented. Heavy-flavour jet identification algorithms have been improved compared to those used previously at centre-of-mass energies of 7 and 8 TeV. For jets with transverse momenta in the range expected in simulated tt events, these new developments result in an efficiency of 68% for the correct identification of a b jet for a probability of 1% of misidentifying a light-flavour jet. The improvement in relative efficiency at this misidentification probability is about 15%, compared to previous CMS algorithms. In addition, for the first time algorithms have been developed to identify jets containing two b hadrons in Lorentz-boosted event topologies, as well as to tag c jets. The large data sample recorded in 2016 at a centre-of-mass energy of 13 TeV has also allowed the development of new methods to measure the efficiency and misidentification probability of heavy-flavour jet identification algorithms. The b jet identification efficiency is measured with a precision of a few per cent at moderate jet transverse momenta (between 30 and 300 GeV) and about 5% at the highest jet transverse momenta (between 500 and 1000 GeV).

Introduction
The success of the physics programme of the CMS experiment at the CERN LHC requires the particles created in the LHC collisions to be reconstructed and identified as accurately as possible. With the exception of the top quark, quarks and gluons produced in pp collisions develop a parton shower and eventually hadronize giving rise to jets of collimated particles observed in the CMS detector. Heavy-flavour jet identification techniques exploit the properties of the hadrons in the jet to discriminate between jets originating from b or c quarks (heavy-flavour jets) and those originating from light-flavour quarks or gluons (light-flavour jets). The CMS Collaboration presented in ref. [1] a set of b jet identification techniques used in physics analyses performed on LHC Run 1 pp collision data, collected in 2011 and 2012 at centre-of-mass energies of 7 and 8 TeV. This paper presents a comprehensive summary of the newly developed and optimized techniques compared to our previous results. In particular, the larger recorded data set of pp collisions at a centre-of-mass energy of 13 TeV during Run 2 of the LHC in 2016, allows the study of rarer high-momentum topologies in which daughter jets from a Lorentz-boosted parent particle merge into a single jet. Examples of such topologies include the identification of boosted Higgs bosons decaying to two b quarks, and of b jets from boosted top quarks. The identification of c jets is also of significant interest, e.g. for the study of Higgs boson decays to a pair of c quarks, and for top squark searches in the c quark plus neutralino final-state topology.
The paper is organized as follows. A brief summary of particle and jet reconstruction in the CMS detector is given in section 2. Details about the simulated proton-proton collision samples and the data-taking conditions are given in section 3. The properties of heavy-flavour jets and the variables used to discriminate between these and other jets are discussed in section 4, while -1 -the algorithms are presented in sections 5 and 6. For some physics processes, it is important to identify b jets at the trigger level. This topic is discussed in section 7. The large recorded number of proton-proton (pp) collisions permits the exploration of new methods to measure the efficiency of the heavy-flavour jet identification algorithms using data. These new methods, as well as the techniques used during the Run 1, are summarized in sections 8 and 9 for efficiency measurements in nonboosted and boosted event topologies, respectively.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter and a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections, together providing coverage in pseudorapidity (η) up to |η| = 3.0. Forward calorimeters extend the coverage to |η| = 5.2. Muons are detected in the pseudorapidity range |η| < 2.4 using gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.
The silicon tracker measures charged particles within the range |η| < 2.5. During the first two years of Run 2 operation at a centre-of-mass energy of 13 TeV, the silicon tracker setup did not change compared to the Run 1 of the LHC. The trajectories of charged particles are reconstructed from the hits in the silicon tracking system using an iterative procedure with a Kalman filter. The tracking efficiency is typically over 98% for tracks with a transverse momentum (p T ) above 1 GeV. For nonisolated particles with 1 < p T < 10 GeV and |η| < 1.4, the track resolutions are typically 1.5% in p T and 25-90  µm in the transverse (longitudinal) impact parameter (IP) [2]. The pp interaction vertices are reconstructed by clustering tracks on the basis of their z coordinates at their points of closest approach to the centre of the beam spot using a deterministic annealing algorithm [3]. The position of each vertex is estimated with an adaptive vertex fit [4]. The resolution on the position is around 20 µm in the transverse plane and around 30 µm along the beam axis for primary vertices reconstructed using at least 50 tracks [2].
The global event reconstruction, also called particle-flow (PF) event reconstruction [5], consists of reconstructing and identifying each individual particle with an optimized combination of all subdetector information. In this process, the identification of the particle type (photon, electron, muon, charged hadron, neutral hadron) plays an important role in the determination of the particle direction and energy. Photons, e.g. coming from neutral pion decays or from electron bremsstrahlung, are identified as ECAL energy clusters not linked to the extrapolation of any charged-particle trajectory to the ECAL. Electrons, e.g. coming from photon conversions in the tracker material or from heavy-flavour hadron semileptonic decays, are identified as combinations of charged-particle tracks reconstructed in the tracker and multiple ECAL energy clusters corresponding to both the passage of the electron through the ECAL plus any associated bremsstrahlung photons. Muons, e.g. from the semileptonic decay of heavy-flavour hadrons, are identified as tracks reconstructed in the tracker combined with matching hits or tracks in the muon system, and matching energy deposits in the calorimeters. Charged hadrons are identified as charged particles not identified as electrons or muons. Finally, neutral hadrons are identified as HCAL energy clusters not matching -2 -any charged-particle track, or as ECAL and HCAL energy excesses with respect to the expected charged-hadron energy deposit.
For each event, particles originating from the same interaction vertex are clustered into jets with the infrared and collinear safe anti-k T algorithm [6,7], using a distance parameter R = 0.4 (AK4 jets). Compared to the R = 0.5 jets that were used in Run 1 physics analyses, jets reconstructed with R = 0.4 are found to still contain most of the particles from the hadronization process, while at the same time being less sensitive to particles from additional pp interactions (known as pileup) appearing in the same or adjacent bunch crossings. For studies involving boosted topologies, jets are clustered with a larger distance parameter R = 0.8 (AK8 jets). The jet momentum is determined as the vectorial sum of all particle momenta in the jet. Jet energy corrections are derived from the simulation and are confirmed with in situ measurements using the energy balance in dijet, multijet, photon + jet, and leptonically decaying Z + jets events [8]. The jet energy resolution amounts typically to 15% at 10 GeV, 8% at 100 GeV, and 4% at 1 TeV [8]. For the studies presented here, jets are required to lie within the tracker acceptance (|η| < 2.4) and have p T > 20 GeV. The missing transverse momentum vector is defined as the projection of the negative vector sum of the momenta of all reconstructed particles in an event on the plane perpendicular to the beams. Its magnitude is referred to as p miss T . The reconstructed vertex with the largest value of summed physics-object p 2 T is taken to be the primary pp interaction vertex (PV). The physics objects are the jets, clustered using the jet finding algorithm with the tracks assigned to the vertex as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets.
The energy of electrons is determined from a combination of the track momentum at the main interaction vertex, the corresponding ECAL cluster energies, and the energies of all bremsstrahlung photons associated with the track. The momentum resolution for electrons with p T ≈ 45 GeV from Z → ee decays ranges from 1.7% for nonshowering electrons, i.e. not producing additional photons and electrons, in the barrel region (|η| < 1.48), to 4.5% for showering electrons in the endcaps (1.48 < |η| < 3.0) [9]. Muons with 20 < p T < 100 GeV have a relative p T resolution of 1.3-2.0% in the barrel and less than 6% in the endcaps. The p T resolution in the barrel is better than 10% for muons with p T up to 1 TeV [10]. The energy of charged hadrons is determined from a combination of the track momentum and the corresponding ECAL and HCAL energy deposits, corrected for zero-suppression effects and for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energy deposits.
Events of interest are selected using a two-tiered trigger system [11]. The level-1 trigger (L1), composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz. The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to less than 1 kHz before data storage.
A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in ref. [12].

Data and simulated samples
The results presented in this paper are based on the pp collision data set recorded at a centre-ofmass energy of 13 TeV by the CMS detector in 2016, corresponding to an integrated luminosity of 35.9 fb −1 . Various event generators are used to model the relevant physics processes. The interactions between particles and the material of the CMS detector are simulated using 4 [13][14][15]. The data and simulated samples are used to determine the heavy-flavour jet identification efficiency in various event topologies. When measuring the heavy-flavour jet identification efficiency or when comparing the data to the simulation, the number of simulated events is large enough to neglect the statistical uncertainty in the simulation unless mentioned otherwise.
The pair production of top quarks and electroweak single top quark production is performed with the 2.0 generator at next-to-leading order (NLO) accuracy [16][17][18][19][20][21]. The value of the top quark mass used for the generation of the simulated samples is 172.5 GeV. The systematic uncertainty related to the value of the top quark mass m t is evaluated by varying it by ±1 GeV. Alternative samples are used to assess parton shower uncertainties, as well as factorization and normalization scale uncertainties at the matrix element and parton shower levels. Diboson WW, WZ, and ZZ events, referred to collectively as "VV" events, are generated at NLO accuracy with the M G 5_a @ 2.2.2 generator [22], including M S [23] and the FxFx merging scheme [24] between jets from matrix element calculations and the parton shower description, or with the 2.0 generator [25,26]. The Z + jets and W + jets events are generated with M G 5_a @ 2.2.2 at leading order (LO), using the MLM matching scheme [27]. Samples of events with a Kaluza-Klein graviton [28] decaying to two Higgs bosons are also simulated with M G 5_a @ 2.2.2 at LO for graviton masses ranging between 1 and 3.5 TeV. Background events comprised uniquely of jets produced through the strong interaction (multijet events) are generated with 8.205 [29] in differentp T bins, wherep T is defined as the average p T of the final-state partons. Muon-enriched multijet samples are produced by forcing the decay of charged pions and kaons into muons and by requiring a generated muon with p T > 5 GeV. 8.205 is also used for the parton showering and hadronization of all the simulated samples with the CMS underlying event tunes CUETP8M1 [30] using the NNPDF 2.3 [31] parton distribution functions. In the case of top quark pair production a modification of this tune is used, CUETP8M2T4 [32] using the NNPDF 3.0 [33] parton distribution functions.
Pileup interactions are modelled by overlaying the simulated events with additional minimum bias collisions generated with 8.205. These additional simulated events are then reweighted to match the observed number of pileup interactions or the primary vertex multiplicity in data.

Heavy-flavour jet discriminating variables 4.1 Properties of heavy-flavour jets
Algorithms for heavy-flavour jet identification use variables connected to the properties of heavyflavour hadrons present in jets resulting from the radiation and hadronization of b or c quarks. For instance, the lifetime of hadrons containing b quarks is of the order of 1.5 ps, while the lifetime of c hadrons is 1 ps or less. This leads to typical displacements of a few mm to one cm for b hadrons, depending on their momentum, thus giving rise to displaced tracks from which a secondary vertex (SV) may be reconstructed, as illustrated in figure 1. The displacement of tracks with respect to the primary vertex is characterized by their impact parameter, which is defined as the distance between the primary vertex and the tracks at their points of closest approach. The vector pointing from the primary vertex to the point of closest approach is referred to as the impact parameter vector. The impact parameter value can be defined in three spatial dimensions (3D) or in the plane transverse to the beam line (2D). The longitudinal impact parameter is defined in one dimension, along the beam line. The impact parameter is defined to be positive or negative, with a positive sign indicating that the track is produced "upstream". This means that the angle between the impact parameter vector and the jet axis is smaller than π/2, where the jet axis is defined by the primary vertex and the direction of the jet momentum. In addition, b and c quarks have a larger mass and harder fragmentation compared to the light quarks and massless gluons. As a result, the decay products of the heavy-flavour hadron have, on average, a larger p T relative to the jet axis than the other jet constituents. In approximately 20% (10%) of the cases, a muon or electron is present in the decay chain of a heavy b (c) hadron. Hence, apart from the properties of the reconstructed secondary vertex or displaced tracks, the presence of charged leptons is also exploited for heavy-flavour jet identification techniques and for measuring their performance in data.
In order to design and optimize heavy-flavour identification techniques, a reliable method is required for assigning a flavour to jets in simulated events. The jet flavour is determined by clustering not only the reconstructed final-state particles into jets, but also the generated b and c hadrons that do not have b and c hadrons as daughters respectively. To prevent these generated hadrons from affecting the reconstructed jet momentum, the modulus of the hadron four-momentum is set to a small number, retaining only the directional information. This procedure is known as ghost association [34]. Jets containing at least one b hadron are defined as b jets; the ones containing at least one c hadron and no b hadron are defined as c jets. The remaining jets are considered to be light-flavour (or "udsg") jets. Since pileup interactions are not included during the hard-scattering event generation, jets from pileup interactions ("pileup jets") in the simulation are tentatively identified as jets without a matched generated jet. The generated jets are reconstructed with the jet clustering algorithm mentioned in section 2 applied to the generated final-state particles (excluding neutrinos). The matching between the reconstructed PF jets and the generated jets with p T > 8 GeV -5 -

JINST 13 P05011
is performed by requiring the angular distance between them to be ∆R = (∆η) 2 + (∆φ) 2 < 0.25. Using this flavour definition, jets arising from gluon splitting to bb are considered as b jets. In sections 6, 8 and 9, these g → bb jets are often shown as a separate category. In this case, two b hadrons without b hadron daughters should be clustered in the jet. The studies presented in sections 4 and 5 are based on simulated events. For these studies, jets are removed if they are closer than ∆R = 0.4 to a generated charged lepton from a direct V boson decay. In addition, electrons or muons originating from gauge boson decays that are reconstructed as jets are removed if they carry more than 60% of the jet p T , i.e. p T /p jet T < 0.6 is required, where p T (p jet T ) is the p T of the lepton (jet). No additional identification or isolation requirements are applied for muons or electrons.

Track selection and variables
The properties of the tracks clustered within the jet represent the basic inputs of all heavy-flavour jet identification (tagging) algorithms. Input variables for the tagging algorithms are constructed from the tracks after applying appropriate selection criteria. In particular, to ensure a good momentum and impact parameter resolution, tracks are required to have p T > 1 GeV, a χ 2 value of the trajectory fit normalized to the number of degrees of freedom below 5, and at least one hit in the pixel layers of the tracker detector. The last of these requirements is less stringent than the requirement used for b jet identification in Run 1, where at least eight hits were required in the pixel and strip tracker combined, of which at least two were pixel detector hits. The requirement on the number of hits was relaxed to cope with saturation effects that were observed at high occupancy in the readout electronics of the strip tracker during the first part of the 2016 data taking, leading to a reduced tracking and b tagging performance. The issues with the readout electronics have been fully resolved, with no side effects on the tracking performance, but the relaxed requirement on the number of hits was kept since there was no impact on the final b tagging performance. Apart from the requirements on the quality of the tracks, the presence of tracks from long-lived K 0 S or Λ hadrons as well as from material interactions is reduced by requiring the track decay length, defined as the distance from the primary vertex to the point of closest approach between the track and the jet axis, to be less than 5 cm. The contribution from tracks originating from pileup vertices is reduced with the following set of requirements: the absolute value of the transverse (longitudinal) impact parameter of the track is required to be smaller than 0.2 (17) cm and the distance between the track and the jet axis at their point of closest approach is required to be less than 0.07 cm. Figure 2 presents typical distributions of the latter variable for jets in tt events after applying the rest of the track selection requirements, showing the origin of each track separately. The origin of a track is labelled with "b hadron" if the track corresponds to a particle originating from a b hadron decay. A track corresponding to a particle from the decay of a c hadron that itself originates from the decay of a b hadron is also labelled as "b hadron". The category with the "c hadron" label contains only tracks corresponding to a particle from the decay of a c hadron without a b hadron ancestor. The label "uds hadron" indicates tracks corresponding to particles without heavy-flavour hadron ancestors. The label "pileup" refers to tracks from charged particles originating from a different primary vertex. A category with mismeasured tracks is defined containing tracks that are more likely to have been misreconstructed, e.g. by wrongly combining hits created by different particles. A track belongs to this category if the number of hits from the simulated charged particle closest to the track over the number of hits associated with the track, is less than 75%. This category is  c hadron  uds hadron  Pileup  Fake   + jets  t  t  > 20 GeV   T   udsg jets p   13 TeV, 2016 CMS Simulation Figure 2. Distribution of the distance between a track and the jet axis at their point of closest approach for tracks associated with b (left) and light-flavour (right) jets in tt events. This distance is required to be smaller than 0.07 cm, as indicated by the arrow. The tracks are divided into categories according to their origin as defined in the text. The distributions are normalized such that their sum has unit area. The last bin includes the overflow entries.
labelled as "fake". In figure 3, the impact of the track selection requirements on the number of tracks in a given category is shown for various jet flavours in tt events. The track selection requirements clearly enhance the fraction of tracks originating from heavy-flavour hadron decays in bottom and charm jets. The track selection requirements reduce the number of tracks in the fake and pileup categories to a few per cent for all jet flavours. Figure 4 shows the track multiplicity dependence on the jet p T and |η| for various jet flavours in tt events before and after applying the track selection requirements. For b jets, the average track multiplicity is higher than for light-flavour jets, before and after applying the track selection requirements, and the ratio of the average track multiplicity for b jets to other jet flavours is roughly constant. The average track multiplicity increases with increasing jet p T for all jet flavours. Before the track selection, the average track multiplicity is almost constant with respect to the jet |η|. The small variations seen are due to the tracker geometry that has an impact on the track reconstruction efficiency. In addition, since the η of the jet is defined as the η of the jet axis, some of the charged particles in the jet are outside the tracker acceptance for high jet |η| values, resulting in a lower track multiplicity in the highest bin. When the track selection requirements are applied, the average track multiplicity decreases with respect to the jet |η|, because of the relatively larger impact of the track selection requirements near the edge of the acceptance window for the tracker.
The aforementioned track selection requirements are always applied when reconstructing the variables used in the tagging algorithms. An exception is given by the variables relying on the inclusive vertex finding algorithm, as discussed in section 4.3. Figure 5 shows the distribution of the 3D impact parameter and its significance for the different jet flavours. The impact parameter significance is defined as the impact parameter value divided by its uncertainty, IP/σ. In addition, the lower panels in figure 5 also show the distribution of the 2D impact parameter significance for the track with the highest and second-highest 2D impact parameter significance for different jet flavours. From figure 5 it is clear that tracks in heavy-flavour jets have larger impact parameter and impact parameter significance compared to tracks in light-flavour jets.     Figure 5. Distribution of the 3D impact parameter value (upper left) and significance (upper right) for tracks associated with jets of different flavours in tt events. Distribution of the 2D impact parameter significance for the track with the highest (lower left) and second-highest (lower right) 2D impact parameter significance for jets of different flavours in tt events. The distributions are normalized to unit area. The first and last bin include the underflow and overflow entries, respectively.
-9 -figure 5 shows that tracks with a large impact parameter significance are also present in light-flavour jets. These originate from the decays of relatively long-lived hadrons, for example K 0 S or Λ, or from heavy-flavour hadrons where the tracks have been incorrectly clustered into a light-flavour jet. For the track with the second-highest impact parameter significance in light-flavour jets, the distribution is much more symmetric as expected for hadrons with a short lifetime.

Secondary vertex reconstruction and variables
If the secondary vertex from the decay of a heavy-flavour hadron is reconstructed, powerful discriminating variables can be derived from it. An example is the (corrected) secondary vertex mass, which is directly related to the mass of the heavy-flavour hadron. The corrected secondary vertex mass is defined as M 2 SV + p 2 sin 2 θ + psinθ, where M SV is the invariant mass of the tracks associated with the secondary vertex, p is the secondary vertex momentum obtained from the tracks associated with it, and θ the angle between the secondary vertex momentum and the vector pointing from the primary vertex to the secondary vertex, which is referred to as the secondary vertex flight direction. Using this definition, the secondary vertex mass is corrected for the observed difference between its flight direction and its momentum, taking into account particles that were not reconstructed or which failed to be associated with the secondary vertex. It should be noted that the energy of a track is obtained using its momentum and assuming the π ± mass [35]. Another example of a discriminating secondary vertex variable is its flight distance (significance), defined as the 2D or 3D distance between the primary and secondary vertex positions (divided by the uncertainty on the secondary vertex flight distance). Reconstructing the secondary vertex from the heavy-flavour hadron decay is not always possible for two main reasons: the heavy-flavour hadron decays too close to the primary vertex, or there are less than two selected tracks. The latter may be due to having less than two charged particles in the decay, less than two reconstructed tracks, or less than two tracks passing the selection requirements.
Two algorithms for reconstructing secondary vertices are used. The first one is the adaptive vertex reconstruction (AVR) algorithm [36]. This secondary vertex reconstruction algorithm was used for b jet identification by the CMS Collaboration during the LHC Run 1 [1]. The algorithm uses the tracks clustered within jets and passing the selection requirements discussed in section 4.2. In addition, the tracks are required to be within ∆R < 0.3 of the jet axis and to have a track distance below 0.2 cm. The vertex pattern recognition iteratively fits all tracks with an outlier-resistant adaptive vertex fitter [4]. At each iteration, tracks close enough to the fitted vertex are removed and a new iteration is made with the remaining tracks. Given that the first iteration often finds a vertex close to the primary vertex, the first iteration is explicitly run with a constraint on the primary vertex. Vertices are rejected if it is found that they share more than 65% of their tracks with the primary vertex, or if their 2D secondary vertex flight distance is more than 2.5 cm or less than 0.01 cm. In addition, the 2D secondary vertex flight distance significance is required to be larger than 3. To reduce the impact of long-lived hadron decays and material interactions, only secondary vertices with M SV < 6.5 GeV are considered. Pairs of tracks are rejected if they are compatible with the mass of the relatively long-lived K 0 S hadron within 50 MeV. Additionally, the angular distance between the jet axis and the secondary vertex flight direction should satisfy ∆R < 0.4. When all these requirements are fulfilled, the reconstructed AVR secondary vertex is associated with the jet.
-10 -At the start of LHC Run 2, the inclusive vertex finding (IVF) algorithm was adopted as the standard secondary vertex reconstruction algorithm used to define variables for heavy-flavour jet tagging. In contrast with AVR, which uses as input the selected tracks clustered in the reconstructed jets, IVF uses as input all reconstructed tracks in the event with p T > 0.8 GeV and a longitudinal IP < 0.3 cm. The algorithm was initially developed to perform a measurement of the angular correlations between the b jets in bb pair production [37]. It is well suited for b hadron decays at small relative angle giving rise to overlapping, or completely merged, jets. The IVF procedure starts by identifying seed tracks with a 3D impact parameter value of at least 50 µm and a 2D impact parameter significance of at least 1.2. After identifying the seed tracks, the procedure includes the following steps: • Track clustering: the compatibility between a seed track and any other track is evaluated using requirements on the distance at the point of closest approach of the two tracks and the angle between them. In addition, the distance between the seed track and any other track at their points of closest approach is required to be smaller than the distance between the track and the primary vertex at their points of closest approach.
• Secondary vertex fitting and cleaning: in order to determine the position of the secondary vertices, the sets of clustered tracks are fitted with the adaptive vertex fitter also used in the AVR algorithm. After the fit, secondary vertices with a 2D (3D) flight distance significance smaller than 2.5 (0.5) are removed. For IVF vertices used in the c tagging algorithm presented in section 5.2.1, the threshold is relaxed to 1.25 (0.25). In addition, if two secondary vertices share 70% or more of their tracks, or if the significance of the flight distance between the two secondary vertices is less than 2, one of the two secondary vertices is dropped from the collection of secondary vertices.
• Track arbitration: at this stage, a track could be assigned to both the primary vertex and secondary vertex. To resolve this ambiguity, a track is discarded from the secondary vertex if it is more compatible with the primary vertex. This is the case if the angular distance between the track and the secondary vertex flight direction is ∆R > 0.4, and if the distance between the secondary vertex and the track is larger than the absolute impact parameter value of the track.
• Secondary vertex refitting and cleaning: the secondary vertex position is refitted after track arbitration and if there are still two or more tracks associated with the secondary vertex. After refitting the secondary vertex positions, a second check for duplicate vertices is performed. This time, a secondary vertex is removed from the collection of secondary vertices when it shares at least 20% of its tracks with another secondary vertex and the significance of the flight distance between the two secondary vertices is less than 10.
The selection criteria applied to the remaining IVF secondary vertices are mostly the same as in the case of the AVR vertices. However, to maximize the secondary vertex reconstruction efficiency, some requirements are relaxed. In particular, secondary vertices are rejected when they share 80% or more of their tracks, and when the 2D flight distance significance is less than 2 (1.5) for secondary vertices used in b (c) tagging algorithms. The remaining secondary vertices are then associated -11 -

JINST 13 P05011
with the jets by requiring the angular distance between the jet axis and the secondary vertex flight direction to satisfy ∆R < 0.3. Figure 6 shows the discriminating power between the various jet flavours for the IVF secondary vertex mass (left) and 2D flight distance significance (right). The secondary vertex mass for b jets peaks at higher values compared to that of the other jet flavours. For c jets, a peak is observed around 1.5 GeV, as expected from the lower mass of c compared to b hadrons. The secondary vertex reconstruction efficiency for jets is defined as the number of jets containing a reconstructed secondary vertex divided by the total number of jets. For jets with p T > 20 GeV in tt events, the efficiency for reconstructing a secondary vertex for b (udsg) jets using the IVF algorithm is about 75% (12%), compared to 65% (4%) for reconstructing a secondary vertex with the AVR algorithm. However, the efficiency gain is largest for c jets with an IVF secondary vertex reconstruction efficiency of about 37%, compared to 23% for the efficiency of the AVR algorithm. Averaged over all jet flavours, 66% of the IVF secondary vertices in jets are also found by the AVR algorithm. The other way around, 86% of the AVR secondary vertices are also found by the IVF algorithm. Figure 7 (left) compares the number of secondary vertices in b jets for the IVF and AVR algorithms. As expected, more secondary vertices are reconstructed with the IVF algorithm because of the inclusive approach of using all tracks instead of only those associated with the jet and passing the selection requirements. The right panel in figure 7 shows the correlation between the corrected mass of the secondary vertices obtained with the two approaches. From the correlation it is clear that the same secondary vertex is found in most cases. Since the efficiency of the IVF algorithm is higher, IVF secondary vertices are used to compute the secondary vertex variables for the heavy-flavour jet identification algorithms. AVR secondary vertices are only used in one of the b jet identification algorithms discussed in section 5.

Soft-lepton variables
Although an electron or muon is present in only 20% (10%) of the b (c) jets, the properties of this low-energy nonisolated "soft lepton" (SL) permit the selection of a pure sample of heavy-flavour jets. Therefore, some of the heavy-flavour taggers use the properties of these soft leptons. Soft  Figure 7. Distribution of the number of secondary vertices in b jets for the two vertex finding algorithms described in the text (left). The distributions are normalized to unit area. Correlation between the corrected secondary vertex mass for the vertices obtained with the two vertex finding algorithms (right). Both panels show jets in tt events. muons are defined as particles clustered in the jet passing the loose muon identification criteria and with a p T of at least 2 GeV [10]. Electrons are associated with a jet by requiring ∆R < 0.4. Soft electrons should pass the loose electron identification criteria, have an associated track with at least three hits in the pixel layers, and be identified as not originating from a photon conversion [9].
Discriminating variables using soft lepton information are typically similar to the variables based on track information alone. As an example, figure 8 shows the distribution of the 3D impact parameter value of soft leptons associated with jets. The 3D impact parameter value of the soft  lepton discriminates between the various jet flavours. For the low-p T muons expected from the heavy-flavour hadron decays, it should be noted that the impact parameter resolution is worse than at high p T [10], which is reflected in the relatively large spread of the impact parameter values. The soft lepton variables are used in the soft lepton algorithms discussed in section 5.1.3 and in the c tagger discussed in section 5.2.1.

The b jet identification
The jet probability (JP) and combined secondary vertex (CSV) taggers used during Run 1 [1] are also used for the Run 2 analyses. Likewise, the combined multivariate analysis (cMVA) tagger, which combines the discriminator values of various taggers, was retrained. Apart from the retraining, the CSV algorithm was also optimized and the new version is referred to as CSVv2. In addition, another version of the CSV algorithm was developed that uses deep machine learning [38] (DeepCSV). These taggers are presented in more detail in the sections 5.1.1 to 5.1.3. The new developments result in a performance that is significantly better than that of the Run 1 taggers, as discussed in section 5.1.4.

Jet probability taggers
There are two jet probability taggers, the JP and JBP algorithms. The JP algorithm is described in ref. [1] and uses the signed impact parameter significance of the tracks associated with the jet to obtain a likelihood for the jet to originate from the primary vertex. This likelihood, or jet probability, is obtained as follows. The negative impact parameter significance of tracks from light-flavour jets reflects the resolution of the measured track impact parameter values. Hence, the distribution of the negative impact parameter significance is used as a resolution function. The probability for a track to originate from the primary vertex, P tr , is obtained by integrating the resolution function R(s) from −∞ to the negative of the absolute track impact parameter significance, −|IP|/σ: The resolution function depends strongly on the quality of the reconstructed track, e.g. the number of hits in the pixel and strip layers of the tracker. Moreover, the probability for a given track to originate from the primary vertex will be smaller for tracks with a large number of missing hits. Therefore, different resolution functions are defined for various track quality classes. In addition, the track quality may be different in data and simulated events. To calibrate the JP algorithm, the resolution functions are determined separately for data and simulation. Using eq. (5.1), tracks corresponding to particles from the decay of a displaced particle will have a low track probability, indicating that the track is not compatible with the primary vertex. The individual track probabilities are combined to obtain a jet probability P j as follows: where Π is the product over the track probabilities, P tr , and the sum runs over the selected tracks index tr, with N the number of selected tracks associated with the jet. To avoid instabilities due to the multiplication of small track probabilities, the probability is set to 0.5% for track probabilities below 0.5%. Only tracks with a positive impact parameter and for which the angular distance between the track and the jet axis satisfies ∆R < 0.3 are used. A variant of the JP algorithm also exists for which the four tracks with the highest impact parameter significance get a higher weight in the jet -14 -probability calculation. This algorithm is referred to as jet b probability (JBP) and uses tracks with ∆R < 0.4. For a light-flavour jet misidentification probability of around 10%, the JBP algorithm has a b jet identification efficiency of 80% compared to 78% for the JP algorithm. The discriminators for the jet probability algorithms were constructed to be proportional to − ln P j . Figure 9 shows the distributions of the discriminator values for the JP and JBP algorithms. The discontinuities in the discriminator distributions are due to the minimum track probability threshold of 0.5%. The jet probability algorithms are interesting for two reasons. First, the fact that the calibration of the resolution function is performed independently for data and simulation results in a robust reference tagger. Second, these algorithms rely only on the impact parameter information of the tracks. Therefore, they are used by some methods when measuring the efficiency of other b jet identification algorithms that rely on secondary vertex or soft lepton information, as discussed in sections 8 and 9.

Combined secondary vertex taggers
The CSVv2 tagger. The CSVv2 algorithm is based on the CSV algorithm described in ref. [1] and combines the information of displaced tracks with the information on secondary vertices associated with the jet using a multivariate technique. Two variants of the CSVv2 algorithm exist according to whether IVF or AVR vertices are used. As baseline, IVF vertices are used in the CSVv2 algorithm, otherwise we refer to it as CSVv2 (AVR). At least two tracks per jet are required. When calculating the values of the track variables, the tracks are required to have an angular distance with respect to the jet axis of ∆R < 0.3. Moreover, any combination of two tracks compatible with the mass of the K 0 S meson in a window of 30 MeV is rejected. Jets that have neither a selected track nor a secondary vertex are assigned a default output discriminator value of −1.
In a first step, the algorithm has to learn the features, e.g. input variable distributions corresponding to the various jet flavours, and combine them into a single discriminator output value. This step is the so-called "training" of the algorithm. During this step, it is important to ensure that the algorithm does not learn any unwanted behaviour, such as b jets having a higher jet p T , -15 -on average, compared to other jets in a sample of tt events. To avoid discrimination between jet flavours caused by different jet p T and η distributions, these distributions are reweighted to obtain the same spectrum for all jet flavours in the training sample. The training is performed on inclusive multijet events in three independent vertex categories: • RecoVertex: The jet contains one or more secondary vertices.
• PseudoVertex: No secondary vertex is found in the jet but a set of at least two tracks with a 2D impact parameter significance above two and a combined invariant mass at least 50 MeV away from the K 0 S mass are found. Since there is no real secondary vertex reconstruction, no fit is performed, resulting in a reduced number of variables.
• NoVertex: Containing jets not assigned to one of the previous two categories. Only the information of the selected tracks is used. • The "SV 2D flight distance significance", defined as the 2D flight distance significance of the secondary vertex with the smallest uncertainty on its flight distance for jets in the RecoVertex category.
• The "number of SV", defined as the number of secondary vertices for jets in the RecoVertex category.
• The "track η rel ", defined as the pseudorapidity of the track relative to the jet axis for the track with the highest 2D impact parameter significance for jets in the RecoVertex and PseudoVertex categories.
- 16 -2018 JINST 13 P05011 • The "corrected SV mass", defined as the corrected mass of the secondary vertex with the smallest uncertainty on its flight distance for jets in the RecoVertex category or the invariant mass obtained from the total summed four-momentum vector of the selected tracks for jets in the PseudoVertex category.
• The "number of tracks from SV", defined as the number of tracks associated with the secondary vertex for jets in the RecoVertex category or the number of selected tracks for jets in the PseudoVertex category.
• The "SV energy ratio", defined as the energy of the secondary vertex with the smallest uncertainty on its flight distance divided by the energy of the total summed four-momentum vector of the selected tracks.
• The "∆R(SV, jet)", defined as the ∆R between the flight direction of the secondary vertex with the smallest uncertainty on its flight distance and the jet axis for jets in the RecoVertex category, or the ∆R between the total summed four-momentum vector of the selected tracks for jets in the PseudoVertex category.
• The "3D IP significance of the first four tracks", defined as the signed 3D impact parameter significances of the four tracks with the highest 2D impact parameter significance.
• The "track p T,rel ", defined as the track p T relative to the jet axis, i.e. the track momentum perpendicular to the jet axis, for the track with the highest 2D impact parameter significance.
• The "∆R(track, jet)", defined as the ∆R between the track and the jet axis for the track with the highest 2D impact parameter significance.
• The "track p T,rel ratio", defined as the track p T relative to the jet axis divided by the magnitude of the track momentum vector for the track with the highest 2D impact parameter significance.
• The "track distance", defined as the distance between the track and the jet axis at their point of closest approach for the track with the highest 2D impact parameter significance.
• The "track decay length", defined as the distance between the primary vertex and the track at the point of closest approach between the track and the jet axis for the track with the highest 2D impact parameter significance.
• The "summed tracks E T ratio", defined as the transverse energy of the total summed fourmomentum vector of the selected tracks divided by the transverse energy of the jet.
• The "∆R(summed tracks, jet)", defined as the ∆R between the total summed four-momentum vector of the tracks and the jet axis.
• The "first track 2D IP significance above c threshold", defined as the 2D impact parameter significance of the first track that raises the combined invariant mass of the tracks above 1.5 GeV. This track is obtained by summing the four-momenta of the tracks adding one track at the time. Every time a track is added, the total four-momentum vector is computed. The 2D impact parameter significance of the first track that is added resulting in a mass of the total -17 -

JINST 13 P05011
four-momentum vector above the aforemention threshold is used as a variable. The threshold of 1.5 GeV is related to the c quark mass.
• The number of selected tracks.
• The jet p T and η.
The discriminating variables in each vertex category are combined into a neural network, specifically a feed-forward multilayer perceptron with one hidden layer [39]. The number of nodes in the hidden layer is different for the three different vertex categories and is set to twice the number of input variables. The discriminator values of the three vertex categories are combined with a likelihood ratio taking into account the fraction of jets of each flavour expected in tt events. The fraction of jets of each flavour is obtained as a function of the jet p T and |η|, using 19 exclusive bins in total. Two dedicated trainings are performed, one with c jets, and one with light-flavour jets as background. The final discriminator value is a linear combination of the output of these two trainings with relative weights of 1 : 3 for the output of the network trained against c and lightflavour jets, respectively. The value of these relative weights is inspired by tt events where one of the two W bosons decays into quarks and the other into leptons, and provides the best performance for a wide variety of physics topologies compared to alternative relative weights.
The main differences from the Run 1 version of the CSV algorithm are the following: • The secondary vertex reconstruction algorithm: the secondary vertices are reconstructed with the IVF algorithm.
• Input variables: table 1 lists the variables used for the Run 1 version of the CSV algorithm and for the CSVv2 algorithm. Figure 11 shows two of the variables used for the CSVv2 algorithm and not for the CSV algorithm.
• Multilayer perceptron: in the previous version of the algorithm the input variables in a certain vertex category were combined with a likelihood ratio. Depending on the type of correlations present between the input variables, the likelihood ratio performs at a comparable level to the other multivariate methods. The likelihood ratio is particularly useful because of its simplicity and when a small number of variables are used. However, to increase the performance of the algorithm, more input variables were added and combined into an artificial neural network.
• Jet p T and η dependence: the correlation of some of the input variables with the jet p T and η is taken into account by including the jet kinematics as input variables, after reweighting the distributions to be the same for all jet flavours. In the past, the training was performed in bins of the jet kinematics. In the current procedure, the bins of jet kinematics are only used to combine the vertex categories after the training. Figure 12 shows the distribution of the discriminator values for the various jet flavours for both versions of the CSVv2 algorithm.
-18 - Table 1. Input variables used for the Run 1 version of the CSV algorithm and for the CSVv2 algorithm. The symbol "x" ("-") means that the variable is (not) used in the algorithm.  The DeepCSV tagger. The identification of jets from heavy-flavour hadrons can be improved by using the advances in the field of deep machine learning [38]. A new version of the CSVv2 tagger, "DeepCSV", was developed using a deep neural network with more hidden layers, more nodes per layer, and a simultaneous training in all vertex categories and for all jet flavours. The same tracks and IVF secondary vertices are used in this approach as for the CSVv2 tagger. The same input variables are also used, with only one difference, namely that for the track-based variables up to six tracks are used in the training of the DeepCSV. Jets are randomly selected in such a way that similar jet p T and η distributions are obtained for all jet flavours. These jet p T and η distributions are also used as input variables in the training to take into account the correlation between the jet kinematics and the other variables. The distribution of all input variables is preprocessed to centre the mean of each distribution around zero and to obtain a root-mean-square value of unity. All of the variables are presented to the multivariate analysis (MVA) in the same way because of the preprocessing. This speeds up the training. In case a variable cannot be reconstructed, e.g. because there are less than six selected tracks (or no secondary vertex), the variable values associated with the missing track or vertex are set to zero after the preprocessing.
The training is performed using jets with p T between 20 GeV and 1 TeV, and within the tracker acceptance. The relative ratio of the number of jets of each flavour is set to 2 : 1 : 4 for b : c : udsg jets. A mixture of tt and multijet events is used to reduce the possible dependency of the training on the heavy-flavour quark production process.
The training of the deep neural network is performed using the K [40] deep learning library, interfaced with the T F [41] library that is used for low-level operations such as convolutions. The neural network uses four hidden layers that are fully connected, each with 100 nodes. Increasing the number of hidden layers and the number of nodes per layer had negligible effects on the performance. Each node in one of the hidden layers uses a rectified linear unit as its activation function to define the output of the node given the input values. For the nodes in the last layer, a normalized exponential function is used for the activation to be able to interpret the output value as a probability for a certain jet flavour category, P( f ). The output layer contains five nodes corresponding to five jet flavour categories used in the training. These categories are defined according to whether the jet contains exactly one b hadron, at least two b hadrons, exactly one c hadron and no b hadrons, at least two c hadrons and no b hadrons, or none of the aforementioned categories. Each of these categories is completely independent of the others. The reason for -20 -defining five flavour categories in the training is to provide analyses with the possibility to identify jets containing two b or c hadrons.  Figure 13 shows the discriminator distribution for each of the DeepCSV probabilities P( f ). The lower right panel in figure 13 also shows the P(b) + P(bb) discriminator used to tag b jets in physics analyses. It has been checked that summing the probabilities for these two categories is equivalent to using a combined training for these categories.

Soft-lepton and combined taggers
Soft leptons, i.e. electrons or muons reconstructed as described in section 4.4 are sometimes present in a jet. When they are, the information related to the charged lepton is used to construct a softelectron (SE) and soft-muon (SM) tagger. The discriminating variables that are used as input for the boosted decision tree (BDT) are the 2D and 3D impact parameter significance of the lepton, the angular distance between the jet axis and the lepton, ∆R, the ratio of the p T of the lepton to that of the jet, and the p T of the lepton relative to the jet axis, p rel T . In the case of the SE algorithm an MVA-based electron identification variable is also used as input. The distributions of the SE and SM discriminator values are shown in figure 14. The different range for the algorithm output values is related to different settings in the training when combining the input variables with a BDT. As a soft lepton is only present in a relatively small fraction of heavy-flavour jets, the soft lepton taggers are not always able to discriminate heavy-flavour jets from other jets. Therefore they are not used standalone, but rather as input for a combined tagger. The combined tagger, cMVAv2, uses six b jet identification discriminators as input variables, namely the two variants of the JP algorithm, the SE and SM algorithms, and the two variants of the CSVv2 algorithm. The training is performed using the open source -package [42] and the variables are combined using a gradient boosting classifier (GBC) as BDT. Prior to the training, the jet p T and η distributions are reweighted to obtain a similar distribution for all jet flavours. Although the correlation between the two CSVv2 discriminator values is close to 100%, a small improvement is seen in the case where the vertex finding algorithms reconstruct different secondary vertices. Figure 15 shows the correlation between the input variables of the cMVAv2 algorithm for b jets as well as the distribution of the cMVAv2 discriminator values for various jet flavours obtained in a tt sample. The correlation between the input variables is similar for other jet flavours. Adding the SL taggers or one of the JP taggers as input variables for the cMVAv2 algorithm results in a similar large performance gain with respect to the CSVv2 algorithm. Adding the other JP tagger and CSVv2 (AVR) algorithm results only in a modest performance gain. The performance of the cMVAv2 tagger for discriminating b jets against other jet flavours is discussed more extensively in section 5.   It is relevant to note that the DeepCSV discriminator output was not included as an input variable, as this algorithm was developed after the cMVAv2 tagger. Further optimizations are ongoing, in particular in the context of the new pixel tracker installed in 2017 [43].

Performance in simulation
The tagging efficiency of the JP, CSVv2, cMVAv2, and DeepCSV taggers is determined using simulated pp collision events. The efficiency (misidentification probability) to correctly (wrongly) tag a jet with flavour f is defined as the number of jets of flavour f passing the tagging requirement divided by the total number of jets of flavour f . Figure 16 shows the b jet identification efficiency versus the misidentification probability for either c or light-flavour jets in simulated tt events requiring jets with p T > 20 GeV and |η| < 2.4 for various b taggers. In this figure, the tagging efficiency is integrated over the p T and η distributions of the jets in the tt sample. The tagging efficiency is also shown for the Run 1 version of the CSV algorithm. It should be noted that the CSV algorithm was trained on simulated multijet events at centre-of-mass energy of 7 TeV using anti-k T jets clustered with a distance parameter R = 0.5. Therefore, the comparison is not completely fair. The performance improvement expected from a retraining is typically of the order of 1%. The absolute improvement in the b jet identification efficiency for the CSVv2 (AVR) algorithm with respect to the CSV algorithm is of the order of 2-4% when the comparison is made at the same misidentification probability value for light-flavour jets. An additional improvement of the order of 1-2% is seen when using IVF vertices instead of AVR vertices in the CSVv2 algorithm. The cMVAv2 tagger performs around 3-4% better than the CSVv2 algorithm for the same misidentification probability for light-flavour jets. The DeepCSV P(b) + P(bb) tagger outperforms all the other b jet identification algorithms, when discriminating against c jets or light-flavour jets, except for b jet identification efficiencies above 70% where the cMVAv2 tagger performs better when discriminating against light-flavour jets. The absolute b identification efficiency improves by about 4% with respect to the CSVv2 algorithm for a misidentification probability for light-flavour jets of 1%. Three standard working points are defined for each b tagging algorithm using jets with p T > 30 GeV in simulated multijet events with 80 <p T < 120 GeV. The average jet p T in this sample of events is about 75 GeV. These working points, "loose" (L), "medium" (M), and "tight" (T), correspond to thresholds on the discriminator after which the misidentification probability is around 10%, 1%, and 0.1%, respectively, for light-flavour jets. The efficiency for correctly identifying b jets in simulated tt events for each of the three working points of the various taggers is summarized in table 2.
The tagging efficiency depends on the jet p T , η, and the number of pileup interactions in the event. This dependency is illustrated for the DeepCSV P(b) + P(bb) tagger in figure 17 using jets with p T > 20 GeV in tt events. A parameterization of the efficiency as a function of the jet p T is provided in appendix A. The efficiency for correctly identifying b jets is maximal for jets with p T ≈ 100 GeV and decreases at low-and high-p T values. The lower efficiency at low jet p T is due to the larger uncertainty on the track impact parameter resolution. At high jet p T , there are two main effects. First, the misidentification probability for light-flavour jets increases because of the larger number of tracks present in the jet, as can be seen from figure 4. Second, at higher jet transverse momenta, jets are more collimated and their charged particles are closer together, resulting in merged hits in the innermost layers of the tracking system. This effect impacts the track reconstruction efficiency and hence also the b jet identification efficiency. Due to the higher track reconstruction efficiency and the better resolution of the track parameters at small |η| values [2], the algorithms are more efficient in identifying b jets in the barrel region of the CMS silicon tracker (|η| < 1). The efficiency for misidentifying light-flavour jets increases with an increasing number of -24 - Table 2. Taggers, working points, and corresponding efficiency for b jets with p T > 20 GeV in simulated tt events. The numbers in this table are for illustrative purposes since the b jet identification efficiency is integrated over the p T and η distributions of jets.

Tagger
Working pileup interactions. This is explained as follows. First, the increasing number of pileup interactions results in a higher probability to choose the wrong primary vertex resulting in light-flavour jets that are displaced, and b jets for which the displacement is wrong. Second, the increasing number of pileup interactions results in a higher occupancy in the tracker, leading to a larger number of wrongly reconstructed tracks as well as more tracks from a different interaction vertex that are clustered in the jets associated with the primary vertex. It was checked that all taggers presented in table 2 show a similar dependence with respect to the number of pileup interactions, and jet p T and |η|.

The c jet identification
As can be seen from figures 5, 6, and 8 in section 4, the distributions of the tagging variables for c jets lie in between the distributions for b and light-flavour jets. This is due to the lifetime of the c hadrons being shorter than that of the b hadrons. In addition, the secondary vertex multiplicity is also lower and the smaller c quark mass results in a smaller track p T relative to the jet axis. Therefore, it is particularly challenging to efficiently identify jets originating from c quarks.

Algorithm description
The c jet identification algorithm uses properties related to displaced tracks, secondary vertices, and soft leptons inside the jets. The secondary vertices are obtained using the IVF algorithm with modified parameters for c jets as described in section 4.3. Based on the presence or absence of a secondary vertex associated with a jet, three secondary vertex categories are defined in the same way as for the CSVv2 algorithm. The presence or absence of a soft lepton, as discussed in the previous paragraph, leads to the definition of three soft lepton categories, independent of the secondary vertex categories:   • NoSoftLepton: including jets without soft leptons found inside the jet; • SoftMuon: at least one soft muon was found inside the jet; • SoftElectron: no soft muon, but at least one soft electron was found inside the jet.
With this categorization, jets containing a muon and an electron will be assigned to the SoftMuon category. Like for the b tagging algorithms, the displaced tracks are ordered by decreasing 2D impact parameter significance, and the secondary vertices are ordered by increasing uncertainty on their 3D flight distance. Some variables are only defined if a secondary vertex was reconstructed or if a soft lepton was found inside the jet. Whenever a variable is not available, a default value is assigned to it. The variables used are similar to the ones used in the CSVv2 algorithm (section 5. • The vertex-lepton category.
• The 2D and 3D impact parameter significance of the first two tracks, and the 3D impact parameter significance of the first two leptons.
• The pseudorapidity of the track (lepton) relative to the jet axis for the first two tracks (leptons).
• The track (lepton) p T relative to the jet axis, i.e. the track momentum perpendicular to the jet axis, for the first two tracks (leptons).
• The track p T relative to the jet axis divided by the magnitude of the track momentum vector, for the first two tracks.
• The track momentum parallel to the jet direction, for the first two tracks.
• The track momentum parallel to the jet direction divided by the magnitude of the track momentum vector, for the first two tracks.
• The ∆R between the track (lepton) and the jet axis for the first two tracks (leptons).
• The distance between the track and the jet axis at their point of closest approach, for the first two tracks.
• The track decay length, i.e. the distance between the primary vertex and the track at the point of closest approach between the track and the jet axis, for the first two tracks.
• The transverse energy of the total summed four-momentum vector of the selected tracks divided by the transverse energy of the jet.
• The ∆R between the total summed four-momentum vector of the tracks and the jet axis.
• The 2D and 3D impact parameter significance of the first track that raises the combined invariant mass of the tracks above 1.5 GeV. This track is obtained by summing the fourmomenta of the tracks adding one track at the time. Every time a track is added, the total four-momentum vector is computed. The 2D impact parameter significance of the first track that is added resulting in a mass of the total four-momentum vector above the aforemention threshold is used as a variable. The threshold of 1.5 GeV is related to the c quark mass.
• The lepton p T divided by the jet p T , for the first two leptons.
• The lepton momentum parallel to the jet direction divided by the magnitude of the jet momentum, for the first two leptons.
• The 2D and 3D flight distance significance of the first secondary vertex.
• The secondary vertex energy ratio, defined as the energy of the secondary vertex with the smallest uncertainty on its flight distance divided by the energy of the total summed fourmomentum vector of the selected tracks.
• The "massVertexEnergyFraction" variable, which is defined as X/(X + 0.04), where X is the corrected secondary vertex mass divided by the average b meson mass [35] multiplied by the scalar sum of the track energies (assuming the pion mass) for tracks associated with the secondary vertex divided by the scalar sum of the track energies for track associated with the jet: This variable is first defined in section 7 of ref. [44].
• The "vertexBoost" variable, defined as Y 2 /(Y 2 + 10), where Y is the square root of the average b meson mass [35] multiplied with the scalar sum of the track p T for tracks associated with the vertex, divided by the product of the corrected secondary vertex mass and the square root of the jet p T . This variable is related to the boost of the secondary vertex. This variable is first defined in section 7 of ref. [44].
• The number of tracks associated with the first secondary vertex.
• The number of secondary vertices.
• The number of tracks associated with the jet.
The training of the algorithm was performed on simulated multijet events. As in the case of the DeepCSV tagger, the variables are first preprocessed to centre their mean at zero and obtain a root-mean-square of unity. Two weights are applied for each jet in the training. To avoid introducing any unwanted dependence on the jet kinematics in the tagger, a first weight is applied to flatten the jet p T and η distributions in the whole training sample for all jet flavours. Simultaneously, a second weight skews the relative contribution of the different secondary vertex categories in the multijet sample to fit the observed ones in the tt sample. Two trainings are performed: one for discriminating c jets from light-flavour jets (CvsL) and another one for discriminating c jets from b jets (CvsB). The training of the two discriminators was performed with the -package [42] using a GBC as implementation of the BDT.
The GBC settings were optimized by varying them over a wide range of values, to ensure the optimal setting was contained within the scanned range. Both the CvsL and CvsB trainings were optimized by scanning a range of the parameters and comparing the final performance curves. The best performance was achieved with the number of boosting stages set to 500, the learning rate to 0.05, the minimum number of samples required to split an internal node to 0.6% and a maximum depth of the individual regression estimators of 15 (8) for the CvsL (CvsB) training. Some of the optimized values did not change the performance visibly when being varied, but they were chosen to reduce the computation time without a loss in performance. Figure 18 shows the output discriminator distributions for the CvsL and CvsB taggers. The discriminator distributions exhibit spikes, which originate from the default values for most input variables if a jet has no track passing the selection criteria. These spikes do not affect any -28 -

Simulation
CvsB discriminator physics analyses, as the discriminator thresholds defining the working points are not just before or after a spike.

Performance in simulation
The performance is evaluated using jets with p T > 20 GeV and |η| < 2.4 in a sample of simulated tt events. The left panel in figure 19 shows the correlation between the CvsL and CvsB discriminators for various jet flavours. Discriminator values close to one correspond to signal-like c jets. Therefore, the c jets populate the upper right corner of this figure, whereas b jets and light-flavour jets populate the region near the bottom right and the upper left corners, respectively. In the upper left corner there is a relatively large fraction of c jets because of the similarity of c jets and light-flavour jets at CvsL discriminator values below −0.3 and CvsB discriminator values above +0.5, as can be seen in figure 18. In order to discriminate c jets from other jet flavours and to evaluate the performance of the c tagger, thresholds are applied on both CvsL and CvsB to select the upper right corner of this phase space. Three working points have been defined corresponding to the efficiency for correctly identifying c jets. These are indicated by the dashed lines. The loose working point has a high efficiency for c jets and rejects primarily b jets, whereas the tight working point rejects primarily light-flavour jets. Table 3 summarizes the efficiencies for the three working points. Table 3. Efficiency for the working points of the c tagger and corresponding efficiency for the different jet flavours obtained using jets with p T > 20 GeV in simulated tt events. The numbers quoted are for illustrative purposes since the efficiency is integrated over the p T and η distributions of the jets. The right panel in figure 19 shows the light-flavour and b jet misidentification probabilities for constant c tagging efficiencies. The arrows indicate the c jet identification efficiency and -29 -misidentification probability for b and light-flavour jets corresponding to the three working points. The discontinuous transition in each of the curves for c tagging efficiencies between 0.35 and 0.7 are due to the largest spike in the CvsL distribution in the left panel in figure 18. In figure 20 the performance of the CvsL and CvsB taggers is compared with the cMVAv2 and CSVv2 b tagging algorithms. In the right panel of this figure, the transition in the performance of the curve for a c jet identification efficiency around 0.4 is due to the largest spike in the CvsL discriminator distribution. The performance of the CvsB tagger is similar to the performance of both b taggers, except at small b jet misidentification probabilities where the CvsB tagger is performing slightly worse than the cMVAv2 tagger. The CvsL tagger outperforms the cMVAv2 and CSVv2 tagger for small light-flavour jet misidentification probabilities. The DeepCSV tagger described in section 5.1.2 is outperforming the dedicated c tagger. For the discrimination between c and b jets, the DeepCSV probabilities corresponding to the five flavour categories defined in section 5.1.2, are combined in the following way: where the numerator corresponds to the probability to identify c jets and the denominator to the probability to identify b or c jets. Similarly, for the discrimination between c and light-flavour jets, the discriminator is constructed: with the numerator giving the probability to identify c jets and the denominator the probability to identify light-flavour or c jets. The comparison with the DeepCSV algorithm used for c tagging should be considered as an illustration for the performance of future c taggers since the working points are not yet defined and the efficiency in data is not yet measured.

Boosted b jet identification with the CSVv2 algorithm
At the high centre-of-mass energy of the LHC, particles decaying to b quarks can be produced with a large Lorentz boost. Examples are boosted top quarks decaying to bW → bqq, or boosted Higgs or Z bosons decaying to bb. As a result of the large boost of the parent particle the decay products often give rise to overlapping jets. In order to capture all the decay products, the jets are reconstructed with a distance parameter of R = 0.8 (AK8). Jet substructure techniques can then be applied to resolve the subjets corresponding to the decay products in the AK8 jet [45][46][47][48]. In this paper, the soft-drop algorithm [45,46], which recursively removes soft wide-angle radiation from a jet, is used to resolve the substructure of the AK8 jets. The subjet axes are obtained by reclustering the jet constituents using the anti-k T algorithm and undoing the last step of the clustering procedure. When the decay of the boosted particle contains a b quark, b tagging can be applied either on the AK8 jet or on its subjets. In both cases the CSVv2 algorithm is used. In the first approach the CSVv2 algorithm is applied to the AK8 jet but using looser requirements for the track-to-jet and vertex-to-jet association criteria, consistent with the R = 0.8 parameter. In the second approach the CSVv2 algorithm is applied to the subjets. The two approaches are illustrated by the scheme in figure 21 (left and middle). -31 -To illustrate the performance of b tagging in various boosted topologies, AK8 and subjet b tagging are compared in figures 22 and 23. When studying the performance of b tagging in various boosted topologies, jets originating from the decay of boosted top quarks (boosted top quark jets) are obtained from a Z sample, where the Z decays to tt, with t → bW → bqq. The boosted top quark jets are then defined as jets containing at least one b hadron. Jets originating from the decay of boosted Higgs bosons (H → bb jets) are obtained from a Kaluza-Klein graviton sample, where the graviton decays to two Higgs bosons, with H → bb. The H → bb jets are then defined as jets containing at least two b hadrons. Jets from a sample of inclusive multijet events are used to determine the misidentification probability.
To obtain a performance similar to what is expected in physics analyses, the jet mass is used to select jets consistent with the top quark or Higgs boson mass. While the jet mass for these particles arises from the kinematics of the decay products present in the jet, the single-parton jet mass arises mostly from soft-gluon radiation. This soft radiation can be removed by applying jet grooming methods [49][50][51], shifting the single-parton jet mass to smaller values. In this paper, jet pruning [51] is applied to the AK8 jets. The jet mass obtained from the jet four-momentum after pruning is referred to as the pruned jet mass. Jets are then selected when they have a pruned jet mass between 50 (135)   . Misidentification probability for jets in an inclusive multijet sample versus the efficiency to correctly tag boosted top quark jets. The CSVv2 algorithm is applied to three different types of jets: AK8 jets, their subjets, and AK4 jets matched to AK8 jets. The AK8 jets are selected to have a pruned jet mass between 135 and 200 GeV, and 300 < p T < 500 GeV (left), or 1.2 < p T < 1.8 TeV (right). Figure 22 shows the b tagging efficiency for boosted top quark jets versus the misidentification probability using jets from a background sample of multijet events. The performance of AK8 and subjet b tagging is compared. When b tagging is applied to the subjets of boosted top quark jets, at least one of the subjets is required to be tagged. In addition, the performance of b tagging applied to AK4 jets matched to AK8 jets within ∆R(AK4,AK8) < 0.4 is also shown. When b tagging is applied to AK4 jets matched to the AK8 jet, at least one of the AK4 jets is required to be tagged. In figure 22 (left), for jets with 300 < p T < 500 GeV, the AK8 jet b tagging is more efficient than AK4 jet b tagging. In contrast, in figure 22 (right), for jets with p T > 1200 GeV, AK8 and AK4 jet b tagging perform similarly. This can be understood as due to the fact that at large jet p T most of the tracks and the secondary vertex are also present in the AK4 jet because of the larger boost.
-32 -In both cases, subjet b tagging is more efficient than AK8 jet b tagging when identifying the b jet from the boosted top quark decay.

CMS Simulation
13 TeV, 2016  Figure 23. Misidentification probability using jets in a multijet sample (upper), for g → bb jets (middle), and for single b jets (lower), versus the efficiency to correctly tag H → bb jets. The CSVv2 algorithm is applied to three different types of jets: AK8 jets, their subjets, and AK4 jets matched to AK8 jets. For the subjet b tagging curves, both subjets are required to be tagged. The double-b tagger, described in section 6.2, is applied to AK8 jets. The AK8 jets are selected to have a pruned jet mass between 50 and 200 GeV, and 300 < p T < 500 GeV (left), or 1.2 < p T < 1.8 TeV (right). Figure 23 shows the efficiency for identifying H → bb jets versus the misidentification probability using jets from a background sample of inclusive multijet events, g → bb jets or single b jets. When b tagging is applied to the subjets of the H → bb jet, both subjets are required to be tagged. Similarly, both AK4 jets matched with the AK8 jet are required to be tagged.
-33 -When the misidentification probability is determined using inclusive multijet events, as illustrated in the upper panels of figure 23, AK8 jet b tagging performs well at the highest H → bb jet tagging efficiencies, while subjet b tagging performs better at lower H → bb jet tagging efficiencies. This can be understood as follows. Some of the input variables used in the CSVv2 tagger rely on the jet axis, as mentioned in section 5.1.2. An example is the ∆R between the secondary vertex flight direction and the jet axis. This variable is expected to have, on average, a smaller value for b jets compared to other jets, as can be seen in the right panel of figure 10. When b tagging is applied to the AK8 jet, the AK8 jet axis is used to calculate some of the variables. However, when two b hadrons are present in the jet, the ∆R between the secondary vertex flight direction and the AK8 jet axis or between the track and the AK8 jet axis may be quite large. Therefore, it is better to calculate these variables with respect to their respective subjet axes. On the other hand, at the highest H → bb jet tagging efficiencies, subjet b tagging does not fully use variables that rely on the information of the full AK8 jet, such as the number of secondary vertices. This results in a worse performance of subjet b tagging compared to AK8 jet b tagging at the highest H → bb jet tagging efficiencies.
The middle panels of figure 23 show the efficiency for H → bb jets versus the misidentification probability for g → bb in multijet events. Both for jets with 300 < p T < 500 GeV and 1.2 < p T < 1.8 TeV, subjet b tagging performs better than AK8 jet b tagging. This is understood as due to the fact that the information from both b hadrons is better used by the subjet b tagging approach.
As can be seen in the bottom panels of figure 23, also in the case where the background is composed of single b jets, subjet b tagging performs better. The lower misidentification probability at the same efficiency is explained by the fact that for the subjet b tagging, the two subjets are required to be tagged. Requiring both subjets to be tagged while there is only one b hadron present in the background jets results in a lower misidentification probability. It is worth noting that these performance curves look very similar to the performance curves obtained when b jets from boosted top quarks are considered as background instead of single b jets from multijet events.
The left panels in figure 23 demonstrate that AK8 jet b tagging is more efficient than AK4 jet b tagging using jets with 300 < p T < 500 GeV. The reason is that at low jet p T not all the tracks and secondary vertex are associated with the two AK4 jets, while they are associated with the AK8 jet. In contrast, using jets with p T > 1200 GeV, requiring the two AK4 jets to be tagged results in a similar performance or better than when the AK8 jet is required to be tagged. This can be explained by the fact that the high jet p T results in tracks and secondary vertices that are more collimated and fully contained in the AK4 jets. Figures 22 and 23 demonstrate that the performance of subjet and AK8 jet b tagging depends not only on the signal jets to be b tagged and on the background jets under consideration, but also on the jet p T .

The double-b tagger
As mentioned in the previous section, the approaches of b tagging AK8 jets, as well as applying subjet b tagging, have limitations when identifying H → bb jets. In this section, a novel approach is presented to discriminate H → bb candidates from single-parton jets in multijet events. The strategy followed when developing the new "double-b" tagging algorithm is to fully use not only the presence of two b hadrons inside the AK8 jet but also the correlation between the directions of -34 -

JINST 13 P05011
the momenta of the two b hadrons. Although the algorithm is developed using simulated H → bb events, any dependence of the algorithm performance on the mass or p T of the bb pair is avoided. This strategy allows the usage of the tagger in physics analyses with a large range of jet p T . The dependence on the jet mass is avoided as this variable is often used to define a region for the estimation of the background. In addition, this strategy also permits the use of the double-b tagger for the identification of boosted Z → bb jets or any other boosted bb resonance where the kinematics of the decay products are similar.
A variable sensitive to the substructure is the N-subjettiness, τ N [47], which is a jet shape variable, computed under the assumption that the jet has N subjets, and it is defined as the p Tweighted distance between each jet constituent and its nearest subjet axis (∆R): where k runs over all jet constituents. The normalization factor is d 0 = k p k T R 0 and R 0 is the original jet distance parameter, i.e. R 0 = 0.8. The τ N variable has a small value if the jet is consistent with having N or fewer subjets. The subjet axes are used as a starting point for the τ N minimization. After the minimization, the τ N axes, also called τ axes, are obtained. These are then used to estimate the directions of the partons giving rise to the subjets, as schematically illustrated in figure 21 (right).
Many of the CSVv2 variables are also used in the double-b tagger algorithm. The variables rely on reconstructed tracks, secondary vertices obtained using the IVF algorithm, as well as the system of two secondary vertices. Tracks with p T > 1 GeV are associated with jets in a cone of ∆R < 0.8 around the jet axis. Each track is then associated with the closest τ axis, where the distance of a track to the τ axis is defined as the distance at their point of closest approach. The selection requirements applied to tracks in the CSVv2 algorithm are also applied here, using the τ axis instead of the jet axis. The reconstructed secondary vertices are associated first with jets in a cone ∆R < 0.7 and then to the closest τ axis within that jet. For each τ axis, the track four-momenta of the constituent tracks from all the secondary vertices associated with a given τ axis are added to compute the secondary vertex mass and p T for that τ axis.
Input variables are selected that discriminate between H → bb jets and other jet flavours, and that improve the discrimination against the background from inclusive multijet production by at least 5% compared to the performance of the tagger without the variable. In addition, as mentioned earlier, variables are chosen that do not have a strong dependence on the jet p T or jet mass. This procedure resulted in the following list of variables: • The four tracks with the highest impact parameter significance.
• The impact parameter significance of the first two tracks ordered in decreasing impact parameter significance, for each τ axis.
• The 2D impact parameter significance, of the first two tracks (first track) that raise the total mass above 5.2 (1.5) GeV. These tracks are obtained as explained in section 5.1.2 in the context of the CSVv2 algorithm. In the case of the highest threshold, also the second track above the threshold mass is used. The thresholds of 5.2 GeV and 1.5 GeV are related to the b and c hadron masses, respectively.

JINST 13 P05011
• The secondary vertex energy ratio, defined as the total energy of all secondary vertices associated with a given τ axis divided by the total energy of all the tracks associated with the AK8 jet that are consistent with the primary vertex, for each of the two τ axes.
• The number of secondary vertices associated with the jet.
• The 2D secondary vertex flight distance significance, for the secondary vertex with the smallest uncertainty on the 3D flight distance, for each of the two τ axes.
• The ∆R between the secondary vertex with the smallest 3D flight distance uncertainty and its τ axis, for each of the two τ axes.
• The relative pseudorapidity, η rel , of the tracks from all secondary vertices with respect to their τ axis for the three leading tracks ordered in increasing η rel , for each of the two τ axes.
• The total secondary vertex mass, defined as the invariant mass of all tracks from secondary vertices associated with the same τ axis, for each of the two τ axes.
• The information related to the system of two secondary vertices, the z variable, defined as: where SV 0 and SV 1 are the secondary vertices with the smallest 3D flight distance uncertainty associated with the two τ axes, p T (SV 1 ) is the p T of the secondary vertex associated with the second τ axis, and ∆R(SV 0 , SV 1 ) is the distance between the two secondary vertices, and m(SV 0 , SV 1 ) is the invariant mass corresponding to the summed four-momenta of the two secondary vertices.
The most discriminating variables are the impact parameter significance for the most displaced tracks, the 2D impact parameter significance for the first track above the (5.2 GeV) b-hadron mass threshold, and the secondary vertex energy ratio for the secondary vertex with the smallest 3D flight distance uncertainty (SV 0 ). Figure 24 shows the distributions for some of the input variables for the signal H → bb jets and using jets from inclusive multijet production containing zero, one, or two b quarks. Distributions are shown separately for g → bb, single b quark, and light-flavour jets production. The secondary vertex multiplicity and the vertex energy ratio for SV 0 , along with the impact parameter significance of the first track raising the total invariant mass of all tracks above the b hadron mass threshold show a good separation between the H → bb jets and the different background contributions. The z variable, eq. (6.2), shows good discrimination against g → bb jets since it uses the different kinematic properties of the H → bb and g → bb decays.
Several variables related to the properties of soft leptons arising from the b hadron decay were also investigated. Despite a small gain in performance, these variables were excluded as input variables since they could introduce a bias in the efficiency measurement from data. The bias could arise when using muon information both to define input variables and to select a sample of jets containing a muon for the efficiency measurement in data, presented in section 9. The discriminating variables are combined using a BDT and the package [52]. The training is performed using H → bb jets from simulated events with a Kaluza-Klein graviton decaying to two Higgs bosons as signal, and jets from inclusive multijet production as background. Jets are selected when they have a pruned mass between 50 and 200 GeV and p T between 300 and 2500 GeV. The jet p T distributions for the simulated signal and background jet samples are similar, therefore no dedicated reweighting of the samples was performed.
The distribution of the double-b discriminator values is shown in the upper panel of figure 25. Four working points are defined corresponding to about 75, 65, 45 and 25% signal efficiency for a jet p T of about 1 TeV. The signal efficiencies and misidentification probabilities as functions of the jet p T for these four working points are shown in the lower panels of figure 25. The decreasing signal efficiency at high jet p T originates from the larger collimation of particles, which results in a lower track reconstruction efficiency due to close by hits. The reduced track reconstruction efficiency for high jet p T results in a lower tagging efficiency for high jet p T . Double-b Loose The performance of the double-b tagger is compared with that of the CSVv2 tagger applied to AK8 jets or their subjets. The top and middle panels in figure 23 show the performance when the background consists of jets from inclusive multijet production or g → bb jets. In these cases, the double-b tagger outperforms the AK8 jet and subjet b tagging approaches for all jet p T ranges. At high jet p T the improvement is larger compared to low jet p T , thereby providing an important gain in the searches for heavy resonances where mostly high-p T jets are expected. When the background is composed of single b jets, as shown in the bottom panels of figure 23, subjet b tagging outperforms the double-b tagger at low jet p T , while the two approaches are similar at high jet p T . The lower misidentification probability for single b jets at the same H → bb jet tagging efficiency for subjet b tagging at low jet p T is explained by the fact that the two subjets are very well separated at low jet p T and the variables related to the AK8 jet used in the double-b tagger are less efficient. In contrast, at high jet p T the subjets are much closer together, resulting in shared tracks and secondary vertices and thereby leading to a more similar performance.
Whether it is better to use subjet b tagging or the double-b tagger in a physics analysis depends strongly on the flavour composition and p T distribution of the jets from the signal and background processes under consideration.

Performance of b jet identification at the trigger level
The identification of b jets at the trigger level is essential to collect events that do not pass standard lepton, jet, or missing p T triggers, and to increase the purity of the recorded sample for analyses requiring b jets in the final state. The L1 trigger uses information from the calorimeters and muon detectors to reconstruct objects such as charged leptons and jets. Identification of b jets is not possible at that stage as it relies on the reconstructed tracks from charged particles available only at the HLT. In this section, we describe b jet identification at the HLT. A detailed description of the CMS trigger system can be found in ref. [11].
Because of latency constraints at the HLT, it is not feasible to reconstruct the tracks and primary vertex with the algorithms used for offline reconstruction. The time needed for track finding can be significantly reduced if the position of the primary vertex is known. While the position in the transverse plane is defined with a precision of 20 µm, its position along the beam line is not known [2]. However, it is possible to obtain a rough estimate of the primary vertex position along the beam line by projecting onto the z direction the position of the silicon pixel tracker hits (pixel detector hits) compatible with the jets. A pixel tracker hit in the barrel (endcap) is compatible with a jet when the difference in azimuthal angle between the hit and the jet is less than 0.21 (0.14). The region along the beam line with the highest number of projected pixel detector hits is most likely to correspond to the position of the primary vertex. This concept is illustrated in figure 26: the direction of the tracks in a jet is assumed to be approximately the same as the direction of the jet obtained using the calorimeter information. This fast primary vertex (FPV) finding algorithm is sensitive to pixel detector hits from pileup interactions. Therefore, a number of selection requirements based on the shape of the charge deposition clusters associated with the pixel detector hits are applied to select those that most likely correspond to a particle with a large p T . In addition, only pixel detector hits compatible with up to four leading jets with p T > 30 GeV and |η| < 2.4 are used. Finally, each pixel detector hit is assigned a weight reflecting the probability that it corresponds to a track in one of the considered jets. The weight is obtained by using information related to the shape of the charge deposition cluster, the azimuthal angle between the jet and the cluster, and the jet p T . Since the spread of projected hits from the primary vertex is proportional to the distance from the beam line, a larger weight is assigned to pixel detector hits closer to the beam line.
-39 - Figure 27 (left) shows that the resolution of the primary vertex along the beam line, ∆z, is about 3 mm for simulated multijet events with 35 pileup interactions on average. Here, events are selected if the scalar sum of the calorimeter jet transverse momenta exceeds 250 GeV. The double-peak structure is caused by a bias in the FPV reconstruction that finds the primary vertex closer to the centre of the CMS detector than it is in reality in the simulation. This bias originates from the higher number of projected hits at the centre of the detector because of the detector geometry and pileup interactions. The efficiency of the FPV algorithm to reconstruct the primary vertex within 1.5 cm of its true position along the beam line is close to 99%. Since b tagging relies on the precise measurement of the displaced tracks with respect to the primary vertex, it is crucial to use tracks that use the information of both the pixel and the silicon strip tracker to improve the spatial and momentum resolutions. To reduce the HLT algorithm processing time, these tracks are only reconstructed when originating near the primary vertex and if they are close to the direction of the leading jets, sorted according to decreasing jet p T . Up to eight jets with p T > 30 GeV and |η| < 2.4 are considered in an event. In the first step, the trajectories of charged particles are reconstructed from the pixel detector hits. To reduce the reconstruction time, tracks are only reconstructed when they have a longitudinal (transverse) impact parameter below 15 (2) mm and are compatible with the direction of one of the jets. For simulated tt events with 35 pileup interactions on average, this approach of regional pixel tracking reduces the track reconstruction time by a factor of almost 40 with respect to pixel tracking without constraints. Using the reconstructed pixel tracks, the efficiency to find the primary vertex within 0.2 mm of its true position along the beam line is around 97.5%. To increase the efficiency even further, the variable is defined, where p T i, j is the p T of track i associated with the leading or subleading jet ( j = 1 or 2) -40 -and p T j is the p T of jet j obtained from the calorimeter deposits. To calculate R, tracks from the two leading jets are used if they have a χ 2 of the track fit below 20, which reduces the effect of tracks reconstructed from a wrong combination of pixel hits. The impact of mismeasured tracks is reduced by setting the track p T to 20 GeV if it is larger than this value. If the primary vertex position is not correctly reconstructed, the value of R will be small. If R < 0.10, the reconstruction of pixel detector tracks is run without the primary vertex position and using instead the direction of the two leading jets. The pixel detector tracks obtained in this way are then used to obtain a new position for the primary vertex, partially recovering the efficiency loss. The primary vertex position for all events is refitted using the reconstructed pixel detector tracks, resulting in a resolution that is much improved, as can be seen in figure 27 (right). Pairs of vertices that are closer than 70 µm to each other are merged into a single vertex. After the full procedure, the efficiency to find the primary vertex within 0.2 mm of its true position is larger than 98.5%, and the resolution on the position of the primary vertex along the beam line is less than 60 µm, using simulated multijet events with 35 pileup interactions on average.
In the second step, the tracks are reconstructed using the information from the pixel and strip detectors. An iterative procedure is applied that is similar to the offline track reconstruction except for the number of iterations and the seeds used for track finding in each iteration. In the first iteration, the pixel tracks reconstructed as described above with p T > 0.9 GeV are used as seeds if they have a transverse (longitudinal) impact parameter below 1 (3) mm. For the second iteration, triplets of pixel hits are used with p T > 0.5 GeV and a transverse (longitudinal) impact parameter <0.5 (1) mm. The last iteration uses pairs of pixel hits with p T > 1.2 GeV and a transverse (longitudinal) impact parameter <0.25 (0.5) mm. It is worth noting that the requirements on the impact parameter do not have a large impact on the reconstruction efficiency for displaced tracks. When refitting the primary vertex using the reconstructed tracks, the resolution on its position along the beam line further improves to less than 30 µm, as shown in figure 27.
The reconstructed tracks and the refitted primary vertex are then used to reconstruct secondary vertices with the IVF vertex reconstruction algorithm. These vertices and tracks are then used as input for the CSVv2 algorithm described in section 5. No dedicated training of the CSVv2 algorithm is used at the HLT, as studies have not shown any improvement in performance. The processing time of regional tracking used for b tagging with up to eight leading jets with p T > 30 GeV is on average 87 ms, not including the jet reconstruction time. The processing time was evaluated using data with the highest number of pileup interactions observed in 2016 (49 pileup interactions on average) and selecting events using a trigger threshold of 250 GeV on the scalar sum of the calorimeter jet transverse momenta. As a comparison, the average global processing time of the HLT farm is limited to about 200 ms per event. The b tagging algorithm was run in about 6% of the events accepted by the L1 trigger.
The performance of b tagging at the HLT is evaluated using data collected during 2016, selecting events with at least four calorimeter jets with p T > 45 GeV and |η| < 2.4 and with the sum of the p T of the jets at the HLT above 800 GeV. Offline CSVv2 discriminator distributions are shown in figure 28 using all jets (in red) as well as using jets with an HLT CSVv2 discriminator exceeding 0.56 (in blue). An estimate of the reduction factor for the trigger rate when requiring a single b tagged jet at HLT is determined as the number of jets passing the initial trigger, based on the sum of the p T of the jets, divided by the number of jets passing the trigger and having an -41 -HLT CSVv2 discriminator above 0.56. The b tagging efficiency for a threshold of 0.56 on the HLT CSVv2 discriminator is shown as a function of the offline CSVv2 discriminator value in figure 28 (right). In both panels, the structure at a discriminator value of ≈ 0.5 is caused by jets from pileup interactions. In the right panel, the discontinuity indicates that these jets do not behave exactly in the same manner at the HLT and offline, due to their different track reconstruction. The larger efficiency for CSVv2 discriminator values below 0.05 is due to jets for which the chosen primary vertex at the HLT and offline is different. In particular, the primary vertex position is wrongly reconstructed at the HLT, resulting in an apparent displaced jet with a high CSVv2 discriminator value at the HLT and a small offline CSVv2 discriminator value. The impact of this effect is relatively small since there are only a few jets with an offline CSVv2 discriminator value below 0.05, as can be seen in the left panel.  As expected, the b tagging performance of the offline reconstruction is better than at the HLT. The maximum b jet identification efficiency at the HLT is ≈ 95% because of three effects that occur more frequently at the HLT: • The primary vertex is not reconstructed or not identified as the vertex corresponding to the jets on which the b tagging algorithm is applied.
• Since the track reconstruction efficiency at the HLT is lower, it happens more often that less than two tracks are associated with the jet, resulting in no valid discriminator value being assigned to the jet.
• There are at least two reconstructed tracks, but they do not pass the track selection requirements applied in the CSVv2 algorithm.
In the future, the b tagging performance at the HLT will be further improved by replacing the CSVv2 tagger with the DeepCSV tagger.

Measurement of the tagging efficiency using data
In the previous sections, the performance of the taggers was studied on simulated samples. In this section, we present the methods used to measure the efficiency of the heavy-flavour tagging algorithms applied on the data. In section 8.1, the data are compared to the simulation for a few input variables as well as for the output discriminator distributions. The measurement of the misidentification probability in the data is presented in section 8.2. The tagging efficiency for c and b jets is presented in sections 8.3 and 8.4, respectively. Section 8.5 summarizes a method to measure data-to-simulation scale factors as a function of the discriminator value for the various jet flavours. The results of the various measurements are compared and discussed in section 8.6.

Comparison of data with simulation
The data are compared to simulation in different event topologies, chosen for their different jet flavour composition, and selected according to the following criteria: • Inclusive multijet sample: events are selected if they satisfy a trigger selection requiring the presence of at least one AK4 jet with p T > 40 GeV. Because of the high event rates only a fraction of the events that fulfill the trigger requirement are selected (prescaled trigger). The fraction of accepted events depends on the prescale value, which varies during the data-taking period according to the instantaneous luminosity. The data are compared to simulated multijet events using jets with 50 < p T < 250 GeV. This topology is dominated by light-flavour jets and contains also a contribution of jets from pileup interactions.
• Muon-enriched jet sample: events are considered if they satisfy an online selection requiring at least two AK4 jets with p T > 40 GeV of which at least one contains a muon with p T > 5 GeV. Also in this case, the trigger was prescaled. The data are compared to a sample of jets with 50 < p T < 250 GeV and containing a muon selected from simulated muonenriched multijet events. Because of the muon requirement this topology is dominated by jets containing heavy-flavour hadrons.
-43 - • Dilepton tt sample: at trigger level, events are selected by requiring the presence of at least one isolated electron and at least one isolated muon. Offline, the leading muon and electron are required to have p T > 25 GeV and be isolated, as expected for leptonic W boson decays [9,10]. Events are further considered if they contain at least two AK4 jets with p T > 20 GeV. In this event sample we expect an enrichment in b jets from top quark decays. There is also a small contribution from jets from pileup interactions due to the relatively low threshold on jet p T .
• Single-lepton tt sample: events are selected at trigger level by requiring the presence of at least one isolated electron or muon [9,10]. Offline, exactly one isolated electron or muon is required, satisfying tight identification criteria. The electron (muon) is required to have a p T > 40 (30) GeV and |η| < 2.4. Events are further considered if they contain at least four jets with p T > 25 GeV. In this event sample a higher fraction of c jets is expected in comparison with the other samples. These c jets arise from the decay of the W boson to quarks.
The distributions of all input variables and output discriminators in the four aforementioned event topologies are monitored to assess the agreement between data and simulation. Figure 30 shows a selection of four input variables. For the secondary vertex variables that are shown the secondary vertices are reconstructed with the IVF algorithm, discussed in section 4.3. In the top left panel, the 3D impact parameter significance of the tracks is shown for jets in the dilepton tt sample. The observed discrepancy around zero is explained by the sensitivity of this variable to the tracker alignment and the uncertainty in the track parameters. The top right panel shows the corrected secondary vertex mass for the leading secondary vertex (sorted according to increasing uncertainty in the 3D flight distance), using jets in an inclusive multijet sample. The bottom left panel shows the 3D flight distance significance of the leading secondary vertex using jets in the muon-enriched jet sample. As was the case for the impact parameter significance, the disagreement between the data and the simulation is related to the sensitivity of this variable to the tracker alignment and the uncertainty in the track parameters and hence on the secondary vertex position. The bottom right panel shows the "massVertexEnergyFraction" variable, defined in section 5.2.1, using jets in the single-lepton tt sample.
While the simulation models the secondary vertex mass reasonably well, some discrepancies are observed for the impact parameter significance of the tracks and the secondary vertex flight distance. The imperfect modelling of the input variables will also have an impact on the modelling of the output discriminator distributions, which are shown in figure 31. The upper panels show the JP and cMVAv2 discriminators using jets in the dilepton tt sample. The discontinuities in the distribution of the JP discriminator values are due to the minimum track probability requirement of 0.5%, as explained in section 5.1.1. The middle panels show the CSVv2 and DeepCSV discriminators using jets in the muon-enriched sample. The lower panels show the CvsL and the CvsB discriminators, using jets in the inclusive multijet sample. The discontinuities in both distributions arise from jets for which no tracks pass the track selection criteria, as discussed in section 5.2.1. Deviations of up to 20% are observed at the highest discriminator values. These deviations may be related to the modelling of the detector in the simulation and to the accuracy of the generators in their modelling of the parton shower and hadronization. It is therefore important to measure the efficiencies directly from the data. In physics analyses, the difference between the tagging efficiency in the data and simulation is then corrected for by taking into account a per jet data-to-simulation scale factor where ε data f (p T , η) and ε MC f (p T , η) are the tagging efficiencies for a jet with flavour f in data and simulation, respectively. For most of the efficiency measurements, the number of jets in the data is too limited to provide a dependence on the jet |η|. For those methods, only the dependence on the jet p T is measured. In simulation, the b/c tagging efficiency (misidentification probability) is defined as the number of b/c (light-flavour) jets that are tagged, according to the working point of a given algorithm (section 5), with respect to the total number of b/c (light-flavour) jets. Using simulated events, the number of jets with flavour f is determined by matching the jets with the generated hadrons. In data, the tagging efficiency is measured with a pure sample of jets with a certain flavour f , using selection requirements that do not bias the jets with respect to the variables used in the tagging algorithm.

The misidentification probability
The misidentification probability for light-flavour jets is measured with a sample of inclusive multijet events. The inclusive multijet data are collected using triggers requiring at least one jet above a certain p T threshold, with p T > 40 GeV being its lowest value. Because of the high trigger rates for the lowest trigger thresholds, the triggers are prescaled. The selected events are reweighted to take into account the different prescales for each trigger threshold in order to obtain the same jet p T distribution as if unprescaled triggers were used. The simulated events are reweighted to match the distribution of the number of pileup interactions in the data. The negative-tag method [1] is used for the measurement of the misidentification probability and the data-to-simulation scale factor, SF l . The method is based on the definition of positive and negative taggers, which are identical to the default algorithms, except that for each jet only tracks with either positive or negative impact parameter values and secondary vertices with either positive or negative flight distance are used. To first order, the discriminator values for negative and positive taggers are expected to be symmetric for light-flavour jets, with nonzero values of the impact parameter and flight distance arising because of resolution effects. Some asymmetries are present for light-flavour jets due to long-lived hadrons, such as K 0 S and Λ hadrons. The positive and negative discriminator distributions are presented in figure 32 using jets with p T > 50 GeV. For convenience, the discriminator values of the negative taggers are shown with a negative sign. Note that since the cMVAv2 and c tagger discriminator values range between −1 and 1, a shift was introduced such that the positive cMVAv2 discriminator is defined between 0 and 2, while the negative discriminator is shown with a negative sign and obtains values between −2 and 0. Deviations of up to 10% are observed between the data and simulation for some discriminator values. We define negative-tagged (positive-tagged) jets as the jets with a discriminator value of the negative (positive) tagger passing the working point of the tagger. The misidentification probability, ε l , is determined from the fraction of negative-tagged jets passing the working point, ε − , in an where the correction factor R LF = ε MC l /ε −,MC is the ratio of the misidentification probability of light-flavour jets to the negative tagging probability of all jets in simulation. The correction factor R LF is typically between 0.3 and 1, with the exact value depending on the working point and tagger.
Systematic uncertainties in the misidentification probability are related to possible effects that may have an impact on R LF . In particular, the following systematic uncertainties are evaluated: • Fraction of heavy-flavour jets: if the fraction of jets from heavy-flavour quarks in the negative-tag sample increases, the value of R LF decreases. The fraction of b jets has been measured by the CMS collaboration to agree with the simulation within ±20% [53]. To assess the effect of this systematic uncertainty, the fraction of heavy-flavour jets in the simulation is varied by ±20%.
• Gluon fraction: the fraction of gluon jets affects the misidentification probability in the simulation as well as the negative tagging probability, because of the larger track multiplicity in gluon jets compared to jets originating from light-flavour quarks. In addition, the fraction of gluon jets depends on the parton density and parton showering in the simulation. The systematic effect due to the uncertainty in the fraction of gluon jets is evaluated by varying the gluon fraction by ±20% [54].
• K 0 S and Λ decays (V 0 ): the observed numbers of reconstructed K 0 S and Λ hadrons are found to be a factor of 1.30 ± 0.30 and 1.50 ± 0.50 larger than expected [55,56], respectively. To determine the nominal value of the data-to-simulation scale factor, the amount of reconstructed K 0 S or Λ hadrons is reweighted in the simulation to be consistent with the observed yields. To obtain the size of the systematic effect due to the reweighting, the fraction of K 0 S and Λ hadrons is varied by the uncertainty in the measured fraction, i.e. by ±30 and ±50%, respectively.
• Secondary interactions: the rate of secondary interactions from photon conversions or nuclear interactions in the pixel tracker layers has been measured with a precision of ±5% [55,56]. The number of secondary interactions is varied by this amount to obtain the systematic uncertainty in the data-to-simulation scale factor.
• Mismeasured tracks: according to the simulation, there are more positive-than negativetagged jets containing a reconstructed track that cannot be associated with a genuine charged particle. This is expected because the positive-tagged light-flavour jets contain K 0 S or Λ hadrons, resulting in more hits and hence a higher probability for a wrong combination of those hits leading to a mismeasured track. To correct for this residual effect of mismeasured tracks, a ±50% variation of this contribution is taken into account for the systematic uncertainty in R LF .
• Sign flip: the number of jets with a negative tag is sensitive to the angular resolution on the jet axis and 3D impact parameter since these may affect the impact parameter sign. In particular, a difference between data and simulation in the probability of sign flips will affect the ratio of the negative tagging probability in data to that in simulation. The difference between data and -48 -simulation on the fraction of negative-tagged jets with respect to all tagged jets is measured with a muon-enriched jet sample and used to estimate the size of this systematic effect.
• Sampling: the dependence of the data-to-simulation scale factor on the event topology is estimated by the trigger dependence of the scale factor. The scale factor is computed separately for each of the trigger requirements used to select the inclusive multijet sample.
The maximum variation of the scale factor for these different measurements with respect to the nominal value using the unbiased jet p T spectrum is taken as the size of the systematic effect.
• Pileup: the simulated events are reweighted according to the observed amount of pileup interactions in data. A 5% uncertainty in the total inelastic cross section of pp collisions [57] is propagated to the distribution of the number of pileup interactions to assess the impact of the uncertainty in the pileup reweighting.
• Statistical uncertainty in the simulation: the limited amount of simulated multijet events is taken into account as an additional systematic uncertainty. Figure 33 shows an example of the measured misidentification probabilities, data-to-simulation scale factors, and relative systematic uncertainties for the medium working point of the DeepCSV and cMVAv2 taggers. In the top right panel of figure 33, the "step" in the misidentification probability around 450 GeV is caused by the p T -(and |η|-) dependent weights for the jet flavours in the vertex categories in the training of the CSVv2 algorithm, discussed in section 5.1.2. The middle panels in figure 33 show the scale factors as a function of the jet p T with the result of the fit superimposed. The fit functions are typically parameterized by a third degree polynomial with four free parameters. The dashed lines around the fit function represent the overall statistical and systematic uncertainty in the measurement. For jets with p T > 1000 GeV the uncertainty in the scale factor is doubled. The scale factors are typically larger than one in a broad jet p T range. The relative precision that is achieved on the scale factors for light-flavour jets when using b tagging algorithms is 5-10% for the loose working point and rises to 20-30% for the tight working point using jets with 20 < p T < 1000 GeV. The statistical uncertainty is typically a factor of 10 times smaller than the systematic uncertainty. For the c tagger, the relative precision varies between 3 and 7% for the loose and tight working points, respectively. The reason for the smaller uncertainty for the c tagger compared to the b taggers is the different definition of the working points. The working points for the c tagger have a much higher misidentification probability for light-flavour jets, ranging from over 90% for the loose working point to about a per cent for the tight working point, compared to 10% and 0.1%, respectively, for the b tagging algorithms (section 5). The tight working point of the c tagger corresponds to a misidentification probability that is in between the loose and medium working points of the b taggers. Taking this into account, the corresponding systematic uncertainties are of a similar size.

The c jet identification efficiency
In this section, the methods are presented to obtain a jet sample enriched in c quark content, which is subsequently employed to measure the efficiency for (mis)identifying c jets in data. The efficiency in data and simulation is then used to determine the data-to-simulation scale factor for c jets, SF c , for each algorithm and working point. The first method relies on the W + c topology. The second method uses c jets from the W boson decay to quarks in the single-lepton tt topology, where one of the W bosons decays into quarks and the other one into leptons.

Measurement relying on W + c events
The efficiency to identify c jets using heavy-flavour jet identification algorithms is measured with a sample enriched in c jets obtained from events with a W boson produced in association with a c quark. At leading order, the production of a W boson in association with a c quark proceeds mainly through s + g → W − + c and s + g → W + + c as shown in figure 34 (left and middle). A key property of this production process is that the c quark and W boson have opposite-sign (OS) electric charge.
The dominant background are W + qq events, which are produced with an equal amount of OS and -50 -same-sign (SS) events, as can be seen in figure 34 (right). After the event selection, a sample with a high purity of W + c events is obtained by subtracting the SS distribution of a variable from the OS distribution for that variable. The remaining events are referred to as "OS-SS". The W + c events are selected according to the criteria of ref. [58]. Events are selected by requesting one isolated electron (muon) with a p T above 34 (26) GeV and satisfying medium (tight) identification criteria [9,10]. When the event has more than one isolated electron or muon satisfying the selection criteria, the highest-p T lepton is considered as the lepton from the W boson decay. The contribution from Z + jets events is reduced by vetoing events with a same-flavour dilepton invariant mass between 70 and 110 GeV. To reduce the background from multijet events to a negligible level, the transverse mass M T = p T p miss T [1 − cos(φ − φ p miss T )] is required to be larger than 55 GeV. In this expression, φ and φ p miss T (p T and p miss T ) are the azimuthal angles (transverse momenta) of the isolated lepton and the ì p miss T vector, respectively. At least one jet is required in the tracker acceptance, with p T > 25 GeV and separated from the isolated lepton by ∆R > 0.5. In addition, the leading jet should contain a nonisolated soft muon among the jet constituents with p T < 25 GeV. The charge of the c quark is determined from the charge of the soft muon inside the jet. The OS (SS) events are then defined as events for which the muon in the jet has the opposite (same) charge as the isolated lepton from the W decay. After these requirements, the expected signal purity is about 60% for W → µν events and 80% for W → eν events. Remaining Z + jets and tt events are the main sources of background for the W → µν channel, and tt events for the W → eν channel. As an example, the distributions of the c tagger discriminators are shown in figure 35 for the OS-SS sample, for the W → µν and W → eν channels combined.
The efficiency to tag a c jet using a certain working point and tagger is obtained as the fraction of tagged c jets over the total number of c jets in W + c events in the OS-SS sample: Apart from the statistical uncertainty, the measurement may also be affected by several sources of systematic effects: • Background subtraction: the number of W+c events in data is obtained under the assumption that the fraction of (tagged) background events in data and simulation is the same. The effect of this assumption is quantified by varying f MC bkg and f tagged,MC bkg by 50%. The impact on the measured efficiency for tagging c jets is of the order of 2%, becoming one of the dominant uncertainties.
• Branching fraction of D → µX and fragmentation of c → D: the branching fractions for D → µX are varied to match the latest PDG data [35]. In particular, the branching fractions are shifted by −2% for D + → µX, +13% for D 0 → µX, and +16% for D s → µX. In addition, also the fragmentation rate of a c quark to a D meson is varied to be consistent with the PDG data [59]. This implies the following 8 variations: +37% for c → D + , −9% for c → D 0 , and −33% for c → D s . The difference in the measured c jet tagging efficiency after this simultaneous variation is less than 1% and is taken as a systematic uncertainty.
• Number of tracks: the uncertainty in the modelling of the number of selected tracks per jet in the simulation is taken into account by reweighting the distribution to match the data and remeasuring the data-to-simulation scale factor. The difference between the nominal scale factor value and the one after reweighting is less than 1%.
• Soft-muon requirement: requiring a muon in a jet may introduce a potential bias in the efficiency measurement when the tagger also relies on muon variables, as is the case for the c tagger and the cMVAv2 tagger. The bias may arise if the tagger response is different for jets with and without a soft lepton. The potential bias is estimated by repeating the measurement using a modified version of the tagger, which treats the muon as a track and assigns a default value to the soft-muon input variables. The difference between the values measured with the modified tagger and the default one is taken as systematic uncertainty. The effect of this variation is less than 3%. This is the dominant systematic uncertainty.
-52 -• Jet energy scale: since the measurements are performed in bins of jet p T , the fraction of jets in each bin may vary depending on the jet energy corrections. The data-to-simulation scale factors are remeasured after varying the jet energies by ±1 standard deviation of the nominal jet energy scale. The systematic effect due to this variation is less than 1%.
• Electron and muon efficiency: the uncertainties related to the lepton reconstruction and identification are taken into account by varying the corresponding correction factors within their uncertainty and reevaluating the efficiency for tagging c jets. The effect of this variation is smaller than 1%.
• Pileup: the effect of the uncertainty in the number of additional pileup interactions is evaluated as described in section 8.2, having an impact on the c tagging efficiency below 1%.
• Factorization and renormalization scales: in ref. [58] the normalized cross section for W+c events has been measured and the impact of the factorization and renormalization scales used at matrix element and parton shower levels was evaluated. The systematic uncertainty related to the variation of these scales was found to be well below 1% because of the cross section normalization. When measuring data-to-simulation scale factors, this systematic uncertainty also cancels in the ratio.
• Parton distribution functions: the NNPDF parton densities are varied within their uncertainties resulting in additional templates for the systematic uncertainty. The effect was found to be less than 1%.
The total systematic uncertainty in the data-to-simulation scale factor measurement is obtained as the quadratic sum of the individual systematic uncertainties. The c jet tagging efficiency and the data-to-simulation scale factor SF c are computed as a function of jet p T and presented in figure 36 for the loose and medium working points of the c tagger. Scale factors for misidentifying c jets are also derived for the b tagging algorithms. -53 -

Measurement relying on the single-lepton tt events
If a W boson decays hadronically, the decay contains a c quark in about 50% of the cases. Therefore, in a pure sample of single-lepton tt events, about one event out of two will contain a c jet. Because of the particular decay chain of the top quark, the energy of up-type quarks from the W boson decay is, on average, larger than for down-type quarks. This property, verified in simulated tt events, is used to obtain samples of jets enriched and depleted in c quarks. The c tagging efficiency is obtained by fitting the distribution of a variable in both of these samples simultaneously to the data, as will be explained in the following. Events are selected by requiring exactly one isolated muon satisfying the tight identification criteria and with a p T exceeding 30 GeV [10]. In addition, exactly four jets with p T > 30 GeV are required. All objects are required to be within the tracker acceptance. The background from multijet events is reduced to a negligible level by requiring the reconstructed transverse mass formed by the muon and ì p miss T , to be M T (µ, p miss T ) > 50 GeV. The tt event is reconstructed by assigning the jets to the quarks from which they originate, using a mass discriminant λ M . This mass discriminant is defined as the 2D probability for the invariant mass of a correct combination of two jets to be consistent with the W boson mass, and the invariant mass of a correct three-jet combination to be consistent with the top quark mass. The jet-quark assignment for which the negative logarithm of λ M is minimal is chosen as the reconstructed tt topology candidate for the event. The two jets assigned to the b quarks from the top quark decay are required to be b-tagged; one jet should pass the tight working point of the CSVv2 tagger and the other one its loose working point. By requiring those jets to be b-tagged only after the jet-quark assignment is done, a bias is avoided on the c jet tagging efficiency measurement. Figure 37 shows the distribution of λ M and of the highest (leading) and second-highest (subleading) energy for the two jets corresponding to the W decay after the full event selection. The tt simulation is divided into three different subsamples: • tt, right W h : the W boson is correctly reconstructed, hence the two jets are correctly assigned to the quarks from the W boson decay.
• tt, wrong W h : the W boson is wrongly reconstructed, hence at least one of the two jets is not correctly assigned to the quarks from the W boson decay.
• Other tt decay: the generated event is not a single-lepton tt event.
The non-tt background is relatively small, with contributions from single top quark, W + jets, Z + jets, and multijet production. From figure 37 it is clear that the λ M distribution has discrimination power to separate jets that are correctly associated with the W boson decay and jets for which this is not the case. Therefore, this distribution is used to measure the efficiency and data-to-simulation scale factor for c jets.
Four event categories are defined according to whether or not the jets that are assigned to the W boson decay (i.e. the probe jets) pass the tagging working point for which the efficiency is to be measured: • Notag: both probe jets fail the tagging requirement; • Leadtag: only the most energetic probe jet passes the tagging requirement; • Subleadtag: only the least energetic probe jet passes the tagging requirement; and • Ditag: both probe jets pass the tagging requirement.
For tt events in the "right W h " subsample, the number of events in the various categories can be written as: 4) with N T the total number of events, f 1,2 the fraction of leading (subscript 1) and subleading (subscript 2) c jets, and ε c,L 1,2 the tagging efficiencies for leading and subleading jets for c (superscript c) and light-flavour (superscript LF) quarks. The indentation highlights the different components, namely the probability for a jet pair to be composed of (c, light), (light, c), and (light, light) jet flavours as (leading, subleading) jets from the W boson decay. The (c, c) pair is not present since it is unphysical. Instead of measuring the efficiency ε c,LF 1,2 , the data-to-simulation scale factor is measured. Therefore, ε c,LF 1,2 is replaced with SF c,l ε c,LF 1,2 (MC) in eq. (8.4). In the latter expression ε c,LF 1,2 (MC) is the efficiency obtained from simulation and SF c (SF l ) is the scale factor for c (light-flavour) jets. To reduce the number of unknown parameters, the value for SF l is taken to be the measured value using the negative-tag method presented in section 8.2.
A maximum likelihood fit is performed on the binned λ M distributions using the signal and background distributions (templates) obtained from the simulated events. The measurement is performed inclusively, since the selected number of events is not sufficient for a precise measurement in bins of jet p T . Systematic uncertainties are included in the fit as nuisance parameters that are profiled. Each nuisance parameter is floating with a Gaussian constraint around the central value with a standard deviation proportional to the systematic uncertainty. It is possible to group the systematic uncertainties in two sets based on their effect on the templates. The following systematic effects only affect the normalization of the templates: -55 -• Scale factor for light-flavour jets: the data-to-simulation scale factor for the misidentification of light-flavour jets is varied within its uncertainty. This is the dominant uncertainty in the scale factor measurement for c jets.
• Cross sections of the simulated processes: an uncertainty of 16, 50 and 20% is assumed in the cross section of the tt [60], single top quark [61,62], and the combined W + jets and Z + jets [63,64] processes, respectively. The limited number of simulated W + jets and Z + jets events requires an additional uncertainty in their yield, fully uncorrelated among the event categories.
• Integrated luminosity and pileup: the uncertainty in the integrated luminosity measurement of 6.2% [65] and on the number of additional pileup interactions are considered as yield uncertainties. These uncertainties as well as the uncertainty in the cross sections for the simulated processes are the same for each working point probed, and are applied to the related samples in a correlated way between categories.
• Scale factors for b tagging: since b tagging is applied for the event selection, the uncertainty in the b tagging data-to-simulation scale factor is considered as a systematic effect. The simulation has been processed with b jet scale factors shifted by their uncertainties. In case the b-tagged jets in the event selection are actually originating from c quarks, the scale factor is varied by a conservative 50%. The size of the combined effect due to the uncertainty of correctly tagging b jets and wrongly tagging c or light-flavour jets for the event selection depends on the samples, the categories, and the working points considered. However, the effect of these uncertainties has limited impact on the final result, being fully correlated across samples and categories.
A potential source of systematic uncertainty for the normalization of the templates may arise from the uncertainty in the cross section of tt events produced in association with heavy-flavour jets, which is constrained to within 35% [66]. Such an uncertainty is covered by the systematic variation on the inclusive tt production cross section and the uncertainty in the b tagging scale factor for the event selection, which is taken to be 50% for jets arising from c quarks. In addition to a possible impact on the normalization of the templates, the following systematic effects affect the shape of the templates: • Jet energy scale: new templates are constructed by varying the jet energy scale by ±1 standard deviation from its nominal value. The uncertainty is propagated to the fraction of c jets in leading and subleading jets.
• Jet energy resolution: for the nominal efficiency measurement, the jet energies in the simulation are smeared according to a Gaussian function to accommodate the slightly worse resolution in data. The uncertainty in the jet energy resolution is propagated to the datato-simulation scale factor measurement by varying the standard deviation of the Gaussian function by its uncertainty.
• Factorization and renormalization scales: the factorization and renormalization scales used at matrix element and parton shower levels affect the number of additional jets from -56 -

JINST 13 P05011
initial-state radiation (ISR) and final-state radiation (FSR), and may impact the fraction of leading and subleading c jets. The factorization and renormalization scales used at matrix element level are varied independently and simultaneously by factors of 2 and 0.5 with respect to their default values. Also the scale for ISR (FSR) in the parton shower is varied by a factor of 2 ( √ 2) and 0.5 ( √ 0.5) [67]. A different way to assess the uncertainty in the modelling of ISR and FSR is to vary the "hdamp" parameter in . This parameter is used to limit the resummation of higher-order effects using a reference energy scale. The real emissions are reweighted by a step-function h 2 /(p 2 T + h 2 ), where h is the hdamp parameter and p T is the transverse momentum of the top quark in the tt rest frame. The hdamp parameter is varied between 0.5m t and 2m t to evaluate the uncertainty related to additional jets from ISR and FSR. The variations upwards and downwards having the largest impact on the templates, are used to repeat the data-to-simulation scale factor measurements independently for ISR and FSR. The deviation from the nominal scale factor value is taken as the uncertainty. Together with the uncertainties in the jet energy scale and resolution, the effect is 1% for both the leading and subleading jets.
• Top quark mass: the uncertainty in the top quark mass may affect the measurement of the data-to-simulation scale factor. The size of the uncertainty is estimated using alternative simulated samples with a mass that is shifted within the uncertainty in the measured value [35].
• Parton distribution functions: the uncertainties in the parton densities is evaluated in the same way as in section 8.3.1 and found to be negligible.
• Bin-by-bin statistical uncertainty: statistical uncertainties related to the single bin population in the templates have been addressed through bin-by-bin variations, i.e. fully uncorrelated shape uncertainties in which only one bin of the template is shifted according to its uncertainty. In order to reduce the computational time required by the fit to converge, this uncertainty is only considered for template bins having an uncertainty larger than 5% of the yield observed in the same bin, thus rejecting most of the low-yield backgrounds. Table 4 summarizes the values of the measured data-to-simulation scale factors for all tagging requirements. The uncertainty in the scale factors in the table are a combination of the statistical and systematic uncertainties obtained from the fit.

Combination of the measured c tagging efficiencies
In the previous sections two methods have been described to measure the c tagging efficiency. In this section, a combination of the measurements is performed. The combination is a weighted average taking into account the full covariance matrix for the uncertainties using the best linear unbiased estimator (BLUE) method [68]. This technique was also used for combining the SF b measurements in Run 1 [1], but here it has been extended to fit all the jet p T bins simultaneously [69], treating more correctly the bin-to-bin correlations for the systematic uncertainties. Systematic uncertainties shared by the two measurements are treated as correlated in the combination. The averaging has been done using the finer jet p T binning of the W + c topology. The relative contribution of the single-lepton tt measurement in each jet p T bin is taken into account by assigning weights to this -57 -  For the c tagging algorithm, the relative precision on the data-to-simulation scale factors for c jets is 2% (4%) for the loose (tight) working point. For the b tagging algorithms, the relative precision is 3-5% for the loose working points, and 10-38% for the tight working points. Overall, the statistical uncertainty is 40-90% of the total uncertainty.

The b jet identification efficiency
The data-to-simulation scale factor for b jets, SF b , is obtained using a sample of jets enriched in b quark content, e.g. by selecting multijet events with at least one jet containing a muon, or tt events that contain two b jets from the decay of the two top quarks. To enhance the purity when selecting tt events, the decay of one or both of the W bosons into leptons is required. This section describes the various SF b measurements and their combination.

Measurements relying on a muon-enriched topology
Events are selected using various online criteria requiring the presence of two jets with at least one of those jets containing a muon. The different prescales of the various triggers are taken into account by reweighting the selected events according to the value of the prescale. Offline, the sample is enriched with events containing b jets by requiring that at least one jet has a muon with p T > 5 GeV and with ∆R < 0.4 from the jet axis, referred to as the "muon jet". The selected simulated events are reweighted to match the pileup profile observed in the data. The muon jet sample is used for three measurements, using the PtRel, LifeTime (LT), and System-8 methods [1]. As discussed in section 8.3.1, the muon enrichment may introduce a bias for the efficiency measurement of taggers that rely on soft muon information, such as the cMVAv2, CvsB, and CvsL discriminators. Therefore, the methods described in this section are only used to derive data-to-simulation scale factors for the other taggers.
PtRel method. The p T of the muon relative to the jet axis, p rel T , is a variable that is able to discriminate between b jets and non-b jets. On average, this variable is expected to be larger for muons coming from the decay of b hadrons because of the large mass of these hadrons. Therefore, this variable can be used to measure the efficiency for tagging b jets with algorithms relying on track and secondary vertex variables. The fraction of b jets in data can be estimated by fitting the observed p rel T distribution to the sum of the templates for the different jet flavours. The p rel T templates for the different flavours are obtained from the simulated muon-enriched multijet samples.
To reduce the fraction of non-b jets, the presence of a second jet is required away from the first one ("away jet") with ∆R > 1.5 and exceeding a JBP discriminator value corresponding to the medium -59 -

JINST 13 P05011
working point. For light-flavour jets, a difference is observed between data and simulation in the distribution of the number of charged particles per jet. Therefore, the jets are reweighted with the ratio of the distribution of the observed number of charged particles in inclusive multijet data to that expected in simulation (without the muon enrichment). The template for b jets is corrected by applying a factor corresponding to the ratio of the p rel T distribution in data to that in simulation for b jets passing the tight JP tagging requirement. The fraction of non-b jets in the JP-tagged samples is found to be of a few per cent and is subtracted. After this correction, we apply the algorithm working point for which the efficiency is to be measured. The observed p rel T distribution is then fitted with the templates for the jet flavours to obtain the number of b jets passing (N tagged b ) or failing (N vetoed b ) the requirement. The b tagging efficiency in data is obtained as Examples of the fitted p rel T distributions using jets passing and failing the medium working point of the CSVv2 algorithm and with 50 < p T < 70 GeV, are shown in figure 39. LifeTime method. The muon jet sample used in the LT method is the same as for the PtRel method, except that the away jet is not required to be tagged. Also the strategy is similar to the PtRel method, but the fit is performed on the JP discriminator distribution. The track probabilities are calibrated using templates with negative impact parameter tracks in multijet events. The calibration is done separately on data and simulation to take into account a potential difference in the impact parameter resolution between both samples. The fraction of b jets is fitted including all shape systematic uncertainties via a correlation matrix. The tagging efficiency is then obtained as the ratio of the number of b jets obtained from the fit after and before applying the algorithm working point

JINST 13 P05011
The factor C b is a correction factor, which takes into account the fraction of jets for which the JP discriminant can be computed. It is defined as with N b,MC the number of b jets with JP information, n b,MC the number of all selected b jets, N tag b,MC the number of b jets with JP information passing the algorithm working point for which the efficiency is being measured and n tag b,MC the number of b jets passing the tagging requirement for which the data-to-simulation scale factor is being measured. The fraction of jets without a JP discriminant value is maximum at very low jet p T (8%) and drops below 1% using jets with p T > 120 GeV.
As an illustration, figure 40 shows the fitted JP distributions using jets with 200 < p T < 300 GeV before and after applying the medium working point of the CSVv2 algorithm. System-8 method. In contrast with the two methods described before, the System-8 method [70] does not rely on simulated templates of a discriminating variable. Instead, it is based on the usage of two weakly correlated b taggers and two samples containing muons within jets. The first b tagging requirement corresponds to the working point of the algorithm for which the efficiency is to be measured (tag); the second b tagging requirement is p rel T > 0.8 GeV. This requirement is weakly correlated with the working points for algorithms that do not rely on soft-muon information. The first sample consists of all events with a muon jet (sample n); the second sample is a subset where an away jet satisfies the medium working point of the JBP algorithm (sample p). For each combination of a sample with either zero, one of the two, or both tagging requirements applied, the observed number of jets can be written as the sum of the two (b and non-b) flavour contributions. The efficiency of the algorithm working point under study and the efficiency of the p rel T > 0.8 GeV requirement are assumed to be factorizable modulo a correlation factor that is determined from simulated events. In total eight equations can be written, with eight unknown parameters, namely -61 -the b tagging efficiencies of the two requirements and the number of b and non-b jets in the two samples: where α, β, γ, δ, κ b and κ c,udsg are the correlation factors. This system of eight equations is solved numerically. The solution has to pass some physical constraints, e.g. the b tagging efficiency is required to be larger than the non-b tagging efficiency, and the fraction of b jets in the initial sample needs to be smaller than the fraction of non-b jets in the sample.
Systematic uncertainties. Various systematic uncertainties are taken into account that may affect the SF b measurement. For the three measurements based on the muon-enriched jet samples, the following systematic effects are considered: • Gluon splitting: a variation in the fraction of b and c jets from gluon splitting may have an important impact on the b tagging efficiency since heavy-flavour jets from gluon splitting have a higher track multiplicity. The fraction of events with b jets from gluon splitting is varied by ±25% [71] to estimate the potential effect. For the tight working point of the taggers, this is one of the dominating uncertainties for the System-8 method. In the case of the LT method, the fraction of events with c jets from gluon splitting is also varied by this amount. For the System-8 and PtRel methods, the fraction of c jets in the non-b template is varied when evaluating other systematic effects.
• b quark fragmentation: the modelling of the b quark fragmentation may affect the p T distribution of the b jets in the sample. The size of this effect is estimated by varying the p T of the primary b hadron in the muon jet by ±5%, which is the observed variation between the distribution of the energy fraction of the b jet carried out by the b hadron in and . This variation between and is typically larger than the variation observed between and data.
• Branching fraction of D → µX and fragmentation of c → D: these systematic effects are evaluated in the same way as described in section 8.3.1, with the exception that the PDG 2008 values [72] are used for the fragmentation rates. While the nominal values and uncertainties vary slightly in the PDG 2008 and 2016 references, they are fully consistent.
• K 0 S and Λ decays (V 0 ): this systematic effect is evaluated in the same way as described in section 8.2.
-62 -• Muon p T and ∆R: the fraction of muons that reach the muon chambers depends on the muon p T . The threshold on the muon p T is varied between 5 and 8 GeV to assess the size of the systematic uncertainty. In addition, the dependence of the measured data-to-simulation scale factor on the ∆R requirement is tested by tightening the requirement to ∆R < 0.3. These systematic effects are among the dominant uncertainties for the System-8 method.
• Away jet tag: the dependence of the b tagging efficiency on the away jet tagging requirement is studied by repeating the data-to-simulation scale factor measurement after changing the tagging requirement from the medium to the loose or tight working points. The largest deviation from the scale factor value obtained using the default away-jet tagging requirement is taken as the size of the systematic effect. This systematic effect is typically the dominant uncertainty for the PtRel method, and it is one of the dominating uncertainties for the System-8 method.
• JP correction factor C b : for the LT method, the fit is performed using only jets that have a JP discriminant value. The applicability of the measured data-to-simulation scale factor to all jets is ensured through the correction factor C b . The systematic uncertainty associated with C b is defined as (δC b ) SF = ± 1−C b 2 . This systematic effect induces an uncertainty in the measured scale factor of a few per cent using jets in the lowest p T bin and is negligible at high jet p T .
• JP calibration: the LT method relies on the calibrated JP discriminator distribution. For the nominal data-to-simulation scale factor value, the calibration of the impact parameter resolution derived from data is applied to the data, and the calibration derived from simulated events is applied to the simulation. However, a bias could be induced in the measurement if there are significant differences between data and simulation in the distribution of track impact parameter resolutions used. Therefore, an additional uncertainty is taken into account by applying the calibration derived on simulation, also on the data. The difference in the measured scale factor is included as additional systematic uncertainty. The inverse approach was also tested, i.e. applying the JP calibration derived on data to both data and simulation.
In that case, the shape changed in a similar way, yielding consistent results for the size of the systematic effect. The systematic effect due to the JP calibration is the dominating uncertainty for the LT method.
• JP bin-to-bin correlation: for the LT method the systematic uncertainties are taken into account via a correlation matrix. This requires an assumption on the bin-to-bin correlation factors. To assess the impact of an uncertainty in these correlation factors, the data-tosimulation scale factors were remeasured when varying the bin-to-bin factors within ±25%. The size of the systematic effect is given by the maximal difference with the nominal SF value.
• Muon p rel T requirement: for the System-8 method, the default requirement of p rel T > 0.8 GeV on the muon is set to a value of 0.5 or 1.2 GeV. The largest deviation from the measured nominal data-to-simulation scale factor is taken as a systematic uncertainty.
• udsg-to-c jet ratio: in the PtRel method the c and light-flavour jets are combined in a single template. The uncertainty in the ratio of light-flavour to c jets is changed by varying it by -63 -±30% to cover the observed discrepancy in the fraction of light-flavour jets in inclusive and muon-enriched multijet events.
• Non-b jet template correction: for the PtRel method, the non-b jet templates are corrected to accommodate the difference in the number of selected tracks for data and simulation. The difference between the measured data-to-simulation scale factors when applying these corrections or not is considered as the size of the systematic effect.
• b jet template correction: similarly as in the case for the non-b jet template, also for the b jet template the difference between the nominal data-to-simulation scale factor value and that measured without template correction is taken as an additional uncertainty for the PtRel method.
• Jet energy scale: the impact of the uncertainty in the jet energy corrections is evaluated as described in section 8.3.1.
• Pileup: the effect of the uncertainty in the number of additional pileup interactions is evaluated as described in section 8.2.
For the System-8 and PtRel methods the largest deviation from the nominal data-to-simulation scale factor value is taken as the size of the systematic effects. For the LT method, the shape variations are taken into account in the template fit. Table 5 summarizes the list of systematic effects taken into account for each of the three methods to measure SF b .
Results. The measurements of SF b obtained on muon-enriched multijet events are combined using the BLUE method as described in section 8.3.3. The weighted average is calculated taking into account the correlations between the three methods. The combination is performed as a function of the jet p T , ranging from 20 to 1000 GeV. Jets with a higher p T are included in the last bin. The PtRel and LT methods provide measurements on the full jet-p T range from 20 to 1000 GeV, while the sensitivity of the System-8 method is limited to the lower part of the spectrum, 20 < p T < 140 GeV.
The PtRel, System-8, and LT methods are applied on the same events. However, the requirement on the second jet is different for each method. The fraction of events with b quarks that is in common between each pair of methods is obtained from simulated events, and is used to estimate the statistical correlation in the combination of the results. Systematic uncertainties that are in common for two or three methods are treated as correlated. Some of the systematic effects that induce a large uncertainty are however related to a specific method and are treated as uncorrelated.
The data-to-simulation scale factor measurements obtained with the PtRel, System-8, and LT methods for the loose (tight) working point of the CSVv2 (DeepCSV) algorithm as a function of the jet p T are compared in the upper panel of figure 41. For each point, the thick error bar corresponds to the statistical error and the thin one to the overall statistical and systematic uncertainty. The combined SF b value is displayed as a hatched area in both panels with its overall uncertainty. In the lower panel the result of a fit function is superimposed. The function used in the fit is SF b (p T ) = α 1+βp T 1+γp T , where α, β, γ are free parameters. The combined statistical and systematic uncertainty is centred around the fit result. The measured data-to-simulation scale factors for the loose working point of the CSVv2 algorithm range from 0.96 to 1.03, and from 0.9 to 1.0 for the -64 - Table 5. Summary of the potential sources of systematic effects taken into account for the muon-enriched SF b measurements. The symbol "x" means that the uncertainty is considered, "-" means that it is negligible, and "n/a" that it is not applicable. The systematic effects are separated by horizontal lines according to the type of uncertainty. The first set indicates the modelling uncertainty of heavy-flavour jets in the simulation, the second set are uncertainties related to the selection requirements or to the method that is applied, and the third set covers any other type of uncertainty.

Systematic effect
PtRel LT System-8 Gluon splitting to bb Branching fraction of D → µX n/a x n/a c → D fragmentation rate n/a x n/a K 0 S (Λ) production fraction n/a x n/a Muon p T and ∆R x -x Away jet tag x n/a x Fraction of jets with JP n/a x n/a JP calibration n/a x n/a JP bin-by-bin correlation n/a x n/a p rel T requirement n/a n/a x udsg-to-c jet ratio x n/a n/a Non-b template correction x n/a n/a b template correction x n/a n/a JES x x x Pileup x -x tight working point of the DeepCSV algorithm. The relative precision on the scale factors is 1-1.5% using jets with 70 < p T < 100 GeV and rises to 3-5% at the highest considered jet p T .

Measurements relying on the dilepton tt topology
The b jet identification efficiency is also measured using dilepton tt events, where two b jets are expected from the decay of the top quark pair. Events are selected with exactly two isolated leptons (muons or electrons) fulfilling tight identification criteria [9,10] with opposite charge and p T above 25 GeV. Events are selected if there are at least two jets with p T > 30 GeV. All aforementioned objects are required to be in the tracker acceptance.
Kinematic selection method For the kinematic selection (Kin) method, events are further selected by requiring the presence of exactly one isolated electron and one isolated muon with opposite sign and with a dilepton invariant mass M µe > 90 GeV. These requirements significantly reduce the background from Z + jets events. In addition, p miss T is required to be larger than 40 GeV. While two jets are expected in dilepton tt events, it is possible that more than two jets (or the wrong two jets) are selected because of, e.g. ISR and FSR. A discriminator is constructed that is able to separate b • ∆η( , j) and ∆φ( , j): difference in pseudorapidity and in azimuthal angle between the lepton and jet for the ( , j) pair with the smallest and the largest ∆R( , j).
• ∆η( , j) and ∆φ( , j): difference in pseudorapidity and azimuthal angle between the dilepton system and an ( , j) pair for the ( , j) pair with the smallest and the largest ∆R( , j).
The variables in the first two items are sensitive to ( , j) pairs originating from the same top quark decay, while the variables in the two latter items use the correlation between the spin of the top quark and the top antiquark that is present in tt events [73]. The 12 variables listed above are combined with a BDT using the TMVA package [52]. Prior to the training on simulated tt events, jets in the event are classified according to their rank when ordered according to decreasing p T . In -66 -particular, the training is performed in three different categories for the leading, subleading, and other jets. This classification helps to better use the correlations between the variables for the signal and background, in particular for events with a high jet multiplicity. The parameters of the BDT, such as the number of trees, the depth and the shrinkage factor of the gradient learning algorithm, were roughly optimized to obtain a smooth background shape at large discriminator values without reducing the discriminating power. A binned likelihood fit is performed on the kinematic discriminator of jets passing and failing the b tagging requirement, inclusively for all jets together. For each flavour f , the total number of jets N f can be expressed as a function of the tagging efficiency in simulation, ε MC f , and the data-to-simulation scale factor SF f : where N MC, tagged f is the expected number of jets of flavour f passing the requirement determined from simulation. The templates for light-flavour and c jets are similar. The measured value of the mistag data-to-simulation scale factor, as presented in section 8.2, is used to correct for the different misidentification probability in the data and simulation. It is not necessary to use a dedicated scale factor for c jets since the fraction of c jets is expected to be less than 1% and fully covered by the systematic uncertainties. The scale factor for b jets is the only free parameter to be determined from the fit. The fit is performed simultaneously in bins of jet multiplicity, with up to four jets. For convenience, the discriminator values are transformed from [−1, 1] to [−1, 1] + 2(N jets − 2).
Several sources of systematic uncertainties are considered: • Factorization and renormalization scales: the uncertainty in the factorization and renormalization scales is evaluated in the same way as in section 8.3.2, except that the scale for FSR in the parton shower is varied by a factor of two up and down, and not by a factor of √ 2. In addition, both the variation of the scale in the parton shower as well as the variation of the hdamp parameter in are taken into account to assess the impact of ISR and FSR instead of using the largest variation. Although these systematic uncertainties are correlated, they are conservatively treated as uncorrelated.
• Cross section of background processes: the cross section of each non-tt background process is varied by 30% to assess the systematic effect due to the uncertainty in the background contributions.
• Top quark mass: the uncertainty in the top quark mass is evaluated in the same way as in section 8.3.2.
• Scale factor for non-b jets: the data-to-simulation scale factor for light-flavour jets, SF l , is applied to correct the expected fraction of light-flavour and c jets. To evaluate the uncertainty related to SF l , the value is changed to SF l ± 1σ, where σ represents the uncertainty in SF l . The effect of this variation on the measured value of SF b is taken as the size of the uncertainty due to SF l .
-67 - • Selection efficiency: the uncertainty in the lepton identification and isolation efficiency is propagated to the measurement by reweighting the simulation using a lepton efficiency scale factor that is shifted up or down by one standard deviation with respect to the nominal value.
• Pileup: the uncertainty in the pileup modelling is assessed as described in section 8.2.
The systematic effect induced by the uncertainty in the parton distribution functions is negligible.
To determine the dependence of the data-to-simulation scale factor on the jet p T , independent fits are performed in mutually exclusive bins of jet p T . An example of the fitted distribution in the jet p T range between 100 and 140 GeV using jets passing and failing the medium working point of the CSVv2 algorithm is shown in left and right panels of figure 42, respectively. Discrepancies between the data and simulation are covered by the combined statistical and systematic uncertainty. The scale factor as a function of the jet p T is shown in figure 43 for the three working points of the CSVv2 and DeepCSV algorithms.
Two-tag counting method. The two-tag counting (TagCount) method is mainly used as a cross check of the Kin method. While the Kin method is able to determine data-to-simulation scale factors at higher jet p T , the TagCount method is a simple and robust approach to assess the size of the scale factors. The dilepton tt events are selected by requiring the dilepton invariant mass M > 12 GeV. If the two leptons have the same flavour, the contribution from Z + jets events is reduced by applying a veto around the Z boson mass, |M − M Z | > 10 GeV, and requiring p miss T > 50 GeV. In addition, each event is required to have exactly two jets. The b jet identification efficiency, ε b , can be obtained by counting the number of events with two b-tagged jets in the selected sample of events: 10) where N 2 b-tagged is the number of events with two b-tagged jets from data, N non-b jet 2 b-tagged is the number of events with two b-tagged jets with at least one of them being a light-flavour or c jet, and n 2 b jets is the number of events with two true b jets. This equation can be solved for ε b if N non-b jet 2 b-tagged and n 2 b jets are known. To reduce the dependence on the tt production cross section, the equation is divided by the number of selected events, where F 2 b-tagged is the fraction of events with two b-tagged jets, F non-b jet 2 b-tagged is the fraction of events with two b-tagged jets of which at least one is a non-b jet, and f 2 b jets is the fraction of events with two true b jets. The two latter fractions are obtained from simulation. When the method is used to measure the efficiency as a function of the jet p T , the two tagged jets are required to be in the same jet p T bin.
While the method is sensitive to the uncertainties in the predicted fraction of events with non-b jets F non-b jet 2 b-tagged , using the fraction of events ensures that systematic uncertainties related to the number of tt events cancel out. The dominant uncertainties originate from the normalization of background events and the fraction of non-b jet events in the bin with two b-tagged jets. The following systematic effects were studied: • The fraction of non-b jets (F non-b jet 2 b-tagged ): a conservative variation of 50% is used to estimate the uncertainty in the fraction of non-b jets. This represents the leading uncertainty in the final data-to-simulation scale factor for the loose working point of the b jet identification algorithms.

JINST 13 P05011
• Background yield: the effect of the uncertainty in the background estimation for the Z + jets background obtained from data is evaluated by varying its normalization by 50%. For the background yields that are estimated from the simulation, an uncertainty of 30% is assumed. This uncertainty is the subleading source of uncertainty.
• Factorization and renormalization scales: the uncertainty in the factorization and renormalization scales is assessed as described in section 8.3.2, except for the scale for FSR in the parton shower that is varied by a factor of two up and down, and not by factor of √ 2.
• Jet energy scale: the uncertainty in the jet energy scale is propagated to an uncertainty in the data-to-simulation scale factor as described in section 8.3.1.
• Jet energy resolution: the uncertainty in the jet energy resolution is addressed as described in section 8.3.2.
• Pileup: the systematic effect related to the uncertainty in the number of pileup interactions is evaluated as described in section 8.2.
The systematic effects related to the uncertainty in the top quark mass and the parton distribution functions are negligible compared to the impact of the uncertainty in the background yield and the number of non-b jets. The b jet identification efficiency is determined in bins of jet p T and the corresponding data-tosimulation scale factors are shown in figure 44. Large bin-to-bin variations are observed for low-p T jets, in particular for the tight working points of the taggers, for which the statistical uncertainty dominates.

Tag-and-probe technique using single-lepton tt events
In addition to dilepton tt events, one can also use the tt topology where only one of the W bosons decays to leptons. In this case, two b jets are expected from the top quark decays as well as two non-b -70 -jets from the decay of one of the W bosons. The decay chain t → bW → bqq is referred to as the hadronic side, while t → bW → b ν is the leptonic side. The event selection criteria are similar to those described in section 8.3.2, requiring exactly one isolated muon or electron with p T > 30 GeV and satisfying tight identification criteria [9,10] and exactly four jets with p T > 30 GeV to reduce the possible number of jet-quark assignments.
To enhance the b quark content in the jet sample on which the b tagging efficiency will be determined, the jets need to be correctly assigned to the quarks from which they originate. To achieve this, a likelihood method is used that is described in detail in ref. [60]. The reconstruction of the tt topology is enhanced by determining first the four-momentum of the neutrino p ν using the W boson and top quark mass constraints, (p ν + p ) 2 = m 2 W and (p ν + p + p b, ) 2 = m 2 t , with p and p b, being the four-momenta of the charged lepton and of the b jet candidate on the leptonic side, respectively. If both equations need to be satisfied, the possible solutions are found on an ellipsoid in the 3D momentum space of the neutrino. For each solution, the distance D ν is computed between the ellipse projection on the transverse plane and the ì p miss T vector. The solution of p ν for which this distance is minimal, D ν,min , is used. More details on this procedure and its performance can be found in ref. [74]. Once the neutrino momentum is defined, the jets are assigned to the quarks by choosing the jet-quark assignment that minimizes the negative logarithm of the likelihood λ. For each permutation − log(λ) is obtained as: Once the jets are assigned to the quarks, a tag-and-probe (TnP) technique is applied to determine the b tagging efficiency from data. As a tagging requirement, the medium working point of the CSVv2 algorithm is applied to either the b jet on the hadronic or leptonic side while the b jet from the other side is used as probe. The event is rejected if the tagging requirement is not satisfied. The probe jets are used to determine the b tagging efficiency of a given working point for each tagger under consideration. To achieve that, the distributions of − log(λ) and p miss T for probe jets passing and failing the tagging requirement are fitted with their expected templates to determine their number in data for the correctly-reconstructed tt events. During the fit, the normalization of the template for the non-tt background is naturally constrained by the p miss T distribution during the simultaneous fit. The b tagging efficiency in data is then obtained from the fitted fraction of probe jets passing the tagging requirement with respect to all probe jets, as in eq. (8.5). To increase the number of probe jets and to avoid a possible bias in the measurement, each b jet is used once as tag and once as probe. While the measurements are performed separately with either the b jet from the hadronic or the leptonic side as probe jet, they are afterwards combined by treating all systematic uncertainties as correlated. The measurement is performed in bins of jet p T . Figure 45 shows an example of the fitted − log(λ) and p miss T distributions for probe jets from the leptonic side, with 70 < p T < 100 GeV, passing and failing the medium working point of the  obtained from simulation. Also the template distribution for single top quark events is taken from simulation and in addition, its normalization is constrained within 20% of the expected standard model yield. The non-tt background is composed of multijet, Z + jets, and W + jets events and the combined template for these background processes is derived from data in a control region. The control region contains the events for which the jet with the highest CSVv2 discriminator value is below 0.6.
Several sources of systematic effects may impact the measurement of the b tagging efficiency. These effects are related to the data-taking conditions or to the uncertainty in the object reconstruction, affecting the selection of events and reconstruction of the tt topology. On the other hand, systematic effects are related to the modelling of the tt production and decay. In particular the following sources of systematic effects have been taken into account: • Factorization and renormalization scales: the uncertainty due to the factorization and renormalization scales is assessed as described in section 8.3.2.
• Top quark mass: the uncertainty in the top quark mass is propagated to the data-to-simulation scale factor measurement as described in section 8.3.2.
-72 - • Background: the non-tt background template is derived using events for which the jet with the highest CSVv2 discriminator value is below 0.6. The systematic effect due to this requirement is evaluated by varying its value to less than 0.3, or to values between 0.4 and 0.7. Although these alternative selections result in a different relative fraction of b and non-b jets, as well as in a different background composition, the overall template shape and the fitted value for the number of correctly reconstructed tt events is stable.
• Gluon splitting: the uncertainty in the gluon splitting into a heavy quark pair is estimated by reweighting events with at least one additional heavy quark that is not originating from the tt decay. Events with an additional c and b quark are reweighted by ±15% [75] and ±25% [71], respectively. As can be seen in figure 46 (right) the effect is relatively small.
• b quark fragmentation: the uncertainty in the b quark fragmentation function is estimated by varying the Bowler-Lund parameterization within the tune uncertainties. In particular, the parameter StringZ:rFactB in is varied by +0.184 and −0.197 to obtain alternative distributions for the ratio of the b hadron p T to the jet p T . The tt simulation is then reweighted using these functions and the impact on the measured data-to-simulation scale factor is taken as the size of the systematic effect.
• Branching fraction of B → X: the systematic uncertainty induced by the values of the branching fractions of the semileptonic decay of b hadrons may affect the b jet energy response. It is evaluated by reweighting the fractions to the values in ref. [35]. In particular, the branching fraction to leptons is varied by 2.7% for B 0 , by 8% B s , by 2.5% for B + , and by 21% for Λ B . As can be seen in figures 46 (right) and 47 (right), the impact of this variation, labelled "b hadron decay", is negligible compared to the other systematic effects.
• Jet energy scale: the impact of the uncertainty in the jet energy scale and its propagation to p miss T is assessed as described in section 8.3.2.
• Jet energy resolution: the uncertainty in the jet energy resolution is propagated to the data-to-simulation scale factor measurement as described in section 8.3.2.
• p miss T : the uncertainty in the lepton, photon, and unclustered energy is estimated by changing p miss T within its uncertainty and repeating the measurement.
• Pileup: the uncertainty in the pileup modelling is assessed as described in section 8.2.
The TnP method is applied to derive data-to-simulation scale factors for the three working points of the CSVv2, DeepCSV, cMVAv2, and c taggers. An example of the size of the systematic uncertainties as a function of the jet p T is shown in figure 46 (right). In the same figure the scale factor SF b as a function of the jet p T is also shown for the medium working point of the CSVv2 algorithm. As discussed previously, SF b is derived separately for b jets from the hadronic or leptonic side of the single-lepton tt decay. As expected, both results are consistent over the full jet p T range. To reduce the overall uncertainty, the results are combined using the BLUE method, assuming fully correlated systematic uncertainties and uncorrelated statistical uncertainties. Figure 47 (left) shows the data-to-simulation scale factors for b jets for the medium working point of the c tagger as a function of the jet p T . Since the probability to tag non-b jets is higher for the c tagger than for the b taggers, the systematic uncertainties will be larger. On the other hand, since the probability to tag b jets with the c tagger is also smaller compared to the b tagger, also the statistical uncertainty increases. This can be seen in figure 47 (right); while the statistical uncertainty on the measured scale factors still dominates, the systematic uncertainties are significantly larger compared to figure 46 (right). As a result, the total uncertainty for the scale factors for b jets is larger for the c tagger than for the b taggers. -74 -

Combination of the data-to-simulation scale factors from multijet and tt events
For the CSVv2 and DeepCSV taggers, the data-to-simulation scale factors measured with the muonenriched multijet events are combined with the ones measured in tt events using the Kin and TnP methods. Since the c tagger and the cMVAv2 tagger rely on the information from muons from the b hadron decay, the scale factors are only measured with tt events since the muon enrichment of the multijet sample may bias the scale factor measurement. Since the Kin method relies on dilepton tt events and the TnP method on single-lepton tt events, the two scale factor measurements are statistically independent. Similarly as for the combination of the scale factors on the muonenriched sample, the correlations between the systematic uncertainties are taken into account when combining all measurements with the BLUE method. In particular, when combining the scale factors measured with the TnP and Kin methods, the systematic uncertainty associated to final-state radiation for the TnP method is assessed in the same way as done for the Kin method. Figure 48 shows the combination of tt measurements for the medium working point of the cMVAv2 tagger (right), and for the loose working point of the DeepCSV algorithm (left). As p T found in tt events the data-to-simulation scale factors vary from about 0.99 for the loose working point to 0.95 for the tight working point. The achieved relative precision on the scale factor for b jets is 1 to 1.5% using jets with 70 < p T < 100 GeV and rises to 3-5% at the highest considered jet p T . Overall, the statistical uncertainty is 15-30% of the total uncertainty.
In some physics analyses of precision measurements, a correlation is present between the quantity to measure and the method to derive the b tagging scale factors. An example is the measurement of the tt production cross section in an analysis requiring one or more b-tagged jets. In that case, the scale factors derived from tt events are correlated with the production cross section to be measured and the scale factor measured with muon-enriched multijet events should be used.

Measurement of the data-to-simulation scale factors as a function of the discriminator value
The last method to measure data-to-simulation scale factors is a technique of iterative fitting (IterativeFit) first described in ref. [76], which aims at correcting the full discriminator shape. This method is designed to meet the needs of analyses in which the full distribution of the b tagging discriminator values is used instead of applying a working point of the algorithm to select jets or events. If the full discriminator distribution is used, the distribution using jets in simulated events has to be corrected to match the one observed in data. Scale factors for both b and light-flavour jets are derived as a function of the discriminator value in bins of jet p T and η. An iterative procedure is used based on a tag-and-probe technique to measure the scale factors for both b and light-flavour jets simultaneously. The scale factors are derived from events with two oppositely charged leptons (electron or muon) within the tracker acceptance and satisfying the tight identification and isolation -76 -requirements [9,10]. The leading (subleading) lepton is required to have p T > 25 (15) GeV. Exactly two jets are required with p T > 20 GeV and to lie within the tracker acceptance.
The data-to-simulation scale factors for b jets are derived from events passing the above requirements. In addition, for the events with same-flavour leptons, the dilepton invariant mass is required to be away from the Z boson mass, |M − M Z | > 10 GeV, and p miss T > 30 GeV. These two requirements reduce the contribution from Z + jets events. The tag jet should pass the medium working point of the algorithm for which the scale factor is to be measured. The other jet is used as probe. After these criteria have been applied, the simulated event sample is composed of 87% tt, 6% single top and 7% Z + jets events. Other backgrounds are reduced to a negligible level.
The data-to-simulation scale factors for light-flavour jets are measured with Z + jets events selected among the same-flavour dilepton events with a dilepton invariant mass close to that of the Z boson, |M − M Z | < 10 GeV, and inverting the requirement on p miss T . A b jet veto is applied on the tag jet using the loose working point of the tagger for which the scale factor is to be measured. After the event selection, the sample is very pure in Z + jets events (99.9%).
After the event selection and tagging or vetoing one of the two jets, the data-to-simulation scale factors are measured using the other jet in the event as the probe. The scale factors are extracted by first normalizing the b tagging discriminator distribution of the probe jets in simulation to that observed in data. Then, when measuring the scale factor for b jets, the contribution from non-b jets is subtracted using the simulated events. Similarly, when measuring the scale factor for light-flavour jets, the expected contributions from b and c jets are subtracted. The scale factor is determined separately in exclusive bins of the b tagging discriminator distribution, p T , and η (for light-flavour jets). Since the scale factors for light-flavour jets have an impact on the measured scale factors for b jets, an iterative procedure is performed. In the first iteration no scale factor is applied, while for the next iteration the background is subtracted using the scale factors obtained in the previous iteration. The iterative procedure stops once the scale factors obtained in the current iteration are stable with respect to those obtained in the previous iteration. Convergence is typically achieved after three iterations. When estimating the scale factor for b jets and light-flavour jets, the scale factor for c jets is set to unity with an uncertainty that is twice the uncertainty in the scale factor for b jets.
For the IterativeFit method, the following list of systematic uncertainties is considered. This list covers possible shape discrepancies between data and simulation for the tagger discriminator distribution.
• Sample purity: several systematic uncertainties impact the sample purity. These need to be taken into account when measuring the data-to-simulation scale factor for light-flavour or b jets. The sample purity may be affected by background processes or the modelling of the signal in the simulation, e.g. related to the production of additional jets in association with the top quark pair when measuring the scale factor for b jets. All sources of systematic uncertainties influencing the sample purity are combined in a single systematic uncertainty. For the scale factor for light-flavour jets, the expected contribution from processes other than Z + jets is negligible. However, the sample purity can be contaminated by heavy-flavour jets produced in association with the Z boson. The fraction of heavy-flavour jets in the sample is conservatively varied upwards and downwards by 20% when calculating SF l . For SF b , the dominant contribution originates from tt events. The dilepton tt events are selected requiring -77 -exactly two jets, consistent with the two b jets expected from the tt decay. However, because of ISR and FSR and the acceptance of the event selection, also non-b jets are selected. The rate of tt events produced with ≥ 2 additional partons varies within up to 20%. Therefore, the fraction of non-b jets is varied by this amount to evaluate the uncertainty in the purity of the sample.
• Jet energy scale: the uncertainty in the jet energy scale is assessed in the same way as described in section 8.3.1.
• Statistical uncertainty: an uncertainty arises due to the limited number of entries in each bin of the discriminator distribution, resulting in statistical fluctuations in certain regions, e.g. at high discriminator values for light-flavour jets and at low discriminator values for b jets. Linear and quadratic functions, where x corresponds to the central value of a discriminator bin. The linear function parameterizes the effect of statistical fluctuations that would tilt the discriminator distribution. In contrast, the quadratic function represents fluctuations that would increase or decrease the data-to-simulation scale factor in the centre of the discriminator distribution compared to the low and high discriminator values. To assess the size of the systematic uncertainty related to statistical fluctuations, the scale factor value is varied according to ±σ(x) f i (x), where σ(x) is the statistical uncertainty in the scale factor in that bin. The scale factors are refitted after applying these variations, resulting in two independent functions that span an envelope around the nominal scale factor function for each of the two types of statistical fluctuations.
• Treatment of SF c : for c jets the data-to-simulation scale factor, SF c , is set to unity. The uncertainty in this value is obtained by doubling the aforementioned relative uncertainties in the scale factor for b jets and adding them in quadrature to obtain a relative uncertainty in SF c . Similarly as for the statistical uncertainty, two separate uncertainties are constructed using linear and quadratic functions f i (x). The scale factor value is then varied according to ±σ(x) f i (x), where σ(x) is the relative uncertainty in SF c . These linear and quadratic variations of SF c are applied independently from the other uncertainties after which the scale factors are refitted to obtain the functions corresponding to the uncertainty in SF c .

Figures 50 and 51
show an example of the distribution for the CSVv2 tagger and the derived data-to-simulation scale factors using jets with 40 < p T < 60 GeV in a topology enriched in b jets and a topology enriched in light-flavour jets, respectively. The scale factors are parameterized as a function of the CSVv2 discriminator value. The scale factor for light-flavour jets as a function of the discriminator value is fitted with a sixth-order polynomial function. For the scale factor for b jets, no satisfactory parameterization was found. Therefore, a smooth function is obtained by interpolating between the scale factors measured in bins of the CSVv2 discriminator distribution. No interpolation is done between the bin below 0, which includes jets with a negative CSVv2 discriminator value, and the first bin above 0.
The data-to-simulation scale factors obtained with the IterativeFit method have been validated in various control regions. One example is the validation in a control region dominated by singlelepton tt events. The flavour composition in this control region is very different from both the dilepton tt and Z + jets topologies used to derive the scale factors, thereby providing a powerful  Figure 51. Distribution of the CSVv2 discriminator values for jets with 40 < p T < 60 GeV and 0.8 < |η| < 1.6 before the data-to-simulation scale factors are applied in the Z + jets sample (left). The simulation is normalized to the number of entries for data. Measured scale factors for light-flavour jets as a function of the CSVv2 discriminator value (right). The line represents a polynomial fit to the scale factors measured in each bin of the CSVv2 discriminator distribution. The bin below 0 contains the jets with a default discriminator value.

CMS
cross check. Events are selected requiring an isolated electron or muon with p T > 30 GeV and |η| < 2.1 and exactly four jets with p T > 30 GeV, of which exactly two are b tagged according to the medium working point of the CSVv2 algorithm. The distribution of the CSVv2 discriminator values is shown in figure 52 for all the jets in the control region. The agreement between the data and simulation improves significantly after applying the measured scale factors, and the remaining fluctuations are covered by the systematic uncertainties.

Comparison of the measured data-to-simulation scale factors
In most cases, the measured data-to-simulation scale factors for heavy-(light-) flavour jets are smaller (larger) than unity. This is expected because the quantities of relevance for heavy-flavour jet identification are not perfectly modelled by the simulation. The scale factors derived with the various methods are compared to each other after averaging the measured scale factor following the p T spectrum for tt events. Figure 53 compares the measured scale factors. However, this figure should not be used to decide which method performs best, since, e.g. for the TagCount method the scale factors were remeasured inclusively over the jet p T range, resulting in a smaller uncertainty than when the weighted average is used over the measurements in bins of jet p T . This is because for the measurement as a function of the jet p T the two tagged jets are required to be in the same jet p T bin, resulting in a loss of events compared to the inclusive measurement. Moreover, to allow a comparison, the scale factors for the IterativeFit method are remeasured using only one bin above the discriminator value corresponding to the working point for which the scale factor is derived. As can be seen from figure 53, the measured scale factors are consistent within their uncertainties. Only for the tight working point of the CSVv2 and DeepCSV taggers there is a hint of tension between the TagCount method and the other methods. This is explained by the fact that the central value of the TagCount method is quite sensitive to the background subtraction and the sample purity. The scale factor for b jets for the cMVAv2 and c tagger working points is not measured with muon-enriched multijet events to avoid a bias due to the muon information used in these taggers. The right panels in figure 53 show that the precision on the scale factors for c jets for the loose and medium working points of the b taggers, is on the same level as the precision reached on the scale factors for b jets, for jets with a p T distribution as expected in tt events. For the tight working point of the b taggers, the uncertainty in the average scale factor is relatively large because of the low number of c jets passing the tagging requirement. Similarly, as can be seen from the lower left panel in figure 53 the uncertainty in the average scale factor for b jets for the c tagger working points is larger compared to the corresponding uncertainty for the working points of the b taggers, because of two reasons. First, the uncertainty for the c tagger tight working point is large because of the low efficiency for b jets to pass this tagging requirement (section 5.2.2), resulting in a relatively large statistical uncertainty. -81 -Second, the uncertainty for the c tagger loose and medium working points is large due to the larger contribution from light-flavour jets resulting in a larger systematic uncertainty. It was also checked that the scale factor for light-flavour jets obtained with the IterativeFit method is consistent with the one obtained using the negative tag method.

Measurement of the tagging efficiency for boosted topologies
In section 6, the performance of b tagging algorithms in boosted topologies was discussed and the double-b tagger was presented to identify boosted particles decaying to two b quarks. This section summarizes the efficiency measurements for b tagging in boosted topologies. In section 9.1 the data are compared to the simulation for two topologies: a sample of muon-enriched subjets of AK8 jets and a sample of double-muon-tagged AK8 jets. Section 9.2 discusses the methods to measure the efficiency for b tagging subjets with the CSVv2 tagger. The efficiency measurement of the double-b tagger is presented in section 9.3. In both cases, the data-to-simulation scale factors are measured as a function of the jet p T . At this stage, the size of the jet sample is not yet large enough to provide also scale factors as a function of the jet |η|.

Comparison of data with simulation
The data are compared to the simulation using jets in boosted topologies. Jets are selected from events satisfying the following description: • Muon-enriched boosted subjets sample: a sample of muon-enriched multijet events is obtained using a combination of single-jet (AK4 and AK8) triggers requiring a muon in the jet. The data are compared to the simulation for soft-drop subjets (section 6) of AK8 jets with p T > 350 GeV and within the tracker acceptance. The subjets are required to contain at least one muon with p T > 7 GeV and ∆R < 0.4. In addition, to reduce the contribution from prompt muons, the ratio of the p T of the muon to that of the jet is required to be smaller than 0.5. The subjet p T distribution in simulation is reweighted to match the observed distribution.
• Double-muon-tagged boosted jet sample: a second sample of muon-enriched multijet events is obtained by combining the triggers used to select the previous sample with dijet triggers with a lower jet p T threshold, and by requiring a muon in each of the two jets. In this way, the sample contains also AK8 jets with 250 < p T < 350 GeV. Each subjet is required to contain a muon with p T > 7 GeV and ∆R < 0.4. The sum of the p T of the two muons with respect to the p T of the AK8 jet is required to be less than 0.6. Some of the triggers are prescaled. The p T distribution of the AK8 jet in the simulation is reweighted to match the observed distribution in data.
In figure 54 the data are compared to the simulation for subjets in the muon-enriched sample. The distributions of a few selected input variables are shown as well as the CSVv2 discriminator output distribution. The agreement is reasonable, with variations of up to 20%. Similarly, figure 55 shows the simulation and data for double-muon-tagged AK8 jets. Some of the input variables of the double-b tagger are shown as well as the discriminator output distribution itself.

Misidentification probability
The CSVv2 algorithm is used when applying b jet identification on subjets of AK8 jets. Data-tosimulation scale factors for light-flavour subjets from AK8 jets are derived with the negative-tag method used to measure the scale factors for light-flavour jets in section 8.2. A sample of inclusive multijet events is selected using single-jet triggers with different p T thresholds ranging from 140 to 500 GeV. The AK8 jet is required to have an offline reconstructed soft-drop jet mass between 50 and 200 GeV, where the jet mass is obtained from the invariant mass of the two subjets. The scale factors are measured for the loose and medium working points of the CSVv2 taggers using subjets with p T > 20 GeV within the tracker acceptance. The same sources of systematic effects are taken into account as for the scale factor measurement for AK4 light-flavour jets.
The measured data-to-simulation scale factors are shown in figure 56 for the loose and medium working points of the CSVv2 algorithm as a function of the subjet p T . The measurement is compared to the corresponding AK4 jet scale factors, and within the uncertainty both scale factors agree for jets with p T > 200 GeV. The difference for low jet p T is because of the very different environment for low-p T subjets in a boosted AK8 jet compared to low-p T AK4 jets.

Measurement of the b tagging efficiency
The data-to-simulation scale factors for subjets originating from b quarks are measured on subjets of AK8 jets using the selection requirements described in section 9.1. The LifeTime LT method presented in section 8.4.1 is applied to measure the scale factors for the loose and medium working points of the CSVv2 algorithm. The templates of the JP distribution for the various flavours obtained from simulation are fitted to the distribution observed in the data before and after applying the tagging requirement. An example of the fitted JP distribution for subjets with 240 < p T < 450 GeV is shown in figure 57 for all subjets and for subjets passing the medium working point of the CSVv2 algorithm. The systematic uncertainties associated with the scale factor measurements are the same as evaluated for AK4 jets discussed in section 8.  Figure 56. Data-to-simulation scale factors for light-flavour subjets of AK8 jets as a function of the subjet p T , as well as for AK4 jets as a function of jet p T , for the loose (left) and medium (right) working points of the CSVv2 algorithm. The solid curve is the result of a fit to the scale factors, and the dashed lines represent the overall statistical and systematic uncertainty of the measurements. considered here, the calibration of the track probabilities is derived from simulation and applied to both data and simulation. The systematic effect is evaluated from the difference between the nominal scale factor and that obtained by applying to the data the calibration of the track probabilities derived from the data. The uncertainty due to jets without a JP discriminator value is found to be negligible because of the higher jet p T .

CMS
The measured data-to-simulation scale factors for the loose and medium working points of the CSVv2 tagger are presented as function of the subjet p T in figure 58. As a comparison, the scale factors for AK4 jets obtained with the LT method are also shown. The scale factors for AK4 jets and subjets are consistent within their uncertainties.  Figure 58. Data-to-simulation scale factors for b subjets of AK8 jets as a function of the subjet p T , as well as for AK4 jets as a function of jet p T , for the loose (left) and medium (right) working points of the CSVv2 algorithm. The hatched band around the scale factors represents the overall statistical and systematic uncertainty of the measurements.

Measurement of the double-b tagging efficiency
To measure the efficiency of the four working points of the double-b tagger defined in section 6.2, a pure sample of boosted bb jets needs to be selected from data. The measurement is performed using a sample of high-p T jets enriched with g → bb jets. The enrichment is achieved by requiring each AK8 jet to be double-muon tagged, as described in section 9.1. While additional systematic uncertainties arise from using bb jets from gluon splitting, the statistical uncertainty of a measurement performed on boosted H → bb jets would be too large. Also Z → bb events cannot be easily used because of the difficulty to obtain a pure sample of those events. Using the simulation, it has been verified that the g → bb jets can be used as a proxy for the H → bb jets signal. Indeed, after the selection, the distributions of the double-b tagger discriminator values and its input variables were compared for simulated g → bb and H → bb jets. Since a different shape was observed for the discriminator distribution, the g → bb events were reweighted using the distribution of the z variable and the secondary vertex energy ratio, which are the variable distributions with the largest shape difference. The data-to-simulation scale factors were then computed using either the reweighted g → bb simulation or the original g → bb simulation. Both scale factors were found to be compatible, which confirms that the g → bb events allow for an unbiased measurement of the efficiency.
The efficiency and the corresponding data-to-simulation scale factor SF double-b is measured using data for the working points of the double-b tagger defined in section 6. The measurement is performed using the LT method, presented in section 8.4.1 and also used in section 9.2.2. The expected templates of the JP discriminator after the tagging requirement consist of two contributions, one arising from g → bb jets and one from jets not stemming from this process (background jets). These two templates are used to fit the fraction of each contribution to the JP discriminator in data. The fit is performed in three bins of jet p T for the loose, medium-1, and medium-2 working points, and in two bins of jet p T for the tight working point. An example of the fitted distributions is shown -86 -in figure 59 for AK8 jets with 350 < p T < 430 GeV before and after applying the loose working point of the double-b algorithm. The background jets are shown separately for b and g → cc jets and for c and light-flavour jets. However, the templates of these two components are merged for the tagged jet sample when performing the fit. The measurement is sensitive to the flavour composition of the background sample. The uncertainty due to the flavour composition is estimated by varying by ±50% the normalization of each flavour in the background templates. As a cross check, the potential systematic effect of merging all background jets in a single template is assessed by remeasuring the data-to-simulation scale factor using a separate template for each flavour in the fit. The systematic uncertainty due to the template variation results in a systematic uncertainty of up to 2.3% in the measured scale factor. The uncertainty related to the track probability calibration for the resolution function used in the JP discriminator is evaluated as described in section 9.2.2, and results in an uncertainty of 2.9% in the measured scale factors. The impact of the uncertainty in the number of pileup interactions results in an uncertainty of 1.3% in the scale factors. The following systematic uncertainties were found to be negligible: bin-by-bin correlations, jet energy corrections, the number of tracks, the branching fractions for c hadrons to muons, the b fragmentation function, the fragmentation rate of a c quark to various D mesons, and the K 0 S and Λ production fractions. The data-to-simulation scale factor SF double-b is presented in figure 60 for two working points of the double-b tagger. The measurement is performed using jets with p T > 250 GeV. Jets with p T > 840 GeV are included in the last bin. The scale factor is positioned at the average jet p T value of the jets populating that bin.

Measurement of the misidentification probability for top quarks
The probability to misidentify a boosted top quark jet corresponding to the decay t → bW → bqq for the four working points of the double-b tagger is estimated from the data. Semileptonic tt events are selected by requiring exactly one isolated muon with p T > 50 GeV and |η| < 2.1. The muon is used to define two hemispheres in the event. The leptonic hemisphere is defined as |φ jet − φ µ | < 2 3 π, and the hadronic hemisphere is its complement. At least one AK4 jet is required in each hemisphere, with p T > 30 GeV and within the tracker acceptance. In addition, the AK4 jet in the leptonic hemisphere should pass the loose working point of the CSVv2 algorithm. At least one AK8 jet is required in the hadronic hemisphere with p T > 250 GeV, |η| < 2.4, and a pruned jet mass between 50 and 200 GeV. The N-subjettiness parameters τ 1 and τ 2 (section 6) should satisfy the condition τ 2 /τ 1 < 0.6. If more than one such jet is present, the one with the highest p T is considered. The aforementioned selection is referred to as the "2-prong" selection.
After the event selection, the simulated events are normalized to the yield observed in the data. Figure 61 shows the distribution of the double-b discriminator and the pruned jet mass for the selected 2-prong events. The purity of the sample is high and the AK8 jet mass distribution is consistent with the decay of the W boson to quarks.  Figure 61. Distribution of the double-b tagger discriminator (left) and pruned jet mass (right) for AK8 jets passing the 2-prong event selection as described in the text. The simulation is normalized to the observed number of events.

JINST 13 P05011
The probability to misidentify a boosted top quark jet in data is obtained as follows: 1) where N data bb-tagged and N data are the number of events with a tagged AK8 jet in data and the total number of events in data, respectively. Similarly, N bkg,MC bb-tagged and N bkg,MC are the simulated number of background events with a tagged AK8 jet and the number of simulated background events before applying the working point of the double-b tagger, respectively. The data-to-simulation scale factors are measured both inclusively and in bins of the AK8 jet p T . The main systematic effect arises from the normalization of the background processes. An uncertainty of 30% is assigned to the cross section of each background contribution. An additional systematic uncertainty is related to the reweighting of the top quark p T spectrum. The shape of the p T distribution for top quarks in data is observed to be softer than in the simulation [77,78]. For the nominal scale factor measurements, a reweighting procedure is applied to correct for the observed difference. To assess the size of any systematic effect due to the reweighting, the uncertainty is obtained as the difference between the nominal scale factor values and the scale factors obtained when repeating the measurement without applying the reweighting procedure. The systematic uncertainty is found to be 1-2%.
The data-to-simulation scale factors for the misidentification of boosted top quark jets for two of the working points of the double-b tagger are shown as a function of the jet p T in figure 62. The scale factors are positioned at the average jet p T value of the jets populating that bin.

Summary
A variety of discriminating variables and algorithms used by the CMS experiment for the identification of heavy-flavour (charm and bottom) jets in proton-proton collisions at 13 TeV have been reviewed. Detailed simulation studies have allowed the reoptimization of existing b tagging algorithms and, in addition, new algorithms have been developed for the first time to identify c jets, -89 -as well as bb jets in events with boosted topologies. The performance of these heavy-flavour jet identification algorithms has been studied with simulations of different final states with heavy-and light-flavour quarks. The efficiency to correctly identify b jets in resolved tt events is 68% at a misidentification probability for light-flavour jets of 1%, which is an improvement of 15% in relative efficiency compared to the best performing algorithm used during LHC Run 1.
The variables and discriminators have been also compared to the data collected by the CMS experiment in 2016 for various event topologies enriched in heavy-or light-flavour jets. Various methods have been presented to determine the data-to-simulation scale factors for the heavy-flavour jet identification efficiency, as well as for the probability to misidentify light-flavour jets. A precision of a few per cent is obtained in the tagging efficiency for b jets with 30 < p T < 300 GeV. For b jets with p T > 500 GeV, the precision is of the order of 5%. For scale factors measured in boosted topologies and for c jets in resolved topologies, the total uncertainty is 5-10%, and the statistical uncertainty in the tagging efficiency dominates over the full jet p T range.
With the increasing integrated luminosity delivered by the LHC, the precision of the datato-simulation scale factors for the specified topologies, jet flavours, and p T ranges will increase further. Differential studies of the heavy-flavour identification performances as a function of jet pseudorapidity, and of the number of multiple proton-proton interactions in the same bunch crossing, will also become viable.

A Parameterization of the efficiency
To facilitate phenomenological studies relying on b jet identification, we provide the b jet identification efficiency as a function of the jet p T for the three operating points of the DeepCSV algorithm.

JINST 13 P05011
The efficiency is obtained using jets with p T > 20 GeV in a simulated tt sample and is multiplied by the data-to-simulation scale factor to obtain the tagging efficiency expected in data. This efficiency is shown in figure 63 for the three jet flavours. Polynomial functions are used to fit the dependence of the efficiency on the jet p T for jets with 20 < p T < 1000 GeV. It is worth noting that the parameterization of the fitted functions is not reliable outside this jet p T range. The parameterizations are summarized in table 6.  Figure 63. Efficiency for b tagging jets for the three different working points of the DeepCSV algorithm multiplied by the measured data-to-simulation scale factor. The efficiencies are shown as a function of the jet p T using jets with p T > 20 GeV in tt events for b jets (upper), c jets (middle), and light-flavour jets (lower). The solid lines represents the functions used to fit the dependence on the jet p T . The last bin includes the overflow.
-92 - Table 6. Polynomial functions used to fit the efficiency of the three working points of the DeepCSV algorithm for the three jet flavours as a function of the jet p T for jets with 20 < p T < 1000 GeV.