Measurement of the track reconstruction efficiency at LHCb

The determination of track reconstruction efficiencies at LHCb using J/ψ→μ+μ- decays is presented. Efficiencies above 95% are found for the data taking periods in 2010, 2011, and 2012. The ratio of the track reconstruction efficiency of muons in data and simulation is compatible with unity and measured with an uncertainty of 0.8 % for data taking in 2010, and at a precision of 0.4 % for data taking in 2011 and 2012. For hadrons an additional 1.4 % uncertainty due to material interactions is assumed. This result is crucial for accurate cross section and branching fraction measurements in LHCb.


Introduction
The track reconstruction efficiency is an important quantity in many physics analyses, especially those that aim at measuring a production cross section or a branching fraction. The uncertainty on the track reconstruction efficiency was a source of large systematic uncertainties with early LHCb data [1]. The method presented in this paper has significantly reduced this uncertainty for recent measurements [2].
In physics analysis, the track reconstruction efficiency is usually estimated with simulated events. To take possible differences between simulation and data into account, a data-driven correction procedure is applied. A clean sample of J/ψ → µ + µ − decays is selected in data with a tag-and-probe approach. J/ψ → µ + µ − decays are ideal candidates for efficiency measurements as -1 -they are abundant, clean, and the decay products cover the momentum spectrum needed in most physics analyses in LHCb. The purity of the sample is enhanced by selecting J/ψ from b-hadron decays. The tag track is fully reconstructed and is well identified as a muon. The probe track is only partially reconstructed, not using information from at least one subdetector which is probed. The track reconstruction efficiency is determined by checking for the existence of a fully reconstructed track corresponding to the probe track as this allows to determine the efficiency of the subdetector that is not used in the reconstruction of the probe track. It is calculated as a function of the momentum of the probe track, p, its pseudorapidity, η, and the track multiplicity of the event, N track . These are chosen because the efficiency is most affected by them. No strong dependence on the polar angle φ is observed. The main result of this paper is the track reconstruction efficiency ratio between data and simulation for prompt tracks and tracks from B and D mesons. This ratio is used in physics analyses to correct the track reconstruction efficiency in simulated events and to determine its uncertainty. The measurement is performed on several data samples to meet the requirements of the analyses performed at LHCb. In this paper, the results are presented for the three data samples from run I, corresponding to different running conditions, proton-proton (pp) centreof-mass energies and integrated luminosities: data taken in 2010 at √ s = 7 TeV corresponding to 29 pb −1 , data taken in 2011 at √ s = 7 TeV corresponding to 1 fb −1 , and data taken in 2012 at √ s = 8 TeV corresponding to 2 fb −1 . The 2010 results are valid for the full 2010 data set, corresponding to a luminosity of 37 pb −1 , since the same running conditions and track reconstruction were used throughout this period.

Detector and software description
The LHCb detector [3] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector, VELO [4], surrounding the pp interaction region; a large-area silicon-strip detector, TT [5], located upstream of a dipole magnet with a bending power of about 4 Tm; and three stations of silicon-strip detectors (Inner Tracker) [6] and straw drift tubes (Outer Tracker) [7] placed downstream of the magnet, called T stations. The tracking system provides a measurement of momentum, p, with a relative uncertainty that varies from 0.4% at low momentum to 0.6% at 100 GeV/c. The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of p transverse to the beam, in GeV/c. The polarity of the dipole magnet is reversed periodically throughout data taking. The configuration with the magnetic field vertically upwards (downwards), bends positively (negatively) charged particles in the horizontal plane towards the centre of the LHC. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors, RICH1 and RICH2. Photon, electron, and hadron candidates are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter, and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [8]. The trigger [9] consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. In the simulation, pp collisions are generated using PYTHIA 6.4 [10] with a specific LHCb configuration [11]. Decays of hadronic particles are described by EVTGEN [12], in which final state radiation is generated using PHOTOS [13]. The interaction of the generated particles with the detector and its response are implemented using the GEANT4 toolkit [14,15] as described in ref. [16]. Hit inefficiencies, e.g. due to dead channels, are typically in the range 1-2% and are included in the simulation. Differences in the positioning of the sensors between data and simulation are at the level of 0.5 mm. Both effects have a negligible impact on the tracking efficiency. The simulated events used in this study are required to contain at least one J/ψ → µ + µ − decay.
Differences in the response of the detectors in simulation and data could potentially lead to a different behaviour of the track reconstruction. The hit efficiencies have been measured in data using tracks. For the different subdetectors, they range from 98-100%. Dead channels are included in the simulation, using an average over the data taking period. From simulations it is known that the (high) hit efficiency does not have any impact on the track reconstruction, as the algorithms have been written to be robust against small hit inefficiencies. The size of the sensitive detector elements are known very accurately and the positioning of the sensitive elements in the simulation is accurate at the level of 0.5 mm. Compared to the overall size of the tracking system, any inaccuracy at this level has negligible impact on the acceptance of the detector.

Track reconstruction at LHCb
Owing to the design of the LHCb detector, which consists of tracking detectors mainly outside the magnetic field, charged particle tracks are in approximation straight line segments in the upstream part (VELO and TT) and in the downstream part (T stations). Figure 1 shows an overview of the different track types defined in the LHCb reconstruction: VELO tracks, which have hits in the VELO; upstream tracks, which have hits in the two upstream trackers; T tracks, which have hits in the T stations; downstream tracks, which have hits in TT and the T stations; and long tracks, which have hits in the VELO and the T stations. The latter tracks can additionally have hits in TT.
If a particle is reconstructed more than once, as different track types, only the track best suited for analysis purposes is kept. Hereby, long tracks are preferred over any other track type, upstream tracks are preferred over VELO tracks, and downstream tracks are preferred over T tracks. The -3 -number of unique tracks in an event, N track , is used in this study as a measure for the event multiplicity; it is strongly correlated with the number of hits in the tracking detectors. The number of tracks is chosen over the number of hits in a tracker to give a balanced measure of the upstream and the downstream occupancy.
The reconstruction of long tracks starts with a search for VELO tracks [17,18]. VELO tracks are reconstructed exploiting the fact that tracks form straight lines due to the absence of a magnetic field in the VELO. Two algorithms promote these VELO tracks to long tracks. The first algorithm, called forward tracking [19], combines VELO tracks with hits in the three T stations. For a given VELO track and a single hit in one of the T stations the momentum is fixed, enabling the algorithm to project hits in the T stations along the trajectory. Hits which form clusters in the projection are used to define the final long track. In the second algorithm, called track matching [20,21], long tracks are made combining VELO tracks with T tracks, which are found by a standalone track finding algorithm [22].
If hits compatible with the long track trajectory are found in TT, they are added to the track to improve the momentum resolution and as discrimination against fake tracks. This procedure is identical for the forward tracking and the track matching.
Most analyses use long tracks because they provide the best momentum and spatial resolution among all track types. Unless otherwise stated, track reconstruction at LHCb refers to the reconstruction of long tracks. In a typical signal triggered event in 2011 or 2012, around 60 long tracks are reconstructed. Other track types, such as downstream tracks [23], are used for the reconstruction of decay products of long-lived particles such as K 0 S mesons, or for internal alignment of the tracking detectors. They are reconstructed from T tracks, which are propagated back through the magnetic field to find corresponding hits in the TT stations.
The efficiency to reconstruct charged particles as long tracks is determined in two approaches. The first approach measures the track reconstruction efficiency in the VELO and in the T stations individually and combines these efficiencies to a single measurement. The second approach determines the efficiency to reconstruct a long track directly.

Tag-and-probe methods
The tag-and-probe method uses two-prong decays, where one of the decay products, the "tag", is fully reconstructed as a long track, while the other particle, the "probe", is only partially reconstructed. The probe should carry enough momentum information that the invariant mass of the parent particle can be reconstructed with a sufficiently high resolution. The invariant mass of the two-prong decay allows for a discrimination against background. The track reconstruction efficiency for long tracks is then obtained by matching the partially reconstructed probe track to a long track. If a match is found, the probe track is defined as efficient. The three methods described below all use J/ψ → µ + µ − decays, as the daughter particles have information in the muon system which can be exploited in the reconstruction of the probe track. The approaches, however, use different combinations of tracking detectors for the partial reconstruction of the probe track.

VELO method
The track reconstruction efficiency in the VELO is measured using downstream tracks as probes, as illustrated in figure 2 share all hits in the T stations. Therefore, a probe track is considered to be found as a long track if there is a long track with at least 50% common hits in the T stations. In simulated events the fraction of 50% common hits is found to be an appropriate and stable matching criterion.

T-station method
The measurement of the track reconstruction efficiency in the T stations for particles that have VELO and muon segments is illustrated in figure 2(b). A dedicated algorithm reconstructs muons as straight tracks starting from hits in the last muon station, see for example refs. [24,25]. These are subsequently matched to VELO tracks.
A long track is considered to be matched to a probe track if two requirements are met. Firstly, the probe track and the long track have to be reconstructed from the same VELO seed. Secondly, at -5 - Table 1. Settings of the software trigger selection as a function of data taking period. Only the tag muon is required to pass the selection. For more information see refs. [9,[26][27][28].
least two hits on the probe track in the muon stations have to be compatible with the extrapolation of the long track into the muon stations. It is found in simulated events that requiring two common hits in the muon stations is sufficient to ensure compatible trajectories of the long track and the VELO-muon probe track.

Long method
The long method uses probe tracks that have hits in the TT and in the muon stations as illustrated in figure 2(c). This method measures the efficiency to reconstruct long tracks because the longtrack-finding algorithms do not require the presence of TT hits. Therefore, the efficiency to find a long track is, to first order, independent of the efficiency to find such a TT-muon track. These (TT-muon) tracks are found by a dedicated reconstruction of tracks in the muon stations, which are subsequently matched to TT hits. A TT-muon track is considered to be reconstructed as a long track in case more than 70% of the hits in the muon stations are compatible with the extrapolation of the long track into the muon stations. In case the long track has TT hits, it needs to share at least 60% of the TT hits as well. These fractions have been optimised in simulation and the results are stable with respect to small differences in data and simulation.

Trigger and selection requirements
The candidate decays are first required to pass a hardware trigger, which selects muons in the muon system with a transverse momentum, p T > 1.48 GeV/c, or dimuons where the product of the two transverse momenta is greater than p T1 × p T2 > (1.296 GeV/c) 2 . In 2012 these thresholds have been raised to p T > 1.76 GeV/c and p T1 × p T2 > (1.6 GeV/c) 2 , respectively. The reconstruction of both muons in the hardware trigger does not bias the determination of the track reconstruction efficiency since it does not use information from the tracking system (VELO, TT, and T stations). The subsequent software stage reconstructs the tag muon in the entire tracking system and in the muon system. The tag muon is required to have high p T , high p, large IP and χ 2 IP with respect to all PVs in the event, where χ 2 IP is defined as the difference in χ 2 of a given PV reconstructed with and without the considered track. Furthermore, a good χ 2 per degree of freedom (χ 2 /ndf) of the trigger track fit is required. Different selection criteria are used during data taking as listed in table 1 to fit different data taking conditions. The IP and χ 2 IP requirements restrict the sample to J/ψ originating from b hadron decays. Only the tag muon is required to be reconstructed in the -6 - Table 2. Selection requirements on the tag and probe tracks and on the combination into a J/ψ candidate for the three different methods.
VELO T-station  Long  method  method  method  Tag Long track used in single muon trigger software trigger in order to avoid any bias on the track reconstruction efficiency, caused by fully reconstructing the two-prong decay with two long tracks.
Further selection criteria are applied as listed in table 2: the χ 2 /ndf from the track fit of the tag tracks must be small to reduce the number of fake tracks. Tag tracks have to fulfil the standard muon selection, which requires hits in the muon stations in a search window around the track extrapolation as explained in ref. [29]. Both the tag and probe tracks have minimal p and p T requirements to remove badly reconstructed tracks and combinatorial background. In order to remove contamination from hadrons, the particle identification system is used. The differences between the logarithm of the likelihood of the tag to be a muon and to be a pion, DLL µπ , is computed and only tag tracks with a high DLL µπ are used. The range of the invariant mass of the µ + µ − combination, M µ µ , is chosen sufficiently large to estimate the background contribution from the mass sidebands. Finally, the χ 2 /ndf from the vertex fit of the tag-and the probe-track has to be small, in order to remove combinatorial background; and the number of J/ψ decays per event (N J/ψ ) must be one, to simplify the association procedure described in the preceding subsections. Additionally, the T-station method only considers J/ψ candidates with a momentum greater than 7 GeV/c, and the long method only J/ψ candidates with an IP smaller than 0.8 mm, as both selections are effective in reducing background contamination without biasing the efficiency determination. After the full selection chain the sample amounts to about 6 000 decays for 2010 for the long and the T method, while for the VELO method 12 000 decays are selected. The 2011 and 2012 data samples comprise more than 300 000 decays in total for all methods and data taking periods.

Mass resolution
To illustrate the mass resolutions that can be achieved, the dimuon invariant mass distributions from J/ψ candidates in the three methods are shown in figure 3 using the 2011 data sample. The

Calculation of efficiency
The track reconstruction efficiency is calculated as the fraction of reconstructed J/ψ decays where the probe track can be matched to a long track. To estimate the number of J/ψ decays, an unbinned extended maximum likelihood fit is performed to the mass distributions. For the VELO and T-station methods the mass distributions are described by a single Gaussian function for the signal and an exponential function for the combinatorial background. This model is preferred over the aforementioned sum of two Gaussian functions to improve the fit stability when measuring the dependence of the track reconstruction efficiency on kinematic variables and other event parameters. For the long method, a Crystal Ball function [30] is used for the signal, to take the tail on the lefthand side of the mass peak into account. Since the number of decays in the 2010 data is relatively -8 -low, in this case a simple sideband subtraction is applied for the VELO and T-station methods. All shape parameters were allowed to vary in the fit for the denominator of the efficiency; they were constrained to the found values for the numerator of the efficiency. This procedure was performed to stabilise the fit, as no difference in the shape of the numerator and denominator could be observed. It has been checked that the choice of the model for the mass distribution has a negligible effect on the efficiency determination.
The efficiencies obtained from the VELO and T-station methods are assumed to be uncorrelated, aside from effects due to dependencies on the track kinematics and the event multiplicity. The data sample is binned in kinematic variables and N track to combine the VELO and T-station efficiencies. The efficiencies obtained with the VELO and T-station methods can be multiplied in each bin to obtain the efficiency for finding long tracks. This combined efficiency can be compared with the efficiency found by the long method, giving two independent methods to probe the long track reconstruction efficiency.
There are, however, small differences between these two approaches. The long method measures the efficiency for tracks that pass through TT. In the combined method, only the VELO method requires this. Furthermore, both the VELO method and T-station method include the efficiency that, given that both the VELO and the T-station segment tracks are reconstructed, the corresponding long track is found. Therefore, in the combined efficiency, this so-called matching efficiency is counted twice. All these effects can lead to small differences in the measured longtrack efficiency. For this reason, the ratio between the efficiencies in data and simulation is used to compare the methods, as these uncertainties are common for simulated and real decays and cancel when the ratio of efficiencies is formed.
On simulated events the track reconstruction efficiency is commonly defined as the fraction of simulated charged particles with sufficient hits in the VELO and T stations that can be associated to a track that shares at least 70% of the hits in each participating subdetector with this particle. For all methods, this so-called hit-based efficiency in simulation agrees within 1% with the efficiency measured with the tag-and-probe methods. Furthermore the matching efficiency was determined to be very close to 100%. The very small matching inefficiency does not affect the agreement between the hit-based efficiency and the tag-and-probe based efficiency in simulation. By taking the ratio between the efficiencies on data and simulation, these discrepancies are reduced to a negligible level.

Efficiency dependencies
Using the momentum spectrum of the J/ψ decay products obtained with the VELO method from data as a benchmark, the average track reconstruction efficiency for long tracks is measured to be (95.4 ± 0.7)% for 2010 data, (97.78 ± 0.07)% for 2011 data and (96.99 ± 0.05)% for 2012 data. All results confirm the good performance of the LHCb tracking system. The uncertainties on these numbers are statistical only; they are binomial errors with additional terms to account for the statistical uncertainty on the number of background events. Systematic uncertainties are discussed in section 8. The difference in the efficiencies between the three years is a consequence of changes in the track reconstruction and the higher centre-of-mass energy, leading to a higher track multiplicity and hence lower reconstruction efficiency for the 2012 running period. Dependencies -9 - Table 3. Track reconstruction efficiencies in % for the individual running periods using the long method for positive and negative muons and different magnetic field polarities (statistical uncertainties only).

Dependencies of track reconstruction efficiency
The efficiency to reconstruct long tracks mainly depends on the particle kinematics and the number of charged particles in an event. As a parametrisation p, η and N track are chosen, as the track reconstruction efficiency shows the largest dependence on these three observables. The simulated events are weighted according to the N track distribution observed in data. The track reconstruction efficiencies for the combination of the VELO and T-station methods and for the long method are shown for the different data-taking periods in figures 4-6 as a function of p, η, N track , and as a function of the number of reconstructed primary vertices, N PV . The efficiency coming from the combination of the VELO and the T-station method is calculated by multiplying the individual efficiencies. Overall, a reasonable agreement is found between simulated and real data for all datataking periods. As the agreement between the tag-and-probe based track reconstruction efficiency and the true track reconstruction efficiency (based on hit information) is within 1%, the results shown in figures 4-6 give an accurate description of the efficiency in simulation.

Efficiency ratios
The efficiency ratio is defined as the efficiency measured in data divided by the efficiency measured in simulation, ratio = ε data ε sim . (7.1) The efficiency dependence versus N track and N PV is reasonably well described in the simulation, see figures 4-6: when fitting a first-order polynomial to the efficiency distributions in simulation and real data, the slopes agree with each other within 2 standard deviations, except for the efficiency -10 -      Figure 7 shows the efficiency ratio versus p for run I, weighted by the event track multiplicity observed in data; the data are split into two ranges of η. Overall a good agreement of the track finding efficiency is found between events in simulation and in data for all data taking periods and most momenta and pseudorapidity regions. The difference between the track finding efficiencies is generally smaller than 1% between events from simulation and data and no trend can be observed for the 2011 and 2012 dataset, with the number of events being too low to draw conclusions from the 2010 dataset. The agreement is worse for tracks with momentum below 10 GeV/c, which might point to a less accurate modelling of multiple scattering effects in the simulation.
The overall efficiency ratio and its uncertainty depend on the particle distribution of the data in terms of p and η. Using the momentum spectrum of the J/ψ decay products obtained with the VELO method from data, an average efficiency ratio is found of 0.994 ± 0.007 for 2010 data, 0.9983 ± 0.0009 for 2011 data and 1.0053 ± 0.0008 for 2012 data. The uncertainties represent the statistical uncertainties only. The ratio is close to one in all three cases as different features seen in the efficiency distributions in simulation and data average out when integrating over the full momentum spectrum or pseudorapidity range.

Systematic uncertainties
Small differences in the ratio of efficiencies are seen when reweighting the simulated samples in different parameters such as the number of primary vertices, or the number of hits or tracks in the different subdetectors. The largest of these differences is taken as a systematic uncertainty and amounts to 0.4%. No systematic uncertainty is assigned for the agreement of the track reconstruction efficiency determined by the tag-and-probe method and the hit-based method (which is on the order of 1%), as the differences cancel when forming the efficiency ratio. Accordingly, no systematic uncertainties are assigned for the fit model as these cancel when forming the fraction of reconstructed J/ψ decays where the probe can be matched to a long track. It has been checked that this is true for a range of fit models, the largest variation being 0.2%. Furthermore, no systematic uncertainty is assigned to the possible matching of a correctly reconstructed probe track to a fake long track, as the requirement for a large overlap in the subdetectors ensure that both reconstructed tracks are either real tracks or fake tracks, where the latter would not peak at the J/ψ mass. No systematic uncertainty is assigned for the fact that the VELO + T-station method and the long method show slightly different results in figures 4-6, as both methods probe different momentum spectra and any residual difference will cancel when forming the ratio with simulation. No systematic uncertainty is assigned for the double-counting of the matching efficiency in the combined method, as this efficiency is very close to 100%, and any uncertainty would get further reduced when forming the ratio with simulation. No systematic uncertainty is assigned for the large difference for the VELO + T efficiency between simulation and data at low momenta in 2011 and 2012, as this is automatically taken into account when forming the ratio of efficiencies. Despite this difference, the integrated track reconstruction efficiencies between simulation and data are in agreement due to compensation of this effect for high momenta, where the efficiency is higher in simulation than in data.

Hadronic interactions
The methods presented in this paper are based on muons and require that they reach the muon stations. Thus, these methods are not sensitive to the effects from hadronic interactions and largeangle scatterings with the detector material. For hadrons, the largest effect is due to hadronic interactions. The cross section depends on the particle type, charge and the momentum. A simulation of B 0 → J/ψ K * 0 decays (where K * 0 → K + π − ) shows that about 11% of the kaons (averaged over positive and negative kaons) and about 14% of the pions cannot be reconstructed due to hadronic interactions that occur before the last T station. This number depends primarily on the momentum of the particle. Due to the uncertainty on the material budget and consequently on the interaction with the detector material, the reconstruction efficiency obtained from simulation has an intrinsic uncertainty, which is not accounted for in the track reconstruction efficiencies measured with muons. When assuming that the total material budget in the simulation has an uncertainty of 10%, the systematic uncertainty due to hadronic interactions is between 1.1-1.4%. The 10% uncertainty is used as a conservative upper limit and is composed as follows: for the VELO a calculation in ref. [4] shows an uncertainty on the material budget of 6%. No direct measurements exist for the T and TT stations. However, weight measurements for the Inner Tracker for the silicon sensors -15 -and the detector boxes give an accuracy of 2%, while an agreement of 5% is reached for the cables and the support structure [31,32]. The Outer Tracker modules have been weighted and this measurement is precise to about 1% [33]. Furthermore, the sum of the weights of the individual components of a module adds up to the total weight of a module within the uncertainties. Taking into account that some level of detail is missing in the detector description in the simulation, an uncertainty of 5% is assumed for the outer tracker. Weight measurements for the sensor modules and the insulation material of TT have been performed. Given the detail of the detector description [34] an uncertainty of 5% on the material budget is well justified. The beam-pipe was implemented in the software following the design drawings, where a precision better than 10% for all pieces was confirmed following measurements after production. The solid radiator (aerogel) and the gas radiator (C 4 F 10 ) contribute more than two-third of the material budget for the RICH1 detector [35]. The amount of aerogel is known up to 2% and the differences between 2011 and 2012 are accounted for in the simulation. The density of the C 4 F 10 was monitored, with the RMS of the distribution being about 1%. The other components of RICH1 have a smaller contribution to the interaction length. The overall uncertainty of 10% for the full material budget was then chosen to also take uncertainties on the GEANT4 cross-sections and additional uncertainties, coming from simplified descriptions of the detector elements in the simulation, into account.

Conclusion
Track reconstruction efficiencies at LHCb have been measured using a tag-and-probe method with J/ψ → µ + µ − decays. The average efficiency is better than 95% in the momentum region 5 GeV/c < p < 200 GeV/c and in the pseudorapidity region 2 < η < 5, which covers the phase space of LHCb. The uncertainty per track is below 0.5% for muons and below 1.5% for pions and kaons, where the larger uncertainty takes the uncertainty on hadronic interactions into account. All uncertainties have been added in quadrature. Furthermore, the ratio of the track reconstruction efficiency of muons in data and simulation is measured, where an uncertainty of 0.8 % for data collected in 2010 and an uncertainty of 0.4 % for data collected in 2011 and 2012 is achieved. The integrated efficiency ratios for all three years of data taking are compatible with unity. This result presents a significant improvement over the uncertainties determined with previous methods ranging from 3 to 4%.