Description and performance of track and primary-vertex reconstruction with the CMS tracker

A description is provided of the software algorithms developed for the CMS tracker both for reconstructing charged-particle trajectories in proton-proton interactions and for using the resulting tracks to estimate the positions of the LHC luminous region and individual primary-interaction vertices. Despite the very hostile environment at the LHC, the performance obtained with these algorithms is found to be excellent. For tt̄ events under typical 2011 pileup conditions, the average track-reconstruction efficiency for promptly-produced charged particles with transverse momenta of pT > 0.9GeV is 94% for pseudorapidities of |η| < 0.9 and 85% for 0.9 < |η| < 2.5. The inefficiency is caused mainly by hadrons that undergo nuclear interactions in the tracker material. For isolated muons, the corresponding efficiencies are essentially 100%. For isolated muons of pT = 100GeV emitted at |η| < 1.4, the resolutions are approximately 2.8% in pT, and respectively, 10μm and 30μm in the transverse and longitudinal impact parameters. The position resolution achieved for reconstructed primary vertices that correspond to interesting pp collisions is 10–12μm in each of the three spatial dimensions. The tracking and vertexing software is fast and flexible, and easily adaptable to other functions, such as fast tracking for the trigger, or dedicated tracking for electrons that takes into account bremsstrahlung.


Introduction
At an instantaneous luminosity of 10 34 cm −2 s −1 , typical of that expected at the Large Hadron Collider (LHC), with the proton bunches crossing at intervals of 25 ns, the Compact Muon Solenoid (CMS) tracker is expected to be traversed by about 1000 charged particles at each bunch crossing, produced by an average of more than twenty proton-proton (pp) interactions. These multiple interactions are known as pileup, to which prior or later bunch crossings can also contribute because of the finite time resolution of the detector. Reconstructing tracks in such a high-occupancy environment is immensely challenging. It is difficult to attain high track-finding efficiency, while keeping the fraction of fake tracks small. Fake tracks are falsely reconstructed tracks that may be formed from a combination of unrelated hits or from a genuine particle trajectory that is badly reconstructed through the inclusion of spurious hits. In addition, the tracking software must run sufficiently fast to be used not only for offline event reconstruction (of ≈10 9 events per year), but also for the CMS High-Level Trigger (HLT), which processes events at rates of up to 100 kHz. The scientific goals of CMS [1,2] place demanding requirements on the performance of the tracking system. Searches for high-mass dilepton resonances, for example, require good momentum resolution for transverse momenta p T of up to 1 TeV. At the same time, efficient reconstruction of tracks with very low p T of order 100 MeV is needed for studies of hadron production rates and to obtain optimum jet energy resolution with particle-flow techniques [3]. In addition, it is essential to resolve nearby tracks, such as those from 3-prong τ-lepton decays. Furthermore, excellent impact parameter resolution is needed for a precise measurement of the positions of primary pp interaction vertices as well as for identifying b-quark jets [4].
While the CMS tracker [5] was designed with the above requirements in mind, the trackfinding algorithms must fully exploit its capabilities, so as to deliver the desired performance. The goal of this paper is to describe the algorithms used to achieve this and show the level of performance attained. The focus here is purely on pp collisions, with heavy ion collisions being beyond the scope of this document. Section 2 introduces the CMS tracker; and section 3 describes the reconstruction of the hits created by charged particles crossing the tracker's sensitive layers. The algorithms used to reconstruct tracks from these hits are explained in section 4; and the performance obtained in terms of track-finding efficiency, proportion of fake tracks and track parameter resolution is presented in section 5. Primary vertices from pp collisions are distributed over a luminous region known as the beam spot. Reconstruction of the beam spot and of the primary vertex positions is described in section 6. This is intimately connected with tracking, since on the one hand, the beam spot and primary vertices are found using reconstructed tracks, and on the other -1 -hand, an approximate knowledge of their positions is needed before track finding can begin. All results shown in this paper are based on pp collision data collected or events simulated at a centreof-mass energy of √ s = 7 TeV in 2011. The simulated events include a full simulation of the CMS detector response based on GEANT4 [6]. All events are reconstructed using software from the same period. The track-reconstruction algorithms have been steadily evolving since then, but still have a similar design now.
The CMS detector [5] was commissioned initially using cosmic ray muons and subsequently using data from the first LHC running period. Results obtained using cosmic rays in 2008 [7] are extensively documented in several publications pertaining to the pixel detector [8], strip detector [9], tracker alignment [10], and magnetic field [11], and are of particular relevance to the present paper. Results from the commissioning of the tracker using pp collisions in 2010 are presented in [12].

The CMS tracker
The CMS collaboration uses a right-handed coordinate system, with the origin at the centre of the detector, the x-axis pointing to the centre of the LHC ring, the y-axis pointing up (perpendicular to the plane of the LHC ring), and with the z-axis along the anticlockwise-beam direction. The polar angle θ is defined relative to the positive z-axis and the azimuthal angle φ is defined relative to the x-axis in the x-y plane. Particle pseudorapidity η is defined as − ln[tan(θ /2)].
The CMS tracker [5] occupies a cylindrical volume 5.8 m in length and 2.5 m in diameter, with its axis closely aligned to the LHC beam line. The tracker is immersed in a co-axial magnetic field of 3.8 T provided by the CMS solenoid. A schematic drawing of the CMS tracker is shown in figure 1. The tracker comprises a large silicon strip tracker with a small silicon pixel tracker inside it. In the central pseudorapidity region, the pixel tracker consists of three co-axial barrel layers at radii between 4.4 cm and 10.2 cm and the strip tracker consists of ten co-axial barrel layers extending outwards to a radius of 110 cm. Both subdetectors are completed by endcaps on either side of the barrel, each consisting of two disks in the pixel tracker, and three small plus nine large disks in the strip tracker. The endcaps extend the acceptance of the tracker up to a pseudorapidity of |η| < 2.5.
The pixel detector consists of cylindrical barrel layers at radii of 4.4, 7.3 and 10.2 cm, and two pairs of endcap disks at z = ±34.5 and ±46.5 cm. It provides three-dimensional (3-D) position measurements of the hits arising from the interaction of charged particles with its sensors. The hit position resolution is approximately 10 µm in the transverse coordinate and 20-40 µm in the longitudinal coordinate, while the third coordinate is given by the sensor plane position. In total, its 1440 modules cover an area of about 1 m 2 and have 66 million pixels.
The strip tracker has 15 148 silicon modules, which in total cover an active area of about 198 m 2 and have 9.3 million strips. It is composed of four subsystems. The Tracker Inner Barrel (TIB) and Disks (TID) cover r < 55 cm and |z| < 118 cm, and are composed of four barrel layers, supplemented by three disks at each end. These provide position measurements in rφ with a resolution of approximately 13-38 µm. The Tracker Outer Barrel (TOB) covers r > 55 cm and |z| < 118 cm and consists of six barrel layers providing position measurements in rφ with a resolution of approximately 18-47 µm. The Tracker EndCaps (TEC) cover the region 124 < |z| < 282 cm. The latter actually each consist of two back-to-back strip modules, in which one module is rotated through a 'stereo' angle. The pixel modules, shown by the red lines, also provide 3-D hits. Within a given layer, each module is shifted slightly in r or z with respect to its neighbouring modules, which allows them to overlap, thereby avoiding gaps in the acceptance.
Each TEC is composed of nine disks, each containing up to seven concentric rings of silicon strip modules, yielding a range of resolutions similar to that of the TOB.
To refer to the individual layers/disks within a subsystem, we use a numbering convention whereby the barrel layer number increases with its radius and the endcap disk number increases with its |z|-coordinate. When referring to individual rings within an endcap disk, the ring number increases with the radius of the ring.
The modules of the pixel detector use silicon of 285 µm thickness, and achieve resolutions that are roughly the same in rφ as in z, because of the chosen pixel cell size of 100 × 150 µm 2 in rφ × z. The modules in the TIB, TID and inner four TEC rings use silicon that is 320 µm thick, while those in the TOB and the outer three TEC rings use silicon of 500 µm thickness. In the barrel, the silicon strips usually run parallel to the beam axis and have a pitch (i.e., the distance between neighbouring strips) that varies from 80 µm in the inner TIB layers to 183 µm in the inner TOB layers. The endcap disks use wedge-shaped sensors with radial strips, whose pitch varies from 81 µm at small radii to 205 µm at large radii.
The modules in the innermost two layers of both the TIB and the TOB, as well as the modules in rings 1 and 2 of the TID, and 1, 2 and 5 of the TEC, carry a second strip detector module, which is mounted back-to-back to the first and rotated in the plane of the module by a 'stereo' angle of 100 mrad. The hits from these two modules, known as 'rφ ' and 'stereo hits', can be combined into matched hits that provide a measurement of the second coordinate (z in the barrel and r on the -3 -disks). The achieved single-point resolution of this measurement is an order of magnitude worse than in rφ .
The principal characteristics of the tracker are summarized in table 1. Figure 2 shows the material budget of the CMS tracker, both in units of radiation lengths and nuclear interaction lengths, as estimated from simulation. The simulation describes the tracker material budget with an accuracy better than 10% [13], as was established by measuring the distribution of reconstructed nuclear interactions and photon conversions in the tracker. Table 1. A summary of the principal characteristics of the various tracker subsystems. The number of disks corresponds to that in a single endcap. The location specifies the region in r (z) occupied by each barrel (endcap) subsystem.  . Total thickness t of the tracker material traversed by a particle produced at the nominal interaction point, as a function of pseudorapidity η, expressed in units of radiation length X 0 (left) and nuclear interaction length λ I (right). The contribution to the total material budget of each of the subsystems that comprise the CMS tracker is shown, together with contributions from the beam pipe and from the support tube that surrounds the tracker.

Reconstruction of hits in the pixel and strip tracker
The first step of the reconstruction process is referred to as local reconstruction. It consists of the clustering of zero-suppressed signals above specified thresholds in pixel and strip channels into -4 -hits, and then estimating the cluster positions and their uncertainties defined in a local orthogonal coordinate system (u, v) in the plane of each sensor. A pixel sensor consists of 100 × 150 µm 2 pixels with the u-axis oriented parallel to the shorter pixel edge. In the strip sensors, the u-axis is chosen perpendicular to the central strip in each sensor (which in the TEC is not parallel to the other strips in the same sensor).

Hit reconstruction in the pixel detector
In the data acquisition system of the pixel detector [14], zero-suppression is performed in the readout chips of the sensors [15], with adjustable thresholds for each pixel. This pixel readout threshold is set to a single-pixel threshold corresponding to an equivalent charge of 3200 electrons. Offline, pixel clusters are formed from adjacent pixels, including both side-by-side and corner-bycorner adjacent cells. Each cluster must have a minimum charge equivalent to 4000 electrons. For comparison, a minimum ionizing particle deposits usually around 21000 electrons. Miscalibration of residual charge caused by pixel-to-pixel differences of the charge injection capacitors, which are used to calibrate the pixel gain, are extracted from laboratory measurements and included in the Monte Carlo (MC) simulation. Two algorithms are used to determine the position of pixel clusters. A fast algorithm (described in section 3.1.1) is used during track seeding and pattern recognition, and a more precise algorithm (section 3.1.2), based on cluster shapes, is used in the final track fit.

First-pass hit reconstruction
The position of a pixel cluster along the transverse (u) and longitudinal (v) directions on the sensor is obtained as follows. The procedure is described only for the case of the u coordinate, but is identical for the v coordinate.
The cluster is projected onto the u-axis by summing the charge collected in pixels with the same u-coordinate [16]. The result is referred to as a projected cluster. For projected clusters that are only one pixel large, the u-position is given by the centre of that pixel, corrected for the Lorentz drift of the collected charge in the CMS magnetic field. For larger projected clusters, the hit position u hit is determined using the relative charge in the two pixels at each end of the projected cluster: where Q first and Q last are the charges collected in the first and last pixel of the projected cluster, respectively; u geom is the position of the geometrical centre of the projected cluster; and the parameter L u /2 = D tan Θ u L /2 is the Lorentz shift along the u-axis, where Θ u L is the Lorentz angle in this direction, and D is the sensor thickness. For the pixel barrel, the Lorentz shift is approximately 59 µm. The parameter W u inner is the geometrical width of the projected cluster, excluding its first and last pixels. It is zero if the width of the projected cluster is less than three pixels. The charge width W u is defined as the width expected for the deposited charge, as estimated from the angle of the track with respect to the sensor, and equals -5 -

JINST 9 P10009
where the angle α u is the impact angle of the track relative to the plane of the sensor, measured after projecting the track into the plane perpendicular to the v-axis. If no track is available, α u is calculated assuming that the particle producing the hit moved in a straight line from the centre of the CMS detector. The motivation for eq. (3.1) is that the charge deposited by the traversing particle is expected to only partially cover the two pixels at each end of the projected cluster. The quantity W u −W u inner , which is expected to have a value between zero and twice the pixel pitch, (a modified version of eq. (3.1) is used for any hits that do not meet this expectation), provides an estimate of the total extension of charge into these two outermost pixels, while the relative charge deposited in these two pixels provides a way to deduce how this total distance is shared between them. The distance that the charge extends into each of the two pixels can thereby be deduced. This gives the position of the two edges of the charge distribution, and the mean value of these edges, corrected for the Lorentz drift, equals the position of the cluster.

Template-based hit reconstruction
The high level of radiation exposure of the pixel detector can affect significantly the collection of charge by the pixels during the detector's useful life. This degrades particularly the performance of the standard hit reconstruction algorithm, sketched in the previous section, as this algorithm only uses the end pixels of projected clusters when determining hit positions. The reconstructed positions of hits can be biased by up to 50 µm in highly irradiated sensors, and the hit position resolution can be severely degraded. In the template-based reconstruction algorithm, the observed distribution of the cluster charge is compared to expected projected distributions, called templates, to estimate the positions of hits [17].
The templates are generated based on a large number of simulated particles traversing pixel modules, which are modelled using the detailed PIXELAV simulation [18][19][20]. Since the PIXELAV program can describe the behaviour of irradiated sensors, new templates can be generated over the life of the detector to maintain the performance of the hit reconstruction. To allow the templatebased algorithm to be applied to tracks crossing the silicon at various angles, different sets of templates are generated for several ranges of the angle between the particle trajectory and the sensor. Working in each dimension independently, each pixel is subdivided into nine bins along the u (or v) axis, where each bin has a width of one-eighth of the size of a pixel and the end bins are centred on the pixel boundaries. The u (or v) coordinate of the point of interception of the particle trajectory and the pixel (defined as the position at which the track crosses the plane that lies halfway between the front and back faces of the sensor) is used to assign the interception point to one of the nine bins, j, indicating its location within the pixel. The charge profile of the cluster produced by each particle is projected into an array that is 13 pixels long along the u axis (or 23 pixels long along the v axis) and centred on the intercepted pixel. The resulting charge in each element i of this array is recorded. Only clusters with a charge below some specified angle-dependent maximum, determined from simulation, are used, as the charge distributions can be distorted by the significant ionization caused by energetic delta rays. This procedure provides an accurate determination of the projected cluster distributions, determined by effects of geometry, charge drift, trapping, and charge induction. In each dimension, the mean charge S i, j in bin (i, j), averaged over all the particles, is then determined. In addition, the RMS charge distributions for the two projected pixels at the two -6 -ends of the cluster are extracted, as are the charge in the projected pixel that has the highest charge within the cluster, and the cluster charge, both averaged over all tracks.
The charge distribution of a reconstructed cluster, projected onto either the u or v axis, can be described in terms of a charge P i in each pixel i of the cluster. This can be compared to the expected charge distributions S i, j stored in the templates, so as to determine the bin j where the particle is likely to have crossed the sensor, and hence the best estimate of the reconstructed hit position. This is accomplished by minimizing a χ 2 function for several or all of the bins: (3.4) In this expression, ∆P i is the expected RMS of a charge P i from the PIXELAV simulation and N j represents a normalization factor between the observed cluster charge and the template. While a sum over all the template bins yields an absolute minimum, different strategies can be used to optimize the performance of the algorithm as a function of allowed CPU time. As described in section 4.3, this χ 2 is also used to reject outliers during track fitting, in particular pixel hits on a track that are incompatible with the distribution expected for the reconstructed track angle. A simplified estimate of the position of a hit is performed for cluster projections consisting of a single pixel by correcting the position of the hit for bias from Lorentz drift and possible radiation damage. The bias is defined by the average residual of all single-pixel clusters, as detailed below.
For cluster projections consisting of multiple pixels, the estimate of the hit position is further refined. The charge template expected for a track crossing the pixel at an arbitrary position r, near the best j bin is approximated by the expression (1 − r)S i, j−1 + rS i, j+1 . Substituting this expression in place of S i, j in eq. (3.3), and minimizing χ 2 with respect to r, yields an improved estimate of the hit position.
Finally, the above-mentioned hit reconstruction algorithm is applied to the same PIXELAV MC samples originally used to generate the templates. Since the true hit position is known, any bias in the reconstructed hit position can be determined and accounted for when the algorithm is run on collision data. In addition, the RMS of the difference between the reconstructed and true hit position is used to define the uncertainty in the position of a reconstructed hit.

Hit reconstruction in the strip detector
The data acquisition system of the strip detector [21] runs algorithms on off-detector electronics (namely, on the modules of the front-end driver (FED) [22]) to subtract pedestals (the baseline signal level when no particle is present) and common mode noise (event-by-event fluctuations in the baseline within each tracker readout chip), and to perform zero-suppression. Zero-suppression accepts a strip if its charge exceeds the expected channel noise by at least a factor of five, or if both the strip and one of its neighbours have a charge exceeding twice the channel noise. As a result, information for only a small fraction of the channels in any given event is retained for offline storage.
-7 -Offline, clusters are seeded by any channel passing zero-suppression that has a charge at least a factor of three greater than the corresponding channel noise [1]. Neighbouring strips are added to each seed, if their strip charge is more than twice the strip noise. A cluster is kept if its total charge is a factor five larger than the cluster noise, defined as σ cluster = √ ∑ i σ 2 i , where σ i is the noise for strip i, and the sum runs over all the strips in the cluster.
The position of the hit corresponding to each cluster is determined from the charge-weighted average of its strip positions, corrected by approximately 10 µm (20 µm) in the TIB (TOB) to account for the Lorentz drift. One additional correction is made to compensate for the fact that charge generated near the back-plane of the sensitive volume of the thicker silicon sensors is inefficiently collected. This inefficiency shifts the cluster barycentre along the direction perpendicular to the sensor plane by approximately 10 µm in the 500 µm thick silicon, while its effect is negligible in the 320 µm thick silicon. The inefficient charge collection from the sensor backplane is caused by the narrow time window during which the APV25 readout chip [23] integrates the collected charge, and whose purpose is to reduce background from out-of-time hits.
The uncertainty in the hit position is usually parametrized as a function of the expected width of the cluster obtained from the track angle (i.e., the 'charge width' defined in section 3.1.1). However, in rare cases, when the observed width of a cluster exceeds the expected width by at least a factor of 3.5, and is incompatible with it, the uncertainty in the position is then set to the 'binary resolution', namely, the width of the cluster divided by √ 12. This broadening of the cluster is caused by capacitive coupling between the strips or energetic delta rays.

Hit efficiency
The hit efficiency is the probability to find a cluster in a given silicon sensor that has been traversed by a charged particle.
In the pixel detector, the efficiency is measured using isolated tracks originating from the primary vertex. The p T is required to be >1 GeV, and the tracks are required to be reconstructed with a minimum of 11 hits measured in the strip detector. Hits from the pixel layer under study are not removed when the tracks are reconstructed. To minimize any ensuing bias, all tracks are required to have hits in the other two pixel layers, ensuring thereby that they would be found even without using the studied layer. A restrictive selection is set on the impact parameter to reduce false tracks and tracks from secondary interactions. To avoid inactive regions and to allow for residual misalignment, track trajectories passing near the edges of the sensors or their readout chips are excluded. Specifically, they must not pass within 0.6 mm (1.0-1.5 mm) of a sensor edge in the pixel endcap (barrel) or within 0.6 mm of the edge of a pixel readout chip. The efficiency is determined from the fraction of tracks to which either a hit is associated in the layer under study, or if it is found within 500 µm of the predicted position of the track. Given the high track density, only tracks that have no additional trajectories within 5 mm are considered so as to reduce false track-to-cluster association. The average efficiency for reconstructing hits is >99%, as shown in figure 3(left), when excluding the 2.4% of the pixel modules known to be defective. The hit efficiency depends on the instantaneous luminosity and on the trigger rate, as shown in figure 3(right). The systematic uncertainty in these measurements is estimated to be 0.2%. Several sources of loss have been identified. First, the limited size of the internal buffer of the readout chips cause a dynamic inefficiency that increases with the instantaneous luminosity and with the trigger rate. = 7 TeV s CMS Figure 3. The average hit efficiency for layers or disks in the pixel detector excluding defective modules (left), and the average hit efficiency as a function of instantaneous luminosity (right). The peak luminosity ranged from 1 to 4 nb −1 s −1 during the data taking.
Single-event upsets temporarily cause loss of information at a negligible rate of approximately two readout chips per hour. Finally, readout errors signalled by the FED modules depend on the rate of beam induced background.
The efficiency in the strip tracker is measured using tracks that have a minimum of eight hits in the pixel and strip detectors. Where two hits are found in one of the closely-spaced double layers, which consist of rφ and stereo modules, both hits are counted separately. The efficiency in any given layer is determined using only the subset of tracks that have at least one hit in subsequent layers, further away from the beam spot. This requirement ensures that the particle traverses the layer under study, but also means that the efficiency cannot be measured in the outermost layers of the TOB (layer 6) and the TEC (layer 9). To avoid inactive regions and to take account of any residual misalignment, tracks that cross a module within five standard deviations from the sensor's edges, based on the uncertainty in the extrapolated track trajectory, are excluded from consideration. The efficiency is determined from the fraction of traversing tracks with a hit anywhere within the non-excluded region of a traversed module. In the strip tracker, 2.3% of the modules are excluded because of short circuits of the high voltage, communication problems with the front-end electronics, or other faults. Once the defective modules are excluded from the measurement, the overall hit efficiency is 99.8%, as shown in figure 4. This number is compatible with the 0.2% fraction of defective channels observed during the construction of the strip tracker.
All defective components of the tracker are taken into account, both in the MC simulation of the detector and in the reconstruction of tracks.

Hit resolution
The hit resolution in the pixel and strip barrel sensors has been studied by measuring residuals, defined by the difference between the measured and the expected hit position as predicted by the -9 -   fitted track. Each trajectory is refitted excluding the hit under study in order to minimize biases of the procedure.
The resolution of the pixel detector is measured from the RMS width of the hit residual distribution in the middle of the three barrel layers, using only tracks with p T > 12 GeV, for which multiple scattering between the layers does not affect the measurement. The expected hit position in the middle layer, as determined from the track trajectory, has an uncertainty that is dominated by the resolution of the hits assigned to the track in the first and third barrel layers. Assuming that the three barrel layers all have the same hit resolution σ hit and because they are approximately equally spaced in radius from the z-axis of CMS, then this uncertainty is given by σ hit / √ 2. Adding this in quadrature with the uncertainty σ hit in the measured position of the hit in the middle layer, demonstrates that the RMS width of the residual distribution is given by σ hit 3/2. The measured hit resolution σ hit in the rφ coordinate, as derived using this formula, is 9.4 µm. The resolution in the longitudinal direction is shown in figure 5, and found to agree within 1 µm with MC simulation. The longitudinal resolution depends on the angle of the track relative to the sensor. For longer clusters, sharing of charge among pixels improves the resolution, with optimal resolution reached for interception angles of ±30 • .
Because of multiple scattering, the uncertainty in track position in the strip detector is usually much larger than the inherent resolution; consequently, individual residuals of hits are not sensitive to the resolution. However, the difference in a track's residuals for two closely spaced modules can be measured with much greater precision. Any offset in a track's position caused by multiple scattering will be largely common to both modules. A technique based on tracks passing through overlapping modules from the same tracker layer is employed to compare the difference in residuals for the two measurements in the overlapping modules [24]. The difference in hit positions (∆x hit ) can be compared to the difference in predicted positions (∆x pred ) derived from the track trajectory, and their difference, fitted to a Gaussian function, provides a hit resolution convoluted with the uncertainty from the trajectory propagation. The bias from translational misalignment between modules affects only the mean of the Gaussian distribution, and not its RMS width. As the two -10 - overlapping modules are expected to have the same resolution, the resolution of a single sensor is determined by dividing this RMS width by √ 2.

JINST 9 P10009
Only tracks of high purity (defined in section 4.4) are used for the above-described study. To reduce the uncertainty from multiple Coulomb scattering, the track momenta are required to be >10 GeV. The χ 2 probability of the track fit is required to be >0.1%, and the tracks are required to be reconstructed using a minimum of six hits in the strip detector. Tracks in the overlapping barrel modules are analysed only when the residual rotational misalignment is less than 5 µm. Remaining uncertainties from multiple scattering and rotational misalignment for the overlapping modules are included as systematic uncertainties of the measurement.
Sensor resolution depends strongly on the size of the cluster and on the pitch of the sensor. The resolutions for the strip detector are shown in table 2, where they are compared to the predictions from MC simulation. The resolution varies not only as a function of the cluster width, but also as a function of pseudorapidity, as the energy deposited by a charged particle in the silicon depends on the angle at which it crosses the sensor plane. The resolution is worse in simulation than in data, implying the need for additional tuning of the MC simulation. The results in the table are valid only for tracks with momenta >10 GeV. At lower momenta, the simulations indicate that the resolution in hit position improves, but this is not important for tracking performance, as the resolution of the track parameters for low-momentum tracks is dominated by the multiple scattering and by not the hit resolution.
-11 - Table 2. A comparison of hit resolution in the barrel strip detector as measured in data with the corresponding prediction from simulation, for track momenta >10 GeV. The resolution is given as function of both the barrel layer and the width of the cluster in strips. Since the resolution is observed to vary with φ and η, a range of resolution values is quoted in each case.

Sensor
Pitch TIB 1-2 80  4 Track reconstruction Track reconstruction refers to the process of using the hits, obtained from the local reconstruction described in section 3, to obtain estimates for the momentum and position parameters of the charged particles responsible for the hits (tracks). As part of this process, a translation between the local coordinate system of the hits and the global coordinate system of the track is necessary. This translation takes into account discrepancies between the assumed and actual location and surface deformation of detector elements as found through the alignment process [25]. In addition, the uncertainty in the detector element location is added to the intrinsic uncertainty in the local hit position.
Reconstructing the trajectories of charged particles is a computationally challenging task. An overview of the difficulties and solutions can be found in review articles [26][27][28]. The tracking software at CMS is commonly referred to as the Combinatorial Track Finder (CTF), which is an adaptation of the combinatorial Kalman filter [29][30][31], which in turn is an extension of the Kalman filter [32] to allow pattern recognition and track fitting to occur in the same framework. The collection of reconstructed tracks is produced by multiple passes (iterations) of the CTF track reconstruction sequence, in a process called iterative tracking. The basic idea of iterative tracking is that the initial iterations search for tracks that are easiest to find (e.g., of relatively large p T , and produced near the interaction region). After each iteration, hits associated with tracks are removed, thereby reducing the combinatorial complexity, and simplifying subsequent iterations in a search for more difficult classes of tracks (e.g., low-p T , or greatly displaced tracks). The presented results reflect the status of the software in use from May through August, 2011, which is applied in a series of six iterations of the track reconstruction algorithm. Later versions of the software retain the same basic structure but with different iterations and tuned values for the configurable parameters to adapt to the higher pileup conditions. Iteration 0, the source of most reconstructed tracks, is designed for prompt tracks (originating near the pp interaction point) with p T > 0.8 GeV that have three pixel hits. Iteration 1 is used to recover prompt tracks that have only two pixel hits. Iteration 2 -12 -is configured to find low-p T prompt tracks. Iterations 3-5 are intended to find tracks that originate outside the beam spot (luminous region of the pp collisions) and to recover tracks not found in the previous iterations. At the beginning of each iteration, hits associated with high-purity tracks (defined in section 4.4) found in previous iterations are excluded from consideration (masked).
Each iteration proceeds in four steps: • Seed generation provides initial track candidates found using only a few (2 or 3) hits. A seed defines the initial estimate of the trajectory parameters and their uncertainties.
• Track finding is based on a Kalman filter. It extrapolates the seed trajectories along the expected flight path of a charged particle, searching for additional hits that can be assigned to the track candidate.
• The track-fitting module is used to provide the best possible estimate of the parameters of each trajectory by means of a Kalman filter and smoother.
• Track selection sets quality flags, and discards tracks that fail certain specified criteria.
The main differences between the six iterations lie in the configuration of the seed generation and the final track selection.

Seed generation
The seeds define the starting trajectory parameters and associated uncertainties of potential tracks. In the quasi-uniform magnetic field of the tracker, charged particles follow helical paths and therefore five parameters are needed to define a trajectory. Extraction of these five parameters requires either three 3-D hits, or two 3-D hits and a constraint on the origin of the trajectory based on the assumption that the particle originated near the beam spot. (A '3-D hit' is defined to be any hit that provides a 3-D position measurement). To limit the number of hit combinations, seeds are required to satisfy certain weak restrictions, for example, on their minimum p T and their consistency with originating from the pp interaction region.
In principle, it is possible to construct seeds in the outermost regions of the tracker, where the track density is smallest, and then construct track candidates by searching inwards from the seeds for additional hits at smaller distances from the beam-line. However, there are several reasons why an alternative approach, of constructing seeds in the inner part of the tracker and building the track candidates outwards, has been chosen instead.
First, although the track density is much higher in the inner region of the tracker, the high granularity of the pixel detector ensures that the channel occupancy (fraction of channels that are hit) of the inner pixel layer is much lower than that of the outer strip layer. This can be seen in figure 6, which shows the mean channel occupancy in strip and pixel sensors in data collected with a 'zero-bias' trigger, (which took events from randomly selected non-empty LHC bunch crossings). This data had a mean of about nine pp interactions per bunch crossing. The channel occupancy is 0.002-0.02% in the pixel detector and 0.1-0.8% in the strip detector. Second, the pixel layers produce 3-D spatial measurements, which provide more constraints and better estimates of trajectory parameters. Finally, generating seeds in the inner tracker leads to a higher efficiency for reconstructing tracks. Although most high-p T muons traverse the entire tracker, a significant fraction -13 -of the produced pions interact inelastically in the tracker (figure 7). In addition, many electrons lose a significant fraction of their energy to bremsstrahlung radiation in the tracker. Therefore, to ensure high efficiency, track finding begins with trajectory seeds created in the inner region of the tracker. This also facilitates reconstruction of low-momentum tracks that are deflected by the strong magnetic field before reaching the outer part of the tracker.   Seed generation requires information on the position of the centre of the reconstructed beam spot, obtained prior to track finding using the method described in section 6.3. It also requires the locations of primary vertices in the event, including those from pileup events. This information is obtained by running a very fast track and vertex reconstruction algorithm, described in section 6.2, that uses only hits from the pixel detector. The tracks and primary vertices found with this algorithm are known as pixel tracks and pixel vertices, respectively.
The seed generation algorithm is controlled by two main sets of parameters: seeding layers and tracking regions. The seeding layers are pairs or triplets of detector layers in which hits are searched for. The tracking regions specify the limits on the acceptable track parameters, including the minimum p T , and the maximum transverse and longitudinal distances of closest approach to the assumed production point of the particle, taken to be located either at the centre of the reconstructed beam spot or at a pixel vertex. If the seeding layers correspond to pairs of detector layers, then seeds are constructed using one hit in each layer. A hit pair is accepted as a seed if the corresponding track parameters are consistent with the requirements of the tracking region. If the seeding layers correspond to triplets of detector layers, then, after pairs of hits are found in the two inner layers of each triplet, a search is performed in the outer detector layer for another hit. If the track parameters derived from the three hits are compatible with the tracking region requirements, the seed is accepted. It is also possible to check if the hits associated with the seed have the expected charge distribution from the track parameters: a particle that enters the detector at a grazing angle will have a larger cluster size than a particle that enters the detector at a normal angle. Requiring the reconstructed charge distribution to match the expected charge distribution can remove many fake seeds.
In simulated tt events at √ s = 7 TeV, more than 85% of the charged particles produced within the geometrical acceptance of the tracker (|η| < 2.5) cross three pixel layers and can therefore be reconstructed starting from trajectory seeds obtained from triplets of pixel hits. Nevertheless, other trajectory seeds are also needed, partially to compensate for inefficiencies in the pixel detector (from gaps in coverage, non-functioning modules, and saturation of the readout), and partially to reconstruct particles not produced directly at the pp collision point (decay products of strange hadrons, electrons from photon conversions, and particles from nuclear interactions). To improve the speed and quality of the seeding algorithm, only 3-D space points are used, either from a pixel hit or a matched strip hit. Matched strip hits are obtained from the closely-spaced double strip layers, which are composed of two sensors mounted back-to-back, one providing an rφ view and one providing a stereo view (rotated by 100 mrad relative to the other, in the plane of the sensor). The 'rφ ' and 'stereo hits' in such a layer are combined into a matched hit, which provides a 3-D position measurement. Table 3 shows the seeding requirements for each of the six tracking iterations. The seeding layers listed in this table are defined as follows: • Pixel triplets are seeds produced from three pixel hits. These seeds are used to find most of the tracks corresponding to promptly produced charged particles. The three precise 3-D space points provide seeds of high quality and with well-measured starting trajectories. A mild constraint on the compatibility of these trajectories with the centre of the beam spot is employed, to remove seeds inconsistent with promptly produced particles. Also, the charge distribution of each pixel hit is required to be compatible with that expected for the crossing angle of the seed trajectory and the corresponding sensor.
-15 - • Mixed pairs with vertex constraint are seeds that use two hits and a third space-point given by the location of a pixel vertex. If more than one pixel vertex is found in an event, which often happens because of pileup, all are considered in turn. The pixel vertices are required to pass quality criteria; the most important is that a vertex must contain at least four pixel tracks. The two hits used for these seeds can be provided by the pixel tracker, or by the two inner rings of the three inner TEC layers, where the TEC layers are used to increase coverage in the very forward regions.
• Mixed triplets are seeds produced from three hits formed from a combination of pixel hits and matched strip hits. Each triplet contains between one and three pixel hits and < 3 strip hits. This iteration is implemented for finding displaced tracks and prompt tracks that do not have three hits in the pixel detector. The beam spot related constraint is less restrictive, providing higher efficiency for finding tracks arising from decays of hadrons containing s, c, or b quarks, photon conversions, and nuclear interactions.
• Strip pairs are seeds constructed using two matched hits from the strip detector. Iteration 4 uses the two inner TIB layers and rings 1-2 of the TID/TEC, which are the same strip layers used in Iteration 3. In Iteration 5, hits from the two inner TOB layers and ring 5 of the TEC are used for seeds. These two iterations have even weaker constraints on the compatibility of the seed trajectory with the centre of the beam spot than has Iteration 3, and they do not require pixel hits. These iterations are therefore useful for finding tracks produced outside of the pixel detector volume or tracks that do not leave hits in the pixel detector. Table 3. The configuration of the track seeding for each of the six iterative tracking steps. Shown are the layers used to seed the tracks, as well as the requirements on the minimum p T and the maximum transverse (d 0 ) and longitudinal (z 0 ) impact parameters relative to the centre of the beam spot. The Gaussian standard deviation corresponding to the length of the beam spot along the z-direction is σ . The asterisk symbol indicates that the longitudinal impact parameter is calculated relative to a pixel vertex instead of to the centre of the beam spot.

Iteration
Seeding layers

Track finding
The track-finding module of the CTF algorithm is based on the Kalman filter method [29][30][31][32]. The filter begins with a coarse estimate of the track parameters provided by the trajectory seed, and then builds track candidates by adding hits from successive detector layers, updating the parameters at each layer. The information needed at each layer includes the location and uncertainty of the -16 -detected hits, as well as the amount of material crossed, which is used to estimate the effects of multiple Coulomb scattering and energy loss. The track finding is implemented in the four steps listed below. The first step (navigation) uses the parameters of the track candidate, evaluated at the current layer, to determine which adjacent layers of the detector can be intersected through an extrapolation of the trajectory, taking into account the current uncertainty in that trajectory. The navigation service can be configured to propagate along or opposite to the momentum vector, and uses a fast analytical propagator to find the intercepted layers. The analytical propagator assumes a uniform magnetic field, and does not include effects of multiple Coulomb scattering or energy loss. With these assumptions, the track trajectory is a perfect helix, and the propagator can therefore extrapolate the trajectory from one layer to the next using rapid analytical calculations. In the barrel, the cylindrical geometry makes navigation particularly easy, since the extrapolated trajectory can only intercept the layer adjacent to the current one. In the endcap and barrel-endcap transition regions, navigation is more complex, as the crossing from one layer does not uniquely define the next one.
The second step involves a search for compatible silicon modules in the layers returned by the navigation step. A module is considered compatible with the trajectory if the position at which the trajectory intercepts the module surface is no more than some given number (currently three) of standard deviations outside the module boundary. The propagation of the trajectory parameters, and of the corresponding uncertainties, to the sensor surface involves mathematical operations and routines that are generally quite time-consuming [33]. Hence, the code responsible for searching for compatible modules has been optimized to limit the number of sensors that are considered, while preserving an efficiency of >99% in finding the relevant sensors. A complication is that the design of the CMS tracker is such that sensors often slightly overlap their neighbours, meaning that a particle can cross two sensors in the same layer. This possibility is accommodated by dividing the compatible modules in each layer into groups of mutually exclusive modules, defined such that if a particle passes through one member of a group, it is not physically possible for it to pass through a second member of the same group. Any two modules that have some overlap are not mutually exclusive, and are therefore assigned to different groups. This feature is used in the third and fourth steps of the track finding, described next.
The third step forms groups of hits, each of which is defined by the collection of all the hits from one of the module groups. A configurable parameter provides the possibility of adding a ghost hit to represent the possibility that the particle failed to produce a hit in the module group, for example, as a result of module inefficiency. The hit positions and uncertainties are refined using the trajectory direction on the sensor surface, to calculate more accurately the Lorentz drift of the ionization-charge carriers inside the silicon bulk. A χ 2 test is used to check which of the hits are compatible with the extrapolated trajectory. The current (configurable) requirement is χ 2 < 30 for one degree of freedom (dof). The χ 2 calculation takes into account both the hit and trajectory uncertainties. In the endcap regions and the barrel-endcap transition regions, the extrapolation distances and the amount of material traversed are generally greater, with correspondingly larger uncertainties in the trajectory, and the probability of finding spurious hits compatible with the track tends therefore to be greater.
The fourth and last step is to update the trajectories. From each of the original track candidates, new track candidates are formed by adding exactly one of the compatible hits from each module -17 -grouping (where this hit may be a ghost hit). As the modules in a given group are mutually exclusive, it would not be expected that a track would have more than one hit contributing from each group. The trajectory parameters for each new candidate are then updated at the location of the module surface, by combining the information from the added hits with the extrapolated trajectory of the original track candidate.
For the above second, third, and fourth steps of the procedure, a more accurate material propagator is used when extrapolating the track trajectory, which includes the effect of the material in the tracker. This differs from the method of the simple analytical propagator, in that it increases the uncertainty in the trajectory parameters according to the predicted RMS scattering angle in the tracker material. It also adjusts the momentum of the trajectory by the predicted mean energy loss of the Bethe-Bloch equation. Since all detector material is assumed to be concentrated in the detector layers, the track propagates along a simple helix between the layers, allowing the material propagator to extrapolate the track analytically. The ghost hits include the effect of material without providing position information to the propagator.
All resulting track candidates found at each layer are then propagated to the next compatible layers, and the procedure is repeated until a termination condition is satisfied. However, to avoid a rapid increase in the number of candidates, only a limited number (default is 5) of the candidates are retained at each step, with the best candidates chosen based on the normalized χ 2 and a bonus given for each valid hit, and a penalty for each ghost hit. The standard termination conditions are if a track reaches the end of the tracker or contains too many missing hits (limit is N lost ), or if its p T drops below a user specified value. The number of missing hits on a track is equal to the number of ghost hits, except that hits not found due to attributable known detector conditions, for example, if a detector module is turned off, are not counted. The building of a trajectory can also be terminated when the uncertainty in its parameters falls below a given threshold or the number of hits is above a threshold; these kinds of termination conditions tend to be used only in the high-level trigger (HLT), where the required accuracy on track parameters is often reached after 5 or 6 hits are added to the track candidate, and the continuation of the track building would correspond to a waste of CPU time.
When the search for hits in the outward direction reveals a minimum number of valid hits (N rebuild ), an inwards search is initiated for additional hits. Otherwise, the track candidate remains as formed. The inwards search starts by taking all of the hits assigned to the track, excluding those belonging to the track seed, and using them to fit the track trajectory. In case this exclusion of the seeding hits leaves fewer than N rebuild hits to fit, some of the seeding hits are also used (taking first the outer contributions) so as to obtain at least N rebuild hits. Then, as in the outward track building, the trajectory is propagated inwards through the seeding layers and then further, until the inner edge of the tracker is reached or too many ghost hits are found. There are three reasons for this inward search. First, additional hits can be found in the seeding layers (for example, from overlapping sensors). Second, hits can be found in layers closer to the interaction region than the seeding layers. Third, when strip layers are used in seeding, matched hits are used to increase computational speed and reduce the combinations of hits available for seeding. However, some rφ or stereo hits are not part of any matched hit. While these hits are not available during seeding, they can be found during the inward track building process. The effect of the inward search is an increase in the mean number of hits per track by 0.15, (i.e., a 1% increase relative to a total of ≈14 -18 -hits), which translates to a better signal-to-background ratio, impact parameter resolution, and p T resolution, with maximum improvements of 2%, 1%, and 0.5%, respectively.
The track of a single charged particle can be reconstructed more than once, either starting from different seeds, or when a given seed develops into more than one track candidate. To remedy this feature, a trajectory cleaner is applied after all the track candidates in a given iteration have been found. The trajectory cleaner calculates the fraction of shared hits between two track candidates: where N hits 1 and N hits 2 are, respectively, the number of hits used in forming the first (second) track candidate. If this fraction exceeds the (configurable) value of 19% (determined empirically), the trajectory cleaner removes the track with the fewest hits; if both tracks have the same number of hits, the track with the largest χ 2 value is discarded. The procedure is repeated iteratively on all pairs of track candidates. The same algorithm is applied when tracks from the six iterations are combined into a single track collection.
The requirements applied during the track-finding stage are shown in table 4 for each tracking iteration. In addition to the requirement on N lost , the completed track candidates must also pass requirements on the minimum number of hits (N hits ) and minimum track p T . The minimum p T requirements have very little effect, as they are weaker than those applied to the seeds, given in table 3. Since the later iterations do not have strong requirements that the tracks originate close to the centre of the beam spot, the probability of random hits forming tracks increases, which leads to more fake tracks and greater usage of CPU time. To compensate for this tendency, the criteria for the minimum number of hits, and maximum number of lost hits, are tightened in the later iterations. Table 4. Selection requirements applied to track candidates during the six iterative steps of track finding, the minimum p T , the minimum number of hits N hits , and the maximum number of missing hits N lost . Also shown is the minimum number of hits needed to be found in the outward track building step to trigger the inward track building step N rebuild , although candidates failing this requirement are not rejected.

Track fitting
For each trajectory, the track-finding stage yields a collection of hits and an estimate of the track parameters. However, the full information about the trajectory is only available at the final hit of the trajectory (when all hits are known). Furthermore, the estimate can be biased by constraints, such as a beam spot constraint applied to the trajectory during the seeding stage. The trajectory is therefore refitted using a Kalman filter and smoother. The Kalman filter is initialized at the location of the innermost hit, with the trajectory estimate obtained by performing a Kalman filter fit to the innermost hits (typically four) on the track. The corresponding covariance matrix is scaled up by a large factor (10 for the last iteration and 100 for -19 -the other iterations) in order to limit the bias. The fit then proceeds in an iterative way through the full list of hits, from the inside outwards, updating the track trajectory estimate sequentially with each hit. For each valid hit, the estimated hit position uncertainty is reevaluated using the current values of the track parameters. In the case of pixel hits, the estimated hit position is also reevaluated. This first filter is followed by the smoothing stage, whereby a second filter is initialized with the result of the first one (except for the covariance matrix, which is scaled by a large factor), and is run backward towards the beam-line. The track parameters at the surface associated with any of its hits, can then be obtained from the weighted average of the track parameters of these two filters, evaluated on this same surface, as one filter uses information from all the hits found before, and the other uses information from all the hits found after the surface. This provides the optimal track parameters at any point, including the innermost and outermost hit on the track, which are used to extrapolate the trajectory to the interaction region and to the calorimeter and muon detectors, respectively. A configurable parameter determines whether the silicon strip matched hits are used as is or split into their component rφ and stereo hits. For the standard offline reconstruction, the split hits are used to improve the track resolution, while for the HLT, the matched hits are used to improve speed.
To obtain the best precision, this filtering and smoothing procedure uses a Runge-Kutta propagator to extrapolate the trajectory from one hit to the next. This not only takes into account the effect of material, but it also accommodates an inhomogeneous magnetic field. The latter means that the particle may not move along a perfect helix, and its equations of motion in the magnetic field must therefore be solved numerically. To do so, the Runge-Kutta propagator divides the distance to be extrapolated into many small steps. It extrapolates the track trajectory over each of these steps in turn, using a well-known mathematical technique for solving first-order differential equations, called the fourth-order Runge-Kutta method, so called because it is accurate to fourth order in the step size. The optimal step size is chosen automatically, according to how non-linear the problem is. This automatic determination of step size employs the method [34], which is based on how well the fourth and fifth order Runge-Kutta predictions agree with each other. Use of the Runge-Kutta propagator is most important in the region |η| > 1, where the magnetic field inhomogeneities are greatest. For example, in this region, tracks fitted using the simple material propagator are biased by up to 1% for particles with p T = 10 GeV. This bias is almost completely eliminated when using the Runge-Kutta propagator. To assure an accurate extrapolation of the track trajectory, the Runge-Kutta propagator uses a detailed map of the magnetic field, which was measured before LHC collisions to a precision of < 0.01%.
Estimates of the track trajectory at any other points, such as the point of closest approach to the beam-line, can be obtained by extrapolating the trajectory evaluated at the nearest hit to that very point. This extrapolation also uses the Runge-Kutta propagator.
After filtering and smoothing, a search is made for spurious hits (outliers), incorrectly associated to the track. Such hits can be related to an otherwise well-defined track, e.g., from δ -rays, or unrelated, such as hits from nearby tracks or electronic noise. Two methods are used to find outliers. One uses the measured residual between a hit and the track to reject hits whose χ 2 compatibility with the track exceeds a configurable threshold (20 for Iterations 0-4 and 30 for Iteration 5). While a χ 2 requirement of 30 on each hit is already applied during track finding, the outlier rejection criterion provides a more powerful restriction as it uses information from the full fit [32].

JINST 9 P10009
The other method calculates a probability that a pixel hit is consistent with the track, taking into account the charge distribution of the pixel hit, which generally comprises several pixel channels. This probability corresponds to the χ 2 defined in eq. (3.3). After removing the outlier, the track is again filtered and smoothed and another check for outliers is made. This continues until no more outliers are found. In cases where removing an outlier results in two consecutive ghost hits, the track is terminated and the remaining outer hits discarded (although not used, a configurable parameter is available to allow the track fitting to continue). If a track is found to have less than three hits after outlier rejection or for the track fitting to fail, the track is discarded (although not used, a configurable parameter is available to return the original track).
The default value of 20 for the χ 2 requirement is chosen to reject a significant fraction of outliers, while removing few genuine hits. With this value, approximately 20% of the spurious outliers are removed from tracks reconstructed in high-density dijet events, whereas <0.2% of the good hits are removed.

Track selection
In a typical LHC event containing jets, the track-finding procedure described above yields a significant fraction of fake tracks, where a fake track is defined as a reconstructed track not associated with a charged particle, as defined in section 5. The fake rate (fraction of reconstructed tracks that are fake) can be reduced substantially through quality requirements. Tracks are selected on the basis of the number of layers that have hits, whether their fit yielded a good χ 2 /dof, and how compatible they are with originating from a primary interaction vertex. If several primary vertices are present in the event, as often happens due to pileup, all are considered. To optimize the performance, several requirements are imposed as a function of the track η and p T , and on the number of layers (N layers ) with an assigned hit (where a layer with both rφ and stereo strip modules is counted as a single layer). The selection criteria are as follows.
• A requirement on the minimum number of layers in which the track has at least one associated hit. This differs from selections based on the number of hits on the track, because more than one hit in a given layer can be assigned to a track, as in the case of layers with overlapping sensors or double-sided layers in which two sensors are mounted back-to-back.
• A requirement on the minimum number of layers in which the track has an associated 3-D hit (i.e., in the pixel tracker or matched hits in the strip tracker).
• A requirement on the maximum number of layers intercepted by the track containing no assigned hits, not counting those layers inside its innermost hit or outside its outermost hit, nor those layers where no hit was expected because the module was known to be malfunctioning.

JINST 9 P10009
The parameters α i and β are configurable constants. The track's impact parameters are d BS 0 and z PV 0 , where d BS 0 is the distance from the centre of the beam spot in the plane transverse to the beam-line and z PV 0 is the distance along the beam-line from the closest pixel vertex. These pixel vertices, described in section 6.2, are required to have at least three pixel tracks and if no pixel vertices meet this requirement, then z PV 0 is required to be within 3σ of the z-position of the centre of the beam spot, where σ is the Gaussian standard deviation corresponding to the length of the beam spot in the z-direction. The above selection criteria include requirements on the transverse |d BS 0 |/δ d 0 and longitudinal |z PV 0 |/δ z 0 impact parameter significances of the track, where the impact parameter uncertainties, δ d 0 and δ z 0 , are calculated from the covariance matrix of the fitted track trajectory. A second pair of requirements is also imposed on these significances, but calculated differently, with the uncertainties in the impact parameters being parametrized in terms of p T and polar angle of the track: where ⊕ represents the sum in quadrature and a and b are parameters. Their nominal values are a = 30 µm and b = 10 µm GeV, but b increases to 100 µm GeV for the loose and tight selection criteria used (and defined below) in Iterations 0 and 1.
The fraction of fake tracks decreases roughly exponentially as a function of the number of layers in which the track has associated hits: dN fake /dN layers ∼ exp(−ωN layers ), with ω in the range 0.9-1.3 depending on the p T of the track. As a consequence, weaker selection criteria can be applied for tracks having many hit layers, which is the reason for the chosen selection criteria. For tracks with hits in at least 10 layers, the selection requirements on χ 2 and impact parameters are found to reject no tracks. However, the criteria become far more stringent for tracks with relatively few hit layers.
The above quality criteria were initially optimized as a function of track p T and N layers , so as to maximize the quality Q(ρ) = s/ s + ρb, where s is the number of selected genuine (non-fake) tracks, b is the number of selected fake tracks and ρ 10 inflates the importance of the fake tracks to achieve low fake rates (below 1% for PYTHIA QCD events withp T of the two outgoing partons in the range 170-230 GeV). As data taking conditions have evolved, the parameters have been adjusted to maintain high efficiency and low fake rate.
The track selection criteria for each iteration are given in table 5. The loose criteria denote the minimum requirements for a track to be kept in the general track collection. The tight and high-purity criteria provide progressively more stringent requirements, which reduce the efficiency and fake rate. In general, high-purity tracks are used for scientific analysis, although in cases where efficiency is essential and purity is not a major concern, the loose tracks can be used. The criteria for the initial tracking iterations emphasise compatibility with originating from a primary vertex as a means of assuring quality, while the criteria used for the later iterations rely on other measures of track quality such as fit χ 2 and the number of hits, ensuring thereby that they are still useful for selecting displaced tracks. This matches the seeding and track-finding requirements shown in tables 3-4, and is aligned with the goals for the six iterations.
After the track selection is complete, the tracks found by each of the six iterations are merged into a single collection.
-22 - Table 5. Parameter values used in selecting tracks reconstructed by each of the six iterative tracking steps. The first table shows the three requirements on the number of layers that contain hits assigned to tracks and the parameter α 0 that controls selection criteria based on χ 2 /dof. The second table shows the parameters α i and β that define compatibility of impact parameters with the interaction point. Each parameter has three entries, corresponding to the loose (L), tight (T), and high-purity (H) selection requirements. Iterations 2 and 3 use two paths that emphasise track quality (Trk) or primary-vertex compatibility (Vtx). A track produced by these iterations is retained if it passes either of these criteria.

Iteration
Min

Specialized tracking
The track reconstruction described above produces the main track collection used by the CMS collaboration. However, variants of this software are also used for more specialized purposes, as described in this section.

Electron track reconstruction
Electrons, being charged particles, can be reconstructed through the standard track reconstruction. However, as electrons lose energy primarily through bremsstrahlung, rather than ionization, large energy losses are common. For example, about 35% of electrons radiate more than 70% of their initial energy before reaching the electromagnetic calorimeter (ECAL) that surrounds the tracker. The energy loss distribution is highly non-Gaussian, and therefore the standard Kalman filter, which is optimal when all variables have Gaussian uncertainties, is not appropriate. As a result, the efficiency and resolution of the standard tracking are not particularly good for electrons and therefore electron candidates are reconstructed using a combination of two techniques that make use of information, not only from the tracker, but also from the ECAL. As this is a subject beyond the scope of this paper, only a brief description of these methods is given.
-23 - The first method [35] starts by searching for clusters of energy in the ECAL. The curvature of electrons in the strong CMS magnetic field means that bremsstrahlung photons emitted by the electrons will, in general, strike the ECAL at η values similar to that of the electron, but at different azimuthal coordinates (φ ). To recover this radiated energy, ECAL superclusters are formed, by merging clusters of similar η over some range of φ . The knowledge of the energy and position of each supercluster, and the assumption that the electron originated near the centre of the beam spot, constrains the trajectory of the electron through the tracker (aside from a two-fold ambiguity introduced by its unknown charge). Tracker seeds compatible with this trajectory are sought in the pixel tracker (and also in the TEC to improve efficiency in the forward region).
The second method [36] takes the standard track collection (excluding tracks found by Iteration 5, as described in table 3) and attempts to identify a subset of these tracks that are compatible with being electrons. Electrons that suffer only little bremsstrahlung loss can be identified by searching for tracks extrapolated to the ECAL that pass close to an ECAL cluster. Electrons that suffer large bremsstrahlung loss can be identified by the fact that the fitted track will often have poor χ 2 or few associated hits. The track seeds originally used to generate these electron-like tracks are retained.
The seed collections obtained by using these two methods are merged, and used to initiate electron track finding. This procedure is similar to that used in standard tracking, except that the χ 2 threshold, used by the Kalman filter to decide whether a hit is compatible with a trajectory, is weakened from 30 to 2000. This is to accommodate tracks that deviate from their expected trajectory because of bremsstrahlung. In addition, the penalties assigned to track candidates for passing through a tracker layer without being assigned a hit are adjusted. This is necessary because bremsstrahlung photons can convert into e + e − pairs with the track-finding algorithm incorrectly forming a track by combining hits from the primary electron with one of the conversion electrons.
To obtain the best parameter estimates, the final track fit is performed using a modified version of the Kalman filter, called the Gaussian Sum Filter (GSF) [37]. In essence, the fractional energy loss of an electron, as it traverses material of a given thickness, is expected to have a distribution described by the Bethe-Heitler formula. This distribution is non-Gaussian, making it unsuitable for use in a conventional Kalman filter algorithm. The GSF technique solves this by approximating the Bethe-Heitler energy-loss distribution as the sum of several Gaussian functions, whose means, widths, and relative amplitudes are chosen so as to optimize this approximation. The parameters of these Gaussian energy-loss functions are determined only once. Each track trajectory is also represented by a mixture of several 'trajectory components', where each trajectory component has helix parameters with Gaussian uncertainties, and a 'weight' corresponding to the probability that it correctly describes the true path of the particle. Initially, a track trajectory is described by ing Gaussian component of the energy-loss distribution. To avoid an exponential explosion in the number of trajectory components being followed, as the track candidate is propagated through successive tracker layers, the less probable trajectory components are dropped or merged (by grouping together similar trajectory components), so as to limit their number to 12. Each trajectory component will also be updated by the Kalman filter if an additional hit is assigned to it when passing through a layer. When this happens, the weight of the trajectory component is further adjusted according to its compatibility with the hit.
The GSF fit provides estimates of the track parameters, whose uncertainties are described not by a single Gaussian distribution, but instead by the sum of several Gaussian distributions, each corresponding to the uncertainty on one of the trajectory components that make up the track. For each parameter, the mode of this distribution is used as it is found to provide the best estimates of the parameters.
The performance of the GSF electron tracking has been studied both with simulations [37] and with data [38], with good agreement observed between the two.

Track reconstruction in the high-level trigger
The CMS high-level trigger (HLT) [39] uses a processor farm running C++ software to achieve large reductions in data rate. The HLT filters events selected at rates of up to 100 kHz using the Level-1 (hardware) trigger. Whereas Level-1 uses information only from the CMS calorimeters and muon detectors, the HLT is also able to capture information from the tracker, thereby adding the powerful tool of track reconstruction to the HLT. Some examples of how this improves the HLT performance are listed below.
• Requiring muon candidates that are reconstructed in the muon detectors to be confirmed through the presence of a corresponding track in the tracker greatly reduces the false reconstruction rate and substantially improves momentum resolution.
• Energy clusters found in the electromagnetic calorimeters can be identified as electrons or photons through the presence of a track of appropriate momentum pointing to the cluster.
• The background rejection rate for lepton triggers can be enhanced by requiring leptons to be isolated. One method of doing this is to use a veto on the presence of (too many) tracks in a cone around the lepton direction.
• Triggering on jets produced by b quarks can be done by counting the number of tracks in a jet that have transverse impact parameters statistically incompatible with the track originating from the beam-line.
• Triggers on τ decays τ → ν ν τ , where = e or µ, can be extended to τ → hν τ decays, where h represents one or more charged hadrons, by reconstructing a narrow, isolated jet using tracks in combination with calorimeter information.
The HLT uses track reconstruction software that is identical to that used for offline reconstruction, but it must run much faster. This is achieved by using a modified configuration of the track reconstruction.
-25 -Tracks can be reconstructed from triplets of hits found using only the pixel tracker, as documented in sections 4.1 and 6.2. This is extremely fast, and can be used with great effect in the reconstruction of the primary-vertex position in the HLT, described in section 6.2.
Tracks can also be reconstructed in the HLT using hits from both the pixel and strip detectors. Such tracks have superior momentum resolution and a lower probability of being fake. However, this requires much more CPU time than just reconstructing pixel tracks, since the strip tracker does not provide the precise 3-D hits of the pixel tracker, and suffers from a higher hit occupancy. This can be mitigated using some or all of the following techniques (the details vary significantly, depending on the type of trigger).
• Rather than trying to reconstruct all tracks in the event, regional track reconstruction can be performed instead, where the software is used to reconstruct tracks lying within a specified η-φ region around some object of interest (which might be a muon, electron, or jet candidate reconstructed using the calorimeters or muon detectors). This saves CPU time, and is accomplished by using regional seeding. This method differs from the track seeding described in section 4.1, in that it only forms seeds from combinations of hits that are consistent with a track heading into the desired η-φ region. Another important ingredient of regional tracking concerns the extraction of hits. As discussed in section 3.2, hits are reconstructed after unpacking the original data blocks produced by the FED readout boards. Significant time is saved by unpacking only the data from those FED units that read out tracker modules within the region of interest [40]. This is not used in the offline reconstruction as the track reconstruction searches the entire η-φ region and therefore needs all hits.
• Further gains in speed can be made by performing just a single iteration in the iterative tracking, such that only seeds made from pairs of pixel hits are considered, where these hits are compatible with a track originating within a few millimetres of a primary pixel vertex. Furthermore, the HLT uses a higher p T requirement when forming the seeds (usually >1 GeV) than is used for offline reconstruction. These stringent requirements on track impact parameter and p T reduce the number of seeds, and thereby the amount of time spent building track candidates.
• Track finding can differ from that described in section 4.2, in that it can rely on partial track reconstruction. With this technique, the building of each track candidate is stopped once a specific condition is met, for example, a given minimum number of hits (typically eight), or a certain precision requirement on the track parameters. As a consequence, the hits in the outermost layers of the tracker tend not to be used. While such partially reconstructed tracks will have slightly poorer momentum resolution and higher fake rates than fully reconstructed tracks, they also take less CPU time to construct.
• Other changes in the tracking configuration can further enhance the speed of reconstruction. For example, when building track candidates from a given seed, the offline track reconstruction retains at most the best five partially reconstructed candidates for extrapolating to the next layer. Changing this configurable parameter to retain fewer candidates can save CPU time.
-26 -Pixel tracking and other aspects of track reconstruction absorb about 20% of the total HLT CPU time. This is kept low by performing track reconstruction only when necessary, and only after other requirements have been satisfied, so as to reduce the rate at which tracking must be performed. Track reconstruction is employed in a variety of ways to satisfy different needs in the HLT. Examples of track reconstruction at the HLT include seeds originating in the muon detector, tracking in a specific η-φ region defined by a jet, and searching for tracks over the full detector. Even the most comprehensive (and slowest) track reconstruction configuration at the HLT is more than ten times faster than the offline reconstruction of tracks in events representative of data taken in 2011 (tt + 10 pileup events).

Track reconstruction performance
In this section, the performance of the CTF tracking algorithm is evaluated in terms of tracking efficiency and fake rate, track parameter resolutions, and the CPU time required for processing collision events. Two different categories of simulated samples are used: isolated particles and pp collision events. Comparing the results helps one understand both the performance of the tracking for isolated particles and to what extent it is degraded in a high hit occupancy environment.
Simulated events offer the possibility of detailed studies of track reconstruction, such as the way characteristics of the tracker and the design of the track reconstruction algorithms influence its performance over a wide range of particle momenta and rapidities, and how much its performance depends on the type of charged particle being reconstructed, and on whether this particle is isolated or not. The performance in simulation can be compared with that in data in certain regions of phase space to verify that the results from simulation are realistic. The CMS collaboration demonstrated previously that its simulation describes the momentum resolution of muons from J/ψ decay to an accuracy of better than 5% [41]; and does similarly well in describing the dimuon mass resolution of muons from Z boson decay [42]. The transverse and longitudinal impact parameters of tracks reconstructed in typical multijet events agree in data and simulation to better than 10% [43]. The CMS collaboration also showed that the tracking efficiency for particles from J/ψ and charmed hadron decays is simulated with a precision better than 5% [44]. A similar comparison for the higher-momentum muons from Z boson decay will be presented in section 5.1.3 of the present work.
The isolated particle samples that are used here consist of simple events with just a single generated muon, pion or electron, although secondary particles may also be present due to interactions with the detector material. The single particles are generated with a flat distribution in pseudorapidity inside the tracker acceptance |η| < 2.5. Their transverse momenta are either fixed to 1, 10 or 100 GeV, or are generated according to a flat distribution in ln(p T ). The former set of particles with fixed momenta is used for studying the tracking performance as a function of η, while the latter is used to quantify the performance as a function of p T .
For pp collisions, simulated inclusive tt events are used, either with or without superimposed pileup events. The average number of pileup collisions per LHC bunch crossing depends on the instantaneous luminosity of the machine and on the period of data-taking over which the luminosity is averaged. For the sake of simplicity, the number of pileup interactions superimposed on each simulated tt event is randomly generated from a Poisson distribution with mean equal to 8. This -27 -amount of pileup corresponds roughly to what was delivered by the LHC, when averaged over the whole 2011 running period. The tt events, and also the minimum-bias events used for the pileup, are generated with the PYTHIA 6 program [45].
Simulated particles are paired to reconstructed tracks for evaluating tracking efficiency, fake rate, and other quantities discussed in this section. A simulated particle is associated with a reconstructed track if at least 75% of the hits assigned to the reconstructed track originate from the simulated particle. The association of simulated hits with reconstructed hits is possible because the simulation software records the particles responsible for the signal in each channel of the tracker. Strip and pixel response to electronic noise is also recorded. Reconstructed tracks that are not associated with a simulated particle are referred to as fake tracks.
Results for tracking efficiencies and for fake rates are presented in section 5.1. While the latter is evaluated only using simulated samples, the former is also measured in data as described in section 5.1.3. The resolution obtained for track parameters is discussed in section 5.2. Unless indicated otherwise, all results pertaining to the performance are obtained using the set of 'high-purity' tracks defined in section 4.4. Finally, section 5.3 provides estimates of the CPU time required for different components of track reconstruction.

Tracking efficiency and fake rate
For simulated samples, the tracking efficiency is defined as the fraction of simulated charged particles that can be associated with corresponding reconstructed tracks, where the association criterion is the one described at the beginning of this section. This definition of efficiency depends not only on the quality of the track-finding algorithm, but also upon the intrinsic properties of the tracker, such as its geometrical acceptance and material content. Using the same association criterion as used for the efficiency, the fake rate is defined as the fraction of reconstructed tracks that are not associated with any simulated particle. This quantity represents the probability that a reconstructed track is either a combination of unrelated hits or a genuine particle trajectory that is badly reconstructed through the inclusion of spurious hits. The efficiency and fake rate presented in this section are given as a function of p T and η of the simulated particle and reconstructed track, respectively. The efficiency is obtained for simulated particles generated within |η| < 2.5, with a production point <3 cm and <30 cm from the centre of the beam spot for r and |z|, respectively. These criteria select fairly prompt particles. We also require p T > 0.9 GeV, for the study of efficiency as a function of η, or p T > 0.1 GeV for studying efficiency over the entire p T spectrum. Since the 'high-purity' requirement described in section 4.4 is the default track selection for the majority of analyses in CMS, unless otherwise stated efficiency and fake rate are measured and presented here using only the subset of reconstructed tracks that are identified as 'high-purity'.

Results from simulation of isolated particles
This section presents the performance of the CTF tracking software in reconstructing trajectories of particles in events containing just a single muon, a pion or an electron.
Muons are reconstructed better than any other charged particle in the tracker, as they mainly interact with the silicon detector through ionization of the medium and, unlike electrons, their energy loss through bremsstrahlung is negligible. Muons therefore tend to cross the entire volume of the tracking system, producing detectable hits in several sensitive layers of the apparatus. Finally, -28 -muon trajectories are altered almost exclusively by Coulomb scattering and energy loss, whose effects are straightforward to include within the formalism of Kalman filter. For isolated muons with 1 < p T < 100 GeV, the tracking efficiency is >99% over the full η-range of tracker acceptance, and does not depend on p T ( figure 8, top). The fake rate is completely negligible.
Charged pions, as muons, undergo multiple scattering and energy loss through ionization as they cross the tracker volume. However, like all hadrons, pions are also subject to elastic and inelastic nuclear interactions. The elastic nuclear interactions introduce long tails in the distribution of the scattering angle, well beyond expectations from Coulomb scattering. The current implementation of the track-finding algorithm assumes a track trajectory modelled by the material propagator described in section 4.2. This takes into account Coulomb scattering, but neglects elastic nuclear interactions. As a result, the formation of a track can be interrupted if a hadron undergoes a largeangle elastic nuclear scattering. Hence, a hadron can be reconstructed as a single track with fewer hits, or as two separate tracks, or it may not be found at all. A loss of hits also degrades the precision with which the parameters of the trajectory can be estimated (section 5.2). Inelastic nuclear interactions are the main source of tracking inefficiency for hadrons, particularly in those regions of the tracker with large material content. Depending on η, up to 20% of the simulated pions are not reconstructed (figure 8, middle). This effect is most significant for hadrons with p T 700 MeV, because of the larger cross sections for nuclear interactions at low energies [46]. The tracking efficiency is also affected, along with the fake rate (figure 9, top), by the secondary particles produced in inelastic processes. This is because the products of nuclear interactions are often emitted with trajectories approximately aligned to that of the traversing pion, particularly for large pion momenta. As a result, it is common for the trajectory builder to combine hits of the incoming pion with those of a secondary particle into a single track. The degradation in efficiency and the increase in fake rate are correlated, as expected, and the loss in performance is greatest for highest momentum pions. In general, the merging of separate trajectories during reconstruction is more common in the region of the barrel to endcap transition and in the endcap regions of the tracker, as these regions contain large amounts of material. In the transition region, the proportion of fake tracks is also high because the distances between successive hits on each track are longer, particularly when passing from a hit in the barrel to a hit in an endcap detector. These longer distances result in correspondingly larger uncertainties in the track trajectory extrapolation that is performed during track building. This makes it more probable that spurious hits, such as those from secondary particles, will be incorrectly assigned to the track. Although the extrapolation uncertainties would be equally large for muons, the fake rate remains very small for muons, as they rarely produce secondary particles. While the fake rate is generally <2-3% for tracks reconstructed in the sample of single pions with p T = 1 or 10 GeV, in a sample of single pions with a p T of 100 GeV, the fake rate peaks at ≈15% for |η| ≈ 1.3.
Electrons lose a large fraction of their energy via bremsstrahlung radiation before they reach the outer layers of the silicon tracker. Such radiation has an impact on the reconstruction of electrons, similar to that of inelastic nuclear interactions on the reconstruction of charged hadrons. First, if an electron loses most of its energy before reaching the outer layer of the tracker, the number of hits assigned to the track can be reduced significantly. Second, if a radiated photon converts to an electron-positron pair or induces an electromagnetic shower, the track finder can assign a mixture of hits from the primary electron and from the secondary particles to a single track. This reduces -29 -  Figure 9. Tracking fake rate for single, isolated pions (top) and electrons (bottom) passing high-purity quality requirements. Results are shown, as a function of the reconstructed η (left), for generated p T = 1, 10, and 100 GeV. Results are also shown as a function of the reconstructed p T (right), for the barrel, transition, and endcap regions, which are defined by the η intervals of 0-0.9, 0.9-1.4 and 1.4-2.5, respectively. The results for p T are obtained using particles generated with a flat distribution in ln p T . NB The measured fake rate depends strongly on the p T distribution of the generated particles, since, for example, if no particles are generated in a given p T range, most tracks reconstructed in that range must necessarily be fake. The generated particles used to make the plots of fake rate versus η have a different p T spectrum to those used to make the plots of fake rate versus p T , therefore the measured fake rates in these two sets of plots are not directly comparable.
tracking efficiency, increases fake rate, and is the principal source of misidentification of charge for electrons. The efficiency and fake rate of the CTF algorithm for reconstructing electrons are shown in figure 8 (bottom) and figure 9 (bottom), respectively. In the barrel, the efficiency for electrons exceeds 90% for p T > 0.4 GeV, and the fake rate is very small. However, the performance is significantly worse in the endcap and barrel-endcap transition regions, because of the larger amount of material and the correspondingly greater chance of an electron to produce an electromagnetic shower within the tracker volume. The fake rate is particularly high in the sample of electrons -31 -with p T = 100 GeV, since any secondary particle they produce will tend to be emitted tangentially to the direction of the original electron, with the consequence that the tracking algorithm tends to reconstruct the primary and the secondary particle as a single track. It is important to note that, in practice, CMS achieves considerably better performance for electron reconstruction, by using the dedicated GSF algorithm [38], described in section 4.5.1, rather than the standard CTF algorithm.

Results from simulated pp collision events
This section presents the performance of the CTF tracking software for reconstructing trajectories of non-isolated charged particles generated in simulated LHC collisions. Compared to the results shown in the previous section for isolated particles, the tracking performance discussed in this section is affected by an additional important feature of LHC events: the large number of hits produced in the tracker at each LHC bunch crossing. These hits originate from the hundreds of primary particles and their interactions selected by the CMS triggers. Their number is increased by the combined effects of low-energy particles spiralling in the CMS magnetic field ("loopers"), and particles produced in temporally overlapping pileup collisions. In the following, we give examples of the kind of difficulties encountered by the tracking algorithm during the reconstruction of these events.
• Many particles can be emitted within highly collimated jets and the hits they produce are closer to each other than the typical uncertainty in the position of extrapolated trajectories at the sensors. In such situations, the trajectory builder is unable to assign unambiguously the hits to the corresponding trajectories. For example, hits corresponding to two distinct charged particles can be mixed into one or two reconstructed tracks that do not describe accurately either of the trajectories of the two particles.
• Trajectories of nearby particles can be separated sufficiently in the outer layers of the tracker so as to be correctly identified by the track-finding module. Nevertheless, their hits in the innermost layers can be so near to each other that the reconstruction algorithm often assigns incorrectly the hits to the relevant trajectories. Particularly in the innermost pixel layer particles can be so close to each other that their ionization signals can merge into a single cluster.
In this case, even if the individual trajectories are reconstructed, and their momenta are well measured, the resolutions in their impact parameter are degraded by the formation of this merged cluster.
• Many of the low-p T particles from the underlying event of the hard collision, or from the other pileup collisions, have such a low transverse momentum that they cannot escape the volume of the tracker, but instead spiral in the magnetic field, producing many hits in the detector, increasing the complexity of the track-finding task. Even when these circulating particles are not close to each other at their production vertex, their large number of hits increases the probability of having uncorrelated hits accepted as legitimate trajectories, and thereby generate reconstructed fake track.
Since most charged particles produced in LHC collisions are hadrons, all the sources of inefficiency discussed for single, isolated pions similarly affect the tt results. The efficiency for reconstructing charged particles in tt events, which is shown in figure 10 (top), closely resembles -32 -the reconstruction efficiency for isolated pions shown in figure 8 (middle). The similarity between the two indicates that tracking efficiency is not strongly degraded by particle multiplicity in typical tt events. The tracking efficiency as a function of the p T is approximately constant for 1 < p T < 80 GeV, but, at small p T , the efficiency decreases quickly (figure 10, top-right) for several reasons.
• The pion-nucleus cross section increases rapidly for pions of energies below 0.7 GeV.
• Track selection criteria (see section 4.4) are much more stringent for trajectories of small momentum, as they correspond to the main source of fake tracks.
• When estimating the RMS scattering angle and mean energy loss in the detector material, the trajectory propagator assumes that all particles have a pion mass, since the pion is the most common particle produced in LHC collisions. While this assumption is good for relativistic particles, it breaks down at low energies when particle masses become more important.
For the considered tt events, charged particles with p T larger than 80 GeV are mostly produced inside the core of collimated jets. The inability of the trajectory builder to cope fully with regions of the tracker characterised by extremely high-density of particles is reflected in the drop in tracking efficiency for large p T values.
The fake rate, shown in figure 10 (bottom), has a similar dependence on η as that observed for isolated pions in figure 9. However, the fake rate has a very different dependence on p T for the two cases. In the pp collisions, the fake rate increases for p T values <1 GeV. This is because the smaller the p T of an initial trajectory seed, the larger the search windows that must be used (because of multiple scattering) when searching for additional hits to form the corresponding track candidates. This increases the probability to assign wrong hits to a track. The fake rate also increases at large p T , as was the case for the single-pion samples, partially because of the production of secondary particles in nuclear interactions, and partially because comparatively few high-p T particles are produced in pp collisions.
The distributions in efficiency and fake rate in figure 10 are generated for two sets of reconstructed tracks: all the tracks produced using the default tracking software, and only those tracks that pass the high-purity requirements. For a 1-2% reduction in efficiency, the quality requirement reduces the fake rate over the entire p T range by more than a factor of two. Figure 11 shows the efficiency and fake rate plots for tt events simulated either with or without superimposed pileup interactions. After applying the quality requirement, the presence of pileup significantly degrades the efficiency and fake rate only for tracks with a p T < 1 GeV.
The CMS tracker is capable of reconstructing highly displaced tracks, such as pions from K 0 decay, or particles produced in nuclear interactions and photon conversions. This is very useful for studies of B physics, photon reconstruction, and for improving energy resolution for particleflow reconstruction [3]. This capability also makes it possible to search for signatures of new phenomena, such as new long-lived particles that decay with displaced tracks. Reconstruction of displaced tracks is carried out in Iterations 3-5 of the 6-step iterative tracking scheme described in section 4. Charged particles originating outside the pixel detector can also be reconstructed. The efficiency for reconstructing this kind of charged particle as a function of the radius of its point of production is shown in figure 12 for tt events.

All tracks
Only high-purity tracks = 7 TeV s CMS simulation Figure 10. Tracking efficiency (top) and fake rate (bottom) for simulated tt events that include superimposed pileup collisions. The number of pileup interactions superimposed on each simulated event is generated randomly from a Poisson distribution with mean value of 8. Plots are for all reconstructed tracks, and also for the subset of tracks passing high-purity requirements. The efficiency and fake rate plots are plotted for |η| < 2.5, and the efficiency for charged particles refers to those generated less than 3 cm (30 cm) from the centre of the beam spot in r (z) directions. The efficiency as a function of η is for generated particles with p T > 0.9 GeV.

Efficiency estimated from data
A "tag-and-probe" method [47,48] allows an extraction of muon tracking efficiency directly from decays of known resonances. For example, Z → µ + µ − candidates are reconstructed using pairs of oppositely charged muons identified in the muon chambers. Each Z candidate must have one tag muon, meaning that it is reconstructed in both the tracker and muon chambers, and one probe muon, meaning that it is reconstructed just in the muon chambers, with no requirement on the tracker. The invariant mass of each µ + µ − candidate is required to be within the 50-130 GeV range, around the 91 GeV mass of the Z boson [46].
For both data and simulated events, the tracking efficiency can be estimated as the fraction of the probe muons in Z → µ + µ − events that can be associated with a track reconstructed in the -34 -  Figure 11. Tracking efficiency (top) and fake rate (bottom) for tt events simulated with and without superimposed pileup collisions. The number of pileup interactions superimposed on each simulated event is generated randomly from a Poisson distribution with mean value of 8. Plots are produced for the subset of tracks passing the high-purity quality requirements. The efficiency and fake rate plots cover |η| < 2.5. The efficiency results are for charged particles produced less than 3 cm (30 cm) from the centre of the beam spot in r (z) directions. The efficiency as a function of η is for generated particles with p T > 0.9 GeV. tracker. A correction must be made for the fact that some of the probe muons are not genuine. This correction is obtained by fitting the dilepton mass spectrum in order to subtract the non-resonant background, since only genuine dimuons will contribute to the resonant peak. This must be done separately for µ + µ − candidates in which the probe is associated (or not) with a track in the tracker.
The results of fits using the tag-and-probe method are shown for data and simulation in figure 13 as a function of the η of the probe, as well as the number of reconstructed primary vertices in the event. The measured tracking efficiency is >99% in both data and simulation. The data displays a 0.3% drop in tracking efficiency with increasing pileup, which is not reproduced in the simulation. This may originate from the dynamic (pileup dependent) inefficiency of the pixel detector, discussed in section 3.3, which is not modelled in the simulation. The structure in the  Figure 12. Cumulative contributions to the overall tracking performance from the six iterations in track reconstruction. The tracking efficiency for simulated tt events is shown as a function of transverse distance (r) from the beam axis to the production point of each particle, for tracks with p T > 0.9 GeV and |η| < 2.5, transverse (longitudinal) impact parameter <60 (30) cm. The reconstructed tracks are required to pass the high-purity quality requirements. tracking efficiency when shown as a function of η is caused by inactive modules and residual misalignment of the tracker. As the figure shows, these detector conditions are well reproduced in simulation.

Resolution in the track parameters
In the context of the reconstruction software of CMS, the five parameters used to describe a track are: d 0 , z 0 , φ , cot θ , and the p T of the track, defined at the point of closest approach of the track to the assumed beam axis. This point is called the impact point, with global coordinates (x 0 , y 0 , z 0 ). Thus d 0 and z 0 define the coordinates of the impact point in the radial and z directions (d 0 = −y 0 cos φ + x 0 sin φ ). The azimuthal and polar angles of the momentum vector of the track are denoted by φ and θ , respectively.
The resolution in the parameters is studied using simulated events, and estimated from track residuals, which are defined as the differences between the reconstructed track parameters and the corresponding parameters of the generated particles. For each of the five track parameters, the resolution is plotted as a function of the η or p T of the simulated charged particle. In every bin of η or p T , the distribution in track residuals defines the resolution as the half-width of the interval that satisfies both of the following requirements.
• The width contains 68% of all entries (including underflows and overflows) in the distribution of the residuals.
• The interval is centred on the most probable value (mode) of the residuals, where this value is taken from the peak of a double-tailed Crystal Ball function [49] fitted to the residuals. The function must provide different parametrizations of the tails on the left and right sides of the residuals distribution as, especially for electrons, the distribution can be very asymmetric.
For all resolution plots, we also provide a second measure of the resolution, defined such that the interval contains 90% of the track residuals. This quantifies the impact of the extreme values, whereas the resolutions for the 68% intervals represent the core of the distribution.

Results from simulation of isolated particles
Muons do not undergo strong interactions, and therefore they tend to traverse the entire volume of the tracker, so the hits on their trajectories provide a long lever arm for reconstruction. Figure 14 shows the dependence on η of the resolution for the five track parameters, for isolated muons with p T = 1, 10, and 100 GeV. The same resolutions, but as a function of p T , are shown in figure 15. The resolutions in both the impact parameters and the angular parameters generally deteriorate for larger values of |η| because the extrapolation from the innermost hit to the beam axis, where the parameters are calculated, becomes larger. The resolutions in the transverse and longitudinal impact parameters d 0 and z 0 are shown in the first two plots of figures 14 and 15. At high momentum, the impact parameter resolution is dominated by the position resolution of the innermost hit in the pixel detector, while at lower momenta, the resolution is degraded progressively by multiple scattering. The improvement in z 0 resolution as |η| increases to 0.4 can be attributed to the beneficial effect of charge sharing in the estimation of position of pixel clusters (see figure 5); in the barrel, as the crossing angle for the tracks in the pixel layers increases, the clusters broaden, distributing thereby the signal over more than one pixel, and improving the resolution in position. The resolutions in the φ and cot θ parameters, shown in the middle two panels of figures 14 and 15, have distributions in resolutions similar to those found for d 0 and z 0 , respectively, for likewise reasons. However, as the contribution of the strip tracker to the measurement of φ and θ is important, -37 -

JINST 9 P10009
the influence of charge sharing in the pixel tracker is smaller. As a function of η, the resolutions in the four track parameters d 0 , z 0 , φ , and θ , are not exactly symmetric around η = 0. This effect is not caused by the tracker geometry, but is rather due to the noisy and dead channels of pixel and strip modules, whose defective components are taken into account in simulation to reproduce the condition of the real detector. The resolution in p T is shown in the bottom panel of figures 14 and 15. At high transverse momentum (100 GeV), the resolution is ≈2-3% up to |η| = 1.6, but deteriorates at higher |η| values, because of the shorter lever arm of these tracks in the x-y plane of the tracker. The degradation at |η| ≈ 1.0 and beyond is due to the gap between the barrel and the endcap disks (figure 1), and due to the inferior hit resolution of the last hits of the track measured in TEC ring 7 compared to the hit resolution in TOB layers 5 and 6 (table 2). At a transverse momentum of 100 GeV, the material in the tracker accounts for between 20 and 30% of the transverse momentum resolution; at lower momenta, the resolution is dominated by multiple scattering and its value reflects the amount of material traversed by the track. The relative precision in p T is measured to be best for tracks with p T ≈ 3 GeV.
Charged pions that do not undergo nuclear interactions behave similarly to muons, as they are subjected to the same multiple scattering effects and to the same mechanism of energy loss through ionization. The trajectories of this subset of pions are reconstructed using the CTF algorithm with a precision that is close to that achieved for muons, and therefore these trajectories populate the core of the distributions of residuals. The five plots in figure 16 show resolutions in the five track parameters as a function of η. As expected, the results are very close to those observed for muons in figure 14. However, the resolutions obtained for the 90% interval have a somewhat different pattern for muons than for pion tracks crossing the barrel-endcap transition region of the tracker. The residuals are generally larger for pions, as they can interact inelastically, and thereby fail to reach the outer layers of the tracking system. Their trajectories are measured therefore using smaller lever arms, with degraded resolutions.
Three of the track parameters (d 0 , φ and p T ) for electrons have very asymmetric residual distributions because of bremsstrahlung, and we therefore alter the definition of their resolution. The distribution in track residuals is split into two regions, separated at the mode of the distribution. Only one of these two regions contains long, non-Gaussian tails due to bremsstrahlung and the resolution is now redefined using only the distribution in this region. It is given by the width of an interval that starts at the mode of the distribution and is wide enough to include 68% of the entries in the region. A similar definition of the resolution corresponding to the width of a 90% probability interval is used to quantify the size of the non-Gaussian tails. Note that if the distribution of residuals had been symmetric, then the results obtained with these new definitions of the resolution would be identical to those that would have been obtained with the original definitions from the beginning of section 5.2. The other two parameters (cot θ and z 0 ) are less affected by bremsstrahlung, and we therefore continue to use the same definition of resolution as for muons and pions.
In figure 17, we show the resolutions in the d 0 , φ , and p T track parameters as a function of η for single, isolated electrons for simulated p T values of 10 and 100 GeV. These resolutions are calculated for using both the standard CTF algorithm as well as using the GSF algorithm, described in section 4.5.1. However, the GSF requirements described in section 4.5.1 for consistency of tracks with energy depositions in the ECAL were not applied, as they are beyond the scope of this discussion. Because the GSF algorithm handles bremsstrahlung in a better way both the 68% and -38 -  Figure 16. Resolution, as a function of pseudorapidity, in the five track parameters for single, isolated pions with transverse momenta of 1, 10, and 100 GeV. From top to bottom and left to right: transverse and longitudinal impact parameters, φ , cot θ and p T . For each bin in η, the solid (open) symbols correspond to the half-width for 68% (90%) intervals centered on the mode of the distribution in residuals, as described in the text.

JINST 9 P10009
the 90% resolutions are significantly improved relative to those obtained with CTF. Similar effects can also be observed for the resolution in the cot θ and z 0 parameters, as shown in figure 18. Figure 19 shows the bias in the reconstructed p T of electrons as a function of η. The bias is defined by the mode of the distribution of residuals. An alternative definition, based on the mean value of residuals is also shown. The momenta are systematically underestimated by the CTF algorithm for electrons outside the barrel region. However, the bias is almost completely recovered using the GSF algorithm except for electrons with |η| > 2.0, where it is affected more severely by the large amount of material in the pixel endcaps.

Results from simulated pp collision events
The resolutions for tracks in tt events, with superimposed pileup interactions, are shown as a function of track p T in figure 20. For the five track parameters, the functional dependence is very similar to that observed for single particles ( figure 15), except for p T beyond 20-30 GeV and η corresponding to the regions outside the tracker barrel.
The impact of pileup on these resolutions is generally negligible.

CPU execution time
Track reconstruction is, by far, the most computationally challenging part of CMS data reconstruction: for processing pp events with pileup, it requires almost as much CPU time as all the other reconstruction modules together. Furthermore, as the number of pileup events increases, the number of tracks increases in proportion, but the number of hit combinations that can be assembled into seeds and track candidates increases much more quickly, leading to a far more rapid increase in the required CPU time. The mean CPU time per event for reconstructing tracks is shown in table 6, separated into needs for tracking iterations and for computational steps (track seeding, finding, fitting, etc). The CPU times are given for tt events, simulated either without pileup or with an average of 8 pileup events. As the table shows, the presence of pileup significantly increases the total required CPU time, for example, by a factor 2.4 for Iteration 0 and a factor 8.6 for Iteration 1. Figure 21 shows the number of tracks per event reconstructed in each iteration. The presence of pileup increases the number of low-p T tracks, and as these are mainly reconstructed in Iterations 1-3, pileup has the biggest effect on these three iterations, increasing thereby both the number of tracks and the use of CPU time.
6 Beam spot and primary-vertex reconstruction and its performance

Primary-vertex reconstruction
The goal of primary-vertex reconstruction [50] is to measure the location, and the associated uncertainty, of all proton-proton interaction vertices in each event, including the 'signal' vertex and any vertices from pileup collisions, using the available reconstructed tracks. It consists of three steps: (i) selection of the tracks, (ii) clustering of the tracks that appear to originate from the same interaction vertex, and (iii) fitting for the position of each vertex using its associated tracks. Track selection involves choosing tracks consistent with being produced promptly in the primary interaction region, by imposing requirements on the maximum value of significance of the -42 -  transverse impact parameter (<5) relative to the centre of the beam spot (which is reconstructed as described in section 6.3), the number of strip and pixel hits associated with a track (≥2 pixel layers, pixel+strip ≥5 ), and the normalized χ 2 from a fit to the trajectory (<20). To ensure high reconstruction efficiency, even for minimum-bias events, there is no requirement on the p T of the tracks. The selected tracks are then clustered on the basis of their z-coordinates at their point of closest approach to the centre of the beam spot. This clustering allows for the reconstruction of any number of proton-proton interactions in the same LHC bunch crossing. The clustering algorithm must balance the efficiency for resolving nearby vertices in cases of high pileup against the possibility of accidentally splitting a single, genuine interaction vertex into more than one cluster of tracks.
A simple 'gap clustering' algorithm was used in past reconstruction of the CMS data recorded in 2010 [43], with all tracks ordered according to the z-coordinate of their point of closest approach to the centre of the beam spot. When any two neighbouring elements in this ordered set of coordinates had a gap exceeding a distance z sep = 2 mm, the gap was used for splitting the tracks on either side into separate vertices. Interaction vertices separated by less than z sep were merged in this algorithm, making it a poor choice for high-pileup LHC conditions. Track clustering is therefore now performed using a deterministic annealing (DA) algorithm [51], finding the global minimum for a problem with many degrees of freedom, in a way that is analogous to that of a physical system approaching a state of minimal energy through a series of gradual temperature reductions. The z-coordinates of the points of closest approach of the tracks to the centre of the beam spot are referred to as z T i , and their associated uncertainties as σ z i . The tracks must be assigned to some unknown number of vertices at positions z V k . 'Hard' assignments, where a track is assigned to one and only one vertex, can be represented by values of probability p ik that equal 1 if track i is assigned to vertex k, and 0 otherwise. In the DA framework, assignments are 'soft', meaning tracks can be associated with more than one vertex, with probability p ik between 0 and 1 that can be interpreted as the probability of the assignment of track i to vertex k in a large ensemble of possible assignments. Postulating that a priori every possible configuration is equally likely, this is analogous to calculations in statistical mechanics if the vertex χ 2 represents the role of the energy. The most probable vertex positions at "temperature" T follow from the minimization of the analogue of the free energy in statistical mechanics, relative to the positions of the vertices z V k with vertex weights ρ k . The sums run over the tracks i, and the set of vertices k that reflect the temperature T . Tracks enter with constant weights, p i , reflecting their consistency with originating from the beam spot. The number of prototype vertices can be chosen to be arbitrarily large, but after minimizing F with respect to the z k , many of the prototype positions coincide. Then a finite number of effective vertices emerge at distinct positions, independent of the number of prototypes. It is computationally more efficient to use those effective -47 -vertices with weights ρ k that correspond to the fraction of unweighted prototypes that coincide at position z k . The weights are variable, but the sum is always constrained to unity. (This version of DA is called "mass-constrained clustering" in [51], because ∑ k ρ k = 1.) The assignment probabilities are given by where the resolutions σ z i are effectively scaled by √ T . At very high T , all p ik become equal, and all tracks become compatible with a single vertex. For T → 0 every track becomes compatible with exactly one vertex, resulting in hard assignment.
The DA algorithm is initiated at a very high temperature with a single vertex. T is gradually decreased, and ∂ F/∂ z V k = ∂ F/∂ ρ k = 0 is implemented iteratively at each new temperature, starting with the result of the previous step in temperature. Because local minima are smeared out by the effective scaling of resolutions as a function of T , this procedure traces the global minimum of F from high to low temperature.
The number of vertices increases as the temperature falls, and rises each time the minimum of F turns into a saddle point at lower temperatures. This happens whenever T falls below the critical temperature of one of the vertices, When this happens, the vertex involved is then replaced by two nearby vertices before the temperature is decreased again. The sum of the weights ρ k of the two resultant vertices is initially set equal to the weight of the parent. The DA process thereby finds not only positions and assignments of tracks to vertices but also the number of vertices. The starting temperature for the whole process is chosen to be above the first critical temperature, evaluated for ρ 1 = p i1 = 1. The temperature is decreased at every step by a cooling factor of 0.6. The 'annealing' is continued down to a minimum temperature of T min = 4, which represents a compromise between the resolving power and the possibility of incorrectly splitting true vertices.
Because of the inherently tentative assignment of tracks in the DA framework, there is a possibility that tracks can be assigned to multiple vertices. For the final assignment, the annealing is continued down to T = 1, but without more splitting.
As described, the DA algorithm is not robust against outliers, such as secondary or mismeasured tracks. Above T min , outlier rejection competes with splitting, and is therefore not used. Below T min , an outlier rejection term Z i = exp(−µ 2 /T ) is added to the vertex sums in eq. (6.1), which acts as a cutoff for the assignment probabilities in the denominator of eq. (6.2). Tracks that are more than µ standard deviations away from the nearest vertex are down-weighted, and the algorithm becomes a one-dimensional robust adaptive multi-vertex fit [52]. The default value for the cutoff is µ = 4.
Outliers tend to create false vertices when other tracks, typically worse in resolution, are available nearby. Candidate vertices are therefore retained only if at least two of their tracks are incompatible with originating from other vertices. The tracks assigned to the rejected candidate vertices -48 -

JINST 9 P10009
are not removed but reassigned to other vertices through another minimization of F. The outlier rejection term at this stage allows individual tracks to have low assignment probability to all remaining vertex candidates. A minimal probability of 0.5 is required for making the final assignment when T = 1 has been reached.
After identifying candidate vertices based on the DA clustering in z, those candidates containing at least two tracks are then fitted using an adaptive vertex fitter [53] to compute the best estimate of vertex parameters, including its x, y and z position and covariance matrix, as well as the indicators for the success of the fit, such as the number of degrees of freedom for the vertex, and weights of the tracks used in the vertex. In the adaptive vertex fit, each track in the vertex is assigned a weight between 0 and 1, which reflects the likelihood that it genuinely belongs to the vertex. Tracks that are consistent with the position of the reconstructed vertex have a weight close to 1, whereas tracks that lie more than a few standard deviations from the vertex have small weights. The number of degrees of freedom in the fit is defined as where w i is the weight of the ith track, and the sum runs over all tracks associated with the vertex. The value of n dof is therefore strongly correlated with the number of tracks compatible with arising from the interaction region. For this reason, n dof can be also used to select true proton-proton interactions.

Primary-vertex resolution
The resolution in a reconstructed primary-vertex position depends strongly on the number of tracks used to fit the vertex and the p T of those tracks. In this section, we introduce a 'splitting method' for measuring the resolution as a function of the number of tracks emanating from a vertex. The tracks used in any given vertex are split equally into two sets. During the splitting procedure, the tracks are first sorted in descending order of p T , and then grouped in pairs starting from the track with the largest p T . For each pair, tracks are assigned randomly to one or the other set of tracks. This ensures that the two sets of tracks have, on average, the same kinematic properties. These two sets of tracks are then fitted independently with the adaptive vertex fitter. To extract the resolution, the distributions in the difference of the fitted vertex positions for a given number of tracks are fitted using a single Gaussian distribution, whose fitted RMS width is then divided by √ 2, because the two measurements of the vertex used in the difference have the same resolution. The range of the fit is constrained to be within twice the RMS of the distribution.
Results from a study of the primary-vertex resolution in x and z as a function of the number of tracks associated to the vertex, using both minimum-bias and jet-enriched data samples, are shown in figure 22. The resolution in y is almost identical to that in x, and is therefore omitted. The minimum-bias sample is collected from a suite of triggers requiring, for example, only a coincidence of signals from the Beam Scintillator Counters or minimal requirements on the hit or track multiplicity in the pixel detectors. The jet-enriched samples are produced by requiring each event to have a reconstructed jet with transverse energy E T > 20 GeV. The tracks in these events have significantly higher mean p T , resulting in higher resolution in the track impact parameter and -49 -consequently better vertex resolution. For minimum-bias events, the resolutions in x and z are, respectively, less than 20 µm and 25 µm, for primary vertices reconstructed using at least 50 tracks. The resolution is better for the jet-enriched sample across the full range of the number of tracks used to fit the vertex, approaching 10 µm in x and 12 µm in z for primary vertices using at least 50 tracks. The primary-vertex resolution for the minimum-bias data from pp collisions has also been compared with simulated minimum-bias events (PYTHIA 6, Tune Z2 [54]), and found to be in excellent agreement. The difference between the measured vertex positions, divided by the sum of the contributions to the uncertainty from the fit, taken in quadrature, is referred to as the "pull." The standard deviation of the Gaussian function fitted to the pull distribution is roughly independent of the number of tracks at the vertex and is found to be approximately 0.93 in data and 0.90 in simulation, indicating that the position uncertainty from the fit to a vertex is slightly overestimated for both. This is consistent with the slightly overestimated track uncertainties observed in MC studies.

Efficiency of primary-vertex reconstruction
Given an input set of reconstructed tracks, the primary-vertex reconstruction efficiency is evaluated based on how often a vertex is reconstructed successfully and its position found consistent with the true value. Neither the tracking efficiency nor the probability to produce a minimal number of charged particles in a minimum-bias interaction is considered in the extraction of the efficiency for reconstruction of the vertex.
Just as in the measurement of the resolution, the efficiency for primary-vertex reconstruction depends strongly on the number of tracks in the cluster. The same splitting method described in the previous section can be used to also extract the reconstruction efficiency as a function of the number of tracks in the vertex cluster. In this implementation of the method, the tracks used at the vertex are sorted first in descending order of p T and then split into two different sets, such that two-thirds (one-third) of the tracks are randomly assigned as tag (probe) tracks. The asymmetric splitting is used to increase the number of vertices with a small number of tracks, where the efficiency is expected to be lowest. The sets of tag and probe tracks are then clustered and fitted independently to extract the vertex reconstruction efficiency. While each event is not entirely reclustered, the contribution to the efficiency from such clustering is not neglected, as the possibility still remains that a new cluster, using the reduced set of tracks following splitting, will not be formed. The effect of pileup on the measurement of the vertexing efficiency has been checked in simulation, and found to be small.
The efficiency is calculated based on the number of times the probe vertex is reconstructed and matched to the original vertex, given that the tag vertex is reconstructed and matched to the original vertex. A tag or probe vertex is considered to be matched to the original vertex if the tag or probe vertex position in z is within 5σ from the original vertex. The value of σ here is chosen to be the larger of the uncertainty in the fit to a vertex for the tag or probe tracks and the uncertainty in the original vertex. Figure 23 shows the efficiency of the primary-vertex reconstruction as a function of the number of tracks clustered in z. The results are obtained using the splitting method described above, applied to both minimum-bias data and to MC simulation, and show agreement between the two samples. The primary-vertex efficiency is estimated to be close to 100% when more than two tracks are used to reconstruct the vertex. The effect of pileup on the efficiency is checked using simulated minimum-bias events, with and without added pileup, and the loss of efficiency is found to be < 0.1% for the pileup with a mean value of 8.

Track and vertex reconstruction with the pixel detector
CMS has an independent reconstruction of tracks and primary vertices based purely on pixel hits. The pixel track reconstruction is extremely fast, because only three tracker layers are used, and the low occupancy and high 3-D spatial resolution of the pixel detector make it ideally suited to track finding. Such reconstructed pixel tracks and primary vertices can be found extremely fast, hence making them valuable tools for the HLT.
-51 -Pixel tracks are formed in the same fashion as the pixel triplets, described in section 4.1, requiring p T > 0.9 GeV. Vertex finding using pixel tracks provides a simple and efficient method for measuring the position of the primary vertex. The clustering of tracks is performed using a gap clustering algorithm, with vertex candidates having at least two tracks fitted using an adaptive vertex fit, as described in section 6.1.
The great speed with which pixel tracks and pixel primary vertices can be reconstructed also makes them a useful tool for many algorithms in the HLT. For example, counting the number of pixel tracks near a lepton can help determine if the lepton is isolated. Similarly, measuring the impact parameter of pixel tracks relative to their vertex can be used to identify the displaced tracks expected from b-hadron decays.

Tracking efficiency and fake rate for pixel tracks
The reconstruction efficiency of pixel tracks is estimated by comparing the reconstructed tracks with the particles generated in simulation. Since pixel tracks have only three hits, it is required that all three hits must be produced by the same simulated particle, for the pixel track and simulated particle to be associated. The efficiency for reconstructing a particle as a pixel track is defined as the fraction of simulated particles that can be associated with a reconstructed pixel track. The fake rate is defined as the fraction of reconstructed tracks that are not associated with any simulated particle.
Plots on top left and top right in figure 24 show the dependence of the measured pixel track efficiency on the simulated track η and p T , for inclusive tt events with and without superimposed pileup (where the number of pileup interactions is 8, as mentioned in section 5). The maximum efficiency for the pixel tracks is ∼85%. The ∼15% inefficiency arises mainly from the presence of defective pixel modules (∼2.4% of the read out chips in the CMS pixel detector are inoperative) and geometric inefficiency. The asymmetry between positive and negative η reflects the non-uniform distribution of the affected pixel modules. In the top-right plot of figure 24, the efficiency drops at low p T because of the p T > 0.9 GeV requirement on pixel tracks. Figure 24 also shows that the addition of pileup events leads to only a small loss in efficiency.
Plots at the bottom left and bottom right in figure 24 show the fake rate as a function of η and p T , both with and without the presence of pileup. As observed for the full tracking algorithms in section 5.1, the fake rate increases significantly with |η| and p T . The effect of pileup is also clearly visible, as the fake rate increases by 50% with high pileup.

Resolution in the parameters of pixel tracks
The resolutions in transverse and longitudinal impact parameters d 0 and d z can be extracted from simulated events in the same way as in section 5.2. Figure 25 shows the resolutions for the five pixel track parameters as a function of pixel track p T that includes the effect from pileup. The distributions are similar in form, but somewhat poorer resolution than those shown for standard tracking in figure 20. The pixel track resolution in p T degrades by over 30% for track p T > 10 GeV.

Position resolution for pixel based vertices
The position resolution for pixel vertices is extracted using the same method used to measure primary-vertex resolutions in section 6.1.1 (split method). Figure 26 shows the measured resolution -52 -  Figure 24. Pixel tracking efficiency (top) and fake rate (bottom) for tt events simulated with and without superimposed pileup collisions. The number of pileup interactions superimposed on each simulated event is randomly generated from a Poisson distribution with mean equal to 8. The two plots of efficiency and fake rate as a function of pseudorapidity are produced applying a p T > 0.9 GeV selection. as a function of the number of tracks in x (left), and z (right), using both minimum-bias and a jetenriched data. (The resolution in y is almost identical to that in x, and hence it is omitted.) The resolution is better for the jet-enriched sample, across the full range of associated tracks used to reconstruct the pixel vertex. For example, with 50 tracks, the x resolution of a pixel vertex is 30 µm for the minimum-bias sample, compared to 25 µm for the jet-enriched sample. This is due to the fact that tracks in the jet-enriched data have higher mean p T compared to those in the minimum-bias sample. As before, the pixel vertex resolution in the minimum-bias data has also been compared with that in simulated minimum-bias events and again found to be in good agreement.

Reconstruction of the LHC beam spot
The beam spot represents a 3-D profile of the luminous region, where the LHC beams collide in the CMS detector. The beam spot parameters are determined from an average over many events, in contrast to the event-by-event primary vertex that gives the precise position of a single collision. Measurements of the centre and dependence of the luminous region on r and z are important components of event reconstruction. The position of the centre of the beam spot, corresponding to the centre of the luminous region, is used, especially in the HLT, (i) to estimate the position of the interaction point prior to the reconstruction of the primary vertex; (ii) to provide an additional constraint in the reconstruction of all the primary vertices of an event; and (iii) to provide the primary interaction point in the full reconstruction of low-multiplicity data.

Determination of the position of the centre of the beam spot
The position of the centre of the beam spot can be determined in two ways. The first method is through the reconstruction of primary vertices (see section 6.1), which map out the collisions as a function of x, y, and z, and therefore the shape of the beam spot. The mean position in x, y, and z, and the size of the luminous region can be determined through a fit of a likelihood to the 3-D distribution of vertex positions. The second method utilises a correlation between d 0 and φ that appears when the centre of the beam spot is displaced relative to its expected position. The d 0 for tracks coming from a primary vertex can be parametrized as: where x BS and y BS are the x and y positions of the beam at z = z BS (the centre of the beam spot along the beam direction), and dx/dz and dy/dz are the derivatives (slopes) of x and y relative to z. The fit of the beam spot [55] uses an iterative χ 2 fit to utilise this correlation between d 0 and φ . With a sample of 1000 tracks, the position can be determined with a statistical precision of approximately 5 µm. The two methods have been checked against each other, and provide consistent results. The precision of the d 0 -φ fit is better in lower-multiplicity events, however the width and length of the -55 -luminous region can not be obtained with the same algorithm. Therefore, a combination of the two methods is used to measure the beam spot used in the full reconstruction of each event. The d 0 -φ fit is used to determine the centre of the luminous region in the transverse plane, (x BS , y BS ), and the slopes, dx/dz and dy/dz; while z BS and the RMS widths of the luminous region σ x , σ y , and σ z are all determined from the fit to the 3-D vertex distribution. The beam spot is determined in every luminosity section (LS), corresponding to the events collected during a period of 23 seconds. When the results from all LS intervals of a run are available, they are combined to extract the final beam spot values. A weighted average is performed, with a check implemented to assure that no significant shift occurred in the parameters that might indicate a movement of the beam spot. To protect against slow drifts of the beam, no more than 60 consecutive LS are combined at a time. Figure 27 shows the fitted positions as a function of time for LHC fills during early 2011. The results demonstrate that, within a fill, the position is quite stable, while occasionally there are larger shifts between fills.

Determining the size of the beam spot
The size of the luminous region is also determined with two methods. The first one is based on the reconstructed primary-vertex distribution, where the values of σ are obtained through the likelihood fit described above. The second method, described below, which measures only the transverse size, is based on event-by-event correlations between the transverse impact parameters of two tracks originating from the same vertex.
The displacement of the interaction point within the interaction region introduces a common displacement of trajectories of all particles from the interaction. This shift of the trajectories produces a correlation between the transverse impact parameters of tracks relative to the nominal beam position. The strength of the correlation reflects the transverse size of the beam. The correlation between the transverse impact parameters of two tracks from the vertex of one interaction (labelled (1) and (2)) can be expressed by the expectation value where φ 1 and φ 2 are the azimuthal angles of the tracks measured at the point of closest approach to the beam. A particular feature of this correlation is that its size is independent of the resolutions in vertex positions and impact parameters, and therefore corrections to remove contributions from the resolution are not required. Assuming no correlation between φ 1 − φ 2 and φ 1 + φ 2 , the coefficients in eq. (6.6) can be obtained through the slopes of straight lines fitted to the respective dependence of d Both methods have been used to extract σ x and σ y and the results averaged over an LHC fill are found to be consistent to 2-3 µm [43]. Figure 28 shows σ x , σ y , and σ z as a function of time for LHC fills in early 2011, obtained using the likelihood fit to the primary-vertex distribution. The size of the beam grows with time during each fill, reflecting the growth of the beam emittance. The emittance growth has been directly observed with dedicated instrumentation and correlated with real-time measurements of the beam size [56]. -58 -

JINST 9 P10009
7 Summary and conclusions CMS has developed sophisticated tracking and vertexing software algorithms, based on the Kalman filter, the Gaussian sum filter, and the deterministic annealing filter, to reconstruct the proton-proton collision data provided by the CMS silicon tracker. The implementation of these algorithms has been optimized for computational efficiency, required to keep up with the high data rates recorded using the CMS apparatus. The flexibility of this software is evident from the fact that, with only few changes, it has been adapted to provide the fast tracking needed for the CMS high-level trigger, which processes events at rates of up to 100 kHz. Furthermore, a dedicated version of the software that accommodates bremsstrahlung energy loss in the tracker material, is used to reconstruct electrons.
The tracking algorithms reconstruct tracks over the full pseudorapidity range of the tracker |η| < 2.5, finding charged particles with p T as low as 0.1 GeV, or produced as far as 60 cm from the beam line (such as pions from K 0 decay). Promptly produced, isolated muons of p T > 0.9 GeV are reconstructed with essentially 100% efficiency for |η| < 2.4. In the central region (|η| < 1.4), where the resolution is best, muons of p T = 100 GeV have resolutions of approximately 2.8% in p T , and 10 and 30 µm in transverse and longitudinal impact parameter, respectively.
For prompt, charged particles of p T > 0.9 GeV in simulated tt events, under typical 2011 LHC pileup conditions, the average track-reconstruction efficiency is 94% in the barrel region (|η| < 0.9) of the tracker and 85% at higher pseudorapidity. Most of the inefficiency is caused by hadrons undergoing nuclear interactions in the tracker material. In the same p T range, the fraction of falsely reconstructed tracks is at the few percent level. In the central region, tracks with 1 < p T < 10 GeV have a resolution in p T of approximately 1.5%. The resolution in their transverse (longitudinal) impact parameters improves from 90 µm (150 µm) at p T = 1 GeV to 25 µm (45 µm) at p T = 10 GeV. In this momentum range, the resolution in the track parameters is dominated by multiple scattering.
Tracks are used to reconstruct the primary interaction vertices in each event. For vertices with many tracks, characteristic of interesting events, the achieved vertex position resolution is 10-12 µm in each of the three spatial dimensions.
When the LHC was first proposed, it was not at all certain that tracking of such high quality could be achieved. To make this possible, the CMS collaboration elected to build the world's largest all-silicon tracker, which would provide a relatively small number of high precision hit position measurements, and immersed it in a powerful coaxial magnetic field. The collaboration then devoted many years to the development and study of different tracking algorithms, before finally selecting the ones described in this paper. For example, it was thought initially that track finding should be seeded using hits in the outer layers of the tracker, where the channel occupancy is relatively low. Only later was it broadly appreciated that the pixel tracker is much better for this purpose, with its high granularity giving it excellent resolution in three dimensions and an even lower channel occupancy, despite the high track density. The CMS track and primary-vertex reconstruction software has already achieved or surpassed the performance levels predicted at the time that the tracker was originally designed [21]. Evolution and refinement of tracking and vertexing algorithms will continue in the future, in order to meet the challenges of ever increasing LHC luminosity.