Measurements of t t-bar charge asymmetry using dilepton final states in pp collisions at sqrt(s) = 8 TeV

The charge asymmetry in t t-bar events is measured using dilepton final states produced in pp collisions at the LHC at sqrt(s) = 8 TeV. The data sample, collected with the CMS detector, corresponds to an integrated luminosity of 19.5 inverse femtobarns. The measurements are performed using events with two oppositely charged leptons (electrons or muons) and two or more jets, where at least one of the jets is identified as originating from a bottom quark. The charge asymmetry is measured from differences in kinematic distributions, unfolded to the parton level, of positively and negatively charged top quarks and leptons. The t t-bar and leptonic charge asymmetries are found to be 0.011 +/- 0.011 (stat) +/- 0.007 (syst) and 0.003 +/- 0.006 (stat) +/- 0.003 (syst), respectively. These results, as well as charge asymmetry measurements made as a function of the invariant mass, rapidity, and transverse momentum of the t t-bar system, are in agreement with predictions of the standard model.


Introduction
The exceptionally large mass of the top quark, measured by this experiment as m t = 172.44 ± 0.48 GeV [1], suggests the top quark could have an important connection to physics beyond the standard model (SM), particularly in the mechanism of electroweak (EW) symmetry breaking. Precision measurements of top quark properties have the potential to identify the first hints of new particles, particularly those with stronger couplings to top quarks than to other fundamental particles. The SM predicts a charge asymmetry in tt production at hadron colliders through quark-antiquark annihilation. This asymmetry is caused by the interference between the Born and the box diagrams, as well as between the initial-and final-state radiation diagrams, and is predicted by quantum chromodynamics (QCD) calculations at next-to-leading order (NLO) [2,3]. Early measurements of this asymmetry by the CDF [4] and D0 [5] collaborations exceeded the NLO predictions [2,3] by about two standard deviations, and the discrepancy was more pronounced in the CDF events with large tt invariant mass (M tt > 450 GeV). These results have led to considerations that the anomalous asymmetry might be generated by tree-level exchanges of new particles or by interference effects from new physics at higher mass scales, not directly observable at the LHC [6]. Recent developments in experimental techniques [7,8] and theoretical predictions such as the inclusion of EW [9][10][11][12] and next-to-nextto-leading-order (NNLO) QCD [13,14] corrections have largely resolved the disagreement between theory and the Tevatron measurements. Nonetheless, the charge asymmetry remains an important probe of new physics.
At the Tevatron, colliding valence quarks from the proton and antiproton beams result in asymmetric rapidity (y) distributions of top quarks and antiquarks. The proton-proton (pp) initial state at the LHC is expected to produce top quark and antiquark rapidity distributions that are symmetric about y = 0. However, since the quarks in the initial state can be from valence, while the antiquarks are from the sea, the larger average momentum-fraction of quarks leads to an excess of top quarks produced in the forward directions. The rapidity distribution of top quarks in the SM is therefore broader than that of the more centrally produced top antiquarks, meaning ∆|y t | = |y t | − |y t | is a suitable observable to measure the tt charge asymmetry, defined in terms of event yields N as While the measurement of A C relies on the reconstruction of the top quark and antiquark directions, an advantage of the dilepton final state is that one can alternatively measure the leptonic charge asymmetry defined using only the lepton pseudorapidities [15] η ± as where ∆|η | = |η + | − |η − |. This observable is useful because it is free of the ambiguities associated with the top quark reconstruction, and because the correlation between the direction of a top quark and its decay products transmits an asymmetry in the parent top quarks to the daughter leptons. Furthermore, its dependence on the top quark polarization implies that it is not fully correlated with A C and provides complementary information

Event selection and reconstruction
The event selection for this analysis is identical to that used in Ref.
[25] and is only briefly described in this section. The particle-flow (PF) method [26, 27] is used to reconstruct finalstate particles. Events are required to have exactly two isolated [25] leptons (electrons [28] or muons [29]) of opposite electric charge, with p T > 20 GeV and |η| < 2.4. The dilepton pair invariant mass M is required to be above 20 GeV. For same-flavor leptons, M must also not be within 15 GeV of the Z boson mass to suppress the Drell-Yan (Z/γ +jets) background.
The anti-k T clustering algorithm [30] with a distance parameter of 0.5 is used to form jets from the PF objects. The contribution to the jet energy from additional interactions in the same bunch crossing (pileup) is estimated for each event using the jet area method [31], and is subtracted from the overall jet p T . At least two jets with p T > 30 GeV and |η| < 2.4 are required in each event. At least one of these jets must be consistent with containing the decay of a heavy-flavor hadron, as identified using the medium operating point of the combined secondary vertex (CSV) b tagging algorithm [32]. We refer to such jets as b-tagged jets.
The missing transverse momentum vector p miss T is defined as the negative vector sum of the p T of all PF objects over the full calorimeter coverage (|η| < 5). Its magnitude is referred to as E miss T . The calibrations that are applied to the energy measurements of jets are propagated to a correction of p miss T . The E miss T value is required to exceed 40 GeV in events with same-flavor leptons in order to further suppress the Drell-Yan background. There is no E miss T requirement for e ± µ ∓ events.
The inclusive measurement of A C and all differential measurements presented here require reconstruction of the tt system. Each signal event has two neutrinos, and there is also a twofold ambiguity in combining the b jets with the leptons. In 62% of the events passing the event selection requirements, only one of the selected jets is b tagged. In those events the untagged jet with the highest ranking by the CSV algorithm is assumed to be the second b jet. Solutions for the neutrino momenta are found analytically assuming m t = 172.5 GeV. Each event can have up to 8 possible solutions, and the one with the maximum weight obtained using the matrix weighting technique [33] is chosen as the most probable. For events with no physical solution, we attempt to find a solution for the sum of neutrino p T as close as possible to the measured p miss T [34,35]. Nonetheless, no solution is found for approximately 16% of the events, both in data and simulation. Events with no solutions are used only in the inclusive measurement of A lep C , although the results do not significantly change if those events are excluded. The signs of ∆|y t | and ∆|η | are correctly reconstructed in 74.9% and 99.5% of selected simulated tt events, respectively.

Event samples and background estimation
The simulated tt events used in this analysis are generated using the MC@NLO 3.41 [36,37] Monte Carlo (MC) event generator, with m t = 172.5 GeV and the CTEQ6M parton distribution functions (PDFs) [38]. The subsequent parton showering and fragmentation are done using HERWIG 6.520 [39]. Simulations with different values of m t and the renormalization and factorization scales (µ R and µ F ) are used to evaluate the associated systematic uncertainties. Events with dileptonic tt decays, including tau leptons that decay leptonically, are defined as signal, while all other tt decay modes are treated as background. Background events from the W+jets, Drell-Yan, diboson (WW, WZ, and ZZ), triboson, and tt+boson processes are generated with MADGRAPH 5.1.3.30 [40,41], while single top quark events are generated using POWHEG 1.0 [42][43][44][45][46]. The parton showering and fragmentation are performed using PYTHIA 6.4.22 [47], which is also used for an alternative tt event sample generated using POWHEG. Cross sections calculated to NLO or NNLO are used to normalize the background samples [48][49][50][51][52][53][54][55][56].
For all MC generated events, pileup is simulated with PYTHIA and superimposed on the hard collisions using a pileup multiplicity distribution that reflects the luminosity profile of the analyzed data. The CMS detector response is simulated using a GEANT4-based model [57], and the events are reconstructed and analyzed with the same software used to process the data. The measured trigger efficiencies are used to weight the simulated events to account for the trigger requirement, while the lepton selection efficiencies (reconstruction, identification, and isolation) are consistent between data and simulation [25, 58]. The differences between b tagging efficiencies measured in data and simulation [32] are accounted for using correction factors.
The total contribution from background events to the data sample is expected to be 9%, of which about half comes from single top quark production in association with a W boson (tW), with dileptonic decays. Several control regions (CRs) in data are used to validate the background estimates from simulation for tW and Z/γ * +jets production and for events with incorrectly identified leptons. The CRs are selected to have similar kinematic properties to the signal region, but with one or two requirements inverted, thus enriching them in different background contributions [25]. Agreement between data and simulation is observed in the tW CR, and we assign a 25% uncertainty in the tW cross section based on the recent CMS measurement of 23.4 ± 5.4 pb [59]. The other CRs are used to derive scale factors (SFs) to multiply the simulated event yields for the corresponding background process, with systematic uncertainties estimated from the envelope of variation in the SF value using the three dilepton flavor combinations and various alternative CRs.
Other processes, including tt production in association with a boson as well as diboson and triboson production, contribute less than 20% of the total background and are estimated from simulation alone. Recent CMS measurements [60-62] indicate agreement between the predicted and measured cross sections for these processes, and their small yields permit the choice of a conservative systematic uncertainty of 50% with negligible effect on the analysis precision.
A comparison of the observed and predicted distributions of ∆|y t | and ∆|η | can be found in Appendix A.

Unfolding the distributions
The measured distributions are distorted, relative to the true underlying distributions, by the acceptance of the detector, the efficiency of the trigger and event selection, and the finite resolution of the reconstructed kinematic quantities. After subtraction of the predicted background, we correct the measured distributions for these effects using an unfolding procedure that estimates the corresponding parton-level distributions. In the context of theoretical calculations and parton shower event generators, the parton-level top quark is defined before it decays and its kinematic properties include the effects of recoil from initial-and final-state radiation in the rest of the event and from final-state radiation from the top quark itself. The parton-level charged lepton, produced from the decay of the intermediate W boson, is defined before the lepton decays or radiates any photons.
We use six bins of varying width in the ∆|y t | parton-level distribution that are well matched to the reconstruction resolution and contain approximately equal numbers of events. The ∆|η | distribution depends only on lepton measurements, and the better resolution allows us to use 12 bins. For the reconstruction-level distributions, we use twice as many bins as those used for the parton-level distributions. The unfolding is performed using the TUNFOLD package [63], using regularization based on the curvature of the simulated signal distribution to suppress statistical fluctuations in the high frequency components of the unfolded distribution. The regularization strength is optimized by minimizing the average global correlation coefficient in the unfolded distribution; the resulting regularization is relatively weak, contributing at the level of 5% to the total χ 2 minimized by the algorithm. An analogous unfolding procedure is used to measure A C and A lep C differentially, after introducing a further three bins in each of the tt system kinematic variables M tt , |y tt |, and p tt T .

Systematic uncertainties
Most of the systematic uncertainties concern detector performance and the modeling of the signal and background processes and are estimated from the change in the measurement when varying the simulated event samples used for the unfolding. The uncertainty from the jet energy scale corrections is estimated by varying the jet energies within their uncertainties [64] and propagating this to the p miss T . Similarly, the jet energy resolution is varied by 2-5%, depending on the η of the jet [64], and the electron energy scale is varied by ±0.6% (±1.5%) for barrel (endcap) electrons, as estimated from comparisons between measured and simulated Z boson events [28]. The uncertainty in muon energies is negligible. The uncertainty in the background subtraction is obtained by varying the normalization of each background component by the uncertainties described in Section 4.
Many of the signal modeling and simulation uncertainties are evaluated by using weights to vary the MC@NLO tt sample: the simulated pileup multiplicity distribution is changed within its uncertainty; the correction factors between data and simulation for the b tagging efficiency [32], trigger efficiency, and lepton selection efficiency are shifted up and down by their uncertainties; and the PDFs are varied using the PDF4LHC procedure [65,66]. Previous CMS studies [67,68] have shown that the p T distribution of the top quark in data is softer than in the NLO simulation of tt production. Since the origin of the discrepancy is not fully un-derstood, the change in the measurement when reweighting the MC@NLO tt sample to match the top quark p T spectrum in data is taken as a systematic uncertainty associated with signal modeling. Further signal modeling uncertainties are evaluated using the dedicated tt samples: µ R and µ F are simultaneously varied up and down by a factor of 2, m t is varied by ±1 GeV, and the tt sample generated with POWHEG and PYTHIA is used to measure the uncertainty in hadronization modeling from the difference between the HERWIG and PYTHIA descriptions. The systematic uncertainty estimates evaluated using dedicated tt samples have a significant statistical uncertainty governed by the number of events in the simulated samples. To avoid underestimation of these uncertainties, the maximum of the estimated systematic uncertainty and the statistical uncertainty in that estimate is taken as the final systematic uncertainty. The uncertainty in the unfolding procedure is dominated by the statistical uncertainty arising from the limited number of events in the MC@NLO tt sample. The uncertainty from the regularization is found to be small in comparison. The systematic uncertainties in the inclusive charge asymmetry values obtained from the unfolded distributions are summarized in Table 1.
The individual terms are added in quadrature to estimate the total systematic uncertainties.
For both A C and A lep C , the dominant systematic uncertainty arises from the limited number of simulated events used for the unfolding.

Results
The unfolded normalized differential cross section from the selected events in data is shown as a function of ∆|y t | and ∆|η | in Fig. 1, along with the parton-level predictions for tt production obtained from calculations at NLO in the SM gauge couplings (QCD+EW) [12] and with the MC@NLO generator (which does not include EW corrections). The corresponding A C and A lep C values are presented in Table 2. Correlations between the contents of different bins, introduced by the unfolding process and from the systematic uncertainties, are accounted for in  The inclusive charge asymmetry measurements obtained from the unfolded distributions and the parton-level predictions from the MC@NLO simulation and calculations at NLO (QCD+EW) [12]. For the data, the first uncertainty is statistical and the second is systematic. The uncertainties in the MC@NLO results are statistical and the uncertainties in the NLO calculations come from varying together µ R and µ F up and down by a factor of two.
Variable Data MC@NLO NLO (QCD+EW) A C 0.011 ± 0.011 ± 0.007 0.006 ± 0.001 0.0111 ± 0.0004 A lep C the calculation of the uncertainties. The measured values are consistent with the expectations from the SM. The charge asymmetries as a function of M tt , |y tt |, and p tt T are also measured. The results, which are shown in Fig. 2, are consistent with the MC@NLO simulation predictions, as well as with the NLO (QCD+EW) calculations for the M tt and |y tt | dependencies. No comparison is made with NLO calculations for the p tt T dependencies as it is expected that the effect of the parton shower process on the p tt T distribution makes fixed-order calculations an inadequate approximation of the data.

Summary
Measurements are presented of the charge asymmetry in tt dilepton final states from distributions, unfolded to the parton level, of the absolute rapidity (pseudorapidity) difference of top quarks (leptons) with positive and negative charge. The data sample corresponds to an integrated luminosity of 19.5 fb −1 from pp collisions at √ s = 8 TeV, collected by the CMS experiment at the LHC. The tt and leptonic inclusive charge asymmetries are found to be, respectively, 0.011 ± 0.011 (stat) ± 0.007 (syst) and 0.003 ± 0.006 (stat) ± 0.003 (syst) when measured inclusively. The charge asymmetries are also measured as a function of the invariant mass, absolute rapidity, and transverse momentum of the tt system in the laboratory frame. Although statistically limited, all measurements are in agreement with the standard model predictions. Future measurements at √ s = 13 TeV with larger data sets are expected to have better statistical precision outweighing the dilution of the charge asymmetry from the decreased fraction of events with the quark-antiquark initial state.    [63] S. Schmitt, "TUnfold, an algorithm for correcting migration effects in high energy physics", JINST 7 (2012) T10003, doi:10.1088/1748-0221/7/10/T10003, arXiv:1205.6201.
[68] CMS Collaboration, "Measurement of the differential cross section for top quark pair production in pp collisions at The simulated signal yield is normalized to that of the background-subtracted data. The vertical bars on the data points represent the statistical uncertainties. The lower panels show the ratio of the numbers of events from data and simulation.