Evidence for single top-quark production in the $s$-channel in proton-proton collisions at $\sqrt{s}=$8 TeV with the ATLAS detector using the Matrix Element Method

This Letter presents evidence for single top-quark production in the $s$-channel using proton-proton collisions at a centre-of-mass energy of 8 TeV with the ATLAS detector at the CERN Large Hadron Collider. The analysis is performed on events containing one isolated electron or muon, large missing transverse momentum and exactly two $b$-tagged jets in the final state. The analysed data set corresponds to an integrated luminosity of 20.3 fb$^{-1}$. The signal is extracted using a maximum-likelihood fit of a discriminant which is based on the matrix element method and optimized in order to separate single-top-quark $s$-channel events from the main background contributions, which are top-quark pair production and $W$ boson production in association with heavy-flavour jets. The measurement leads to an observed signal significance of 3.2 standard deviations and a measured cross-section of $\sigma_s=4.8\pm0.8$(stat.)$^{+1.6}_{-1.3}$(syst.) pb, which is consistent with the Standard Model expectation. The expected significance for the analysis is 3.9 standard deviations.


Introduction
In proton-proton (pp) collisions, top quarks are produced mainly in pairs via the strong interaction, but also singly via the electroweak interaction through a Wtb vertex. Therefore, single top-quark production provides a powerful probe for the electroweak couplings of the top quark. In the Standard Model (SM), three different production mechanisms are possible in leading-order (LO) QCD: an exchange of a virtual W boson either in the t-channel or in the s-channel (see Fig. 1), or the associated production of a top quark and a W boson. Among other interesting features, s-channel single top-quark production is sensitive to new particles proposed in several models of physics beyond the SM, such as charged Higgs boson or W boson production [1]. It also plays an important role in indirect searches for new phenomena that could be modelled as anomalous couplings in an effective quantum field theory [2]. Furthermore, s-channel production, like the other two production channels, provides a direct determination of the absolute value of the Cabibbo-Kobayashi-Maskawa (CKM) matrix element V tb . Single top-quark production was first seen by the CDF and D / 0 collaborations in combined measurements of the s-channel and t-channel [3,4]. Recently, the s-channel alone was observed in a combination of the results from both collaborations [5]. At the Large Hadron Collider (LHC) [6] the production of single top quarks was observed both in the t-channel and in associated Wt production by the CMS [7, 8] and ATLAS collaborations [9, 10]. For the s-channel, results of a search at √ s = 8 TeV using an integrated luminosity of 20.3 fb −1 were published by ATLAS [11]. That analysis was based on a boosted decision tree (BDT) event classifier and led to an upper limit of 14.6 pb at the 95% confidence level. The obtained cross-section was σ BDT s = 5.0 ± 4.3 pb with an observed signal significance of 1.3 σ.
Standard Model predictions are available for the production of single top quarks in next-to-leading-order (NLO) QCD [12][13][14] including resummed next-to-next-to-leading logarithmic (NNLL) corrections for soft gluon emissions [15][16][17]. For the s-channel the predicted total inclusive cross-section for pp collisions at a centre-of-mass energy √ s = 8 TeV is σ th s = 5.61 ± 0.22 pb, while for the t-channel it is σ th t = 87.76 +3.44 −1.91 pb, and σ th Wt = 22.37 ± 1.52 pb for associated Wt production. The given uncertainties include variations of the renormalization and factorization scales, as well as an estimate of the uncertainty of the parton distribution function (PDF) needed for the calculation.
In this Letter, a measurement of single top-quark s-channel production in pp collisions with √ s = 8 TeV at the LHC is presented. Each of the two other single-top-quark production processes, t-channel and Wt production, is treated as a background process assuming its cross-section as predicted by NLO+NNLL QCD calculations. In the SM the top quark decays almost exclusively into a W boson and a b-quark. This analysis considers only the leptonic decays (e or µ) of the W boson, since the fully hadronic final states are dominated by overwhelming multi-jet background. Some of the events containing a W boson decaying into a τ lepton which subsequently decays leptonically are also selected. At LO the final state contains two jets with large transverse momenta: one jet originating from the decay of the top quark into a b-quark ("b-jet"), and another b-jet from the Wtb vertex producing the top quark. Thus the experimental signature consists of an isolated electron or muon, large missing transverse momentum, E miss T , due to the undetected neutrino from the W boson decay, and two jets with large transverse momentum, p T , and which are both identified as containing b-hadrons ("b-tagged"). The electron and muon channels in this analysis are merged regardless of the lepton charge in order to measure the combined production cross-section of top quarks and top antiquarks.
In contrast to the aforementioned BDT-based analysis [11], the signal extraction in this analysis is based on the matrix element (ME) method [18,19]. The same data set is used in both analyses. This analysis takes advantage of enhanced simulation samples which reduce the statistical uncertainty and give a better description of the data. Furthermore, updated calibrations for the 2012 data are used, resulting in a reduction of systematic uncertainties. The event selection is improved by adding a veto on dileptonic events, which leads to a significant suppression of the background for top-quark pair (tt) production (see Section 5). The combination of all these measures results in a significant improvement in the sensitivity to the s-channel process. Approximately half of this improvement can be attributed to the change in method from BDT to ME. In particular, the BDT technique applied to this analysis is limited by the sample sizes available for the training, while the ME approach is not sensitive to this limitation.

The ATLAS detector
The ATLAS detector [20] is a multi-purpose detector consisting of a tracking system, calorimeters and an outer muon spectrometer. The inner tracking system contains a silicon pixel detector, a silicon microstrip tracker and a straw-tube transition radiation tracker. The system is surrounded by a thin solenoid magnet which produces a 2 T axial magnetic field, and it provides charged-particle tracking as well as particle identification in the pseudorapidity 1 region |η| < 2.5. The central calorimeter system covers the range of |η| < 1.7 and is divided into a liquid-argon electromagnetic sampling calorimeter with high granularity and a hadron calorimeter consisting of steel/scintillator tiles. The endcap regions are equipped with liquid-argon calorimeters for electromagnetic and hadronic energy measurements up to |η| = 4.9. The outer muon spectrometer is immersed in a toroidal magnetic field provided by air-core superconducting magnets and comprises tracking chambers for precise muon momentum measurements up to |η| = 2.7 and trigger chambers covering the range |η| < 2.4. The combination of all these systems provides efficient and precise reconstruction of leptons and photons in the range |η| < 2.5. Jets and E miss T are reconstructed using energy deposits over the full coverage of the calorimeters, |η| < 4.9. A three-level trigger system [21] is used to reduce the recorded rate of uninteresting events and to select the events in question.

Background estimation
The two most important backgrounds are tt and W+jets production. The former is difficult to distinguish from the signal since tt events contain real top-quark decays. In its dileptonic decay mode, tt events can mimic the final-state signature of the signal if one of the two leptons escapes unidentified, whereas the semileptonic decay mode contributes to the selected samples if only two of its four jets are identified or if some jets are merged. The W+jets events can contribute to the background if they contain b-jets in the final state or due to mis-tagging of jets containing other quark flavours. Single top-quark t-channel production also leads to a sizeable background contribution, while associated Wt production has only a small effect.
A less significant background contribution is multi-jet production where jets, non-prompt leptons from heavy-flavour decays, or electrons from photon conversions are mis-identified as prompt isolated leptons. This background is estimated by using a data-driven matrix method [41], where the probability to misidentify an isolated electron or muon in an event is obtained by exploiting sum rules based on disjoint control samples, one almost pure electron or muon sample and another containing a high fraction of mis-identified leptons due to a relaxed lepton-isolation criterion. For both decay channels the amount of multi-jet background is below 2% in the final selection. Other minor backgrounds are from Z+jets and diboson production.
Apart from the data-driven multi-jet background, all samples are normalized to their predicted crosssections. The samples for single top-quark production are normalized to their NLO+NNLL predictions (see Section 1), while for all tt samples a recent calculation with Top++ (v2.0) at NNLO in QCD including resummations of NNLL soft gluon terms of σ th tt = 253 +13 −15 pb is used for the normalization [42-47].

Event reconstruction and selection
For the selection of s-channel final states, a single high-p T lepton, either electron or muon, exactly two b-tagged jets and a large amount of E miss T are required.
Electrons are reconstructed as energy deposits in the electromagnetic calorimeter matched to chargedparticle tracks in the inner detector and must pass tight identification requirements [48,49]. The transverse momentum of the electrons must satisfy p T > 30 GeV and be in the central region with pseudorapidity |η| < 2.47, excluding the region 1.37 < |η| < 1.52, which contains a large amount of inactive material. Muon candidates are identified using combined information from the inner detector and the muon spectrometer [50, 51]. They are required to have p T > 30 GeV and |η| < 2.5. Both the electrons and muons must fulfil additional isolation requirements, as described in Ref.
[41], in order to reduce contributions from non-prompt leptons originating from hadron decays, and fake leptons.
Jets are reconstructed by using the anti-k t algorithm [52] with a radius parameter of 0.4 for calorimeter energy clusters calibrated with the local cluster weighting method [53]. For the jet calibration an energyand η-dependent simulation-based scheme with in-situ corrections based on data [54] is employed. Only events containing exactly two jets with p T > 40 GeV for the leading jet and p T > 30 GeV for the second leading jet, as well as |η| < 2.5 for both jets are selected. Events involving additional jets with p T > 25 GeV and |η| < 4.5 are rejected. Both jets must be identified as b-jets. The identification is performed using a neural network which combines spatial and lifetime information from secondary vertices of tracks associated with the jets. The operating point of the tagging algorithm used in this analysis corresponds to a b-tagging efficiency of 70% and a rejection factor for light-flavour jets of about 140, while the rejection factor for charm jets is around 5 [55, 56].
The missing transverse momentum is computed from the vector sum of all clusters of energy deposits in the calorimeter that are associated with reconstructed objects, and the transverse momenta of the reconstructed muons [57,58]. The energy deposits are calibrated at the corresponding energy scale of the parent object. Since E miss T is a measure for the undetectable neutrino originating from the top-quark decay, in this analysis only events with E miss T > 35 GeV are accepted. Furthermore, the transverse mass 2 of the W boson, m W T , needs to be larger than 30 GeV to suppress multi-jet background. The main background at this stage of the selection originates from top-quark pair production, which is in turn dominated by dileptonic tt events. To reduce this background, a veto is applied to all events containing an additional reconstructed electron or muon identified with loose criteria. The minimum required p T of these leptons is 5 GeV. By this measure the tt background is diminished by 30% while reducing the signal by less than half a per cent. After the application of all event selection criteria, a signal-to-background ratio of 4.6% is reached. The event yields for all samples in the signal region are collected in Table 1.
Apart from the signal region, two more regions are defined to validate the modelling, one validation region for tt production and a control region for the W+jets background. The latter is used to constrain the normalization of the W+jets background in the final signal extraction, as explained in more detail in Section 8. The two regions are defined in the same way as the signal region, except that neither the veto on events with additional jets nor the one on dilepton events are applied. Top-quark pair production is enriched by selecting events containing exactly four jets with p T > 25 GeV, two of them b-tagged at the 70% working point. The W+jets control region is defined using a less stringent b-tag requirement (80% working point); in order to ensure that this region is disjoint from the signal region, it is required that at least one of the two jets fails to meet the signal region b-tagging criteria at the 70% working point.

Matrix element method
The ME method directly uses theoretical calculations to compute a per-event signal probability. This technique was used for the observation of single top-quark production at the Tevatron [3, 4, 59]. The discrimination between signal and background is based on the computation of likelihood values P(X|H proc ) for the hypothesis that a measured event with final state X is of a certain process type H proc . Those likelihoods can be computed by means of the factorization theorem from the corresponding partonic cross-sections of the hard scatter. The mapping between the hadronic measured final state and the parton state is implemented by transfer functions which take into account the detector resolution functions, the reconstruction and b-tagging efficiencies, as well as all possible permutations between the partons and the reconstructed objects.
The phase-space integration of the differential partonic cross-sections is performed using the Monte Carlo integration algorithm Vegas [60] from the Cuba program library [61]. The required PDF sets are taken from the LHAPDF5 package [62], while the computation of the scattering amplitudes is based on codes 2 The transverse mass, m W T , is computed from the lepton transverse momentum, p T , and the difference in azimuthal angle, ∆φ, between the lepton and the missing transverse momentum as m W T = 2E miss T p T 1 − cos(∆φ(Ê miss T ,p T )) .
from the MCFM program [63]. The parameterizations of the ATLAS detector resolutions used for the transfer functions are those used in the KLFitter kinematic fit framework [64,65].
In total, eight different processes are considered for the computation of the likelihood values. For the schannel signal, final states with two and three partons are included, while the single-top-quark t-channel process is modelled in the four-flavour scheme only. In the case of tt production semileptonic and dileptonic processes are evaluated separately. The remaining background processes are W boson production with two associated light-flavour jets, with one light-flavour jet and one charm jet, and W+bb production.
From the likelihood values of these processes the probability P(S |X) for a measured event X to be a signal event S can be computed with Bayes' theorem by . (1) Here, S i and B j denote all signal and background processes that are being considered. The a priori probabilities α S i and α B j are given by the expected fraction of events of each process in the set of selected events within the signal region. The value of P(S |X) is taken as the main discriminant in the signal extraction. The binning of the ME discriminant is optimized in the signal region utilizing a dedicated algorithm [66]. This results in a non-equidistant binning which exhibits wider bins in regions with a large signal contribution, while preserving a sufficiently large number of background events in each bin. In the following, all histograms showing the ME discriminant are drawn with a constant bin width causing a non-linear horizontal scale. Values of P(S |X) lower than 0.00015 are not taken into account for the signal extraction because of the large background domination in this range.
In order to validate the s-channel ME discriminant P(S |X) a comparison of the discriminant between data and the simulation is shown in Fig. 2 for the W+jets control region and the tt validation region. For the latter, only the two b-tagged jets are considered for the ME discriminant computation. The normalization of each sample in Fig.. 2 except for the data-driven mulit-jet background is obtained by a similar fit to data as described in Sec. 8. The only difference is that here all samples, including the signal, are varied within their SM prediction unctertainy, while for the signal extraction fit the signal is allowed to float freely. In both regions the data is described well by the simulation.

Systematic uncertainties
Apart from systematic effects in the signal acceptance and the background normalizations, the ME discriminant is subject to those systematic effects which change the four-momenta of the reconstructed objects. Therefore, systematic uncertainties such as the energy calibration of jets, electrons and muons are propagated through the whole analysis including the ME computation by variations in the modelling of the detector response.
The main sources of systematic uncertainties for jets are the energy scale, which is evaluated by a combination of in-situ techniques [54], the energy resolution [67] and the reconstruction efficiency  and contributions from energy deposits in the calorimeters not associated with any reconstructed objects are considered as well.
Potential mis-modelling in the simulation of the signal and the main background processes is also taken into account in the evaluation of the systematic uncertainties. This includes contributions from the modelling of the hard process, the parton showers, hadronization and ISR/FSR. The uncertainty caused by the choice of renormalization and factorization scales is evaluated for the signal process and tt production. All of these uncertainties are estimated by comparing simulation samples produced with different generators (see Section 3) or different parameter settings such as shower models or scales.
The normalization uncertainties of the different samples are taken from theory except for the multi-jet background, which is estimated by a data-driven method. Uncertainties of 6%, 4% and 7% are assigned to tt, single top-quark t-channel, and Wt production, respectively. For W+jets production, as well as for the combination of Z+ jets and diboson production, an uncertainty of 60% each is considered [68,69]. The uncertainty for W+jets production is dominated by its heavy-flavour contribution. For the multi-jet background a normalization uncertainty of 50% is estimated.
The uncertainties associated with the PDFs are taken into account for all simulated samples by assessing a systematic uncertainty according to the PDF4LHC prescription [70], which makes use of the MSTW2008NLO [71], CT10, and NNPDF2.3 [72] sets. The uncertainty of the luminosity measurement is 2.8%, which was determined by dedicated beam-separation scans [22].
In addition to the impact of the systematic uncertainties on the signal acceptance and the background normalizations, their effect on the shape of the discriminant distributions is taken into account if it is significant. The significance is evaluated by performing χ 2 tests between the nominal and the systematically varied distributions made from uncorrelated event samples. Only a small fraction of all systematic uncertainties exhibit a significant shape effect. These are mainly the impact of the jet energy scale and resolution on the single-top-quark s-and t-channel samples.
For all simulation samples the effect of their limited size is included in the systematic uncertainty.

Signal extraction
The amount of signal in the selected data set is measured by means of a binned maximum-likelihood fit of the ME discriminant in the signal region. In order to better constrain the W+jets background, the lepton charge in the W+jets-enriched control region is used as an additional discriminant variable in the fit, as it exploits the charge asymmetry of the incoming partons participating in the W+jets processes. The likelihood function used in the fit consists of a Poisson term for the overall number of observed events, a product of probability densities of the discriminants taken over all bins of the distributions and a product of Gaussian constraint terms for the nuisance parameters which incorporate all statistical and systematic uncertainties in the fit. While all backgrounds are constrained by their given uncertainties, the signal strength µ = σ s /σ th s is a free parameter in the fit. The significance of the fit result is obtained with a profile-likelihood-ratio test statistic which is used to determine how well the fit result agrees with the background-only hypothesis. Ensemble tests for all nuisance parameters are performed using the aforementioned likelihood function to get the expected distributions of the test statistic for the background-only and the signal-plus-background hypotheses. The significance is evaluated by integrating the probability density of the test statistic expected for the background-only hypothesis above the observed value. In a similar fashion the confidence interval of the measured signal strength can be estimated by studying its p-value dependence for the backgroundonly hypothesis, as well as for the signal-plus-background hypothesis, by means of ensemble tests. The statistical evaluation used throughout this analysis is based on the RooStats framework [73].

Results
The results of the maximum-likelihood fit are presented in Fig. 3, which shows the two discriminant distributions used in the fit for all samples scaled by the fit results. For the ME discriminant the signal contribution in the data after the subtraction of all background samples is given in Fig. 4. After the fit, none of the nuisance parameters is biased or further constrained by the fit, except for the W+jets normalization. Here, the rather conservative input uncertainty is halved by the fit to signal and the W+jets control regions. The observed signal strength obtained by the fit is µ = 0.86 +0. 31 −0.28 with an observed (expected) significance of 3.2 (3.9) standard deviations. Table 1 summarizes the pre-fit and post-fit event yields for the signal and all backgrounds.
This analysis measures a cross-section of σ s = 4.8 ± 0.8(stat.) +1.6 −1.3 (syst.) pb = 4.8 +1.8 −1.6 pb. The main sources of uncertainty are collected in Table 2. The largest contribution arises from the limited sample sizes for data and the simulation. The jet energy resolution plays a major role, as well as the modelling of the single-top-quark t-channel background and scale variations for the signal. All other systematic effects are negligible.
The measured cross-section can be interpreted in terms of the CKM matrix element V tb . The ratio of the measured cross-section to the prediction is equal to | f LV V tb | 2 , where the form factor f LV could be modified by new physics or radiative corrections through anomalous coupling contributions, for example those in Refs. [74][75][76]. The s-channel production and top quark decays through |V ts | and |V td | are assumed to be small. A lower limit on |V tb | is obtained for f LV = 1 as in the SM, without assuming CKM unitarity [77,78]. The measured value of | f LV V tb | is 0.93 +0.18 −0.20 , and the corresponding lower limit on |V tb | at the 95% confidence level is 0.5.   Table 2: Main statistical and systematic uncertainties contributing to the total uncertainty of the measured crosssection. The relative uncertainties reflect the influence of each systematic effect on the overall signal strength uncertainty. Apart from possible correlations between the systematic uncertainties, the total uncertainty contains several minor contributions which are all smaller than 1%.

Conclusion
An analysis for s-channel single top-quark production in pp collisions at a centre-of-mass energy of 8 TeV recorded by the ATLAS detector at the LHC is presented. The analysed data set corresponds to an integrated luminosity of 20.3 fb −1 . The selected events consist either of an electron or muon, two jets, both of which are identified to be induced by a b-quark, and large E miss T . In order to separate the signal from the large background contributions, a matrix element method discriminant is used. The signal is extracted from the data utilizing a profile likelihood fit, which leads to a measured cross-section of 4.8 +1.8 −1.6 pb. The result, which is in agreement with the SM prediction, corresponds to an observed significance of 3.2 standard deviations, while the expected significance of the analysis is 3.9 standard deviations.