1 Introduction

The top quark is the most massive elementary particle known, with a mass \(m_{\text {top}}={173.3 \pm 0.8}\,\mathrm{GeV}\) [1] close to the electroweak symmetry breaking scale. This makes it an excellent object with which to test the Standard Model (SM) of particle physics, as well as to search for phenomena beyond the SM.

At the LHC, top quarks are primarily produced in pairs via the strong interaction. In addition to the predominant pair-production process, top quarks are produced singly through three different subprocesses via the weak interaction: the t-channel, which is the dominant process, involving the exchange of a space-like W boson; the Wt associated production, involving the production of a real W boson; and the s-channel process involving the production of a time-like W boson.

As a consequence of the large value, which is close to one, of the \(V_{tb}\) element in the Cabibbo–Kobayashi–Maskawa (CKM) matrix, the predominant decay channel of top quarks is \(t\rightarrow Wb\). Transitions between top quarks and other quark flavours mediated by neutral gauge bosons, so-called flavour-changing neutral currents (FCNC), are forbidden at tree level and suppressed at higher orders in the SM [2]. However, several extensions to the SM exist that significantly enhance the production rate and hence the branching fractions, \(\mathcal {B}\), of FCNC processes. Examples of such extensions are the quark-singlet model [35], two-Higgs-doublet models with or without flavour conservation [611], the minimal supersymmetric standard model [1218] or supersymmetry with R-parity violation [19, 20], models with extra quarks [2123], or the topcolour-assisted technicolour model [24]. Reviews can be found in Refs. [25, 26]. Many of these models allow for enhanced FCNC production rates, e.g. by permitting FCNC interactions at tree level or introducing new particles in higher-order loop diagrams. The predicted branching fractions for top quarks decaying to a quark and a neutral boson can be as large as \(10^{-5}\)\(10^{-3}\) for certain regions of the parameter space in the models mentioned. However, the experimental limits have not excluded any specific extension of the SM for the process \(t \rightarrow qg\) so far.

Among FCNC top-quark decays of the form \(t \rightarrow qX\) with \(X=Z,H,\gamma ,g\), modes involving a Z boson, a Higgs boson (H), or a photon (\(\gamma \)) are usually studied directly by searching for final states containing the corresponding decay particles. However, the mode \(t \rightarrow qg\), where q denotes either an up quark, u, or a charm quark, c, is nearly indistinguishable from the overwhelming background of multi-jet production via quantum chromodynamic (QCD) processes. For the \(t \rightarrow qg\) mode, much better sensitivity can be achieved by searching for anomalous single top-quark production (\(qg\rightarrow t\)) where a u- or c-quark and a gluon g, originating from the colliding protons, interact to produce a single top quark. A leading-order diagram for top-quark production in the \(qg \rightarrow t\) mode as well as a SM decay of the top quark is shown in Fig. 1.Footnote 1

Fig. 1
figure 1

Leading-order Feynman diagram for FCNC top-quark production in the \(qg \rightarrow t\) mode followed by the decay of the top quark into a b-quark and a W boson, where the W boson decays into a lepton and a neutrino

Anomalous FCNC couplings can be described in a model-independent manner using an effective operator formalism [27], which assumes the SM to be the low-energy limit of a more general theory that is valid at very high energies. The effects of this theory below a lower energy scale, \(\Lambda \), are perceived through a set of effective operators of dimension higher than four. The formalism therefore allows the new physics to be described by an effective Lagrangian consisting of the SM Lagrangian and a series of higher-dimension operators, which are suppressed by powers of \(1/\Lambda \). The new physics scale, \({\Lambda }\), has a dimension of energy and is related to the mass cut-off scale above which the effective theory breaks down, hence characterising the energy scale at which the new physics manifests itself in the theory. A further method for simplifying the formalism is to only consider operators of interest that have no sizeable impact on physics below the TeV scale, following Ref. [28].

The interest of this paper lies in effective dimension-six operators, which contribute to flavour-changing interactions in the strong sector; thus no operators with electroweak gauge bosons are considered. In particular, the operators describing FCNC couplings to a single top quark are of interest here; they describe strong FCNC vertices of the form qgt and can be written as [29]:

$$\begin{aligned} \mathcal {O}_{uG\Phi }^{\,ij}=\bar{q}_{\mathrm L}^{\,i}\,\lambda ^a\,\sigma ^{\mu \nu }\,u_{\mathrm R}^j\,\tilde{\Phi }\,G^{a\mu \nu }\,, \end{aligned}$$

where \(u_R^j\) stands for a right-handed quark singlet, \(\bar{q}_{L}^{\,i}\) for a left-handed quark doublet, \(G^{a\mu \nu } \) is the gluon field strength tensor, \(\tilde{\Phi }\) the charge conjugate of the Higgs doublet, \(\lambda ^a\) are the Gell-Mann matrices and \(\sigma ^{\mu \nu }\) is the anti-symmetric tensor. The indices (ij) of the spinors are flavour indices indicating the quark generation. By requiring a single top quark in the interaction, one of the indices can always be set equal to 3 while the other index is either 1 or 2. Hence, the remaining fermion field in the interaction is either a u- or a c-quark. Apart from direct single top-quark production, these operators give rise to interactions of the form \(gg \rightarrow tq\) and \(gq \rightarrow tg\). The processes considered are a subset of these, where a u-quark, c-quark or gluon originating from the colliding protons interacts through an s-, t- or u-channel process to produce a single top quark, either via a \((2 \rightarrow 2)\) process or without the associated production of additional gluons or light quarks via a \((2 \rightarrow 1)\) process.

The corresponding strong FCNC Lagrangian usually is written as [29]:

$$\begin{aligned} \mathcal {L}_{\mathrm S} = -g_{\mathrm s} \sum _{q=u,c}\,\frac{\kappa _{qgt}}{\Lambda }\,\bar{q}\,\lambda ^a\,\sigma ^{\mu \nu }\,(f_{q} + h_{q}\gamma _5)\,t\,G^{a}_{\mu \nu } + \text {h.c.}\,, \end{aligned}$$

with the real and positive parameters \(\kappa _{gqt}\,(q=u,c)\) that relate the strength of the new couplings to the strong coupling strength, \(g_{\mathrm s}\), and where t denotes the top-quark field. The parameters \(f_{q}\) and \(h_{q}\) are real, vector and axial chiral parameters, respectively, which satisfy the relation \(|f_{q}|^2 + |h_{q}|^2 = 1\). This Lagrangian contributes to both the production and decay of top quarks.

Experimental limits on the branching fractions of the FCNC top-quark decay channels have been set by experiments at the LEP, HERA, Tevatron and LHC accelerators. At present the most stringent upper limits at 95 % confidence level (CL) for the coupling constants \(\kappa _{\gamma qt}\) and \(\kappa _{qgt}\) are \(\kappa _{\gamma qt}/m_{\text {top}}< {0.12}\,\mathrm{GeV^{-1}}\) [30] (ZEUS, HERA) and \(\mathcal {B}(t\rightarrow qg) < {5.7 \times 10^{-5}}\) (ugt) and \(\mathcal {B}(t\rightarrow qg) < {2.7 \times 10^{-4}}\) (cgt) [31] (ATLAS, LHC). In the case of \(t \rightarrow qZ\), upper limits on the branching fractions of the top-quark decay have been determined to be \(\mathcal {B}(t\rightarrow qZ) < {0.05}\,{\%}\) [32] (CMS, LHC). Finally, the most stringent limit for the decay \(t\rightarrow qH\) is measured to be \(\mathcal {B}(t\rightarrow qH) < {0.79}\,{\%}\) [33] (ATLAS, LHC).

In the allowed region of parameter space for \(\kappa _{qgt}/\Lambda \), the FCNC production cross-section for single top quarks is of the order of picobarns, while the branching fraction for FCNC decays is very small, i.e. below 1 %. Top quarks are therefore reconstructed in the SM decay mode \(t\rightarrow Wb\). The W boson can decay into a quark–antiquark pair (\(W\rightarrow q_1 \bar{q}_{2}\)) or a charged lepton–neutrino pair (\(W\rightarrow \ell \nu \)); only the latter is considered here. This search targets the signature from the \(qg \rightarrow t \rightarrow W(\rightarrow \ell \nu )\,b\) process. Events are characterised by an isolated high-energy charged lepton (electron or muon), missing transverse momentum from the neutrino and exactly one jet produced by the hadronisation of the b-quark. Events with a W boson decaying into a \(\tau \) lepton, where the \(\tau \) decays into an electron or a muon, are also included. Several SM processes have the same final-state topology and are considered as background to the FCNC analysis. The main backgrounds are V+jets production (especially in association with heavy quarks), where V denotes a W or a Z boson, SM top-quark production, diboson production, and multi-jet production via QCD processes. The studied process can be differentiated from SM single top-quark production, which is usually accompanied by additional jets. Furthermore, FCNC production has kinematic differences from the background processes, such as lower transverse momenta of the top quark.

This paper is organised as follows: Sect. 2 provides a description of the ATLAS detector. Section 3 gives an overview of the data and Monte Carlo (MC) samples used for the simulation of signal and expected background events from SM processes. In Sect. 4 the event selection is presented. The methods of event classification into signal- and background-like events using a neural network are discussed in Sect. 5 and sources of systematic uncertainty are summarised in Sect. 6. The results are presented in Sect. 7 and the conclusions are given in Sect. 8.

2 ATLAS detector

The ATLAS detector [34] is a multipurpose collider detector built from a set of sub-detectors, which cover almost the full solid angle around the interaction point.Footnote 2 It is composed of an inner tracking detector (ID) close to the interaction point surrounded by a superconducting solenoid providing a 2T axial magnetic field, electromagnetic and hadronic calorimeters, and a muon spectrometer (MS). The ID consists of a silicon pixel detector, a silicon microstrip detector providing tracking information within pseudorapidity \(|\eta | < 2.5\), and a straw-tube transition radiation tracker that covers \(|\eta | < 2.0\). The central electromagnetic calorimeter is a lead and liquid-argon (LAr) sampling calorimeter with high granularity, and is divided into a barrel region that covers \(|\eta | < 1.475\) and endcap regions that cover \(1.375 < |\eta | < 3.2\). An iron/scintillator tile calorimeter provides hadronic energy measurements in the central pseudorapidity range. The endcap and forward regions are instrumented with LAr calorimeters for both the electromagnetic and hadronic energy measurements, and extend the coverage to \(|\eta | = 4.9\). The MS covers \(|\eta | < 2.7\) and consists of three large superconducting toroids with eight coils each, a system of trigger chambers, and precision tracking chambers.

3 Data and simulated samples

This analysis is performed using \(\sqrt{s}= {8} \, \mathrm{TeV} \) proton–proton (pp) collision data recorded by the ATLAS experiment in 2012. Stringent detector and data quality requirements are applied, resulting in a data sample with a total integrated luminosity of 20.3 \({{\rm fb}^{-1}}\).

3.1 Trigger requirements

ATLAS employs a three-level trigger system for selecting events to be recorded. The first level (L1) is built from custom-made hardware, while the second and third levels are software based and collectively referred to as the high-level trigger (HLT). The datasets used in this analysis are defined by high-\(p_{\text {T}}\) single-electron or single-muon triggers [35, 36].

For the L1 calorimeter trigger, which is based on reduced calorimetric information, a cluster in the electromagnetic calorimeter is required with \(E_{\text {T}}> {30}\,\mathrm{GeV}\) or with \(E_{\text {T}}> {18}\,\mathrm{GeV}\). The energy deposit must be well separated from other clusters. At the HLT, the full granularity of the calorimeter and tracking information is available. The calorimeter cluster is matched to a track and the trigger electron candidate is required to have \(E_{\text {T}}> {60}\,\mathrm{GeV}\) or \(E_{\text {T}}> {24}\,\mathrm{GeV}\) with additional isolation requirements.

The single-muon trigger is based on muon candidates reconstructed in the MS. The triggered events require a L1 muon trigger-chamber track with a 15 GeV threshold on the \(p_{\text {T}}\) of the track. At the HLT, the requirement is tightened to \(p_{\text {T}}> {24}\,\mathrm{GeV}\) with, or 36 GeV without, an isolation criterion.

3.2 Simulated events

Simulated event samples are used to evaluate signal and background efficiencies and uncertainties as well as to model signal and background shapes.

For the direct production of top quarks via FCNC, MEtop [29] is used for simulating strong FCNC processes at next-to-leading order (NLO) in QCD. It introduces strong top-quark FCNC interactions through effective operators. By comparing kinematic distributions for different FCNC couplings, it has been verified that the kinematics of the signal process are independent of the a priori unknown FCNC coupling strength. As a conservative approach, only left-handed top quarks (as in the SM) are produced, and the decay of the top quark is assumed also to be as in the SM.Footnote 3 The CT10 [37] parton distribution function (PDF) sets are used for the generation of the signal events and the renormalisation and factorisation scales are set to the top-quark mass.

The Powheg-box [38] generator with the CT10 PDF sets is used to generate \(t\bar{t}\) [39] and electroweak single top-quark production in the t-channel [40], s-channel [41] and Wt-channel [42]. All processes involving top quarks, including the strong FCNC processes, are produced assuming \(m_{\text {top}}= {172.5}\,\mathrm{GeV}\). The parton shower and the underlying event are added using Pythia 6.426 [43], where the parameters controlling the modelling are set to the values of the Perugia 2011C tune [44].

Vector-boson production in association with jets (V+jets) is simulated using the multi-leg leading-order (LO) generator Sherpa 1.4.1 [45] with its own parameter tune and the CT10 PDF sets. Sherpa is used not only to generate the hard process, but also for the parton shower and the modelling of the underlying event. W+jets and Z+jets events with up to five additional partons are generated. The CKKW method [46] is used to remove overlap between partonic configurations generated by the matrix element and by parton shower evolution. Double counting between the inclusive V+n parton samples and samples with associated heavy-quark pair production is avoided consistently by using massive c- and b-quarks in the shower.

Diboson events (WW, WZ and ZZ) are produced using Alpgen 2.14 [47] and the CTEQ6L1 PDF sets [48]. The partonic events are showered with Herwig 6.5.20 [49], and the underlying event is simulated with the Jimmy 4.31 [50] model using the ATLAS Underlying Event Tune 2 [51].

All the generated samples are passed through the full simulation of the ATLAS detector [52] based on Geant4 [53] and are then reconstructed using the same procedure as for data. The simulation includes the effect of multiple pp collisions per bunch crossing. The events are weighted such that the average distribution of the number of collisions per bunch crossing is the same as in data. In addition, scale factors are applied to the simulated events to take into account small differences observed between the efficiencies for the trigger, lepton identification and b-quark jet identification. These scale factors are determined using control samples.

4 Event selection

The expected signature of signal events is used to perform the event selection. Events containing exactly one isolated electron or muon, missing transverse momentum and one jet, which is required to be identified as a jet originating from a b-quark, are selected.

4.1 Object definition and event selection

Electron candidates are selected from energy deposits (clusters) in the LAr electromagnetic calorimeter associated with a well-measured track fulfilling strict quality requirements [54]. Electron candidates are required to satisfy \(p_{\text {T}}> {25}\,\mathrm{GeV}\) and \(|\eta _{\text {clus}}| < 2.47\), where \(\eta _{\text {clus}}\) denotes the pseudorapidity of the cluster. Clusters falling in the calorimeter barrel–endcap transition region, corresponding to \(1.37<|\eta _{\text {clus}}|<1.52\), are ignored. High-\(p_{\text {T}}\) electrons associated with the W-boson decay can be mimicked by hadronic jets reconstructed as electrons, electrons from the decay of heavy quarks, and photon conversions. Since electrons from the W-boson decay are typically isolated from hadronic jet activity, backgrounds can be suppressed by isolation criteria, which require minimal calorimeter activity and only allow low-\(p_{\text {T}}\) tracks in an \(\eta \)\(\phi \) cone around the electron candidate. Isolation cuts are optimised to achieve a uniform cut efficiency of 90 % as a function of \(\eta _{\text {clus}}\) and transverse energy, \(E_{\text {T}}\). The direction of the electron candidate is taken as that of the associated track. For the calorimeter isolation a cone size of \(\Delta R = 0.2\) is used. In addition, the scalar sum of all track transverse momenta within a cone of size \(\Delta R = 0.3\) around the electron direction is required to be below a \(p_{\text {T}}\)-dependent threshold in the range between 0.9 and 2.5 GeV. The track belonging to the electron candidate is excluded from this requirement.

Muon candidates are reconstructed by matching track segments or complete tracks in the MS with tracks found in the ID [55]. The final candidates are required to have a transverse momentum \(p_{\text {T}}> {25}\,\mathrm{GeV}\) and to be in the pseudorapidity region \(|\eta |<2.5\). Isolation criteria are applied to reduce background events in which a high-\(p_{\text {T}}\) muon is produced in the decay of a heavy-flavour quark. An isolation variable [56] is defined as the scalar sum of the transverse momenta of all tracks with \(p_{\text {T}}\) above 1 GeV, except the one matched to the muon, within a cone of size \(\Delta R_{\text {iso}} = {10}\,\mathrm{GeV}/p_{\text {T}}(\mu )\). Muon candidates are accepted if they have an isolation to \(p_{\text {T}}(\mu )\) ratio of less than 0.05. An overlap removal is applied between the electrons and the muons, rejecting the event if the electron and the muon share the same ID track.

Jets are reconstructed using the anti-\(k_{t}\) algorithm [57] with a radius parameter of 0.4, using topological clusters [58] as inputs to the jet finding. The clusters are calibrated with a local cluster weighting method [59]. Calibrated jets using an energy- and \(\eta \)-dependent simulation-based calibration scheme, with in situ corrections based on data, are at first required to have \(p_{\text {T}}> {25}\,\mathrm{GeV}\) and \(|\eta |<2.5\). The jet energy is further corrected for the effect of multiple pp interactions, both in data and in simulated events.

If any jet is within \(\Delta R = 0.2\) of an electron, the closest jet is removed, since in these cases the jet and the electron are very likely to correspond to the same physics object. Remaining electron candidates overlapping with jets within a distance \(\Delta R<0.4\) are subsequently rejected. To reject jets from pile-up events, a so-called jet-vertex fraction criterion is applied for jets with \(p_{\text {T}}< {50}\,\mathrm{GeV}\) and \(|\eta | <2.4\): at least 50 % of the scalar sum of the \(p_{\text {T}}\) of tracks within a jet is required to be from tracks compatible with the primary vertexFootnote 4 associated with the hard-scattering collision. The final selected jet is required to have \(p_{\text {T}}> {30}\,\mathrm{GeV}\) and must also be identified as a jet originating from a b-quark (b-tagged).

In this analysis, a b-tagging algorithm that is optimised to improve the rejection of c-quark jets is used, since \(W+c\) production is a major background. A neural-network-based algorithm is used, which combines three different algorithms exploiting the properties of a b-hadron decay in a jet [60]. The chosen working point corresponds to a b-tagging efficiency of 50 %, when cutting on the discriminant, and a c-quark jet and light-parton jet mistag acceptance of 3.9 and 0.07 %, respectively, as measured in \(t\bar{t}\) events [61, 62].

The missing transverse momentum (with magnitude \(E_{\text {T}}^{\text {miss}}\)) is calculated based on the vector sum of energy deposits in the calorimeter projected onto the transverse plane [63]. All cluster energies are corrected using the local cluster calibration scheme. Clusters associated with a high-\(p_{\text {T}}\) jet or electron are further calibrated using their respective energy corrections. In addition, contributions from the \(p_{\text {T}}\) of selected muons are included in the calculation of \(E_{\text {T}}^{\text {miss}}\). Due to the presence of a neutrino in the final state of the signal process, \(E_{\text {T}}^{\text {miss}}> {30}\,\mathrm{GeV}\) is required. Lepton candidates in multi-jet events typically arise from charged tracks being misidentified as leptons, electrons arising from converted photons and leptons from c- and b-hadron decays. Such candidates are collectively referred to as fake leptons. As such, the multi-jet events tend to have low \(E_{\text {T}}^{\text {miss}}\) and low W-boson transverse mass,Footnote 5 \(m_{\text {T}}(W)\), relative to single top-quark events. Therefore, an additional requirement on \(m_{\text {T}}(W)\) is an effective way to reduce this background. The selection applied is \(m_{\text {T}}(W)> {50}\,\mathrm{GeV}\). In order to further suppress the multi-jet background and also to remove poorly reconstructed leptons with low transverse momentum, a requirement on the transverse momentum of leptons and the azimuthal angle between the lepton and jet is applied:

$$\begin{aligned} p_{\text {T}}^{\ell } > {90}\,\mathrm{GeV} \left( 1- \frac{\pi - |\Delta \phi (\ell , \text {jet})|}{\pi -2}\right) \,. \end{aligned}$$
(1)

The parameters of the cut are motivated by the distribution of multi-jet events, obtained in the signal region, where the simulated backgrounds except the multi-jet contribution are subtracted from data. Almost no signal events are removed by this cut. The distribution of the transverse momentum of the lepton versus the azimuthal angle between the lepton and the jet is shown in Fig. 2.

Fig. 2
figure 2

The transverse momentum of the lepton versus the azimuthal angle between the lepton and the jet. The colours indicate the number of events in data after the simulated backgrounds except the multi-jet contribution have been subtracted and before the cut given by Eq. 1 is applied. The solid black line shows the cut

In addition to the signal region defined by this selection, a control region is defined with the same kinematic requirements, but with a less stringent b-tagging requirement with an efficiency of 85 %, and excluding events passing the tighter signal-region b-tagging selection. This control region is designed such that the resulting sample is dominated by W+jets production, which is the dominant background.

4.2 Background estimation

For all background processes except the multi-jet background, the normalisations are estimated by using Monte Carlo simulation scaled to the theoretical cross-section predictions, using \(m_{\text {top}}= {172.5}\,\mathrm{GeV}\). In order to check the modelling of kinematic distributions, correction factors to the normalisation of the W+jets and \(t\bar{t}\) and single-top processes are subsequently determined simultaneously in the context of the multi-jet background estimation.

The SM single top-quark production cross-sections are calculated to approximate next-to-next-to-leading-order (NNLO) precision. The production via the t-channel exchange of a virtual W boson has a predicted cross-section of 87 pb [64]. The cross-section for the associated production of an on-shell W boson and a top quark (Wt channel) has a predicted value of 22.3 pb [65], while the s-channel production has a predicted cross-section of 5.6 pb [66]. The resulting weighted average of the theoretical uncertainties including PDF and scale uncertainties of these three processes is 10 %.

The cross-section of the \(t\bar{t}\) process is normalised to 238 pb, calculated at NNLO in QCD including resummation of next-to-next-to-leading logarithmic (NNLL) soft gluon terms [6771] with Top++2.0 [72]. The PDF and \(\alpha _{\mathrm {s}}\) uncertainties are calculated using the PDF4LHC prescription [73] with the MSTW2008 NNLO [74, 75] at 68 % \(\text {CL}\), the CT10 NNLO [37, 76], and the NNPDF 2.3 [77] PDF sets, and are added in quadrature to the scale uncertainty, yielding a final uncertainty of 6 %.

The cross-sections for inclusive W- and Z-boson production are predicted with NNLO precision using the FEWZ program [78, 79], resulting in a LO-to-NNLO K-factor of 1.10 and an uncertainty of 4 %. The uncertainty includes the uncertainty on the PDF and scale variations. The scale factor is applied to the prediction based on the LO Sherpacalculation and the flavour composition is also taken from the MC samples. The modelling of the transverse momentum of the W boson in the W+jets sample is improved by reweighting the simulated samples to data in the W+jets-dominated control region.

LO-to-NLO K-factors obtained with MCFM [80] of the order of 1.3 are applied to the Alpgen LO predictions for diboson production. Since the diboson process is treated together with Z-boson production in the statistical analysis and the fraction of selected events is only 5 %, the same uncertainties as used for the Z+jets process are assumed.

Multi-jet events may be selected if a jet is misidentified as an isolated lepton or if the event has a non-prompt lepton that appears to be isolated. The normalisation of this background is obtained from a fit to the observed \(E_{\text {T}}^{\text {miss}}\) distribution, performed both in the signal and control regions. In order to construct a sample of multi-jet background events, different methods are adopted for the electron and muon channels. The ‘jet-lepton’ model is used in the electron channel while the ‘anti-muon’ model is used in the muon channel [81]. In the jet-lepton model, a shape for the multi-jet background is established using events from a Pythia dijet sample, which are selected using same criteria as the standard selection, but with a jet used in place of the electron candidate. Each candidate jet has to fulfil the same \(p_{\text {T}}\) and \(\eta \) requirements as a standard lepton and deposit 80–95 % of its energy in the electromagnetic calorimeter. Events with an electron candidate passing the electron cuts described in Sect. 4.1 are rejected and an event is accepted if exactly one ’jet-lepton’ is found. The anti-muon model is derived from collision data. In order to select a sample that is highly enriched with muons from multi-jet events, some of the muon identification cuts are inverted or changed, e.g. the isolation criteria are inverted.

To determine the normalisation of the multi-jet background template, a binned maximum-likelihood fit is performed on the \(E_{\text {T}}^{\text {miss}}\) distribution using the observed data, after applying all selection criteria except for the cut on \(E_{\text {T}}^{\text {miss}}\). Fits are performed separately in two \(\eta \) regions for electrons: in the endcap (\(|\eta | > 1.52\)) and central (\(|\eta | < 1.37\)) region of the electromagnetic calorimeter, i.e. the transition region is excluded. For muons, the complete \(\eta \) region is used. The multi-jet templates for both the electrons and the muons are fitted together with templates derived from MC simulation for all other background processes (top quark, W+light flavour (LF), W+heavy flavour (HF), Z+jets, dibosons). Acceptance uncertainties are accounted for in the fitting process in the form of additional constrained nuisance parameters. For the purpose of these fits, the contributions from \(W\)+LF and \(W\)+HF, the contributions from \(t\bar{t}\) and single top-quark production, and the contributions from Z+jets and diboson production are each combined into one template. The normalisation of the template for Z+jets and diboson production is fixed during the fit, as its contribution is very small.

The \(E_{\text {T}}^{\text {miss}}\) distributions after rescaling the different backgrounds and the multi-jets template to their respective fit results are shown in Fig. 3 for both the electron and the muon channels. The fitted scale factors for the other templates are close to 1.

Fig. 3
figure 3

Fitted distributions of the missing transverse momentum \(E_{\text {T}}^{\text {miss}}\) for a central electrons and b muons in the control region and for c central electrons and d muons in the signal region. The last histogram bin includes overflow events and the hatched error bands contain the MC statistical uncertainty combined with the normalisation uncertainty on the multi-jet background

Table 1 provides the event yields after the complete event selection for the control and signal regions. The yields are calculated using the acceptance from MC samples normalised to their respective theoretical cross-sections including the (N)NLO K-factors, while the number of expected events for the multi-jet background is obtained from the maximum-likelihood fit. Each event yield uncertainty combines the statistical uncertainty, originating from the limited size of the simulation samples, with the uncertainty on the cross-section or normalisation. The observed event yield in data agrees well with the background prediction. For comparison, a 1 pb FCNC cross-section would lead to 530 events in the signal region. The corresponding efficiency for selecting FCNC events is 3.1 %.

Table 1 Number of observed and expected events in the control and signal region for all lepton categories added together. The uncertainties shown are derived using the statistical uncertainty from the limited size of the samples and the uncertainty on the theoretical cross-section only or multi-jet normalisation. The scale factors obtained from the multi-jet background fit are not applied when determining the expected number of events

Kinematic distributions in the control region of the identified lepton, reconstructed jet, \(E_{\text {T}}^{\text {miss}}\)and \(m_{\text {T}}(W)\) are shown in Fig. 4 for the combined electron and muon channels. These distributions are normalised using the scale factors obtained in the \(E_{\text {T}}^{\text {miss}}\) fit to estimate the multi-jet background. Overall, good agreement between the observed and expected distributions is seen. The trends that can be seen in some of the distributions are covered by the systematic uncertainties.

Fig. 4
figure 4

Kinematic distributions in the control region for the combined electron and muon channels. All processes are normalised to the result of the binned maximum-likelihood fit used to determine the fraction of multi-jet events. Shown are: a the transverse momentum and b pseudorapidity of the lepton, c the transverse momentum and d pseudorapidity of the jet, e the missing transverse momentum and d W-boson transverse mass. The last histogram bin includes overflow events and the hatched band indicates the combined statistical and systematic uncertainties, evaluated after the fit discussed in Sect. 7

5 Analysis strategy

As no single variable provides sufficient discrimination between signal and background events and the separation power is distributed over many correlated variables, multivariate analysis techniques are necessary to separate signal candidates from background candidates. A neural-network (NN) classifier [82] that combines a three-layer feed-forward neural network with a preprocessing of the input variables is used. The network infrastructure consists of one input node for each input variable plus one bias node, an arbitrary number of hidden nodes, and one output node, which gives a continuous output in the interval \([-1,1]\). The training is performed with a mixture of 50 % signal and 50 % background events, where the different background processes are weighted according to their number of expected events. Only processes from simulated events are considered in the training, i.e. no multi-jet events are used. In order to check that the neural network is not overtrained, 20 % of the available simulated events are used as a test sample. Subsequently, the NN classifier is applied to all samples.

Table 2 Variables used in the training of the neural network ordered by their descending importance

The \(qg\rightarrow t\rightarrow b\ell \nu \) process is characterised by three main differences from SM processes. Firstly, the \(p_{\text {T}}\) distribution of the top quark is much softer than the \(p_{\text {T}}\) distribution of top quarks produced through SM top-quark production, since the top quark is produced almost without transverse momentum. Hence, the W boson and b-quark from the top-quark decay are produced almost back-to-back in the transverse plane. Secondly, unlike in the W / Z+jets and diboson backgrounds, the W boson from the top-quark decay has a high momentum and its decay products tend to have small angles. Lastly, the top-quark charge asymmetry differs between FCNC processes and SM processes in the ugt channel. In pp collisions, the FCNC processes are predicted to produce four times more single top quarks than anti-top quarks, whereas in SM single top-quark production and in all other SM backgrounds this ratio is at most two. Several categories of variables are considered as potential discriminators between the signal and background processes. Apart from basic event kinematics such as the \(m_{\text {T}}(W)\) or \(H_{\text {T}}\) (the scalar sum of the transverse momenta of all objects in the final state), various object combinations are considered as well. These include the basic kinematic properties of reconstructed objects like the W boson and the top quark, as well as angular distances in \(\eta \) and \(\phi \) between the reconstructed and final-state objects in the laboratory frame and in the rest frames of the W boson and the top quark. In order to reconstruct the four-vector of the W boson, a mass constraint is used. A detailed description of the top-quark reconstruction is given in Ref. [83]. Further, integer variables such as the charge of the lepton are considered.

The ranking of the variables in terms of their discrimination power is automatically determined as part of the preprocessing step and is independent of the training procedure [84].Footnote 6 Only the highest-ranking variables are chosen for the training of the neural network. Each variable is tested beforehand for agreement between the background model and the distribution of the observed events in the control region. Using only variables with an a priori defined separation power, 13 variables remain in the network. Table 2 shows a summary of the variables used, ordered by their importance. The probability density of the three most important discriminating variables for the dominant background processes together with the signal is displayed in Fig. 5.

The distributions for three of the four most important variables in the control and signal regions are shown in Fig. 6. The shape of the multi-jet background is obtained using the samples described in Sect. 4.2. The distribution of \(p_{\text {T}}^{\ell }\) is shown in Fig. 7 a for the control region. The distributions are normalised using the scale factors obtained in the binned maximum-likelihood fit to the \(E_{\text {T}}^{\text {miss}}\) distribution.

The resulting neural-network output distributions for the most important background processes and the signal are displayed in Fig. 7 as probability densities and in Fig. 8a, b normalised to the number of expected events in the control and signal regions, respectively. Signal-like events have output values close to 1, whereas background-like events accumulate near \(-1\). Overall, good agreement within systematic uncertainties between data and the background processes is observed in both the control and signal regions.

Fig. 5
figure 5

Probability densities of the three most important discriminating variables: a the transverse mass of the reconstructed top quark; b the transverse momentum of the charged lepton; and c the distance in the \(\eta \)\(\phi \) plane between the charged lepton and the reconstructed top quark. The last histogram bin includes overflows

Fig. 6
figure 6

Distributions of three important discriminating variables (except for the transverse momentum of the lepton): a, d the top-quark transverse mass in the control and signal regions; b, e the \(\Delta R\) between the lepton and the reconstructed top quark in the control and signal regions; c, f the \(\Delta \phi \) between the jet and the reconstructed top quark. All processes are normalised using the scale factors obtained in the binned maximum-likelihood fit to the \(E_{\text {T}}^{\text {miss}}\) distribution. The FCNC signal cross-section is scaled to 50 pb and overlayed on the distributions in the signal region. The last histogram bin includes overflow events and the hatched band indicates the combined statistical and systematic uncertainties, evaluated after the fit discussed in Sect. 7

Fig. 7
figure 7

Probability density of the neural-network output distribution for the signal and the most important background processes

Fig. 8
figure 8

Neural-network output distribution a in the control region and b in the signal region. The shape of the signal scaled to 50 pb is shown in b. All background processes are shown normalised to the result of the binned maximum-likelihood fit used to determine the fraction of multi-jet events. The hatched band indicates the combined statistical and systematic uncertainties, evaluated after the fit discussed in Sect. 7

6 Systematic uncertainties

Systematic uncertainties are assigned to account for detector calibration and resolution uncertainties, as well as the uncertainties on theoretical predictions. These can affect the normalisation of the individual backgrounds and the signal acceptance (acceptance uncertainties) as well as the shape of the neural-network output distribution (shape uncertainties). Quoted relative uncertainties refer to acceptance of the respective processes unless stated otherwise.

6.1 Object modelling

The effects of the systematic uncertainties due to the residual differences between data and Monte Carlo simulation, uncertainties on jets, electron and muon reconstruction after calibration, and uncertainties on scale factors that are applied to the simulation are estimated using pseudo-experiments.

Uncertainties on the muon (electron) trigger, reconstruction and selection efficiency scale factors are estimated in measurements of \(Z \rightarrow \mu \mu \) (\(Z \rightarrow e e\) and \(W \rightarrow e\nu \)) production. The scale factor uncertainties are as large as 5 %. To evaluate uncertainties on the lepton momentum scale and resolution, the same processes are used [85]. The uncertainty on the charge misidentification acceptances were studied and found to be negligible for this analysis.

The jet energy scale (JES) is derived using information from test-beam data, LHC collision data and simulation. Its uncertainty varies between 2.5 and 8 %, depending on jet \(p_{\text {T}}\) and \(\eta \) [59]. This includes uncertainties in the fraction of jets induced by gluons and mismeasurements due to close-by jets. Additional uncertainties due to pile-up can be as large as 5 %. An additional jet energy scale uncertainty of up to 2.5 %, depending on the \(p_{\text {T}}\) of the jet, is applied for b-quark-induced jets due to differences between light-quark and gluon jets compared to jets containing b-hadrons. Additional uncertainties are from the modelling of the jet energy resolution and the missing transverse momentum, which accounts for contributions of calorimeter cells not matched to any jets, soft jets, and pile-up. The effect of uncertainties associated with the jet-vertex fraction is also considered for each jet.

Since the analysis makes use of b-tagging, the uncertainties on the b- and c-tagging efficiencies and the mistag acceptance [61, 62] are taken into account.

6.2 Multi-jet background

For the multi-jet background, an uncertainty on the estimated multi-jet fractions and the modelling is included. The systematic uncertainty on the fractions, as well as a shape uncertainty, are obtained by comparing to an alternative method, the matrix method [81]. The method estimates the number of multi-jet background events in the signal region based on loose and tight lepton isolation definitions, the latter selection being a subset of the former. The number of multi-jet events \(N^\text {tight}_\text {fake}\) passing the tight (signal) isolation requirements can be expressed as:

$$\begin{aligned} N^\text {tight}_\text {fake} = \frac{\epsilon _\text {fake}}{\epsilon _\text {real} - \epsilon _\text {fake}} \cdot (N^\text {loose} \epsilon _\text {real} - N^\text {tight})\,, \end{aligned}$$

where \(\epsilon _\text {real}\) and \(\epsilon _\text {fake}\) are the efficiencies for real and fake loose leptons being selected as tight leptons, \(N^\text {loose}\) is the number of selected events in the loose sample, and \(N^\text {tight}\) is the number of selected events in the signal sample. By comparing the two methods, the uncertainty on the fraction of multi-jet events is estimated to be 17 %. The shape uncertainty is constructed by comparing the neural-network output distributions of the jet-lepton and anti-muon samples with the distributions obtained using the matrix method.

6.3 Monte Carlo generators

Systematic effects from the modelling of the signal and background processes are taken into account by comparing different generator models and varying the parameters of the event generation. The effect of parton-shower modelling for the top-quark processes is tested by comparing two Powheg samples interfaced to Herwig and Pythia, respectively. There are also differences associated with the way in which double-counted events in the NLO corrections and the parton showers are removed. These are estimated by comparing samples produced with the MC@NLO method and the Powheg method.

The difference between the top-quark mass used in the simulations and the measured value has negligible effect on the results.

For the single top-quark processes, variations of initial- and final-state radiation (ISR and FSR) together with variations of the hard-process scale are studied. The uncertainty is estimated using events generated with Powheg interfaced to Pythia. Factorisation and renormalisation scales are varied independently by factors of 0.5 and 2.0, while the scale of the parton shower is varied consistently with the renormalisation scale using specialised Perugia 2012 tunes [44]. The uncertainty on the amounts of ISR and FSR in the simulated \(t\bar{t}\) sample is assessed using Alpgen samples, showered with Pythia, with varied amounts of initial- and final-state radiation, which are compatible with the measurements of additional jet activity in \(t\bar{t}\) events [86].

The effect of applying the W-boson \(p_{\text {T}}\) reweighting was studied and found to have negligible impact on the shape of the neural-network output distribution and the measured cross-section. Hence no systematic uncertainty due to this was assigned.

Finally, an uncertainty is included to account for statistical effects from the limited size of the MC samples.

6.4 Parton distribution functions

Systematic uncertainties related to the parton distribution functions are taken into account for all samples using simulated events. The events are reweighted according to each of the PDF uncertainty eigenvectors or replicas and the uncertainty is calculated following the recommendation of the respective PDF group [73]. The final PDF uncertainty is given by the envelope of the estimated uncertainties for the CT10 PDF set, the MSTW2008 PDF set and the NNPDF 2.3 PDF set.

6.5 Theoretical cross-section normalisation

The theoretical cross-sections and their uncertainties are given in Sect. 4.2 for each background process. Since the single top-quark t-, Wt-, and s-channel processes are grouped together in the statistical analysis, their uncertainties are added in proportion to their relative fractions, leading to a combined uncertainty of 10 %.

A cross-section uncertainty of 4 % is assigned for the W / Z+(0 jet) process, while ALPGEN parameter variations of the factorisation and renormalisation scale and the matching parameter consistent with experimental data yield an uncertainty on the cross-section ratio of 24 %. For \(W\)+HF production, a conservatively estimated uncertainty on the HF fraction of 50 % is added. This uncertainty is also applied to the combined Z+jets and diboson background.

6.6 Luminosity

The uncertainty on the measured luminosity is estimated to be 2.8 %. It is derived from beam-separation scans performed in November 2012, following the same methodology as that detailed in Ref. [87].

7 Results

In order to estimate the signal content of the selected sample, a binned maximum-likelihood fit to the complete neural-network output distributions in the signal region is performed. Including all bins of the neural-network output distributions in the fit has the advantage of making maximal use of all signal events remaining after the event selection, and, in addition, allows the background acceptances to be constrained by the data.

The signal rates, the rate of the single top-quark and \(t\bar{t}\) background and the rate of the \(W\)+HF background are fitted simultaneously. The event yields of the multi-jet background, the \(W\)+LF and the combined Z+jets/diboson background are not allowed to vary in the fit, but instead are fixed to the estimates given in Table 1.

No significant rate of FCNC single top-quark production is observed. An upper limit is set using hypothesis tests. The compatibility of the data with the signal hypothesis, which depends on the coupling constants, and the background hypothesis is evaluated by performing a frequentist hypothesis test based on pseudo-experiments, corresponding to an integrated luminosity of 20.3 fb\(^-\) \(^1\). Two hypotheses are compared: the null hypothesis, \(H_0\), and the signal hypothesis, \(H_1\), which includes FCNC single top-quark production. For both scenarios, ensemble tests, i.e. large sets of pseudo-experiments, are performed. Systematic uncertainties are included in the pseudo-experiments using variations of the signal acceptance, the background acceptances and the shape of the neural-network output distribution due to all sources of uncertainty described in the previous section.

To distinguish between the two hypotheses, the so-called Q value is used as a test statistic. It is defined as the ratio of the likelihood function L, evaluated for the different hypotheses:

$$\begin{aligned} Q = -2 \ln \left( \frac{L\left( \beta ^\text {FCNC} = 1 \right) }{ L\left( \beta ^\text {FCNC} = 0 \right) } \right) , \end{aligned}$$
(2)

where \(\beta ^\text {FCNC}\) is the scale factor for the number of events expected from the signal process for an assumed production cross-section. Systematic uncertainties are included by varying the predicted number of events for the signal and all background processes in the pseudo-experiments.

The \(\text {CL}_{s}\) method [88] is used to derive confidence levels (\(\text {CL}\)) for a certain value of \(Q^{\text {obs}}\) and \(Q^{\text {exp}}\). A particular signal hypothesis \(H_1\), determined by given coupling constants \(\kappa _{ugt}/\Lambda \) and \(\kappa _{cgt}/\Lambda \), is excluded at the 95 % \(\text {CL}\) if a \(\text {CL}_{s}< 0.05\) is found. The observed 95 % \(\text {CL}\) upper limit on the anomalous FCNC single top-quark production cross-section multiplied by the \(t \rightarrow Wb\) branching fraction, including all uncertainties, is 3.4 pb, while the expected upper limit is \({2.9^{+1.9}_{-1.2}}\)pb.

To visualise the observed upper limit in the neural-network output distribution, the FCNC signal process scaled to 3.4 pb stacked on top of all background processes is shown in Fig. 9.

The total uncertainty is dominated by the jet energy resolution uncertainty, the modelling of \(E_{\text {T}}^{\text {miss}}\) and the uncertainty on the normalisation and the modelling of the multi-jet background. A summary of all considered sources and their impact on the expected upper limit is shown in Table 3.

Table 3 The effect of a single systematic uncertainty in addition to the cross-section normalisation and MC statistical uncertainties alone (top row) on the expected 95 % \(\text {CL}\) upper limits on the anomalous FCNC single top-quark production \(qg\rightarrow t \rightarrow b\ell \nu \). The relative change quoted in the third column is with respect to the expected limit with normalisation and MC statistical uncertainties only
Fig. 9
figure 9

a Neural-network output distribution in the signal region and b in the signal region with neural network output above 0.1. In both figures the signal contribution scaled to the observed upper limit is shown. The hatched band indicates the total posterior uncertainty as obtained from the limit calculation

Using the NLO predictions for the FCNC single top-quark production cross-section [89, 90] and assuming \(\mathcal {B}(t \rightarrow Wb) = 1\), the upper limit on the cross-section can be interpreted as a limit on the coupling constants divided by the scale of new physics: \(\kappa _{ugt}/\Lambda < {5.8 \times 10^{-3} \,\mathrm{TeV}}\) assuming \(\kappa _{cgt}/\Lambda = 0\), and \(\kappa _{cgt}/\Lambda < {13 \times 10^{-3}\, \mathrm{TeV}}\) assuming \(\kappa _{ugt}/\Lambda = 0\). Distributions of the upper limits on the coupling constants for combinations of cgt and ugt channels are shown in Fig. 10a.

Limits on the coupling constants can also be interpreted as limits on the branching fractions using \(\mathcal {B}(t \rightarrow qg) = \mathcal {C} \left( \kappa _{qgt} / \Lambda \right) ^{2}\), where \(\mathcal {C}\) is calculated at NLO [91]. Upper limits on the branching fractions \(\mathcal {B}(t \rightarrow ug) < 4.0 \times 10^{-5}\), assuming \(\mathcal {B}(t \rightarrow cg)=0\) and \(\mathcal {B}(t \rightarrow cg) < {20 \times 10^{-5}}\), assuming \(\mathcal {B}(t \rightarrow ug)=0\), are derived and presented in Fig. 10b.

Fig. 10
figure 10

a Upper limit on the coupling constants \(\kappa _{ugt}\) and \(\kappa _{cgt}\) and b on the branching fractions \(\mathcal {B}(t \rightarrow ug)\) and \(\mathcal {B}(t \rightarrow cg)\). The shaded band shows the one standard deviation variation of the expected limit

8 Conclusion

A search for anomalous single top-quark production via strong flavour-changing neutral currents in pp collisions at the LHC is performed. Data collected by the ATLAS experiment in 2012 at a centre-of-mass energy \(\sqrt{s} = {8}\mathrm{TeV}\), and corresponding to an integrated luminosity of 20.3 fb\(^{-1}\) are used. Candidate events for which a u- or c-quark interacts with a gluon to produce a single top quark are selected. To discriminate between signal and background processes, a multivariate technique using a neural network is applied. The final statistical analysis is performed using a frequentist technique. As no signal is seen in the observed output distribution, an upper limit on the production cross-section is set. The expected 95 % \(\text {CL}\) limit on the production cross-section multiplied by the \(t \rightarrow b W\) branching fraction is \(\sigma _{qg \rightarrow t} \times \mathcal {B}(t \rightarrow bW)< {2.9}\,\mathrm{pb}\) and the observed 95 % \(\text {CL}\)limit is \(\sigma _{qg \rightarrow t} \times \mathcal {B}(t \rightarrow Wb)< {3.4}\,\mathrm{pb}\). Upper limits on the coupling constants divided by the scale of new physics \(\kappa _{ugt}/\Lambda < {5.8\times 10^{-3}}\,\mathrm{TeV}\) and \(\kappa _{cgt}/\Lambda < {13 \times 10^{-3} \,\mathrm{TeV}}\) and on the branching fractions \(\mathcal {B}(t \rightarrow ug) < {4.0\times 10^{-5}} \) and \(\mathcal {B}(t \rightarrow cg) < {20 \times 10^{-5}} \) are derived from the observed limit. These are the most stringent limits published to date.