tWH associated production at the LHC

We study Higgs boson production in association with a top quark and a W boson at the LHC. At NLO in QCD, tWH interferes with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t\bar{t} H$$\end{document}tt¯H and a procedure to meaningfully separate the two processes needs to be employed. In order to define tWH production for both total rates and differential distributions, we consider the diagram removal and diagram subtraction techniques that have been previously proposed for treating intermediate resonances at NLO, in particular in the context of tW production. These techniques feature approximations that need to be carefully taken into account when theoretical predictions are compared to experimental measurements. To this aim, we first critically revisit the tW process, for which an extensive literature exists and where an analogous interference with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t \bar{t}$$\end{document}tt¯ production takes place. We then provide robust results for total and differential cross sections for tW and tWH at 13 TeV, also matching short-distance events to a parton shower. We formulate a reliable prescription to estimate the theoretical uncertainties, including those associated to the very definition of the process at NLO. Finally, we study the sensitivity to a non-Standard-Model relative phase between the Higgs couplings to the top quark and to the W boson in tWH production.


Introduction
The study of the Higgs boson is one of the main pillars of the physics programme of the current and future LHC runs. Accurate measurements of the Higgs boson properties are crucial both to validate the standard model (SM) as well as to possibly discover new physics through the detection of a e-mail: federico.demartin@uclouvain.be deviations from the SM predictions. Another main pillar of the LHC research programme of the coming years is the study of the top quark. Being the heaviest quark, the top quark also plays a main role in Higgs boson phenomenology. In particular, the main production channel for the Higgs boson at the LHC entails a top-quark loop, while very soon Run II will be sensitive to on-shell top-antitop pair production in association with the Higgs boson, a process that will bring key information on the strength of the top-quark Yukawa interaction.
Exactly as when no Higgs is present in the final state, top quark and Higgs boson associated production can proceed either via a top pair production mediated by QCD interactions, or as a single-top (anti-)quark process mediated by electroweak interactions. The latter case, despite being characterised by much smaller cross sections with respect to the QCD production, displays a richness and peculiarities that make it phenomenologically very interesting. For example, it is sensitive to the relative phase between the Higgs coupling to the top quark and to the W boson. Single-top production (in association with a Higgs boson) can be conveniently classified in three main channels: t-channel, schannel (depending on the virtuality of the intermediate W boson) and t W (H ) associated production. For the first two channels, this classification is unambiguous only up to nextto-leading order (NLO) accuracy if a five-flavour scheme (5FS) is used. Beyond NLO, the two processes interfere and cannot be uniquely separated. The associated t W (H ) production, on the other hand, can easily be defined only at leading-order (LO) accuracy and in the 5FS, i.e. through the partonic process gb → t W (H ). At NLO, real corrections of the type gg → t W b(H ) arise that can feature a reso-nantt in the intermediate state and therefore overlap with gg → tt(H ), i.e. with tt(H ) production at LO. This fact would not be necessarily a problem per se, were it not for the fact that the cross section of tt(H ) is one order of magnitude larger than t W (H ), and its subtraction -which can only be achieved within some approximation -leads to ambiguities that have to be carefully estimated and entails both conceptual issues and practical complications.
A fully consistent and theoretically satisfying treatment of resonant contributions can be achieved by starting from the complete final state W bW b(H ) in the four-flavour scheme (4FS), including all contributions, i.e. doubly, singly and non-resonant diagrams. Employing the complex-mass scheme [1,2] to deal with the finite width of the top quark guarantees the gauge invariance of the amplitude and the possibility of consistently going to NLO accuracy in QCD. This approach has been followed already for W bW b and other processes calculations at NLO [3][4][5][6][7][8]. Recent advances have also proven that these calculations can be consistently matched to parton showers (PS) [9][10][11]. However, from the practical point of view, such calculations are computationally very expensive and would entail the generation of large samples including resonant and non-resonant contributions as well as their interference. This approach does not allow one to distinguish between top-pair and single-top production in the event generation. One would then need to generate signal and background together in the same sample (a procedure that would entail complications from the experimental point of view, for example in data-driven analyses) and communicate experimental results and their comparison with theory only via fiducial cross sections measurements. In any case, results for W bW bH are currently available at NLO accuracy only with massless b quarks [12], and therefore cannot be used for studying t W H.
A more pragmatic solution is to adopt a 5FS, define final states in terms of on-shell top quarks, and remove overlapping contributions by controlling the ambiguities to a level such that the NLO accuracy of the computation is not spoiled, and total cross section as well as differential distributions can be meaningfully defined. To this aim, several techniques have been developed with a different degree of flexibility, some being suitable only to evaluate total cross sections, others being employable in event generators. They have been applied to t W production and to the production of particles in SUSY or in other extensions of the SM, where the problem of resonances appearing in higher-order corrections is recurrent. Two main classes of such techniques exist for event generation, and they are generally dubbed diagram removal (DR) and diagram subtraction (DS). Unavoidably, all these approaches have their own shortcomings, some of them of more theoretical nature, such as possible violation of gauge invariance (which, however, turns out not to be worrisome), or ambiguities in the far off-shell regions which need to be kept into account and studied on a process-by-process basis. As will be recalled in the following, DR and DS actually feature complementary virtues and vices. An important point of the 5FS approach is that the combination of the separate tt(H ) and t W (H ) results ought not to depend on the technical details used to define the t W (H ) contribution, in the limit where overlapping is correctly removed and possible theoretical ambiguities are under control. In practice, the most common approach is to organise the perturbative expansion in poles of the top propagator, where tt(H ) production is computed with on-shell top quarks (this approach can also be used in the 4FS [3][4][5]7]). In this case, the complementary t W (H ) contribution should encompass all the remaining effects, e.g. including the missing interference with tt(H ) if that is not negligible. We are interested in finding a practical and reliable procedure to generate t W (H ) events under this scenario.
As already mentioned above, Higgs and top-quark associated processes can provide further information on the top-Higgs interaction. While at the Run I the LHC experiments have not claimed observation yet for these processes, setting only limits on the signal strength [13][14][15][16][17][18][19], tt H is expected to be soon observed at the Run II, allowing a first direct measurement of the top-quark Yukawa coupling y t . Indeed, unlike the dominant Higgs production mode via gluon fusion, where the extraction of y t is indirect, in the case of tt H such an extraction is (rather) model-independent. In addition, tt H production is well known to be sensitive to the Higgs CP properties [20][21][22][23][24][25][26][27][28][29][30][31]. On the other hand, Higgs production in association with a single top quark (t H and t W H), though rare, is very sensitive to departures from the SM, since the total rate can increase by more than an order of magnitude [32,33] due to constructive interference effects, becoming comparable to or even larger than tt H. In particular, Higgs plus single top allows one to access the phase of y t , which remains unconstrained in gluon fusion and tt H; a preliminary, yet not enough sensitive exploration has been carried out already at Run I [19]. At variance with t-channel and s-channel processes, predictions for t W H cross sections are only available at LO. Accurate predictions for t W H are not only important for the measurement of t W H itself, but also as a possible background to t H production, and in view of the observation of tt H and of the consequent extraction of Higgs couplings.
The main aim of this paper is to present the first predictions at NLO accuracy for t W H cross sections at the LHC. In order to do that, we first review the different techniques that can be used to remove resonant contributions from NLO corrections and also make a proposal for an improved DS scheme. We then study the t W process in detail, and compare our findings with the results already available in the literature. Finally, we apply these techniques to get novel results for t W H production.
At this point, we stress that even though it is not really the original motivation of this work, a critical analysis of t W is certainly welcome. The relevance of which approach ought to be used to describe t W production is far from being only of academic interest: already during the Run I, single-top production has been measured by both ATLAS and CMS in the tchannel [34][35][36][37], s-channel [38,39] and t W [40][41][42] modes. In particular, in t W analyses the difference between the two aforementioned methods, DR and DS (without including the tt-t W interference), has been added to the theoretical uncertainties. In view of the more precise measurements at the Run II, a better understanding of the tt-t W overlap is desirable, in order to avoid any mismodelling of the process and incorrect estimates of the associated theoretical uncertainties, both in the total cross section and in the shape of distributions. Furthermore, given the large amount of data expected at Run II and beyond, a measurement aimed at studying the details of the tt-t W interference may become feasible, and this gives a further motivation to study the best modelling strategy. Finally, a sound understanding of t W production will also be beneficial for the numerous analyses which involve tt production as a signal or as background. This is particularly true in analyses looking for a large number of jets in the final state, which typically employ Monte Carlo samples based on NLO merged [43][44][45] events, where stable top quarks are produced together with extra jets (tt + nj). In this case, all kinds of non-top-pair contributions, like t W , need to be generated separately. While these effects are expected to be subdominant, their importance has still to be assessed and may become relevant after specific cuts, given also the plethora of analyses; an example can be the background modelling in tt H or t H searches. Note that results for W bW b plus one jet have been recently published [46,47], but the inclusion of extra radiation in merged samples is much more demanding if one starts from the W bW b final state and thus may be impractical. Last but not least, a reliable 5FS description of t W is desirable in order to assess residual flavour-scheme dependence between the 4FS (W bW b) and the 5FS (tt +t W ) modelling of this process. Such a comparison can offer insights on the relevance of initial-state logarithms resummed in the bottom-quark PDF, which are an important source of theoretical uncertainty.
The paper is organised as follows: in Sect. 2 we review the definitions of the DR and DS techniques, and we also include a proposal for an improved DS scheme. In Sect. 3 we describe our setup for NLO computations, also matched to parton shower. In Sect. 4 we review the results from these techniques in the well-studied case of t W production, performing a thorough study of their possible shortcomings, considering the impact of interference effects between toppair and single-top processes, and investigating what happens after typical cuts are imposed to define a fiducial region for the t W process. In Sect. 5 we repeat a similar study for the SM t W H process at NLO. We also include the study of the t W H process going beyond the SM Higgs boson, investigating results from a generic CP-mixed Yukawa interaction between the Higgs and the top quark. Our study is complemented in the appendix by a quantitative assessment of the t W b and t W bH channels, studied as standalone processes in the 4FS and at the partonic level. In Sect. 6 we summarise our findings and propose an updated method to estimate the impact of theoretical systematics in the definition of t W and t W H at NLO in the 5FS.

Subtraction of the top quark pair contribution
As discussed in the introduction, the computation of higherorder corrections to t W (H ) requires the isolation of the tt(H ) process, and its consequent subtraction. In this section we review the techniques to remove such a resonant contribution which appears in the NLO real emissions of the t W (H ) process.
In the case of fixed-order calculations, and in particular when only the total cross section is computed, a global subtraction (GS) of the on-shell top quark can be employed, which just amounts to the subtraction of the total cross section for tt(H ) production times the t → bW branching ratio [48,49]: where (t → W b) is the physical width, while t is introduced in the resonant top-quark propagator as a regulator, and gauge invariance is ensured in the t → 0 limit. A conceptually equivalent version, which can be applied locally in the virtuality of the resonant particle and in an analytic form, 1 has been employed in the NLO computations for pair production of supersymmetric particles [50,51] and for charged Higgs boson production [52,53]. On the other hand, NLO+PS simulations require a subtraction which is fully local in the phase space. In order to achieve such a local subtraction, two main schemes have been developed, known as diagram removal (DR) and diagram subtraction (DS) [54]. These subtraction schemes have been studied in detail for t W production matched to parton shower in MC@NLO [54,55] and in Powheg [56], as well as in the case of t H − [57] and for supersymmetric particle pair production [58][59][60][61].
To keep the discussion as compact as possible, we focus on t W production (see Fig. 1 for the LO diagrams) and consider the specific case of the t W −b real emission and of Examples of doubly resonant (left), singly resonant (centre) and non-resonant (right) diagrams contributing to W bW b production. The first two diagrams on the left (with the t line cut) describe the NLO real-emission contribution to the t W − process its overlap with tt production. The extension to the process with an extra Higgs boson is straightforward. Strictly speaking, one should consider tt and t W −b (t W + b) processes as doubly resonant and singly resonant contributions to W bW b production, which also contains the set of non-resonant diagrams as shown in Fig. 2. However, as discussed in detail in the appendix, the contribution from non-resonant W bW b production and off-shell effects for the final-state top quark are tiny, as well as possible gauge-dependent effects due to the introduction of a finite top width. Therefore, we will treat one top quark as a final-state particle with zero width, so that the only intermediate resonance appears in top-pair amplitudes. The squared matrix element for producing a t W −b final state can be written as where A 1t denotes the single-top amplitudes, considered as the real-emission corrections to the t W process, while A 2t represents the resonant top-pair amplitudes describing tt production, where the intermediatet can go on-shell. The corresponding representative Feynman diagrams are shown in Fig. 2. In the following, we will discuss the DR and DS techniques in detail. DR (diagram removal): Two different version of DR have been proposed in the literature: -DR1 (without interference): This was firstly proposed in [54] for t W production and its implementation in MC@NLO. One simply sets A 2t = 0, removing not only |A 2t | 2 , which can be identified with tt production, but also the interference term 2Re(A 1t A * 2t ), so that the only contribution left is This technique is the simplest from the implementation point of view and, since diagrams with intermediate top quarks are completely removed from the calculation, it does not need the introduction of any regulator. -DR2 (with interference): This second version of DR was firstly proposed in [50] for squark-pair production. In this case, one removes only |A 2t | 2 , keeping the contribution of the interference between singly and doubly resonant diagrams Note that the DR2 matrix element is not positive-definite, at variance with DR1. In this case, while the integral is finite even with t → 0, in practice one has to introduce a finite t in the amplitude A 2t in order to improve the numerical stability of the phase-space integration.
An important remark concerning the DR schemes is that, as they are based on removing contributions all over the phase space, they are not gauge invariant. However, for t W the issue was investigated in detail in [54], and effects due to gauge dependence have been found to be negligible. We have confirmed this finding for both t W and t W H in a different way, and we discuss the details in the appendix, where we show that gauge dependence is not an issue if one uses a covariant gauge, such as the Feynman gauge implemented in MadGraph5_aMC@NLO.
DS (diagram subtraction): DS methods, firstly proposed for the MC@NLO t W implementation, have been developed explicitly to avoid the problem of gauge dependence, which, at least in principle, affects the DR techniques. The DS matrix element is written as where the local subtraction term C 2t , by definition, must [54,56]: 1. cancel exactly the resonant matrix element |A 2t | 2 when the kinematics is exactly on top of the resonant pole; 2. be gauge invariant; 3. decrease quickly away from the resonant region.
Given the above conditions, a subtraction term can be written as where  {q i } are the external momenta after a reshuffling that puts the internal antitop quark on mass-shell, i.e.
Such a reshuffling is needed in order to satisfy gauge invariance of C 2t , which in turn implies gauge invariance of the DS matrix element of Eq. (5) in the t → 0 limit. There is freedom to choose the prefactor f ( p 2 W b ), and the Breit-Wigner profile is a natural option to satisfy the third condition. Here, we consider two slightly different Breit-Wigner distributions: -DS1: which is just the ratio between the two Breit-Wigner functions for the top quark computed before and after the momenta reshuffling, as implemented in MC@NLO and POWHEG for t W [54,56]. -DS2: This off-shell profile of the resonance differs from DS1 by the replacement m t t → √ s t [62,63]. The exact shape of a resonance may be process-dependent, and in the specific case of t W (H ) we find that this profile is in better agreement than DS1 with the off-shell line shape of the amplitudes |A 2t | 2 (away from W b threshold), as can be seen in Fig. 3. In particular, we have checked that the agreement between the |A 2t | 2 profile and the C 2t subtraction term in DS2 holds for the separate qq and gg channels; at least in the qq channel there is no gaugerelated issue, off-shell effects in top-pair production are correctly described by |A 2t | 2 , and DS2 captures these effects better. As it will be shown later, this modification in the resonance profile leads to appreciable differences between the two DS methods at the level of total cross sections as well as differential distributions.
Apart from the different resonance line shapes, another important remark on DS is about the reshuffling of the momenta. Such a reshuffling is not a Lorentz transformation, since it changes the mass of the W b system, therefore different momenta transformations could result in different subtraction terms. Actually, there is an intrinsic arbitrariness in defining the on-shell reshuffling, potentially leading to different counterterms and effects. Thus, on the one hand DS ensures that gauge invariance is preserved in the t → 0 limit, at variance with DR. On the other hand, it introduces a possible dependence on how the on-shell reshuffling is implemented, which is not present in the DR approach and needs to be carefully assessed. To our knowledge, this problem has not been discussed in depth in the literature; a more detailed study is under way and will be reported elsewhere. In this work, we adopt the reshuffling employed by MC@NLO and POWHEG [54,56], where the recoil is shared democratically among the initial-state particles, also rescaling by the difference in parton luminosities due to the change of the partonic centre-of-mass energy.
Finally, we comment on the introduction of a non-zero top-quark width in the DR2 and DS methods. In order to regularise the singularity of A 2t , we have to modify the denominator of the resonant top-quark propagators as At variance with the case of a physical resonance, here t is just a mathematical regulator that does not necessarily need to be equal to the physical top-quark width. 2 In fact, one can set it to any number that satisfies t /m t 1 without affecting the numerical result in a significant way [58,60]. We have checked that the NLO DR2 and DS codes provide stable results with t in the interval between 1.48 and 0.001 GeV. 3 After all the technical details exposed in this section, we summarise the key points in order to clearly illustrate our rationale in assessing the results in the next sections: -Our starting point is to assume the (common) case where results for tt(H ) production are generated with on-shell top quarks. Resonance profile and correlation among production and decay are partially recovered from the offshell LO amplitudes with decayed top quarks, following the procedure illustrated in [64]. In particular, after this procedure the on-shell production cross section is not changed.  Fig. 3 and the related discussion, we already find DS2 to provide a better treatment than DS1 in the subtraction of the off-shell tt(H ) contribution; the difference between DS1 and DS2 quantifies the impact of different off-shell profiles.
-DR is in general gauge dependent. The difference between GS and DR2 amounts to the impact of possible gauge-dependent contributions and off-shell effects. As it will be shown, for the t W and t W H processes this difference is tiny. Finally, the difference between DR2 and DR1 amounts to the interference effects between tt(H ) and t W (H ); the single-top process is well defined per se only if the impact of interference is small.
As a last comment, we argue that in practice gauge dependence in DR should not be an issue in our case. When using a covariant gauge and only transverse external gluons, any gauge-dependent term decouples from the gg → t W b amplitudes [54], and this remains valid also after adding a Higgs. An independent constraint on gauge-dependent effects comes also from the off-shell profiles in Fig. 3. In the qq channel, |A 2t | 2 is free from gauge dependence and validates the C 2t DS2 off-shell profile for t W (H ); the gaugeinvariant DS2 counterterm continues to agree with |A 2t | 2 also in the gg channel, which in turn limits the size of alleged gauge-dependent effects in DR2. Moreover, even in the case of a significant gauge dependence, its effects should cancel out in a consistent combination of tt(H ) and t W (H ) events, if the off-shell amplitudes used to decay tt(H ) have been computed in the same gauge as t W (H ).

Setup for NLO+PS simulation
The code and events for t W production at hadron colliders at NLO-QCD accuracy can be generated in the Mad-Graph5_aMC@NLO framework by issuing the following commands: > import model loop_sm-no_b_mass > generate p p > t w-[QCD] > add process p p > t˜w+ [QCD] > output > launch and similarly for t W H production: The output of these commands contains, among the NLO real emissions, the t W b amplitudes that have to be treated with DR or DS. The technical implementation of DR1 (no interference) in the NLO code simply amounts to edit the relevant matrix_*.f files, setting to zero the top-pair amplitudes. To implement DR2, on the other hand, one subtracts the square of the top-pair amplitudes from the full matrix element. A subtlety is that the top-pair amplitudes (and only those) need to be regularised by introducing a non-zero width in the top-quark propagator. Note that, as we have already remarked in Sect. 2, this width is just a mathematical regulator. The DS is more complicated, since it also requires the implementation of the momenta reshuffling to put the top quark on-shell before computing the subtraction term C 2t . The automation of such on-shell subtraction in the Mad-Graph5_aMC@NLO framework is under way and will be become publicly available in the near future.
In our numerical simulations we set the mass of the Higgs boson to m H = 125.0 GeV and the mass of the top quark to m t = 172.5 GeV, which are the reference values used by the ATLAS and CMS collaborations at the present time in Monte Carlo generations. We renormalise the top Yukawa coupling on-shell by setting it to y t / GeV is the electroweak vacuum expectation value, computed from the Fermi constant G F = 1.16639 × 10 −5 GeV −2 ; the electromagnetic coupling is also fixed to α = 1/132.507. The W and Z boson masses are set to m W = 80.419 GeV and m Z = 91.188 GeV. In the 5FS the bottom-quark mass is set to zero in the matrix element, while m b = 4.75 GeV determines the threshold of the bottom-quark parton distribution function (PDF), which affects the parton luminosities. 4 We have found the contributions proportional to the bottom Yukawa coupling to be negligible, therefore we have set y b = 0 as well.
The proton PDFs and their uncertainties are evaluated employing reference sets and error replicas from the NNPDF3.0 global fit [65], at LO or NLO as well as in the 5FS or 4FS (4FS numbers are shown in the appendix). The value of the strong coupling constant at LO and NLO is set to α (5F,LO) s (m Z ) = 0.130 and, respectively, α (5F,NLO) The factorisation and renormalisation scales (μ F and μ R ) are computed dynamically on an event-by-event basis, by setting them equal to the reference scale μ d 0 = H T /4, where H T is the sum of the transverse masses of all outgoing particles in the matrix element. The scale uncertainty in the results is estimated varying μ F and μ R independently by a factor two around μ 0 . Additionally, we also show total cross sections computed with a static scale, which we fix to μ s 0 = (m t + m W )/2 for t W production and to μ s We use a diagonal CKM matrix with V tb = 1, ignoring any mixing between the third generation and the first two. In particular, this means that the top quark always decays to a bottom quark and a W boson, Br(t → bW ) = 1, with a width computed at LO in the 5FS equal to t = 1.4803 GeV. 5 Spin correlations can be preserved by decaying the events with MadSpin [21], following the procedure presented in [64]. We choose to leave the W bosons stable, because we focus on the behaviour of the b jets stemming either from the top decay or from the initial-state gluon splitting.
Short-distance events are matched to the Pythia8 parton shower [66] by using the MC@NLO method [67]. Jets are defined using the anti-k T algorithm [68] implemented in FastJet [69], with radius R = 0.4, and required to have A jet is b-tagged if a b hadron is found among its constituents (we ideally assume 100% b-tagging efficiency in our studies). The same kinematic cuts are applied for b jets as for light flavour jets in the inclusive study. In the fiducial phase space, on the other hand, a requirement on the pseudorapidity of is imposed, resembling acceptances of b-tagging methods employed by the experiments.

tW production
In this section we (re-)compute NLO+PS calculations for t W production at the LHC, running with a centre-of-mass energy √ s = 13 TeV. With the shorthand t W we mean the sum of the two processes pp → t W − and pp →t W + , which have the same rates and distributions at the LHC. We carefully quantify the impact of theoretical systematics in the event generation. Our discussion is split in two parts, focusing first on the inclusive event generation and the related theoretical issues, and then on what happens when fiducial cuts are applied.

Inclusive results
We start by showing in Fig. 4 the renormalisation and factorisation scale dependence of the pp → t W cross section, computed at LO and NLO accuracy, keeping the t stable. Results are obtained by employing the static and dynamic scales μ s 0 and μ d 0 (defined in Sect. 3) in the left and right plot, respectively. We show results where we simultaneously vary the renormalisation and factorisation scales on the diagonal μ R = μ F ; on top of this, for LO and NLO DR results, we also present two off-diagonal profiles where μ R = √ 2μ F and μ R = μ F / √ 2. In the two plots we present predictions   Fig. 4 Scale dependence of the total cross section for pp → t W − and t W + at the 13-TeV LHC, computed in the 5FS at LO and NLO accuracy, presented for μ F = μ R ≡ μ using a static scale (left) and a dynamic scale (right). The NLO t W b channels are treated using DR and DS; see Sect. 2 for more details. Furthermore, we show NLO results from GS (only for a static scale), and two off-diagonal profiles of the scale dependence, , for LO and NLO DR. Finally, the scale dependence of pp → tt at LO is also reported for comparison Table 1 Total cross sections for pp → t W − andt W + at the 13-TeV LHC, in the 5FS at LO and NLO accuracy with different schemes, computed with a static scale μ s 0 = (m t + m W )/2 and a dynamic scale We also report the scale and PDF uncertainties and the NLO-QCD K factors; the numerical uncertainty affecting the last digit is quoted in parentheses  Fig. 4, in this case scale variations are computed by varying μ F and μ R independently by a factor two around μ 0 . As expected, NLO corrections visibly reduce the scale dependence with respect to LO predictions. Comparing DR1 and DR2, we see that interference effects are negative at this centre-of-mass energy, and reduce significantly the NLO cross section, by about 13%. Also, the cross section scale dependence is different, in particular for very small scales. This effect is driven by the LO scale dependence in tt amplitudes, which is larger at low scales. Moving to DS, we find that DS1 and DS2 predictions show a 8% difference. Therefore, the dependence on the subtraction scheme is large, being comparable to the scale uncertainty or even larger.
We note that the total rate predictions obtained with DR2 and DS2 agree rather well within uncertainties, especially at the reference scale choice, and also agree with the predictions from the GS scheme. This result is quite satisfactory because it supports some important observations. First, that the offshell effects of the top-quark resonant diagrams are small, and indeed well described by the (gauge-invariant) parametrisation of Eq. (9). Second, that possible gauge dependence in DR2 is in practice not an issue if one uses a covariant gauge, dσ/dp T (t) [  where the subtraction of |A 2t | 2 turns out to be very close to an on-shell gauge-invariant subtraction. On the other hand, DR1, which does not include the interference in the definition of the signal, and DS1, which has a different profile over the virtuality of the intermediate top quark, do not describe well the NLO effects and extrapolate to a biased total cross section, even in the t → 0 limit. Thus, a third observation is that interference terms are not negligible, and it is mandatory to keep them in the definition of the t W process in order to have a complete simulation. Finally, a fourth point is that to include interference effects is not enough, but one also needs to subtract the top-pair process with an adequate profile over the phase space. This picture is confirmed at the level of differential distributions in the following discussion, and also at the total cross section level in the 4FS; see the appendix. We now turn to differential distributions, and we show some relevant observables in Figs. 5 and 6. Here, we employ a dynamical scale choice, μ 0 = H T /4 and we do not impose any cut on the final-state particles. Note that, for simplicity and after the shorthand t W , we label as t both the undecayed dσ/dp T   top quark in t W − production and the antitop int W + ; similarly, W indicates the W − in the first process and W + in the second one, i.e. the boson produced in association with t, and not the one coming from the t decay. Particles (not) coming from the top decay are identified by using the event-record information. We see that the DR1 and DS1 simulations tend to produce harder and more central distributions, while the DR2 and DS2 results, very similar one another, tend to be softer and more forward. In any case, NLO corrections cannot be taken into account by the LO scale uncertainty, nor be described by a K factor, especially for the physics of b jets. The hardest b jet ( j b,1 ) dominantly comes from the top decay, while the second-hardest b jet is significantly softer due to the initial-state g → bb splitting. As seen for DR2, the highp T W boson and b jets are highly suppressed due to the negative interference with the tt process. In fact, due to this interference the cross section can become negative in some corners of the phase space, for example in the highp T tail of the second b jet. We interpret this fact as a sign that t W cannot be separated from tt in this region, and the two con-  tributions must be combined in order to obtain a physically observable (positive) cross section. In summary, the t W -tt interference significantly affects the inclusive total rate as well as the shapes of various distributions at NLO. In particular, different schemes give rise to different NLO results, with ambiguities which in principle can be larger than the scale uncertainty. Such differences arise from two sources: the interference between resonant (top-pair) and non-resonant (single-top) diagrams, which is relevant and ought to be taken into account, and (in the case of DS) the treatment of the off-shell tails of the top-pair contribution. These ambiguities are intrinsically connected to the attempt of separating two processes that cannot be physically separated in the whole phase space. On the other hand, we have also found that two of such schemes, DR2 and DS2, give compatible results among themselves and integrate up to the total cross section defined in a gauge-invariant way in the GS scheme. We are now ready to explore whether a region of phase space (possibly accessible from the experiments) exists where the two processes can be separated in a meaningful way.

Results with fiducial cuts
In this section we would like to investigate whether t W can be defined separately from tt at least in some fiducial region of the phase space, in the sense that in such a region interference terms between the two processes and thus theoretical ambiguities are suppressed. In practice, this goal can be achieved by comparing results among different NLO schemes, since the difference among them provides a measure of interference effects and related theoretical systematics (gauge dependence in DR, subtraction term in DS). We remark that the following toy analysis is mainly for illustrative purposes, since the same procedure can be applied to any set of fiducial cuts defined in a real experimental analysis, also imposing a selection on specific decay products of the W bosons.
Motivated by the b-jet spectra in Fig. 5 and by experimental t W searches, a popular strategy to suppress the tt background as well as t W -tt interference is to select events with exactly one central b jet [40][41][42]48,55,70]. We define our set of "fiducial cuts" for t W by selecting only events with 1. exactly one b jet with p T ( j b ) > 20 GeV and |η( j b )| < 2.5, 2. exactly two central W bosons with rapidity |y(W )| < 2.5.
In this regard we stress that the first selection is the key to suppress the contributions from tt amplitudes, hence both the pure tt "background" as well as the t W -tt interference (i.e. theoretical ambiguities). Note that we would like to draw general conclusions about the generation of t W events, therefore we have chosen to define a pseudo event category that does not depend on the particular decay channel of the W bosons. The second selection is added to mimic a good reconstructability of these bosons inside the detector regardless of their final-state daughters; it affects less than 7% of the events surviving selection 1.
Looking at Table 2 we can see that, before any cut is applied, the event category is largely dominated by the tt contribution. Once the above fiducial cuts are applied, the tt contribution is reduced by more than a factor 16, while the t W rate shrinks by about just one third (for DR2 and DS2), bringing the signal-to-background ratio σ (t W )/σ (tt) close to unity, which is exactly the aim of t W searches. The impact of interference has been clearly reduced by the cuts; The fiducial cross sections computed with the different NLO schemes agree much better with each other, than before selections are applied. Still, there is a minor residual difference in the rates, which amounts to about 2%.
From the distributions in Figs. 7 and 8 we can see once more an improved agreement among the different NLO schemes in the fiducial region. The lower panels show flatter and positive K factors and a lower scale dependence in the highp T tail than before the cuts, since we have suppressed dσ/dp T (t) [ ( j b,1 ). Monte Carlo information shows that the central b jet coincides with the one stemming from the top decay ( j b,t ) for the vast majority of events. In the highp T region, however, the b jet can also originate from a hard initialstate g → bb splitting, similar to the case of t-channel t H production [33]. This suggests that, if on top of the fiducial cuts we also demand the central b jet to unambiguously originate from the top quark, then we may be able to suppress even further the t W -tt interference and the related theoretical systematics. In fact, we can see from Table 2 and from the right plot in Fig. 8 that, after such a requirement is included in the event selection, the total rates as well as the distributions end up in almost perfect agreement, and one can effectively talk about t W and tt as separate processes in this region: interference effects have been suppressed at or below the level of numerical uncertainty in the predictions. A possible remark is that the top-reconstruction requirement shaves off another ∼ 2 pb of the cross section, i.e. more than the residual dσ/dp T Fig. 7, but for the central b-tagged jet. For the right plot, in addition to the fiducial cuts, the top reconstruction is required discrepancy between the different NLO schemes before this last selection is applied. To summarise, a naturally identified region of phase space exists where t W is well defined, i.e. gauge invariant and basically independent of the scheme used (either DR1, DR2, DS1, DS2) to subtract the tt contribution. Given the fact that DS2 and DR2 also give consistent results outside the fiducial region and integrate to the same total cross section, equal to the GS one, they can both be used in MC simulations. In practice, given the fact that the gauge-dependent effects are practically small when employing a covariant gauge, and that the implementation in the code is rather easy, DR2 is certainly a very convenient scheme to use in simulations of t W production in the 5FS, including the effects of interference with the tt contribution. In addition, one can use the difference between DR1 and DR2 (i.e. the amount of t Wtt interference) to assess whether the fiducial region where the measurements are performed is such that the processdefinition uncertainties are under control (smaller than the missing higher-order uncertainties), and to estimate the residual process-definition systematics. We have seen that requiring the presence of exactly one central b jet is a rather effec- 10 Examples of doubly resonant (first on the left), singly resonant (second two) and non-resonant (last two) diagrams contributing to W bW bH production. The first three diagrams (with the t line cut) describe the NLO real-emission contribution to the t W − H process tive way to identify such a fiducial region. We have also found that, especially in DR2 and DS2 schemes, the perturbative series for the t W process is well behaved, NLO-QCD corrections mildly affect the shape of distributions but reduce the scale dependence considerably with respect to LO. A further handle to suppress process-definition systematics can be given by a reconstruction of the top quark, identifying the central b jet as coming from its decay. Top-tagging techniques are being developed (theoretical and experimental reviews can be found at [71] and [72,73]), and may help to define a sharper fiducial region, although this may depend on the trade-off between the top-tagging efficiency and the amount of residual process-definition ambiguities to be suppressed.

tW H production
In this section we present novel NLO+PS results for t W H production in the 5FS at the 13-TeV LHC (diagrams are shown in Figs. 9,10). Similar to what we have done for t W in the previous section, we address the theoretical systematics both at the inclusive level and with fiducial cuts. We anticipate that our findings for t W H are qualitatively similar to the ones for t W , but the larger numerical ratio between the top-pair and single-top contributions enhances the impact of interference effects and exacerbates theoretical systematics in the simulation, which are clearly visible in the t, W , H and b-jet observables. We will see that this can be alleviated after applying suitable cuts. Finally, we investigate the impact of non-SM couplings of the Higgs boson on this process.

Inclusive results
As for t W , we start by showing the renormalisation and factorisation scale dependence of the t W H cross section in Fig. 11, both at LO and NLO accuracy, using differ-ent schemes to treat the t W bH real-emission channels (the details for the various NLO schemes can be found in Sect. 2). The values of the total rate computed at the central scale μ 0 are also quoted in Table 3. Unlike in Fig. 11, in this case scale variations are computed by varying μ F and μ R independently by a factor two around μ 0 . The same pattern we have found for t W is repeated. Comparing DR results obtained by neglecting (DR1, red) or taking into account (DR2, orange) interference with tt H, we observe again that these interference effects are negative, but their relative impact on the cross section is even more sizeable. The interference reduces the NLO rate by about 5 fb, which amounts to a hefty −25%, leading to a K factor close to 1. Since interference effects are driven by the LO tt H contribution, they grow larger for lower scale choices. The cross sections obtained employing the two DS techniques, DS1 (blue) and DS2 (green), show large differences which go beyond the missing higher orders estimated by scale variations, and can be traced back to the different Breit-Wigner prefactor in the subtraction term C 2t . As it has been the case for t W production, we find that DR2 and DS2 are in good agreement with GS.
In complete analogy with the case of the t W b channel in t W production at NLO, we perform a study of the theoretical systematics in the modelling of the t W bH channel (employing the 4FS to isolate this contribution), which can be found in the appendix.
In Figs. 12 and 13 we collect some differential distributions. Observables related to the Higgs boson can essentially be described by a constant K factor for each subtraction scheme. On the other hand, similar to the t W case, the NLO distributions for the top quark and the W boson are quite different among the four NLO techniques. As we know, these differences are driven essentially by whether the interference with tt H is included or not (in DR), and by the profile  of the subtraction term (in DS). These NLO effects are quite remarkable for the b jets, since the negative interference with tt H drastically suppresses central hard b jets. Summarising, in analogy with the t W process, effects due to the interference between tt H and t W H which appear in NLO corrections of the latter process are significant, and hence the details of how the tt H contribution is subtracted enormously affect the predictions for both the total rate and the shape of distributions. On the one hand, a LO description of t W H in the 5FS is apparently not sufficient. On the other hand, the NLO prediction strongly depends on the subtraction scheme employed. This last point is only a relative issue, if we take into account the fact that DR2 and DS2 results are quite consistent with each other and integrate to the same total cross section as GS, which suggests that they provide a better description of the physics not included in tt H than DR1 and DS1. Nevertheless, as in the case of t W production, it is clear that fiducial cuts are crucial to obtain a meaningful separation of t W H from tt H, and their effects will be discussed in the next subsection.

Results with fiducial cuts
We now move to investigate whether the separation between t W H and tt H can become meaningful in a fiducial region, where interference between the two processes and theoretical systematics are suppressed. The problem is exactly analogous to the t W -tt separation. In practice, for any selection defined by suitable cuts, one needs to quantify the residual difference among different subtraction schemes and see if it is small enough.   The lower panels provide information on the differential K factors with the scale uncertainties Motivated by the same rationale behind our t W discussion, we define our set of "fiducial cuts" for t W H selecting only events with 1. exactly one b jet with p T ( j b ) > 20 GeV and |η( j b )| < 2.5, 2. exactly two central W bosons with |y(W )| < 2.5, 3. exactly one central Higgs boson with |y(H )| < 2.5.
We recall that the first selection is the key to suppress the double-top amplitudes and hence t W H-tt H interference and theoretical ambiguities. We do not assume any particular decay channel for the heavy bosons and hence the second and third selections are added to mimic a good reconstructability of the W and H bosons in the detector. However, they are not crucial since they affect just 5% of the events after surviving selection 1. Our pseudo event category is defined mainly for illustrating the issues behind the simulation of the t W H signal, but the same procedure can be applied to any realistic set of fiducial cuts in experimental analyses, including a selection on specific decay products of the W and H bosons.
Looking at Table 4, we can see that the situation for t W H is very similar to the one we have already seen for t W . Before the fiducial cuts, the category is largely dominated by tt H events. Once the fiducial cuts are applied, the contribution from tt H is reduced by more than a factor 20, while the one from t W H just by about 1/4 (for DR2), enhancing the signal-to-background ratio (t W H/tt H) to about 0.5, which is encouraging from the search point of view. The interference with LO tt H amplitudes has been visibly reduced, with fiducial cross sections among the four techniques agreeing much better than in the inclusive case; this is also apparent in the differential distributions of Figs. 14 and 15, and in particular in the much smaller scale dependence in the tails of t W H distributions at NLO.
Nevertheless, a residual difference of about 6% (0.7 fb) is present between the DR1 and DR2 fiducial cross sections, and this discrepancy is also visible in the shape of some p T distributions. Once again, if we use MC information to additionally require the central b jet to come unambiguously from the top quark, the residual interference effects are further reduced to less than 1% at a tiny cost on the signal efficiency. This brings the differential predictions in excellent agreement among the four schemes and with this selection one can effectively consider t W H and tt H as separate processes.  Fig. 13 Same as Fig. 12, but for the b-tagged jets. Note that the second-hardest b jet is described by the parton shower at LO, while by the matrix element at NLO Finally, we briefly comment on the possibility to observe the t W H signal at the LHC. Naturally, one may wonder whether it will be possible to observe it over the (already quite rare) tt H process, in an experimental analysis that applies a selection similar to our fiducial cuts. For example, the LHC Run II is expected to deliver an integrated luminosity in the 100 fb −1 ballpark. In our pseudo event category (with top reconstruction), the difference between including or excluding the t W H contribution amounts to tt H only: 2147 ± 46 (stat.) +101 −204 (theo.) events, tt H + t W H : 3251 ± 57 (stat.) +147 −257 (theo.) events. Unfortunately, once branching ratios of the Higgs and W bosons and realistic efficiencies are taken into account, these numbers disfavour the possibility to observe t W H over tt H at the Run II. On top of that, there are many more background processes contributing to our event category than just tt H. This makes the searches for the SM t W H signal extremely Table 4 Total cross sections in fb at the LHC 13 TeV for the processes pp → tt H and pp → t W H, in the 5FS at NLO+PS accuracy. Results are presented before any cut (left), after fiducial cuts (centre), and also adding top reconstruction on the event sample (right). We also report the scale and PDF uncertainties, as well as the cut efficiency with respect to the case with no cuts. All numbers are computed with the reference dynamic scale μ 0 = H T /4, and the numerical uncertainty affecting the last digit is reported in parentheses

No cuts
Fiducial cuts Fiducial cuts + top reco.    Fig. 12, but after applying the fiducial cuts to suppress interference between t W Hb and tt H challenging, and the high-luminosity upgrade of the LHC is definitely needed in order to have a sufficient number of events.
On the other side, simulated t W H events should be taken into account in other searches for Higgs boson and top quark associated production, which are not necessarily going to apply t W H-specific fiducial cuts, in order to complete the MC modelling. In particular, this will be relevant in searches for the tt H signal, and also for the t-channel t H process (also called t Hq by experiments) with Higgs decay into a pair of bottom quarks (H → bb), where semileptonic t W H events can lurk in the signal region defined by a large (b-)jet multiplicity. In fact, including the t W H simulation in the signal definition (as opposed to considering it a background) in the case of either tt H or t-channel t H searches will lead to a more comprehensive view on Higgs boson and top-quark associated production, e.g. being relevant when setting limits or measuring the signal strength.

Higgs characterisation
In this section we explore the sensitivity of t W H production to beyond the standard model (BSM) physics in the Higgs sector. In particular, we start by studying the total production rate in the so-called "κ-framework" [74,75] where the SM Higgs interactions are simply rescaled by a dimensionless constant κ. Then we move to characterising the Yukawa interaction between the Higgs boson and the top quark, which in general can be a mixture of CP-even and CP-odd terms, similar to what has been done for t-channel t H production in Sect. 5 of [33]. To describe the Yukawa interaction, we consider the following Lagrangian for a generic spin-0 mass eigenstate X 0 that couples to both scalar and pseudoscalar fermionic currents: where c α ≡ cos α and s α ≡ sin α are the cosine and sine of the CP-mixing phase α; κ Htt,Att are real dimensionless parameters that rescale the magnitude of the CP-even and CP-odd couplings, and g with v 246 GeV. While redundant (only two independent real quantities are needed to parametrise the most general CP-violating interaction between a spin-0 particle and the top quark at dimension four), this parametrisation has the practical advantage of easily interpolating between the purely CP-even (c α = 1, s α = 0) and purely CP-odd (c α = 0, s α = 1) cases, as well as to easily recover the SM when c α = 1 , κ Htt = 1 . In the κ-framework c α = 1, and only the part proportional to κ Htt is considered. On the other hand, the SM-like interactions between the Higgs and the EW vector bosons is described by where g HVV = 2m 2 V /v (V = W, Z ). For the full Higgs characterisation (HC) Lagrangian, including CP-even and CPodd higher-dimensional X 0 V V operators, we refer to [76,77]. The Feynman rules from these Lagrangians are coded in the publicly available HC_NLO_X0 model [78]. The code and events for t W X 0 production at NLO can be generated in a way completely analogous to SM t W H: > import model HC_NLO_X0-no_b_mass > generate p p > t w-x0 [QCD] > add process p p > t˜w+ x0 [QCD] In this section we show results obtained only with the DR techniques. We start by showing results in the κ-framework in Fig. 16. We can see that a CP-even Higgs boson is highly sensitive to the relative sign of Higgs couplings to fermions (t) and EW bosons (W ). Depending on the (κ Htt , κ SM ) con-figuration, the inclusive t W H rate (DR2, including interference with tt H) can be enhanced from 15 fb to almost 800 fb. The t W H process can thus be exploited to further constrain the allowed regions in the two-dimensional plane spanned by κ Htt and κ SM together with the already sensitive t H production.
Given the experimental constraints after the LHC Run I [79], we can reasonably fix the Higgs interaction with the EW bosons to be the SM one, and turn to study CP-mixing effects in the Higgs-fermion sector. It is also reasonable to assume that gluon fusion is dominated by the top-quark loop, and consequently the X 0 -top interaction must reproduce the SM gluon-fusion rate at NLO accuracy to comply with experimental results. This fixes the values of the rescaling factors in Eq. (13) to leaving the value of the CP-mixing angle α free. In Fig. 17 we plot the total NLO cross section for Higgs production in association with a top-quark pair tt X 0 (red), and for the combined contribution of tt X 0 and t W X 0 including their interference (orange), which is simply obtained by summing the t W X 0 DR2 cross section to the tt X 0 one. We can immediately see that the inclusion of the t W X 0 process lifts the y t → −y t degeneracy that is present in tt X 0 production. For a flipped-sign Yukawa coupling, the interference between single-top diagrams where the Higgs couples to the top and the ones where it couples to the W becomes constructive, and the total cross section is augmented from roughly Note that the standard model configuration (+1, +1) almost lies in a minimum, which means the process is suited for constraining this place due to enhanced rates for deviations from the SM. Right: the t W H cross section is shown for three different intensities of the X 0 W W coupling κ SM , as a function of κ Htt , where DR1 results are also reported, to gauge the impact of interference with tt H 500 fb (SM, α = 0 • ) to more than 600 fb (α = 180 • ). This enhancement can help in a combined analysis of the Higgs interactions, though it is less striking than the one which takes place in the t-channel Higgs plus single-top process (which is also reported in blue for comparison). For the sake of clarity we point out that, going along the α-axis in Fig. 17,  17 NLO cross sections (with scale uncertainties) for pp → tt X 0 , pp → t W X 0 (with DR2) and pp → t X 0 (t-channel) at the 13-TeV LHC as a function of the CP-mixing angle α, where κ Htt and κ Att are set to reproduce the SM gluon-fusion cross section for every value of α.
The tt X 0 and t W X 0 processes have been computed using the dynamic scale μ 0 = H T /4, while t X 0 results are taken from [33] the t W X 0 cross section includes in fact two different interference effects. On the one hand, there is the interference between single-top amplitudes with Higgs-to-fermion and Higgs-to-gauge-boson interactions, similar to the t H process. This is already present at LO, and it drives the growth of the cross section from the SM case (maximally destructive interference) to the case of a reversed-sign top Yukawa (maximally constructive). On the other hand, employing DR2 for the computation of the t W X 0 NLO cross section means that also the interference with tt H is included. This is an effect present only at NLO, and its size depends as well on the CPmixing angle α (due to the different ratio between tt H and t W H amplitudes).
In Fig. 18 we compare some differential distributions for the SM hypothesis (blue), the purely CP-odd scenario (red) and the flipped-sign CP-even case (green), before any cuts. We can see that the interference between the doubly resonant tt H and the singly resonant t W H amplitudes is largest for the SM case. For the case of flipped Yukawa coupling the interference gives a minor contribution, while for the CP-odd case it is very tiny because the doubly resonant contribution is at its minimum. The W and Higgs transverse momentum distributions become harder when the mixing angle is larger. Once the fiducial cuts are applied (Fig. 19), the difference between DR1 and DR2 decreases as expected.
In conclusion, we find that the t W H process can help to lift the y t → −y t degeneracy for tt H and put constraint on BSM Yukawa interactions of the Higgs boson in a combined analysis, on top of the most sensitive t-channel t H production mode. Finally we recall that, if one also assumes a SM interaction between the Higgs and the W bosons, one can  Fig. 19 Same as in Fig. 18, but after applying the fiducial cuts further include the γ γ decay channel data to put limits on the CP-mixing phase α.

Summary
In this work we have provided for the first time NLO accurate predictions for the t W H process, including parton-shower effects. In order to achieve a clear understanding of the ambiguities associated to the very definition of the process at NLO accuracy due to its mixing with tt H, we have revisited the currently available subtraction schemes in the case of t W production. We have therefore carefully analysed t W at NLO in the five-flavour scheme, and then we have proceeded in an analogous way for t W H. On the one hand, NLO corrections to these processes are crucial for a variety of reasons, ranging from a reliable description of the b quark kinematics to a better modelling of backgrounds in searches for Higgs production in association with single top quark or a top pair. On the other hand, they introduce the issue of interference with tt or tt H production, which has a significant impact on the phenomenology of these processes. Our first aim has been to study the pro's and the con's of the various techniques (which fall in the GS, DR and DS classes) that are available to subtract the resonant contributions appearing in the NLO corrections. At the inclusive level these techniques can deliver rather different results, with differences which can often exceed the theoretical uncertainties on the NLO cross sections estimated via scale variations. These differences have been traced back to whether a given technique accounts for the interference between the t W (H ) and tt(H ) processes, and to how the off-shell tails of the resonant diagrams are treated. They become visible at the total cross section level as well as in distributions, particularly those involving b-jet related observables. We find the DR2 and DS2 techniques to provide a more faithful descrip-tion of the underlying physics in t W and t W H than that of DS1 and DR1, therefore we deem them as preferable to generate events for these two processes at NLO. We stress that the aim of our work is to provide a practical and reliable technique to simulate t W and t W H at NLO, when the corresponding tt and tt H process are generated separately in the on-shell approximation. Our results have no claim of generality, and cannot be immediately extended to other SM or BSM processes. A study of subtraction techniques should be performed on a process-by-process basis, in particular for BSM physics, where different width-to-mass ratios and different amplitude structures (i.e. resonance profiles) can appear.
Our second aim has been to study what happens once event selections similar to those performed in experimental analyses are applied, and in general whether one can find a fiducial region where the single-top processes t W and t W H can be considered well defined per se, and they are stable under perturbative corrections. A simple cut as requiring exactly one btagged jet in the central detector (which becomes three b jets in the case of t W H if the Higgs decays to bottom quarks) can greatly reduce interference effects, and thus all the processdefinition systematics of t W (H ) at NLO. In such a fiducial region, we find the perturbative description of t W (H ) to be well behaved, and the inclusion of NLO corrections significantly decreases the scale dependence; differences between the various DR and DS subtraction techniques are reduced below those due to missing perturbative orders, making the separation of the single-top and top-pair processes meaningful. Given a generic set of cuts, we have provided a simple and robust recipe to estimate the left-over process-definition systematics, i.e. use the difference between the DR1 and DR2 predictions (which amounts to the impact of interference effects). In general, such approach provides a covenient way to quantify the limits in the separation of tt(H ) and t W (H ) and the quality of fiducial regions. In particular, this is essential for a reliable extraction of the Higgs couplings in t W H production.
Finally, we have investigated the phenomenological consequences of considering a generic CP-mixed Yukawa interaction between the Higgs boson and the top quark in t W H production. While the SM cross section is tiny, due to maximally destructive interference between the H -t and H -W interactions, and direct searches for this process may only be feasible after the high-luminosity upgrade of the LHC, BSM Yukawa interaction tend to increase the production rate. For example, in the case of a reversed-sign Yukawa coupling with respect to the SM, the t W H cross section is enhanced by an order of magnitude, similar to what happens for the dominant single-top associated mode, i.e. the t-channel t H production. The large event rate predicted after the combination of these Higgs plus single-top modes will help to exclude a reversedsign top Yukawa coupling already during the LHC Run II. the completion of this work. This work has been performed in the framework of the ERC grant 291377 "LHCtheory: Theoretical predictions and analyses of LHC physics: advancing the precision frontier" and of the FP7 Marie Curie Initial Training Network MCnetITN (PITN-GA-2012-315877). It is also supported in part by the Belgian Federal Science Policy Office through the Interuniversity Attraction Pole P7/37. The work of FD and FM is supported by the IISN "MadGraph" convention 4.4511.10 and the IISN "Fundamental interactions" con-

Appendix: The tW b and tW bH channels in the 4FS
In this appendix we perform a study of the various ways to treat the t W b channel, in particular we will discuss the performance and shortcomings of the diagram removal and diagram subtraction techniques, which are used to eliminate the tt resonant contribution. Since the issue appears just in the matrix-element description, the study in this appendix is simply performed at the partonic level. The t W b channel is more easily addressed in the 4FS, where it appears as a finite and independent LO contribution, thus it can be isolated from the other channels contributing to t W . The only difference from the 5FS is that bottom mass effects are included in the 4FS description, which act as an IR cutoff; the Feynman diagrams are the same ones describing the 5FS NLO realemission channel, and the features and shortcomings of DR and DS are independent of the flavour scheme employed. An analogous study is then repeated for the t W bH channel in the 4FS.
The problem of the LO tt contribution in the t W −b channel has first been addressed in [48], where it is subtracted at the cross section level (see Eq. (4) in the reference). This global subtraction procedure (GS) is described in Sect. 2; an important point in the calculation is that the two pieces (t W −b and tt) are separately integrated before the subtraction is performed. The GS procedure ensures that the remainder of the subtraction converges to a well-defined limit t → 0, where the result is fully gauge invariant, and exactly all and just the LO on-shell tt contribution is subtracted. Therefore, combining the tt simulation with the t W −b obtained this way, one gets a well-defined total rate for producing the common physical final state, without double counting and also including interference effects; this procedure provides a consistent way to define the t W cross section.
Actually, the only way to perform a theoretically consistent simulation that encompasses both the top-pair and the single-top contributions, that is gauge invariant and that includes interference and other finitet effects, is to compute pp → W + bW −b in the 4FS and using a complex top-quark mass. This W bW b simulation will also contain the contribution from amplitudes without any resonant top propagator A 0t , and also interference between single-top and singleantitop contributions A 1t A * 1t , which are not present in the t W b simulation nonetheless, we expect the last two lines in Eq. (A.1) to be negligible compared to the previous two lines, which encompass top-pair tt and single-top t W b production. In the end, the reference result will be the difference between the W bW b cross section (computed in the complexmass scheme, with a physical t ) and the tt cross section (computed with on-shell top's), which in general guarantees a correct description of t W b production. If the non-resonant contributions A 0t to W bW b, the A 1t A * 1t interference, and the off-shell effects related the single top kept stable in t W b simulations are small enough, this cross section will be close to the one obtained from GS.
The global subtraction schemes cannot be applied to event generation, where a fully local subtraction of the top-pair contribution must be performed in the 2 → 3 phase space; this is exactly the reason why alternative techniques such as DR and DS have been developed and implemented in MC@NLO and POWHEG for t W production. Nevertheless, a simple but powerful way to test the adequacy of DR and DS can be carried out by comparing their total cross section with the GS one, which is the number we expect to be returned from a consistent local subtraction scheme. We perform this comparison in Table 5, where cross sections are computed with the static scale μ s 0 , also showing the cross section ratio R defined as From the results in Table 5 we first notice that the W bW b − tt cross section (computed with a physical t ) is in good agreement with the t W b one computed with the GS prescription (which is independent on the actual value of t ), thus either can be considered as the reference value. This also confirms that non-resonant contributions from A 0t and A 1t A * 1t interference are small, and justifies the 5FS treatment where one top is always on-shell. Among the two diagram removal techniques, the DR1 modelling does not capture the A 2t A * 1t interference, which amounts to more than 9 pb (this was evident already in Table 1). On the other hand, there is excellent agreement between the DR2 cross section and the desired one from W bW b − tt, thus any possible violation of gauge invariance in the DR2 total rate must be negligible. 6 When we compute |A W bW b | 2 − |A 2t | 2 (namely W bW b − |A 2t | 2 in Table 5), we can see that the difference with t W b DR2 is a modest 2%; this provides a further confirmation that effects related to A 0t , A 1t A * 1t interference, and off-shell t are small; the subtraction of |A 2t | 2 in a covariant gauge turns out to be almost equivalent to an on-shell tt subtraction (compare W bW b−tt and W bW b − |A 2t | 2 ).
Moving to diagram subtraction, we can see that DS2 is in rather good agreement with GS and DR2, while DS1 clearly overestimates the total rate, which tends to be much closer to DR1.
The situation can be understood also at the differential level by looking at the m W b distribution in Fig. 20. The missing of interference in DR1 leads to an underestimate of the rate in the low-mass region m W b < m t , and to an overestimate in the tail m W b > m t ; at the LHC energy, the latter region dominates, leading to a net overestimate of the total Table 6 LO cross sections in the 4FS at the LHC with √ s = 13 TeV for the processes pp → W + bW −b H (complex-mass scheme), pp → tt H (t stable), and singly resonant pp → t W −b H plus pp →t W + bH computed using the GS, DR and DS prescriptions. For these t W bH results we also report the ratio R, which is analogous to the one defined in Eq. (A.2). All numbers are computed using the static scale μ s 0 = (m t + m W + m H )/2, and the numerical uncertainty affecting the last digit is reported in parentheses  (2) rate. 7 DR2 and DS2 nicely reproduce the peak-dip interference pattern, with small differences between the two curves; since DS2 is gauge invariant, this fact can be interpreted as that gauge effects in DR2, when employing a covariant gauge, are small also at the level of differential shapes. Finally, while DS1 includes interference effects as well, it also introduces a significant distortion in the profile of the subtraction term C 2t , as already shown in Fig. 3; the net effect is an unreliable m W b profile, with an inverted dip-peak structure and a too large tail. We now move on to studying the t W bH channel in t W H production at NLO, which overlaps with LO tt H. We follow a procedure completely analogous to the one employed for t W b, therefore we do not repeat all the details in the following discussion.
Our reference total rate is the difference between the W bW bH cross section, computed in the complex top-quark mass scheme, and the tt H cross section computed in the approximation of stable final-state top quarks. Once again we find GS to be in very good agreement with this reference value, so both results can be taken as a reference for comparison with DR and DS; see Table 6.
We can see that the ratio between top-pair and singletop amplitudes is even higher than for tt versus t W , and this exacerbates the same problems we have observed in that case. Interference effects are very large and neglecting them results in an error of O(100%) in DR1, where the cross section is more than twice that from GS. Once again, we find DR2 results to be in excellent agreement within the numer- 7 We have verified that the net sum of interference effects in the total rate is positive at collider energies below ∼2 TeV, while becomes more and more negative at higher energies, where the phase space for m W b > m t is larger. ical accuracy. The impact of non-resonant amplitudes and of interference between single-top and single-antitop contributions is very small, less than 2% of the DR2 rate in this channel. The rate obtained from DS1 is overestimated by more than a factor two, while DS2 looks again in better agreement with GS and DR2, although there is a residual difference of about 0.7 fb (slightly larger than the 0.3 fb in the 5FS scheme).
In Fig. 21 we show the m W b differential distribution. A similar pattern of the one for t W b is repeated: interference effects are large and positive in the m W b < m t region, while negative for m W b > m t , where DR1 clearly overestimates the event rate. The interference pattern is nicely reproduced by the DR2 and DS2 shapes, although there are some minor differences between the two methods; instead, DS1 fails to return a physical shape, due to the visibly distorted profile of the subtraction term C 2t ; see Fig. 3.
We would like to stress one final remark: the fact that gauge dependence is apparently not an issue in the DR2 procedure should be regarded as a peculiarity of the t W b and t W bH channels, and not as a general result. We cannot exclude that gauge dependence could become a significant issue at higher perturbative orders (NNLO t W (H )), or in other processes with a more complex colour flow, or using a different (i.e.