Search for associated production of a Higgs boson and a single top quark in proton-proton collisions at $\sqrt{s} =$ 13 TeV

A search is presented for the production of a Higgs boson in association with a single top quark, based on data collected in 2016 by the CMS experiment at the LHC at a center-of-mass energy of 13 TeV, which corresponds to an integrated luminosity of 35.9 fb$^{-1}$. The production cross section for this process is highly sensitive to the absolute values of the top quark Yukawa coupling, $y_t$, the Higgs boson coupling to vector bosons, $g_\mathrm{HVV}$, and, uniquely, to their relative sign. Analyses using multilepton signatures, targeting H$\to$WW, H$\to\tau\tau$, and H$\to$ZZ decay modes, and signatures with a single lepton and a $\mathrm{b\overline{b}}$ pair, targeting the H$\to\mathrm{b\overline{b}}$ decay, are combined with a reinterpretation of a measurement in the H$\to\gamma\gamma$ channel to constrain $y_\mathrm{t}$. For a standard model-like value of $g_\mathrm{HVV}$, the data favor positive values of $y_\mathrm{t}$ and exclude values of $y_\mathrm{t}$ below about $-$0.9 $y_\mathrm{t}^\mathrm{SM}$.


Introduction
The scalar resonance discovered by the CMS and ATLAS Collaborations at the LHC [1][2][3] in 2012 has been found to have properties consistent with the predictions of the standard model (SM) for a Higgs boson with a mass of about 125 GeV [4]. In particular, its couplings to bosons (g HVV ) and fermions (y f ) corroborate an SM-like dependence on the respective masses. Furthermore, data indicate that it has zero spin and positive parity [5]. Recently, the associated production of top quark pairs with a Higgs boson (ttH) and Higgs boson decays to pairs of bottom quarks have been observed [6][7][8], thereby directly probing the Yukawa interactions between the Higgs boson and top as well as bottom quarks for the first time. In addition to measuring the absolute strengths of Higgs boson couplings, it is pertinent to assess the possible existence of relative phases among the couplings, as well as their general Lorentz structure. Hence a broad sweep of Higgs boson production mechanisms and decay modes must be considered to reveal any potential deviations from the SM expectations.
The production rate of ttH is sensitive only to the magnitude of the top quark-Higgs boson Yukawa coupling y t and has no sensitivity to its sign. Measurements of processes such as Higgs boson decays to photon pairs [9] or the associated production of Z and Higgs bosons via gluon-gluon fusion [10] on the other hand do have sensitivity to the sign, because of indirect effects in loop interactions. Those measurements currently disfavor a negative value of the coupling [11,12], but rely on the assumption that only SM particles contribute to the loops [13].
In contrast, the production of Higgs bosons in association with single top quarks in protonproton (pp) collisions proceeds via two categories of Feynman diagrams [14][15][16][17], where the Higgs boson couples either to the top quark or the W boson. The two leading order (LO) diagrams for the t channel production process (tHq) are shown in Fig. 1. Because of the interference of these diagrams, the production cross section is uniquely sensitive to the magnitude as well as the relative sign and phase of the couplings. In the SM, the interference between these two diagrams is maximally destructive and leads to very small production cross sections of about 71, 16, and 2.9 fb for the t channel, tW process, and s channel, respectively, at a center-of-mass energy √ s = 13 TeV [18,19]. Hence measurements using the data collected at the LHC so far are not yet sensitive to the SM production. However, in the presence of new physics, there may be relative opposite signs between the t-H and W-H couplings which lead to constructive interference and enhance the cross sections by an order of magnitude or more. In such scenarios, realized, e.g., in some two-Higgs doublet models [20], tHq production would exceed that of ttH production, making it accessible with current LHC data sets. In this paper, the tHq and tHW processes are collectively referred to as tH production, while s channel production is neglected.
The event topology of tHq production is that of two heavy objects-the top quark, and the Higgs boson-in the central portion of the detector recoiling against one another, while a lightflavor quark and a soft b quark escape in the forward-backward regions of the detector. Leptonic top quark decays produce high-momentum electrons and muons that can be used to trigger the detector readout. Higgs boson decays to vector bosons or τ leptons (H → WW * , ZZ * , or ττ), which subsequently decay to light leptons ( = e ± , µ ± ), lead to a multilepton final state with comparatively small background contributions from other processes. Higgs boson decays to bottom quark-antiquark pairs (H → bb), on the other hand, provide a larger event rate albeit with challenging backgrounds from tt+jets production. In contrast, the rarer Higgs boson decays to two photons (H → γγ) result in easily triggered and relatively clean signals for both leptonic or fully hadronic top quark decays, with backgrounds mainly from other production modes of the Higgs boson. The production of tHW lacks the presence of forward activity and involves three heavy objects and therefore does not exhibit the defining features of tHq events, while closely resembling the ttH topologies.
The CMS Collaboration has previously searched for anomalous tHq production in pp collision data at √ s = 8 TeV, assuming a negative sign of the top quark Yukawa coupling relative to its SM value, y t = −y SM t , using all the relevant Higgs boson decay modes, and set limits on the cross section of this process [21]. This paper describes two new analyses targeting multilepton final states and single-lepton + bb final states, using a data set of pp collisions at √ s = 13 TeV corresponding to an integrated luminosity of 35.9 fb −1 , collected in 2016. Furthermore, a previous measurement of Higgs boson properties in the H → γγ final state at 13 TeV [22] has been reinterpreted in the context of tHq signal production and the results are included in a combination with those from the other channels. This paper is structured as follows: the experimental setup and data samples are described in Sections 2 and 3 respectively; the two analysis channels and their event selection, background estimations, and signal extraction techniques are described in Sections 4 and 5; the reinterpretation of the H → γγ result is described in Section 6; and the results and interpretation are given in Section 7. The paper is summarized in Section 8.

The CMS experiment
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T along the beam direction. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections providing pseudorapidity coverage up to |η| < 3.0. Forward calorimeters employing Cherenkov light detection extend the acceptance to |η| < 5.0. Muons are detected in gasionization chambers embedded in the steel flux-return yoke outside the solenoid with a fiducial coverage of |η| < 2.4. The silicon tracker system measures charged particles within the range |η| < 2.5. The impact parameters in the transverse and longitudinal direction are measured with an uncertainty of about 10 and 30 µm, respectively [23]. Tracks of isolated muons of transverse momentum p T ≥ 100 GeV and |η| < 1.4 are reconstructed with an efficiency close to 100% and a p T resolution of about 1.3 to 2% and smaller than 6% for higher values of η. For p T ≤ 1 TeV the resolution in the central region is better than 10%. A two-level trigger system is used to reduce the rate of recorded events to a level suitable for data acquisition and storage.
The first level of the CMS trigger system [24], composed of custom hardware processors, uses information from the calorimeters and muon detectors to select the most interesting events in a time interval of less than 4 µs. The high-level trigger processor farm further decreases the event rate from around 100 kHz to about 1 kHz. A more detailed description of the CMS detector, together with a definition of the coordinate system and the kinematic variables used in the analysis, can be found in Ref. [25].
A full event reconstruction is performed by the particle-flow (PF) algorithm using optimized and combined information from all the subdetectors [26]. The individual PF candidates reconstructed are muons, electrons, photons, and charged and neutral hadrons, which are then used to reconstruct higher-level objects such as jets, hadronic taus, and missing transverse momentum (p miss T ). Additional quality criteria are applied to the objects to improve the selection purity.
Collision vertices are reconstructed using a deterministic annealing algorithm [27,28]. The reconstructed vertex position is required to be compatible with the location of the LHC beam in the x-y plane. The vertex with the largest value of summed physics-object p 2 T is considered to be the primary pp interaction (PV). Charged particles, which are subsequently reconstructed, are required to be compatible with originating from the selected PV.
The identification of muons is based on linking track segments reconstructed in the silicon tracker and in the muon system [29]. If a link can be established, the track parameters are recomputed using the combination of hits in the inner and outer detectors. Quality requirements are applied on the multiplicity of hits in the track segments, on the number of matched track segments, and on the quality of the track fit [29].
Electrons are reconstructed using an algorithm that matches tracks found in the silicon tracker with energy deposits in the electromagnetic calorimeter while limiting deposits in the hadronic calorimeter [30]. A dedicated algorithm takes into account the emission of bremsstrahlung photons and determines the energy loss [31]. A multivariate analysis (MVA) approach based on boosted decision trees (BDT) is employed to distinguish real electrons from hadrons mimicking an electron signature. Additional requirements are applied in order to remove electrons originating from photon conversions [30]. Both muons and electrons from signal events are expected to be isolated, while those from heavy-flavor decays are often situated near jets. Lepton isolation is quantified using the scalar p T sum over PF candidates reconstructed within a cone centered on the lepton direction and shrinking with increasing lepton p T . The effect of additional pp interactions in the same and nearby bunch crossings (pileup) on the lepton isolation is mitigated by considering only charged particles consistent with the PV in the sum, and by subtracting an estimate of the contribution from neutral pileup particles within the cone area.
Jets are reconstructed from charged and neutral PF candidates using the anti-k T algorithm [32,33] with a distance parameter of 0.4, and with the constraint that the charged particles are compatible with the selected PV. Jets originating from the hadronization of b quarks are identified using the "combined secondary vertex" (CSVv2) algorithm [34], which exploits observables related to the long lifetime of b hadrons and to the higher particle multiplicity and mass of b jets compared to light-quark and gluon jets. Two working points of the CSVv2 discriminant output are used: a "medium" one, with a tagging efficiency for real b jets of 69% and a probability of wrongly tagging jets from light-flavor quarks and gluons of about 1%, and a "loose" one, with a tagging efficiency of 83% and a mistag rate for light-flavor jets of 8%. Finally, the missing transverse momentum is defined as the magnitude of the vectorial p T sum of all PF candidates in the event.

Data and simulation
Collision events for this analysis are selected by the following high-level trigger algorithms. Events in the multilepton channels must pass at least one of single-lepton, dilepton, or trilepton triggers with loose identification and isolation requirements and with a minimum p T threshold based on the lepton multiplicity in the final state. Events in the single lepton + bb channels must pass the same single-lepton triggers, or a dilepton trigger for the control region described in Section 5. The minimum p T threshold for single lepton triggers is 24 GeV for muons and 27 GeV for electrons. For dilepton triggers, the p T thresholds on the leading and subleading leptons are 17 GeV and 8 GeV for muons, and 23 GeV and 12 GeV for electrons, respectively. For the trilepton trigger, the third hardest lepton p T must be greater than 5 GeV for muons and 9 GeV for electrons.
The data are compared to signal and background estimations based on Monte Carlo (MC) simulated samples and techniques based on control samples in data. All simulated samples include the response of the CMS detector based on the GEANT4 [35] toolkit and are generated with a Higgs boson mass of 125 GeV and a top quark mass of 172.5 GeV. The event generator used for the tHq and tHW signal samples is MADGRAPH5 aMC@NLO (version 2.2.2) [36] at LO precision [37] and using the NNPDF3.0 set of parton distribution functions (PDF) [38] with the PDF4LHC prescription [39,40]. The samples are normalized to next-to-leading order (NLO) SM cross sections at 13 TeV of 71.0 and 15.6 fb for tHq and tHW, respectively [18,19].
The Higgs boson production cross sections and branching fractions are expressed as functions of Higgs boson coupling modifiers in the kappa framework [41], where the coupling modifiers κ are defined as the ratio of the actual value of a given coupling to the one predicted by the SM. Particularly relevant for the tH case are the top quark and vector boson coupling modifiers: κ t ≡ y t /y SM t and κ V ≡ g HVV /g SM HVV , where V stands for either W or Z bosons. The dependence of the tHq and tHW production cross sections on κ t and κ V are assumed to be as follows (calculated at NLO using MADGRAPH5 aMC@NLO [17-19]): Event weights are produced in the generation of both samples corresponding to 33 values of κ t between −6.0 and +6.0, and for κ V = 1.0. The tHq events are generated with the four-flavor scheme (4FS) while the tHW process uses the five-flavor scheme (5FS) to disentangle the LO interference with the ttH process [19].
The MADGRAPH5 aMC@NLO generator is also used for simulation of the ttH process and the main backgrounds: associated production of tt pairs with vector bosons, ttW, ttZ, at NLO [42], and with additional jets or photons, tt+jets, ttγ + jets at LO. All the rates are normalized to nextto-next-leading order cross sections. In particular, the ttH production cross section is taken as 0.507 pb [18]. A set of minor backgrounds are also simulated with MADGRAPH5 aMC@NLO at LO, or with other generators, such as NLO POWHEG v2 [43][44][45][46][47][48]. All generated events are interfaced with PYTHIA (8.205) [49] for the parton shower and hadronization steps.
The object reconstruction in MC events uses the same algorithm as used in data. Furthermore, the trigger selection is simulated and applied for generated signal events. However, the triggering and selection efficiencies for leptons are different between data and simulation, at the level of 1%. All simulated events used in the analyses are corrected by applying small datato-MC scale factors to improve the modeling of the data. Separate scale factors are applied to correct for the difference in trigger efficiency, lepton reconstruction and selection efficiency, as well as the b tagging efficiency and the resolution of the missing transverse momentum.
Simulated events are weighted according to the number of pileup interactions so that the distribution of additional pp interactions in the simulated samples matches that observed in data, as estimated from the measured bunch-to-bunch instantaneous luminosity and the total inelastic cross section.

Multilepton channels
Signal tH events where the top quark decay produces leptons and the Higgs boson decays to vector bosons or τ leptons can lead to final states containing multiple isolated, high-p T leptons with different charge and flavor configurations. Of particular interest among these are those with three or more charged leptons or with two leptons of the same electric charge, as they appear with comparatively low backgrounds. Selecting such events in pp collisions while requiring the presence of b-tagged jets typically yields a mixture of mostly tt+jets events with nonprompt leptons and events from the associated production of tt with a vector boson (ttW and ttZ) or with a Higgs boson (ttH) that decay to additional prompt leptons. The analysis described in this section separates the tHq signal from these two dominant background sources by training two multivariate classifiers using features such as the forward light jet, the difference in multiplicities of jets and b-tagged jets ("b jets"), as well as the kinematic properties of the leptons. The two classifier outputs are combined into a single binned distribution, which is then fit to the data to extract the signal yield and constrain the background contributions.

Event and object selections
In the multilepton channels, events are selected with trigger algorithms involving one, two, or three leptons passing the given p T thresholds. At the offline analysis level, a distinction is made between prompt signal leptons (from W, Z, or leptonic τ decays) and nonprompt leptons (either genuine leptons from heavy-flavor hadron decays, asymmetric γ conversions, or jets misidentified as leptons). For this purpose an MVA classifier is used [50], exploiting the properties of the jet associated with individual leptons in addition to the lepton kinematics, isolation, and reconstruction quality. The leptons are selected if they pass a certain threshold of the classifier output and are referred to as "tight" leptons, with a lower threshold defined for a relaxed selection and "loose" leptons.
The final tH event selection targets signatures with H → WW and t → Wb → νb, which results in three W bosons, one b quark, and a light quark at high rapidity. Three mutually exclusive channels are defined based on the number of tight leptons and their flavors: exactly two same-sign leptons (2 ss), either µ ± µ ± or e ± µ ± , or exactly three leptons ( , = µ or e). The same-sign dielectron channel suffers from larger backgrounds and does not add sensitivity and is therefore not included in the analysis. There is an additional requirement of at least one b-tagged jet (using the medium working point of the CSVv2 algorithm) and at least one light-flavor (untagged, using the loose working point) jet. The full selection is summarized in Table 1.
About one quarter of the events in the finally selected sample are from H → ττ and H → ZZ decays, with the rest coming from H → WW decays, as determined from the tHq signal simulation. A significant fraction of selected events also pass the selection used in the dedicated search for ttH in multilepton channels [50]: about 50% in the dilepton channels and about 80% in the channels. Same-sign channel (µ ± µ ± or e ± µ ± ) Exactly two tight SS leptons p T > 25/15 GeV No loose leptons with m < 12 GeV One or more b-tagged jet with p T > 25 GeV and |η| < 2.4 One or more untagged jets with p T > 25 GeV for |η| < 2.4 and p T > 40 GeV for |η| > 2.4 channel Exactly three tight leptons p T > 25/15/15 GeV No lepton pair with |m − m Z | < 15 GeV No loose leptons with m < 12 GeV One or more b-tagged jet with p T > 25 GeV and |η| < 2.4 One or more untagged jets with p T > 25 GeV for |η| < 2.4 and p T > 40 GeV for |η| > 2.4

Backgrounds
The background processes contributing to the signal sample can be divided into two classes, reducible and irreducible, and are estimated respectively from data and MC simulation. Irreducible physics processes, such as the associated production of an electroweak boson with a top quark pair (ttV, V = W, Z), give rise to final states very similar to the tHq signal and are directly estimated from MC simulation. However, the dominant contribution is from the reducible background arising from nonprompt leptons, mainly from tt production. This background is suppressed to a certain extent by tightening the lepton selection criteria. The background estimation methods employed here and summarized below are identical to those used in the dedicated search for ttH in multilepton channels [50].
The yield of reducible backgrounds is estimated from the data, using a "tight-to-loose" ratio measured in a control region dominated by nonprompt leptons. The ratio represents the probability with which the nonprompt leptons that pass the looser selection can also pass the tight criteria, and is measured in categories of the lepton p T and η. A sideband region in data which has loosely selected leptons is then extrapolated with this ratio to obtain the nonprompt background contribution.
A further background in the same-sign dilepton channels arises from events where the charge of one lepton is wrongly assigned. This can be estimated from the data, by measuring the charge misidentification probability using the Z boson mass peak in same-sign dilepton events, and weighting events with opposite-sign leptons to determine the yield in the signal region. The effect is found to be negligible for muons but sizable for electrons.
The production of WZ pairs with leptonic Z boson decays has similar leptonic features as the signal, but usually lacks the hadronic activity required in the signal selection. To determine the corresponding diboson contribution in the signal region, simulated WZ events have been used along with a normalization scale factor determined from data in an exclusive control region.
Other subdominant backgrounds are estimated from MC simulation and include additional multiboson production, such as ZZ, W ± W ± qq, VVV, same-sign W boson production from double-parton scattering (DPS), associated production of top quarks with Z bosons (tZq, tZW), events with four top quarks, and tt production in association with photons and subsequent asymmetric conversions.
The expected and observed event yields after the selections described in Table 1 are shown in Table 2. Table 2: Data yields and expected backgrounds after the event selection for the three multilepton search channels in 35.9 fb −1 of integrated luminosity. Quoted uncertainties include statistical uncertainties reflecting the limited size of MC samples and data sidebands, and unconstrained systematic uncertainties.

Signal extraction
After applying the event selection of the multilepton channels, only about one percent of selected events are expected to be from tH production (assuming the SM cross sections), while roughly 10% of events are from ttH production. To discriminate this small signal from the backgrounds, an MVA method is employed: a classification algorithm is trained twice with tHq events as the signal class, and either ttV (mixing ttW and ttZ according to their respective cross sections) or tt+jets as background classes. The two separate trainings allow the exploitation of the different jet and b jet multiplicity distributions, and of the different kinematic properties of the leptons in the two dominant background classes. Several machine learning algorithms were studied for potential use, and the best performance was obtained with a gradient BDT using a maximum tree depth of three and an ensemble of 800 trees [51]. Events from tHW and ttH production are not used in the training and, because of their close kinematic similarity with the ttV background, tend to be classified as backgrounds.
As observed above, the features of the tHq signal can be split into three broad categories: those related to the forward jet activity; those related to jet and b-jet multiplicities; and those related to kinematic properties of leptons, as well as their total charge. A set of ten observables were used as input features to the classification training, and are listed in Table 3. The training is performed separately for the 2 ss and the channels with the same or equivalent input features. Table 3: Input observables to the signal discrimination classifier.
Number of jets with p T > 25 GeV, |η| < 2.4 Maximum |η| of any (untagged) jet ("forward jet") Sum of lepton charges Number of untagged jets with |η| > 1.0 ∆η between forward light jet and leading b-tagged jet ∆η between forward light jet and subleading b-tagged jet ∆η between forward light jet and closest lepton ∆φ of highest-p T same-sign lepton pair Minimum ∆R between any two leptons p T of subleading (or 3 rd ) lepton A selection of the main discriminating input observables is shown in Figs. 2-4, comparing the data and the estimated distribution of signal and background processes.
Ratio to SM 50 Ratio to SM 120 Ratio to SM  Figure 2: Distributions of discriminating observables for the same-sign µ ± µ ± channel, normalized to 35.9 fb −1 , before fitting the signal discriminant to the data. The grey band represents the unconstrained (pre-fit) statistical and systematic uncertainties. In the panel below each distribution, the ratio of the observed and predicted event yields is shown. The shape of the two tH signals for κ t = −1.0 is shown, normalized to their respective cross sections for The six classifier output distributions, trained against ttV and tt+jets processes for each of the three channels, are shown in Fig. 5, before a fit to the data. The events are then sorted into ten categories depending on the output of the two BDT classifiers according to an optimized binning strategy, resulting in a one-dimensional histogram with ten bins. Figure 6 shows the post-fit categorized classifier output distributions for each of the three channels, after the combined maximum likelihood fit to extract the limits, as described in Section 7. 70 100 250

CMS
Ratio to SM  Figure 3: Distributions of discriminating observables for the same-sign e ± µ ± channel, normalized to 35.9 fb −1 , before fitting the signal discriminant to the data. The grey band represents the unconstrained (pre-fit) statistical and systematic uncertainties. In the panel below each distribution, the ratio of the observed and predicted event yields is shown. The shape of the two tH signals for κ t = −1.0 is shown, normalized to their respective cross sections for

Systematic uncertainties
The yield of signal and background events after the selection, as well as the shape of the classifier output distributions for signal and background processes, have systematic uncertainties from a variety of sources, both experimental and theoretical. Experimental uncertainties relate either to the reconstruction of physics objects or to imprecisions in estimating the background contributions. Uncertainties in the efficiency of reconstructing and selecting physics objects affect all yields and kinematic shapes taken from MC simulation, for both signal and background. Background contributions estimated from the data are not affected by these.
Uncertainties from unknown higher-order contributions to tHq and tHW production are estimated from a change in the factorization and renormalization scales of double and half the initial value, evaluated separately for each point of κ t . The ttH component has an uncertainty of between 5.8-9.3% from scale variations and an additional 3.6% from the knowledge of PDFs and the strong coupling constant α S [18]. Uncertainties related to the choice of the PDF set and its scale are estimated to be 3.7% for tHq and 4.0% for tHW. The effect of missing higher-order corrections to the kinematic shape of the classifier outputs is taken into account for the tH, ttH, and ttV components by independent changes of the renormalization and factorization scales of double and half the nominal value.
The cross sections of ttZ and ttW production are known with uncertainties of +9.6%/− 11.2% and +12.9%/− 11.5%, respectively, from missing higher-order corrections to the perturbative expansion. The corresponding values due to uncertainties in the PDFs and α S are 3.4 and 4.0%, respectively [18].
The efficiency for events passing the combination of trigger requirements is measured separately for events with two or more leptons, and has an uncertainty in the range of 1-3%. Efficiencies for the reconstruction and selection of muons and electrons are measured as a function of their p T , using a tag-and-probe method with uncertainties of 2-4% [52]. The energy scale of jets is determined using event balancing techniques and carries uncertainties of a few percent, depending on p T and η of the jets [53]. Their impact on the kinematic distributions used in the 22 25 90 The uncertainty in the integrated luminosity is 2.5% [54] and affects the normalization of all processes modeled in simulation.

CMS
The estimate of events containing nonprompt leptons is subject to uncertainties in the determination of the tight-to-loose ratio on one hand and to the inherent bias in the selection of the control region dominated by nonprompt leptons, as tested in simulated events, on the other hand. The measurement of the lepton tight-to-loose rate has statistical as well as systematic uncertainties from the removal of residual prompt leptons in the control region, amounting to a total uncertainty of 10-40%, depending on the flavor of the leptons and their p T and η. The validity of the method itself is tested in simulated events and contributes a small additional uncertainty both to the normalization and shape of the classifier distributions for such events.
The estimate of backgrounds from electron charge misidentification in the e ± µ ± channel carries an uncertainty of about 30% from the measurement of the misidentification probability. It is composed of a dominant statistical component from the limited event yields, and one related to the residual disagreement observed when testing the prediction in a control region.
The estimate of backgrounds from WZ production is normalized in a control region with three leptons and carries uncertainties due to its limited event count (10%), the residual non-WZ backgrounds (20%), the b tagging rate (10%), and the theoretical uncertainties related to the flavor composition of jets produced in association with the boson pair (up to 10%). In the dilepton channels, this uncertainty is increased to 50% to account for the differences with respect to the control region.
Additional smaller backgrounds which have not yet been observed at the LHC (VVV, samesign W boson production, tZq, tZW, tttt) are assigned a normalization uncertainty of 50%.  Figure 5: Pre-fit classifier outputs, for the µ ± µ ± channel (left), e ± µ ± channel (center), and threelepton channel (right), for training against ttV (top row) and against tt+jets (bottom row). In the box below each distribution, the ratio of the observed and predicted event yields is shown. The shape of the two tH signals for κ t = −1.0 is shown, normalized to their respective cross sections for κ t = −1.0, κ V = 1.0. The grey band represents the unconstrained (pre-fit) statistical and systematic uncertainties.
Of these sources of systematic uncertainties, the ones with largest impact on the final result are found to be those related to the normalization of the nonprompt backgrounds, the scale variations for the ttV and ttH processes, and the lepton selection efficiencies.

Single-lepton + bb channels
Events from a tH signal where the Higgs boson decays to a bottom quark-antiquark pair (H → bb) produce final states with at least three central b jets and a hard lepton from the top quark decay used for triggering. Selecting such events leads to challenging backgrounds from tt production with additional heavy-flavor quarks, which can be produced in gluon splittings from initial-or final-state radiation. The analysis described in this section uses two selections aimed at identifying signal events, with either three or four b-tagged jets, and a separate sample with opposite-sign dileptons, dominated by tt+jets events, to control tt + heavy-flavor (tt+HF) events in a simultaneous fit. A multivariate classification algorithm is trained to discriminate different tt+jets background components in the control region. Additional multivariate algo- Data/Pred. µ ± µ ± channel (left), e ± µ ± channel (center), and three-lepton channel (right), for 35.9 fb −1 . In the box below each distribution, the ratio of the observed and predicted event yields is shown. The shape of the tH signal is indicated with 10 times the SM.
rithms are used to optimize the jet-parton assignment used to reconstruct kinematic properties of signal and background events which, in turn, are used to distinguish these components.

Selection
Selected events in the single-lepton + bb signal channels must pass a single-lepton trigger. Each event is required to contain exactly one muon or electron. Muon (electron) candidates are required to satisfy p T > 27 (35) GeV and |η| < 2.4 (2.1), motivated by the trigger selection, and to be isolated and fulfill strict quality requirements. Events with additional leptons that have p T > 15 GeV and pass less strict quality requirements are rejected. At the analysis level, the selection criteria target the H → bb and t → Wb → νb decay channels. With these decays, the final state of the tHq process consists of one W boson, three b quarks, and the light-flavor quark recoiling against the top quark-Higgs boson system. In addition, a fourth b quark is expected because of the initial gluon splitting, but often falls outside the detector acceptance. The main signal region is therefore required to have either three or four b-tagged jets and at least one additional untagged jet, both defined using the medium working point. Central jets with |η| < 2.4 are required to have p T > 30 GeV, while jets in the forward region (2.4 ≤ |η| ≤ 4.7) are required to have p T > 40 GeV.
The noninteracting neutrino is accounted for by requiring a minimal amount of missing transverse momentum of p miss T > 35 GeV in the muon channel and p miss T > 45 GeV in the electron channel. This renders the background from QCD multijet events negligible.
In addition to the signal regions, a control region is defined to constrain the main background contribution from top quark pair production. Events selected for this control region must pass a dilepton trigger. Each event is required to contain exactly two oppositely charged leptons, where their flavor can be any combination of muons or electrons. Two jets in each event must be b tagged. Furthermore, at least one additional jet must pass the loose b tagging requirement. Similarly to the signal regions, each event is further required to have a minimum amount of missing transverse momentum. All selection criteria are summarized in Table 4. Control region Two leptons: p T > 20/20 GeV (µ ± µ ∓ ) or p T > 20/15 GeV (e ± e ∓ /µ ± e ∓ ) No additional loose leptons Two medium b-tagged jets p T > 30 GeV and |η| < 2.4 One or more additional loose b-tagged jets p T > 30 GeV and |η| < 2.4 p miss T > 40 GeV

Backgrounds
The main background contribution in the single-lepton + bb channels arises from SM processes with multiple b quarks. The modeling and estimation of all background processes are done using samples of simulated events.
In particular, the dominant background process is top quark pair production because of the similar final state and, comparatively, a large cross section. As the modeling of the additional heavy-flavor partons in tt events is theoretically difficult, the sample of simulated tt events is further divided into different subcategories, defined by the flavor of possible additionally radiated quarks and taking into account a possible merging of b hadrons into single jets. The control region is specifically designed to separate the tt+HF and tt + light-flavor (tt+LF) components with a multivariate approach. The different categories are listed in Table 5. tt+bb Two additional jets arising from b hadrons tt+2b One additional jet arising from two merged b hadrons tt+b One additional jet arising from one b hadron tt+cc The three former categories combined for c hadrons instead of b hadrons tt+LF All events that do not meet the criteria of the other four categories Other backgrounds contributing to the signal region are single top quark production and top quark pair production in association with electroweak bosons, namely ttW and ttZ. An irreducible background for the tHq processes comes from tZq production with Z → bb. Background contributions also arise from Z+jets production, especially in the dilepton control region.
The expected and observed event yields for the signal and control regions are listed in Table 6.

Signal extraction
As the assignment of final state quarks to reconstructed jets is non-trivial for the multijet environment of the 3 and 4 tag signal regions, the jet-to-quark assignment is achieved with dedicated jet assignment BDTs (JA-BDTs). Each event is reconstructed under three different hypotheses: tHq signal event, tHW signal event, or tt+jets background event. Each assignment hypothesis utilizes a separate BDT, which is trained with correct and wrong jet-to-quark assignments of the respective process. When a JA-BDT is applied, all possible jet-to-quark assignments are evaluated and the one with the highest JA-BDT output value is chosen for the given hypothesis. The matching efficiency for a complete tHq event is 58 (45)% in the 3 (4) tag signal region, for a complete tHW event 38 (29)% and for a complete tt event 58 (31)%.
The different assignment hypotheses provide sensitive variables, which can be exploited in a further signal classification BDT (SC-BDT) to separate the tHq and tHW processes from the main background of the analysis, tt events. Global event variables that do not rely on any particular jet-to-quark assignment are used in addition to the assignment-dependent variables. The input variables used for the SC-BDT are listed in Table 7 with the result of the training illustrated in Fig. 7.
In addition, a dedicated flavor classification BDT (FC-BDT) is used in the dilepton region to constrain the contribution of different tt + X components. The training is performed with tt+LF as signal process and tt+bb as background process. This FC-BDT exploits information of the number of jets per event and their response to b and c tagging algorithms. The full list of input variables is provided in Table 8 and the result of the training of the FC-BDT is shown in Fig. 8.  To determine the signal yield, the output distributions of the SC-BDT in the three and four b tag regions are fitted simultaneously with the output of the FC-BDT in the dilepton region. The SC-BDT output distributions before the fit are shown in Fig. 9 and the result of the fit is shown in Fig. 10. The pre-and post-fit distributions of the FC-BDT are shown in Fig. 11.

Systematic uncertainties
Many systematic uncertainties affect the result of the analysis, arising both from experimental and theoretical sources. All uncertainties are parametrized as nuisance parameters in the statistical inference performed in the final analysis step described in Section 7.
The uncertainty in the signal normalization due to the choice of factorization and renormalization scales is evaluated by changing their values to double and half of the nominal values. A rate uncertainty of around 5% is assigned to each process to account for the choice of PDFs, since shape variations are found to be negligible. Furthermore, for each tt+HF category, an individual 50% rate uncertainty is assigned, since the modeling of these components is theo-   The observed top quark p T spectrum in tt events is found to be softer than the theoretical prediction [59]. A systematic uncertainty for this effect is derived by applying event-by-event weights that correct the disagreement.
Efficiency corrections for the selection of isolated leptons by the trigger and quality require-   ments are evaluated with a tag-and-probe method. Uncertainties in correcting the distribution of PV interactions are accounted by varying the total inelastic cross section by 4.6% [60]. The corrections applied to the jet energy scale and resolution are varied within their given uncertainties and the migration between different categories is used to determine the effect. In addition, the contribution to p miss T of unclustered particles is varied within the resolution of each particle [61]. The b tagging efficiencies for jets are measured in QCD multijet and tt enriched samples and varied within their uncertainties [34].
As for the multilepton channel, an uncertainty of 2.5% is assigned to the integrated luminosity [54] and affects the normalization of all processes.
The dominant systematic uncertainties are those related to the factorization and renormalization scales, as well as the uncertainties in the overall normalization of the tt+HF processes and the jet energy corrections.

Reinterpretation of the H → γγ measurement
The standard model tHq and tHW signal processes with H → γγ were included in previous measurements of the Higgs boson properties in the inclusive diphoton final state [22]. Events with two prompt high-p T photons were divided into different event categories, each enriched with a particular production mechanism of the Higgs boson. The tHq and tHW processes contribute mostly to the "ttH hadronic", and "ttH leptonic" categories as defined in Ref. [22], which target the ttH process for fully hadronic top quark decays and for single-lepton or dilepton decay modes, respectively. Events in the ttH leptonic category are selected to have at least one lepton well separated from the photons, and well reconstructed, as well as at least two jets of which at least one passes the medium b tagging requirement. The ttH hadronic category is defined as events with no identified leptons and at least three jets, of which at least one is b tagged with the loose working point.
The signal is modeled with a sum of Gaussian functions describing the diphoton invariant mass (m γγ ) shape derived from simulation. The background contribution is determined from the data without reliance on simulated events, using the discrete profiling method [4, 62,63]. Different classes of models describing the falling m γγ distribution in the background processes are used as input to the method. Sources of systematic uncertainties affecting the signal model and leading to migrations of signal events among the categories are considered.
The inputs to Ref.
[22] from the ttH categories are used here in a combination with the multilepton and single-lepton + bb channels to put constraints on the coupling modifier κ t and on the production cross section of tH events. The coupling modifiers κ t and κ V affect both the tH and ttH production cross sections, as well as the Higgs boson decay branching fraction into two photons through the interference of bosonic and fermionic loops. Changes in the kinematic properties of the tH signal arising from the modified couplings are taken into account by considering their effect on the signal acceptance and selection efficiency. Figure 12 shows the modified tHq and tHW selection efficiencies including acceptances for the two relevant categories of the H → γγ measurement as a function of the ratio of coupling modifiers κ t /κ V . The signal diphoton mass shape is found to be independent of κ t /κ V .
The dependence of the signal acceptance and efficiency on κ t /κ V is implemented in the same statistical framework as that of Ref. [22], modifying the signal only in the ttH categories.

Results and interpretation
The different discriminator output distributions in the multilepton and single-lepton + bb channels and the γγ invariant mass distributions in the diphoton channel are compared to the data in a combined maximum likelihood fit for various assumptions on the signal kinematics and normalizations, and are used to derive constraints on the signal yields.
The event selections in the different channels are mutually exclusive, therefore allowing a straightforward combination. Common systematic uncertainties such as the integrated luminosity normalization, the b tagging uncertainties, and the theoretical uncertainties related to the signal modeling are taken to be fully correlated among different channels.
A profile likelihood scan is performed as a function of the coupling modifier κ t , which affects the production cross sections of the three signal components tHq, tHW, and ttH, as well as the Higgs boson branching fractions. Effects on Higgs boson decays via fermion and boson loops to γγ, Zγ, and gluon-gluon final states also affect the branching fractions in other channels. Furthermore, the kinematic properties of the two tH processes and thereby the shape  Figure 12: Acceptance and selection efficiency for the tHq (red) and tHW (blue) signal processes as a function of κ t /κ V , for the ttH leptonic (solid lines) and ttH hadronic categories (dashed lines) of the H → γγ measurement.
of the classifier outputs entering the fit depend on the value of κ t . Systematic uncertainties are included in the form of nuisance parameters in the fit and are treated via the frequentist paradigm, as described in Refs. [64,65]. Uncertainties affecting the normalization are constrained either by Γ-function distributions, if they are statistical in origin, or by log-normal probability density functions. Systematic uncertainties that affect both the normalization and shape in the discriminating observables are included in the fit using the technique detailed in Ref. [66], and represented by Gaussian probability density functions. Table 9 shows the impact of the most important groups of nuisances parameters on the tH + ttH signal yield. Pre-fit systematic uncertainties of the same groups are shown for comparison. To derive constraints on κ t for a fixed value of κ V = 1.0, a scan of the likelihood ratio L(κ t )/L(κ t ) is performed, whereκ t is the best fit value of κ t . Figure 13 shows the negative of twice the logarithm of this likelihood ratio (−2∆ ln (L)), for scans on the data, and for an Asimov data set [67] with SM expectations for ttH and tH. On this scale, a 95% confidence An excess of observed over expected events is seen both in the multilepton and γγ channels, with a combined significance of about two standard deviations. Consequently, floating a signal strength modifier (defined as the ratio of the fitted signal cross section to the SM expectation) of a combined tH + ttH signal yields a best fit value of 2.00 ± 0.53 under the SM hypothesis. These results are in agreement with those from the dedicated ttH searches [6], as expected, since they share a large fraction of events with the data set used here.
To establish limits on tH production, a different signal strength parameter is introduced for the combination of tHq and tHW, not including ttH. A maximum likelihood fit for this signal strength is then performed based on the profile likelihood test statistic [64,65] at fixed points of κ t . Upper limits on the signal strength are then derived using the CL s method [68,69] and using asymptotic formulae for the distribution of the test statistic [67]. They are multiplied by the κ tdependent tH production cross section times the combined Higgs boson branching fractions to WW * + ττ + ZZ * + bb + γγ and are shown in Fig. 14. Limits for the SM and for a scenario with κ t = −1.0 for the individual channels are shown in Table 10. The ttH contribution is kept fixed to its κ t -dependent expectation. The fiducial cross section for SM-like tH production is limited to about 1.9 pb, with an expected limit of 0.9 pb, corresponding, respectively, to about 25 and 12 times the expected cross section times branching fraction in the combination of the channels explored. The significant discrepancy between observed and expected limits around κ t = 0.0 is caused by the fact that the predicted ttH cross section vanishes while the data favors even larger than expected yields for ttH in both the γγ and multilepton channels.

Summary
Events from proton-proton collisions at √ s = 13 TeV compatible with the production of a Higgs boson (H) in association with a single top quark (t) have been studied to derive constraints on the magnitude and relative sign of Higgs boson couplings to top quarks and vector bosons. at the 95% confidence level. An excess of events compared with the expected backgrounds is compatible with the standard model expectation of tH + ttH production. Table 10: Expected and observed 95% CL upper limits on the tH production cross section times H → WW * + ττ + ZZ * + bb + γγ branching fraction for a scenario of inverted couplings (κ t = −1.0, top rows) and for an SM-like signal (κ t = 1.0, bottom rows), in pb. The expected limit is calculated on a background-only data set, i.e., without tH contribution, but including a κ t -dependent contribution from the ttH production. The ttH normalization is kept fixed in the fit, while the tH signal strength is allowed to float. Limits can be compared to the expected product of tH cross sections and branching fractions of 0.83 and 0.077 pb for the inverted top quark Yukawa coupling and for the SM, respectively.
[7] ATLAS Collaboration, "Observation of Higgs boson production in association with a top quark pair at the LHC with the ATLAS detector", Phys.  [11] CMS Collaboration, "Precise determination of the mass of the Higgs boson and tests of compatibility of its couplings with the standard model predictions using proton collisions at 7 and 8 TeV", Eur. Phys. J. C 75 (2015) 212, doi:10.1140/epjc/s10052-015-3351-7, arXiv:1412.8662.