Measurement of the production cross section for a Higgs boson in association with a vector boson in the H → W W ∗ → (cid:2) ν (cid:2) ν channel in pp collisions at √ s = 13 TeV with the ATLAS detector

A measurement of the Higgs boson production cross sections via associated W H and ZH production using H → W W ∗ → (cid:2) ν (cid:2) ν decays, where (cid:2) stands for either an electron or a muon, is presented. Results for combined W H and ZH production are also presented. The analysis uses events produced in proton– proton collisions collected with the ATLAS detector at the Large Hadron Collider in 2015 and 2016. The data correspond to an integrated luminosity of 36 . 1 fb − 1 recorded at a centre-of-mass energy of 13 TeV. The products of the H → W W ∗ branching fraction times the W H and ZH cross sections are measured to be 0 . 67 + 0 . 31 − 0 . 27 (stat.) + 0 . 18 − 0 . 14 (syst.) pb and 0 . 54 + 0 . 31 − 0 . 24 (stat.) + 0 . 15 − 0 . 07 (syst.) pb respectively, in agreement with the Standard Model predictions.


Introduction
Higgs boson production in association with a W or Z boson, which is respectively denoted by W H and Z H, and collectively referred to as V H associated production in the following, provides direct access to the Higgs boson couplings to weak bosons. In particular, in the W H mode with subsequent H → W W * decay, the Higgs boson couples only to W bosons, at both the production and decay vertices. This paper presents a measurement of the corresponding production cross sections through the decay H → W W * → ν ν, The analysis is performed using events with three (3 ) or four (4 ) charged leptons (electrons or muons) in the final state, targeting the W H and Z H channels respectively. Leptonic decays of τ leptons, from H → W W * → τ ντ ν or H → W W * → τ ν ν or from the associated vector bosons, are considered as signal, while no specific selection is performed for events with hadronically decaying τ leptons in the final state. Events from V H production with E-mail address: atlas .publications @cern .ch. H → τ τ are considered as background. The leading-order Feynman diagrams for the W H and Z H production processes are depicted in Fig. 1.
In the W H channel, multivariate discriminants are used to maximise the sensitivity to the Higgs boson signal, while in the Z H channel the analysis is performed through selection requirements. The distribution of these W H discriminants, together with event counts in background control regions and the signal regions in the Z H channel, are combined in a binned maximum-likelihood fit to extract the signal yield and the background normalisations. The maximum-likelihood fit provides results for the W H and the Z H channels separately and for their combination V H, assuming the Standard Model (SM) prediction for the relative cross sections of the two production processes.

ATLAS detector
The ATLAS experiment [10-12] is a multi-purpose particle detector with a forward-backward symmetric cylindrical geometry and a near 4π coverage in solid angle. 1 It consists of an in-1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe. The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upwards. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis. The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2). Angular distance is measured in units of R ≡ ( η) 2 + ( φ) 2 . Transverse momentum and energy are defined as p T = p sin θ and E T = E sin θ respectively.

Table 1
Monte Carlo generators used to model the signal and background processes. Alternative generators, underlying event and parton-showering models, used to estimate systematic uncertainties, are shown in parentheses. In the last column the prediction order for the total cross section is shown. "Pythia6" refers to version 6.428, "Pythia8" refers to versions 8. 210  surrounds the calorimeters and is based on three large air-core toroidal superconducting magnet systems that provide a field integral between 2.0 and 6.0 T m across most of the detector. The muon spectrometer includes a system of precision tracking chambers covering the region |η| < 2.7 and fast detectors for triggering within the range |η| < 2.4. A two-level trigger system is used to select events [13].

Signal and background Monte Carlo simulation
Monte Carlo (MC) event generators are used to model signal and background processes. All signal samples were generated with a Higgs boson mass of 125 GeV [14,15]. For most processes, sep-arate programs were used to generate the hard scattering process and to model the underlying event and the parton showering (UEPS). A description of the MC samples is given in Table 1. They are normalised to cross-section predictions calculated with the QCD and electroweak (EW) orders specified in the last column of Table 1.
The qq → W H and qq → Z H processes were generated with Powheg-Box v2 [30] MiNLO interfaced to Pythia8 [31], with the AZNLO set of tuned parameters (tune) [32] for underlying event, showering and hadronisation. The gg → Z H process was simulated with Powheg-Box v2 + Pythia8 with the AZNLO tune for underlying event, showering and hadronisation. For the V H samples, the PDF4LHC15 parton distribution function (PDF) set [33] was used for the hard scattering process in Powheg-Box v2 and the CTEQ6L1 PDF set [34] was used for the parton showering in Pythia8. Herwig 7 [35], with the MMHT2014lo68cl PDF set [36], was used as an alternative parton-showering model for V H. The uncertainty due to the PDF choice is smaller than the uncertainty obtained by using Herwig as an alternative parton shower model (Section 7).
The gluon-gluon fusion (ggF) events were generated with Powheg-Box v2 NNLOPS [37] interfaced to Pythia8 with the AZNLO tune. The vector-boson fusion (VBF) events were generated with Powheg-Box v2, interfaced to Pythia8. For the ggF and VBF samples, the PDF4LHC15 PDF set was used for the hard scattering process in Powheg-Box v2 and the CTEQ6L1 PDF set was used for the parton showering in Pythia8. The contribution from the tt H and t H production modes is negligible. The top-quark pair production (tt) was simulated with Powheg-Box v2 [38] using the NNPDF 3.0 NNLO PDF set [39] and interfaced to Pythia8 using the NNPDF 2.3 PDF set [40] for parton showering, with the A14 tune [41]. For tt production, Sherpa [42] 2.2.1, with the NNPDF 3.0 PDF set, was used as an alternative generator while Herwig 7, with the MMHT2014lo68cl PDF set, was used as an alternative UEPS model. The single-top-quark production W t was generated with Powheg-Box v1 [23] interfaced to Pythia6 [43] for parton showering with the Perugia2012 tune [44]. EvtGen 1.2.0 [45] was used for the simulation of b-quark and c-quark decays. The ttW /Z and t Z processes were generated at leading order (LO) with MG5_aMC@LO [25] version 2.2.2 (ttW /Z ) and 2.2.1 (t Z ) interfaced to Pythia8 (ttW /Z ) and Pythia6 (t Z ), using the NNPDF2.3 LO PDF set.
The qq/qg → V V * samples with final states ν and [46] were generated with Sherpa 2.2.2, with the exception of the Z Z * sample in the W H analysis for which Sherpa 2.1 was used; the CT10 PDF set [47] and the NNPDF 3.0 PDF set were used for versions 2.1 and 2.2.2, respectively. Powheg-Box v2 [48] was used as an alternative generator for V V * , with Herwig++, using the CTEQ6L1 PDF set, for parton showering. Among the loop-induced gg-initiated diboson processes, the only relevant process in this analysis is gg → Z Z * , for which a K -factor of 1.55 was used [28].
This process was simulated with Sherpa 2.1.1, using the CT10 PDF set.
The triboson V V V samples were generated with Sherpa 2.2.2 and the NNPDF 3.0 PDF set. MG5_aMC@NLO was used as an alternative generator for V V V , with Pythia8, using the NNPDF2.3 LO PDF set. The same PDF sets were used for the hard scattering and the parton showering in all the Sherpa samples described above.
All simulated samples include the effect of pile-up from multiple interactions in the same and neighbouring bunch crossings. This was achieved by overlaying minimum-bias events, simulated using Pythia8 with the A2 tune [49] and MSTW2008LO PDF set [50]. All samples were processed through the Geant 4 [51] AT-LAS detector simulation [52].

Event reconstruction
Candidate signal events are selected using triggers that require a single isolated lepton with minimum transverse momentum (p T ) thresholds between 24 GeV and 26 GeV for electrons and between 20 GeV and 26 GeV for muons, depending on the data-taking period. At least one of the leptons reconstructed offline is required to have triggered the event and to have a p T higher than the nominal trigger threshold by at least 1 GeV. The single-lepton trigger efficiencies on the plateau are approximately 70% for single muons with |η| < 1.05, 90% for single muons in the range 1.05 < |η| < 2.40 and greater than 90% for single electrons in the range |η| < 2.47. The trigger efficiency for the signal events, estimated after the preselection, is 94% for W H and 98.5% for Z H.
Selected events are required to have at least one primary vertex reconstructed from at least two associated tracks, each with transverse momentum p T > 400 MeV, as described in Ref. [53]. If an event has more than one reconstructed primary vertex, the vertex with the largest track p 2 T is selected for the analysis. Electrons are reconstructed from clusters of energy deposits in the EM calorimeter matched to ID tracks, and are identified using criteria based on the calorimeter shower shape, the quality of the match between the track and the cluster and the amount of transition radiation emitted in the ID, as described in Ref. [54]. Electrons are required to satisfy |η| < 2.47, excluding 1.37 < |η| < 1.52, which corresponds to the transition region between the barrel and the endcap EM calorimeters. The efficiency for electron identification ranges from 88% to 94%, depending on electron p T and η.
Muons are reconstructed by combining ID and MS tracks with consistent trajectories and curvatures. An overall fit of hits from the ID track, energy loss in the calorimeter and the hits of the track in the muon system is used to form muon candidates, as described in Ref. [55]. The efficiency for muon identification is close to 95% over the full instrumented η range. To suppress particles misidentified as leptons, several identification requirements as well as impact parameter, calorimeter and track isolation criteria [54,55]

Event selection
In the W H channel, exactly three isolated leptons with p T > 15 GeV are required with a total charge of ±1. The lepton with unique charge is labelled 0 , the lepton closest to 0 in angular distance R is labelled 1 , and the remaining lepton is labelled 2 . In signal events leptons 0 and 1 are most likely to originate from the H → W W * decay, with probabilities of 99% and 85% respectively.
The most prominent background processes to the W H channel are W Z/W γ * production and top-quark processes with either three prompt leptons, e.g. tt V , or two prompt leptons and one non-prompt lepton from a b-hadron decay, e.g. tt. Other important background processes are Z Z * (including Z γ * ), Z γ and Z +jets production; they may satisfy the signal selection requirements if a lepton is undetected, in the case of Z Z * , or if they contain a misidentified or non-prompt lepton, in the case of Z γ and Z +jets production. Processes with three prompt leptons in the final state such as tribosons, in particular W W W , also contribute to the background. Contributions from background processes that include more than one misidentified lepton, such as W +jets production and inclusive bb pair production, are negligible. The background from top-quark production is suppressed by vetoing events if they contain any b-tagged jet. The analysis of the W H channel separates events with at least one same-flavour opposite-sign charge (SFOS) lepton pair from events with zero SFOS lepton pairs, which have different signalto-background ratios. Due to the presence of Z → decays as a dominant background, the former is hereafter referred to as the Z -dominated category, while the latter is referred to as the Z -depleted category. In the Z -dominated category, the major background processes are those involving Z bosons. Events are ve- The W Z/W γ * and top-quark background processes are normalised with the normalisation factors from the control region analysis (Table 4). The Z +jets and Z γ background processes are estimated with the data-driven technique described in Section 6. toed if they contain more than one jet. This requirement further suppresses top-quark events with an additional non-prompt lepton from b-hadron decays. In order to select final states with neutrinos, E miss T is required to be above 30 GeV. The invariant masses m of all SFOS pairs are required to satisfy a Z -veto selection: |m − m Z | > 25 GeV. The last two requirements suppress W Z/W γ * and Z Z * events, and improve the Z +jets rejection. In order to suppress background events from heavy-flavour quarkonia, the smallest invariant mass of SFOS pairs is required to be greater than 12 GeV. The signal efficiency of this selection with respect to the preselection is 34.6%. A discriminant based on a boosted decision tree (BDT) [63,64] is used to achieve a further separation between signal and background processes. The main purpose of the multivariate classifier, named BDT Zdom , is to distinguish between signal and the dominant W Z/W γ * and Z Z * background processes, and hence it is trained against these two background processes. The BDT Zdom uses seven input variables. They are the magnitude of the vectorial sum of lepton transverse momenta (| p i T |), the invariant masses of the first lepton pair (m 0 1 ) and of the three leptons (m ), the angular distance R 0 1 , E miss T and the pseudorapidity separation between the leptons with the same charge ( η 1 2 ). Moreover the BDT uses the transverse mass built from the p miss T and the lepton W which is the lepton not belonging to the SFOS lepton pair with invariant mass closer to the Z boson mass, and could be either 1 or 2 . Fig. 2 shows the distribution of R 0 1 , which is the most powerful variable in the BDT Zdom training, and the BDT Zdom distribution in the Z -dominated category, before applying any selection requirement on the BDT Zdom score.
The Z -dominated signal region (SR), defined as the events with high-ranking BDT Zdom score (BDT Zdom > 0.3), is divided into three bins with increasing sensitivity: 0.3 ≤ BDT Zdom < 0.5, 0.5 ≤ BDT Zdom < 0.7 and 0.7 ≤ BDT Zdom < 1.0. This binning is the result of an optimisation to find the minimal number of BDT bins that gives the highest sensitivity. The expected signal-to-background ra-tio in these bins is about 0.05, 0.09 and 0.19, respectively. The full Z -dominated event selection is summarised in Table 2.
In the Z -depleted category, the two major background pro- has the same signature of the signal, namely three prompt leptons, while tt contains a misidentified or non-prompt lepton from a jet. Two separate BDTs, named BDT W Z and BDT tt , are trained to allow an optimal background rejection. The BDT W Z uses 11 input variables, of which three common to the BDT Zdom are m 0 1 , E miss T and η 1 2 ; the other variables are the transverse momenta of the three leptons (p 0 T , p 1 T , p 2 T ), the transverse mass (m T 0 1 ) built from 0 , 1 and p miss T , the invariant mass of the electrons with same-sign charge (m ee ), the transverse impact parameter significances of the lepton with lowest p T (|d 0,sig,min |), the transverse impact parameter significances of the lepton with second-lowest p T and opposite charge with respect to the lepton with lowest p T (|d 0,sig,mid |) and the compatibility of the event with the W Z hy- A definition of the most likely lepton from heavy-flavour decays ( HFL ) is crucial for an optimal performance of the BDT tt . For this purpose, a BDT HFL is trained purely on data using track and calorimeter isolation as well as impact-parameter variables as input. The lepton with the minimal BDT HFL output is selected as HFL . The BDT tt uses nine input variables, of which two common to the BDT Zdom and BDT W Z are m 0 1 and η 1 2 , one common to the BDT Zdom is R 0 1 , and the other input variables are the number of jets (N jet ), the transverse momentum of the leading jet (p j lead T ), the invariant mass of the leptons with same-sign charge (m 1 2 ), and three HFL -related variables: its BDT HFL output, its transverse momentum, p HFL T , and the invariant mass built from it together 2 Given the reconstructed charged-lepton momenta and the p miss T , the event kinematics can be calculated under the W Z with Z → τ τ hypothesis and using the collinear approximation for the τ decays with one remaining unknown, e.g. the ratio of one τ energy to the energy of the lepton from this τ decay. This unknown is varied and the number of physical kinematic solutions is taken as a measure of the compatibility with the W Z hypothesis.   (Table 4).
with the closest opposite-charge lepton (m HFL cloc ). Fig. 3 shows the outputs of BDT W Z and BDT tt in the Z -depleted category, before applying any selection requirements on the BDT scores. The choice of input variables for the different BDTs was the result of an optimisation study where several thousand different BDTs using different set of input variables have been compared and the best performing BDTs have been selected for the final analysis.
The full Z -depleted event selection is summarised in Table 2. The events with high-ranking BDT scores (BDT tt > 0.2 and BDT W Z > 0.15) are used to define the Z -depleted SR. In this region, the BDT scores are used as discriminant variables in the statistical analysis, with three bins in BDT tt , each being subdivided into two bins in BDT W Z as shown in Table 3. The expected signalto-background ratio in these bins ranges from about 0.07 in the first bin up to about 3.6 in the last bin. The signal efficiency with respect to the preselection, before any cut on the BDT scores, is 23.5%.
The Z H channel requires events to contain four isolated leptons with p T > 10 GeV and total electric charge of zero. Events that contain a SFOS lepton pair with m < 10 GeV are rejected to suppress the contamination from heavy-flavour quarkonia. Following this preselection, events are classified according to the number of SFOS lepton pairs: 1-SFOS and 2-SFOS. Events with no SFOS lepton pairs are not considered.  (Table 4). The Mis-Id background is estimated with the data-driven technique described in Section 6.
The reconstruction of the Z H process proceeds through the identification of the leptons from the Z boson, called 2 and 3 , as the SFOS lepton pair with invariant mass closest to the Z boson mass, m Z . Then, the remaining two leptons, labelled 0 and 1 , are candidates for originating from the Higgs boson decay. The background in the Z H channel is almost exclusively due to Z Z * production. This constitutes ∼92% of the total background after the preselection is applied. Processes with four prompt leptons in the final state such as triboson production, in particular Z W W , which has the same signature as the signal, and tt Z also contribute to the background. Other background processes such as W Z/W γ * , Z +jets and t V may contribute when at least one jet, hadron or a converted photon is misidentified as a lepton.
In order to suppress the tt Z process, events containing b-tagged jets are rejected and at most one and two jets are allowed in 2-SFOS and 1-SFOS classes, respectively. To reduce the Z Z * background process in events with two SFOS lepton pairs, a threshold of 45 GeV is applied to the E miss

Background estimation
The main background contamination originates from processes with the same final state as the signal, namely diboson production (W Z/W γ * , Z Z * ), top-quark processes with three or four prompt leptons such as tt V , and triboson production. Other relevant background contributions arise from processes, such as tt or Z +jets, where the reconstructed leptons either originate from non-prompt decays of heavy-flavour hadrons or from jets misidentified as leptons.
Two dedicated regions, hereafter named control regions (CRs), are used to estimate the normalisation factors (NFs) of the main prompt background processes by fitting the expected yields from simulation to data: W Z/W γ * for the W H channel and Z Z * for the Z H channel in the 2-SFOS SR. In the 1-SFOS SR, Z Z * is estimated purely from simulation.
The CRs are made orthogonal to the corresponding SRs by inverting various selection criteria with respect to the SR definitions. The W Z CR is defined by reversing the Z -veto in the Z -dominated W H signal region. To improve the purity of the W Z CR, the minimum E miss T is increased from 30 GeV to 50 GeV. The Z Z CR is defined by inverting the m 0 1 requirement defined in the Z H 2-SFOS SR. In order to increase the number of events, the E miss T , φ boost 0 1 , p 4 T , and m 4 requirements are removed. Table 4 summarises the event selection for the W Z and Z Z CRs and the NFs for the background processes, obtained from the fit described in   The background contributions with misidentified leptons are estimated using different techniques. The top-quark background in the W H channels is normalised using a CR (top-quark CR) defined by requiring at least one b-tagged jet. To improve the purity of the top-quark CR, the minimum E miss T is increased from 30 GeV to 50 GeV if at least one SFOS pair is present in the event. Processes with one misidentified lepton (tt and W t) constitute 94% of the top-quark CR, the remaining events contain three prompt leptons from ttW decays. The full selection requirements applied to define the top-quark CR and the measured NF are also summarised in Table 4. Fig. 5(c) shows the m 0 1 distribution in the top-quark CR as obtained from the final fit. Table 5 Post-fit predictions and data yields in the four SRs. The uncertainties include those from the sample statistics, and the theoretical and experimental systematic uncertainties. The sum of the single contributions may differ from the total value due to rounding. Moreover, the total uncertainty differs from the sum in quadrature of the single-process uncertainties due to the correlations. "Other Higgs" contains Higgs production mechanisms and decay processes different from V H and H → W W * respectively, except for the H → Z Z * contribution which is included in the " Z Z * " row. The corresponding prefit predictions for the background processes differ only by the normalization factors listed in Table 4 A control sample where one or two of the lepton candidates fail to meet the nominal identification or isolation criteria but satisfy looser identification criteria, referred to as antiidentified leptons, is used. The contribution from misidentified leptons in the SR is then obtained by scaling the number of events in the control sample by extrapolation factors measured in a data sample enriched in Z +jets events. The latter is obtained by selecting events with two prompt leptons from a Z boson decay and a loosely identified lepton considered to be the misidentified lepton candidate. The extrapolation factors are defined as the ratio of the number of misidentified lepton candidates that pass the nominal identification criteria to the number that pass the antiidentification criteria. In both the control sample and the data samples enriched in Z +jets events, the contribution from background events not estimated with this method is subtracted using MC expectations. Details of this method can be found in Ref.
[67]. The uncertainty in the data-driven background processes described in this section includes the statistical uncertainty in the Z +jets enriched sample, the uncertainty from Z +jets MC modelling, and the theory uncertainty from the subtraction of other processes. For the Z H channel, which can have events with two prompt leptons and two misidentified leptons, the uncertainty in the extrapolation from the control sample to the SRs is also included.

Systematic uncertainties
The systematic uncertainties can be categorised into those arising from experimental sources and those from theoretical sources. The dominant experimental uncertainties come from the misidentification of leptons (see Section 6), the mismodelling of the impact-parameter significance, the b-tagging efficiency [61] and the jet energy scale and resolution [58]. Other sources of uncertainty are due to the modelling of pile-up, the calibration of the missing transverse momentum measurement [62], and the luminosity measurement. The uncertainty in the combined 2015+2016 integrated luminosity is 2.1%. It is derived from the calibration of the luminosity scale using x-y beam-separation scans, following a methodology similar to that detailed in Ref.
The impact of the uncertainties on lepton energy (momentum) scale and resolution, and identification and isolation criteria [54,55,70] is negligible. The experimental uncertainties are varied in a correlated way across all background processes and all signal-and control-region bins, so that uncertainties in the extrapolation from control to signal regions are correctly propagated. The luminosity uncertainty is only applied to background processes that are normalised to theoretical predictions, and to the Higgs boson signal. The theoretical uncertainties are evaluated by comparing nominal and alternative event generators and UEPS models as described in Section 3 and by varying PDF sets and the QCD renormalisation and factorisation scales. The uncertainty due to the PDF choice for the signal process is 1% while the uncertainty obtained by using Herwig as an alternative parton shower model is 3 − 10%, depending on the signal region. All uncertainties are propagated through the full analysis chain and treated as being bin-dependent and region-dependent, i.e. potentially modifying not only the normalisation but also the shape of the BDT output distributions. Whenever the influence on the shape is found to be negligible, as in the case of the PDF and scale variations, only the normalisation uncertainties are used. A list of the systematic uncertainty sources and their impact on the cross-section measurement are shown later in Section 8.

Results
A binned likelihood function is constructed as a product of Poisson probability terms over the eleven bins of the different SRs defined in Section 5. The function has two independent scaling parameters: the signal strength parameter μ, defined as the ratio of the measured signal yield to that predicted by the SM, for each of the W H and the Z H processes. Additionally, one Poisson probability term is added for each CR to determine simultaneously the normalisation of the corresponding background processes. Systematic uncertainties enter as nuisance parameters in the likelihood function and their correlations are taken into account. The final results are obtained using the profile likelihood method [71]. The resulting post-fit prediction and data yields in the four SRs are shown in Table 5.      Fig. 8. The corresponding one-dimensional results, where the other parameter is left Table 6 Breakdown of the main contributions to the total uncertainty in σ W H · B H→W W * (left) and σ Z H · B H→W W * (right). The individual sources of systematic uncertainties are grouped together. The sum in quadrature of the individual components differs from the total uncertainty due to correlations between the components. Systematic uncertainties that affect the shape of the fitted distribution are indicated by an asterix.

Acknowledgements
We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently.
We acknowledge the support of ANPCyT, The crucial computing support from all WLCG partners is acknowledged gratefully, in particular from CERN, the ATLAS Tier-