1 Introduction

After the observation of a scalar particle compatible with the standard model (SM) Higgs boson by the ATLAS and CMS Collaborations in 2012 [1,2,3], the two experiments have focused on performing precision measurements of the properties of the new particle. The large data sample collected at the CERN LHC during the data taking periods through 2018 allowed the measurement of the Higgs boson quantum numbers and couplings to other SM particles with an unprecedented level of accuracy [4]. All results reported so far are compatible with the SM within the current uncertainties.

Among all the Higgs boson decay channels predicted by the SM, the one into a pair of W bosons has the second largest branching fraction (\(\approx \) 22%), while benefitting from a lower background with respect to the more probable decay in a pair of b quarks. This combination makes this channel one of the most sensitive for measuring the production cross section of the Higgs boson and its couplings to SM particles. This paper presents the measurement of the Higgs boson properties in the \(\text {H} \rightarrow \text {W} \text {W} \) decay channel targeting the gluon fusion (\(\text {g} \text {g} \text {H} \)) and vector boson fusion (VBF) production mechanisms, as well as associated production with a vector boson (V H, where V stands for either a W or a Z boson). The measurement utilizes final states with at least two charged leptons arising either from the associated vector boson or from the products of the \(\text {H} \rightarrow \text {W} \text {W} \) decays. In all cases at least one of the W bosons originating from the Higgs boson is required to decay leptonically.

The properties of the Higgs boson are probed by measuring the inclusive cross sections for each production mechanism, as well as the production cross sections in finer phase spaces defined according to the simplified template cross section (STXS) framework [5]. In addition, measurements of the Higgs boson couplings to fermions and vector bosons are presented.

The analysis is based on proton-proton (\(\text {p} \text {p} \)) collision data produced at the LHC at \(\sqrt{s}=13\,\text {Te\hspace{-.08em}V} \) and collected by the CMS detector during 2016–2018, for a total integrated luminosity of about 138\(\,\text {fb}^{-1}\). This paper builds on previous analyses published by the CMS Collaboration in the \(\text {H} \rightarrow \text {W} \text {W} \) channel focused on the inclusive production cross section and coupling measurements at \(\sqrt{s}=7\), 8, and 13\(\,\text {Te\hspace{-.08em}V}\)  [6, 7], and on differential fiducial production cross section measurements at 8\(\,\text {Te\hspace{-.08em}V}\)  [8] and 13\(\,\text {Te\hspace{-.08em}V}\)  [9]. Similar measurements have also been reported in several Higgs boson decay channels by the ATLAS and CMS Collaborations [10,11,12,13,14].

Results reported in this paper show an overall improvement of the measurement accuracy thanks to new analysis techniques specifically devised to increase the sensitivity to particular production mechanisms (e.g., VBF with a different-flavor lepton pair in the final state), to the inclusion of new channels that have not been investigated in Run 2 before, such as VBF and V H production with a same-flavor pair of charged leptons and a hadronically decaying V, and \(\text {Z} \text {H} \) production with a three-lepton final state, and to the larger integrated luminosity analyzed. Moreover, \(\text {W} \text {H} \) production with two same sign leptons is measured for the first time in CMS. Tabulated results are provided in the HEPData record for this analysis [15].

This paper is organized as follows. A brief overview of the CMS apparatus is given in Sect. 2. The data set and simulated samples used are described in Sect. 3. Sections 48 describe in detail the event selection and categorization strategy, as well as the discriminating variables used to target each final state. The estimation of the backgrounds is described in Sect. 9, and the sources of systematic uncertainty and their treatment are given in Sect. 10. Results are presented in Sect. 11. Finally, closing remarks are given in Sect. 12.

2 The CMS detector and event reconstruction

The CMS apparatus is a general purpose detector designed to tackle a wide range of measurements. The central feature of CMS is a superconducting solenoid of 6\(\text {\,m}\) internal diameter, providing a magnetic field of 3.8\(\text {\,T}\). Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (\(\eta \)) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.

The events of interest are selected using a two-tiered trigger system. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100\(\text {\,kHz}\) within a fixed latency of about 4\(\,\mu \text {s}\)  [16]. The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1\(\text {\,kHz}\) before data storage [17]. Events passing the trigger selection are stored for offline reconstruction. A more detailed description of the CMS detector, together with a definition of the coordinate system and the kinematic variables, can be found in Ref. [18].

Muons are identified and their momenta are measured in the range \(|\eta | < 2.4\) by matching tracks in the muon system and the silicon tracker. The single muon trigger efficiency exceeds 90% over the full \(\eta \) range, and the efficiency to reconstruct and identify muons is greater than 96%. The relative transverse momentum (\(p_{\textrm{T}}\)) resolution for muons with \(p_{\textrm{T}}\) up to 100\(\,\text {Ge\hspace{-.08em}V}\) is 1% in the barrel and 3% in the endcaps [19, 20].

Electrons are identified and their momenta are measured in the interval \(|\eta | < 2.5\) by combining tracks in the silicon tracker with spatially compatible energy deposits in the ECAL, also accounting for the energy of bremsstrahlung photons likely originating from the electron track. The single electron trigger efficiency exceeds 90% over the full \(\eta \) range. The efficiency to reconstruct and identify electrons ranges between 60 and 80% depending on the lepton \(p_{\textrm{T}}\). The momentum resolution for electrons with \(p_{\textrm{T}} \approx 45\,\text {Ge\hspace{-.08em}V} \) from \(\text {Z} \rightarrow \text {e} \text {e} \) decays ranges from 1.7 to 4.5% depending on the \(\eta \) region. The resolution is generally better in the barrel than in the endcaps and also depends on the bremsstrahlung energy emitted by the electron as it traverses the material in front of the ECAL [21].

In order to achieve better rejection of nonprompt leptons, increasing the sensitivity of the analysis, leptons are required to be isolated and well reconstructed by imposing a set of requirements on the quality of the track reconstruction, shape of calorimetric deposits, and energy flux in the vicinity of the particle trajectory. On top of these criteria, a selection on a dedicated multivariate analysis (MVA) tagger developed for the CMS \({{\text {t}} {}{\bar{\text {t}}}} \text {H} \) analysis [22], referred to as ttHMVA, is added in all analysis categories for muon candidates. In categories targeting the VH production modes with leptonically decaying V boson, it is found that adding a selection on the ttHMVA tagger for electrons improves the sensitivity of the analysis.

Multiple \(\text {p} \text {p} \) interaction vertices are identified from tracking information by use of the adaptive vertex fitting algorithm [23]. The primary vertex is taken to be the vertex corresponding to the hardest scattering in the event, evaluated using tracking information alone, as described in Section 9.4.1 of Ref. [24].

The particle-flow (PF) algorithm [25] aims to reconstruct and identify each individual particle in an event, with an optimized combination of information from the various elements of the CMS detector. The energy of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for the response function of the calorimeters to hadronic showers. The energy of photons is obtained from the ECAL measurement. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies.

Hadronic jets are reconstructed from PF objects using the infrared and collinear safe anti-\(k_{\textrm{T}}\) algorithm [26, 27] with a distance parameter of 0.4. The jet momentum is determined from the vector sum of all PF candidate momenta in the jet. From simulation, reconstructed jet momentum is found to be, on average, within 5–10% of the momentum of generator jets, which are jets clustered from all generator-level final-state particles excluding neutrinos, over the entire \(p_{\textrm{T}}\) spectrum and detector acceptance. Additional p p interactions within the same or nearby bunch crossings (pileup) can contribute additional tracks and calorimetric energy deposits to the jet momentum. To mitigate this effect, charged particles identified as originating from pileup vertices are discarded, and an offset correction is applied for remaining contributions from neutral pileup particles [25]. Jet energy corrections are derived from simulation studies so that the average measured response of jets becomes identical to that of generator jets. In situ measurements of the momentum imbalance in dijet, photon+jet, Z+jet, and multijet events are used to account for any residual differences in jet energy scale in data and simulation [28, 29]. The jet energy resolution amounts typically to 15% at 10\(\,\text {Ge\hspace{-.08em}V}\), 8% at 100\(\,\text {Ge\hspace{-.08em}V}\), and 4% at 1\(\,\text {Te\hspace{-.08em}V}\). Additional selection criteria are applied to each jet to remove jets potentially dominated by anomalous contributions from various subdetector components or reconstruction failures. Jets are measured in the range \(|\eta |<4.7\). In the analysis of data recorded in 2017, to eliminate spurious jets caused by detector noise, all jets in the range \(2.5<|\eta |<3.0\) were excluded [30].

Table 1 Trigger requirements on the data set used in the analysis

We refer to the identification of jets likely originating from b quarks as b tagging [31, 32]. For each jet in the event a score is calculated through a multivariate combination of different jet properties, making use of boosted decision trees (BDTs) and deep neural networks (DNNs). Jets are considered b tagged if their associated score exceeds a threshold, tuned to achieve a certain tagging efficiency as measured in \({{\text {t}} {}{\bar{\text {t}}}}\) events. Typically three thresholds, called working points (WPs) in the following, are provided, labeled loose, medium, and tight, corresponding to probabilities of mistagging a jet originating from a lighter quark as coming from a bottom quark of 10, 1, and 0.1%, respectively. Unless otherwise specified, the loose WP of the DeepCSV tagger is used throughout this paper.

The missing transverse momentum vector \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\) is computed as the negative vector sum of the transverse momenta of all the PF candidates in an event, and its magnitude is denoted as \(p_{\textrm{T}} ^\text {miss}\)  [33]. The \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\) is modified to account for corrections to the energy scale of the reconstructed jets in the event. The pileup per particle identification algorithm [34] is applied to reduce the pileup dependence of the \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\) observable. The \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\) is computed from the PF candidates weighted by their probability to originate from the primary interaction vertex [33].

3 Data sets and simulations

The data sets used in the analysis were recorded by the CMS detector in 2016, 2017, and 2018, corresponding to an integrated luminosity of 36.3, 41.5, and 59.7\(\,\text {fb}^{-1}\), respectively [35,36,37].

The events selected in the analysis are required to pass criteria based on HLT algorithms that require the presence of either one or two electrons or muons, satisfying isolation and identification requirements. For the 2016 data set, the single-electron trigger requires a \(p_{\textrm{T}}\) threshold of 25\(\,\text {Ge\hspace{-.08em}V}\) for electrons with \(|\eta |<2.1\) and 27\(\,\text {Ge\hspace{-.08em}V}\) for \(2.1<|\eta |<2.5\). For the single-muon trigger the \(p_{\textrm{T}}\) threshold is 24\(\,\text {Ge\hspace{-.08em}V}\) for \(|\eta |<2.4\). In the dielectron (dimuon) trigger the \(p_{\textrm{T}}\) thresholds of the leading (highest \(p_{\textrm{T}}\)) and trailing (second-highest \(p_{\textrm{T}}\)) electron (muon) are respectively 23 (17) and 12 (8)\(\,\text {Ge\hspace{-.08em}V}\). In the dilepton \(\text {e} {\upmu } \) trigger, the \(p_{\textrm{T}}\) thresholds are 23 and 12\(\,\text {Ge\hspace{-.08em}V}\) for the leading and trailing lepton, respectively. For the first part of data taking in 2016, a lower \(p_{\textrm{T}}\) threshold of 8\(\,\text {Ge\hspace{-.08em}V}\) for the trailing muon was used. In the 2017 data set, the \(p_{\textrm{T}}\) thresholds of the single electron and single muon triggers are raised respectively to 35 and 27\(\,\text {Ge\hspace{-.08em}V}\), while they are set to 32 and 24\(\,\text {Ge\hspace{-.08em}V}\) in the 2018 data set. For both 2017 and 2018 data sets, the \(p_{\textrm{T}}\) thresholds of the dilepton triggers are kept the same as the last part of the 2016 data set. The trigger selection is summarized in Table 1.

Monte Carlo (MC) event generators are used in the analysis to model the signal and background processes. Three independent sets of simulated events, corresponding to the 2016, 2017, and 2018 data sets, are used for each process of interest, in order to take into account year-dependent effects in the CMS detector, data taking, and event reconstruction. Despite different matrix element generators being used for different processes, all simulated events corresponding to a given data set share the same set of parton distribution functions (PDFs), underlying event (UE) tune, and parton shower (PS) configuration. The PDF set used is NNPDF 3.0 [38, 39] at NLO for 2016 and NNPDF 3.1 [40] at NNLO for 2017 and 2018. The CUETP8M1 [41] tune is used to describe the UE in 2016 simulations, while the CP5 [42] tune is adopted in 2017 and 2018 simulated events. For all the simulations, the matrix-element event generators are interfaced with pythia  [43] 8.226 in 2016, and 8.230 in 2017 and 2018, for the UE description, PS, and hadronization.

Simulated events are used in the analysis to model Higgs boson production through \(\text {g} \text {g} \text {H} \), VBF, V H, and associated production with top quarks (\({{\text {t}} {}{\bar{\text {t}}}} \text {H} \)) or bottom quarks (\({\text {b}} \bar{{\text {b}}} \text {H} \)), although \({{\text {t}} {}{\bar{\text {t}}}} \text {H} \) and \({\text {b}} \bar{{\text {b}}} \text {H} \) have a negligible contribution in the analysis phase space. All Higgs boson production processes except \({\text {b}} \bar{{\text {b}}} \text {H} \) are generated using the powheg v2 [44,45,46,47,48,49,50] event generator, which describes Higgs boson production at next-to-leading order (NLO) accuracy in quantum chromodynamics (QCD), including finite quark mass effects. Instead, \({\text {b}} \bar{{\text {b}}} \text {H} \) production is simulated using the MadGraph 5_amc@nlo v2.2.2 generator [51]. The \(\text {Z} \text {H} \) production process is simulated including both gluon- and quark-induced contributions. The minlo hvj [49] extension of powheg v2 is used for the simulation of \(\text {W} \text {H} \) and quark-induced \(\text {Z} \text {H} \) production, providing a description of V H+0- and 1-jet processes with NLO accuracy. For \(\text {g} \text {g} \text {H} \) production, the simulated events are reweighted to match the NNLOPS [52, 53] prediction in the hadronic jet multiplicity (\(N_{\text {jet}}\)) and Higgs boson transverse momentum (\(p_{\textrm{T}} ^{\text {H}}\)) distributions, according to a two-dimensional map constructed using these observables. Moreover, for a better description of the phase space with more than one jet, the minlo hjj [54] generator is used, giving NLO accuracy for \(N_{\text {jet}} \ge 2\) and leading order (LO) accuracy for \(N_{\text {jet}} \ge 3\). The simulated samples are normalized to the cross sections recommended in Ref. [55]; in particular, the next-to-next-to-next-to-leading order cross section is used to normalize the \(\text {g} \text {g} \text {H} \) sample. The Higgs boson mass (\(m_\text {H} \)) in the event generation is assumed to be 125\(\,\text {Ge\hspace{-.08em}V}\), while the value of 125.38\(\,\text {Ge\hspace{-.08em}V}\)  [56] is used for the calculation of cross sections and branching fractions, yielding values of \(48.31\text {\,pb} \), \(3.77\text {\,pb} \), \(1.36\text {\,pb} \), \(0.88\text {\,pb} \), and \(0.12\text {\,pb} \) for the \(\text {g} \text {g} \text {H} \), VBF, \(\text {W} \text {H} \), quark-induced \(\text {Z} \text {H} \), and gluon-induced \(\text {Z} \text {H} \) processes, respectively, and 22.0% for the \(\text {H} \rightarrow \text {W} \text {W} \) branching ratio [55]. The decay to a pair of W bosons and subsequently to leptons or hadrons is performed using the JHUgen [57] v5.2.5 generator in 2016, and v7.1.4 in 2017 and 2018, for \(\text {g} \text {g} \text {H} \), VBF, and quark-induced \(\text {Z} \text {H} \) samples. The Higgs boson and W boson decays are performed using pythia 8.212 for the other signal simulations. For the \(\text {g} \text {g} \text {H} \), VBF, and V H production mechanisms, additional Higgs boson simulations are produced using the powheg v2 generator, where the Higgs boson decays to a pair of \(\uptau \) leptons. These events are treated as signal in the analysis, with the exception of the measurement in the STXS framework, in which they are treated as background.

The background processes are simulated using several event generators. The quark-initiated nonresonant \(\text {W} \text {W} \) process is simulated using powheg v2 [58] with NLO accuracy for the inclusive production. The mcfm v7.0 [59,60,61] generator is used for the simulation of gluon-induced \(\text {W} \text {W} \) production at LO accuracy, and the normalization is chosen to match the NLO cross section [62]. The nonresonant electroweak (EW) production of \(\text {W} \text {W} \) pairs with two additional jets (in the vector boson scattering topology) is simulated at LO accuracy with MadGraph 5_amc@nlo v2.4.2 using the MLM matching and merging scheme [63]. Top quark pair production (\({{\text {t}} {}{\bar{\text {t}}}}\)), as well as single top quark processes, including \({\text {t}} \text {W} \), s-, and t-channel contributions, are simulated with powheg v2 [64,65,66]. The Drell–Yan (DY) production of charged-lepton pairs is simulated at NLO accuracy with MadGraph 5_amc@nlo v2.4.2 with up to two additional partons, using the FxFx matching and merging scheme [67]. Production of a W boson associated with an initial-state radiation photon (\(\text {W} {\upgamma } \)) is simulated with MadGraph 5_amc@nlo v2.4.2 at NLO accuracy with up to one additional parton in the matrix element calculations and the FxFx merging scheme. Diboson processes containing at least one Z boson or a virtual photon (\({\upgamma } ^{*}\)) with mass down to 100\(\,\text {Me\hspace{-.08em}V}\) are generated with powheg v2 [58] at NLO accuracy. Production of a W boson in association with a \({\upgamma } ^{*}\) (\(\text {W} {\upgamma } ^{*}\)) for masses below 100\(\,\text {Me\hspace{-.08em}V}\) is simulated by pythia 8.212 in the parton showering of \(\text {W} {\upgamma } \) events. Triboson processes with inclusive decays are also simulated at NLO accuracy with MadGraph 5_amc@nlo v2.4.2.

Table 2 Overview of the selection defining the analysis categories (a more detailed breakdown is given in Table 12)

For all processes, the detector response is simulated using a detailed description of the CMS detector, based on the Geant4 package [68]. The distribution of the number of pileup interactions in the simulation is reweighted to match the one observed in data. The average number of pileup interactions was 23 (32) in 2016 (2017 and 2018).

The efficiency of the trigger system is evaluated in data on a per-lepton basis by selecting dilepton events compatible with originating from a Z boson. The per-lepton efficiencies are then combined probabilistically (i.e., the overall efficiency for an event passing any of the triggers listed above is calculated) to obtain the overall efficiencies of the trigger selections used in the analysis. The procedure has been validated by comparing the resulting efficiencies with MC simulation of the trigger. A correction has been derived as a function of \(\varDelta R =\sqrt{\smash [b]{(\varDelta \eta )^2+(\varDelta \phi )^2}}\) between the two leptons to account for any residual discrepancy, which is found to be on average below 1%. The resulting efficiencies are then applied directly on simulated events.

4 Event selection and categorization

The analysis targets events in which a Higgs boson is produced via \(\text {g} \text {g} \text {H} \), VBF, or V H processes, and subsequently decays to a pair of W bosons. Events are selected by requiring at least two charged leptons (electrons or muons) with high \(p_{\textrm{T}}\), high \(p_{\textrm{T}} ^\text {miss}\), and a varying number of hadronic jets. Throughout this paper, unless otherwise specified, only hadronic jets with \(p_{\textrm{T}} > 30 \,\text {Ge\hspace{-.08em}V} \) are considered. Categories targeting Higgs bosons produced via \(\text {g} \text {g} \text {H} \), VBF, and V H with a hadronically decaying vector boson (V H2j) are subdivided in different-flavor (DF) and same-flavor (SF) by selecting e\(\upmu \), and ee/\(\upmu \) \(\upmu \) pairs, respectively. Categories targeting V H production with a leptonically decaying vector boson are subdivided in four subcategories based on the number of leptons and hadronic jets required: \(\text {W} \text {H} \) SS (same sign), \(\text {W} \text {H} \) 3\(\ell \), \(\text {Z} \text {H} \) 3\(\ell \), and \(\text {Z} \text {H} \) 4\(\ell \) targeting the \(\text {W} \text {H} \rightarrow \ell ^{\pm }\ell ^{\pm } 2{\upnu } {\text {q}} {\text {q}} \), \(\text {W} \text {H} \rightarrow 3\ell 3{\upnu } \), \(\text {Z} \text {H} \rightarrow 3\ell {\upnu } {\text {q}} {\text {q}} \), and \(\text {Z} \text {H} \rightarrow 4\ell 2{\upnu } \) processes, respectively. In all cases events containing additional leptons with \(p_{\textrm{T}} > 10 \,\text {Ge\hspace{-.08em}V} \) are rejected. A summary of the different categories is given in Table 2, with a more detailed breakdown given in Table 12.

Across all categories, in the 2016 data set, events are required to pass single- or double-lepton triggers. An additional requirement is placed on the lepton \(p_{\textrm{T}}\) to be above 10\(\,\text {Ge\hspace{-.08em}V}\), and the highest \(p_{\textrm{T}}\) (leading) lepton in the event is furthermore required to have \(p_{\textrm{T}} >25\,\text {Ge\hspace{-.08em}V} \). In the 2017 and 2018 data sets the threshold for leptons is increased to 13\(\,\text {Ge\hspace{-.08em}V}\) because of a change in the trigger setup. Where yields suffice, events are further split according to the charge and \(p_{\textrm{T}}\) ordering of the dilepton system, \(p_{\textrm{T}}\) of the subleading lepton, and number of hadronic jets in the event, as detailed in following sections. The number of expected and observed events in each category are given in Sect. 11.

5 Gluon fusion categories

This section describes the categories targeting the \(\text {g} \text {g} \text {H} \) production mechanism, both in DF and SF final states. In DF final states, the main background processes are nonresonant \(\text {W} \text {W} \), top quark production (both single and pair), DY production of a pair of \(\uptau \) leptons that subsequently decay to an \(\text {e} {\upmu } \) pair and associated neutrinos, and W+jets events when a jet is misidentified as a lepton. Subdominant backgrounds include \(\text {W} \text {Z} \), \(\text {Z} \text {Z} \), \({\text {V}} {\upgamma } \), \({\text {V}} {\upgamma } ^{*}\), and \({\text {V}} {\text {V}} {\text {V}} \) production. In SF final states, the dominant background contribution is given by DY events, with subdominant components arising from top quark and \(\text {W} \text {W} \) production, as well as events with misidentified leptons.

5.1 Different-flavor ggH categories

On top of the common selection, the leading leptons are required to form an \(\text {e} {\upmu } \) pair with opposite charge. Contributions arising from top quark production are reduced by rejecting events containing any jet with \(p_{\textrm{T}} >20\,\text {Ge\hspace{-.08em}V} \) that is identified as originating from a bottom quark by the tagging algorithm. The dilepton invariant mass \(m_{\ell \ell }\) is required to be above 12\(\,\text {Ge\hspace{-.08em}V}\) to suppress QCD events with multiple misidentified jets. Events with no genuine missing transverse momentum (arising from the presence of neutrinos in signal events), as well as \({\uptau } {\uptau } \) events, are suppressed by requiring \(p_{\textrm{T}} ^\text {miss} >20\,\text {Ge\hspace{-.08em}V} \). The latter are further reduced by requiring the \(p_{\textrm{T}}\) of the dilepton system \(p_{\textrm{T}} ^{\ell \ell }\) to exceed 30\(\,\text {Ge\hspace{-.08em}V}\), as leptons arising from a \({\uptau } {\uptau } \) pair are found to have on average lower \(p_{\textrm{T}}\) than those coming from a \(\text {W} \text {W} \) pair. Finally, to further suppress contributions from \({\uptau } {\uptau } \) and W+jets events, where the subleading lepton does not arise from a W boson decay, the transverse mass built with \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\) and the momentum of the subleading lepton \(m_{\textrm{T}} (\ell _2, p_{\textrm{T}} ^\text {miss})\) is required to be greater than 30\(\,\text {Ge\hspace{-.08em}V}\), where \(m_{\textrm{T}}\) for a collection of particles \(\{P_i\}\) with transverse momenta \({\vec p}_{\textrm{T}} {}_{,i}\) is defined as:

$$\begin{aligned} m_{\textrm{T}} (\{P_i\}) = \sqrt{ \left( \sum |{\vec p}_{\textrm{T}} {}_{,i} | \right) ^2 - \left| \sum {\vec p}_{\textrm{T}} {}_{,i} \right| ^2}. \end{aligned}$$
(1)

Selected events are further split into subcategories in order to exploit the peculiar kinematics of the target final state. Events with zero, one, and more than one hadronic jets are separated into distinct categories. In order to better constrain the W+jets background, the 0- and 1-jet categories are subdivided into two categories each according to the charge and \(p_{\textrm{T}}\) ordering of the dilepton pair. This subdivision exploits the fact that the signal is charge symmetric, while in W+jets events \({\text {W}}^{+}\) bosons are more abundant than \({\text {W}}^{-}\) bosons. Finally, these subcategories are further divided according to whether the \(p_{\textrm{T}}\) of the subleading lepton (\(p_{\textrm{T}} {}_2\)) is above or below 20\(\,\text {Ge\hspace{-.08em}V}\). This results in a four-fold partitioning of the 0- and 1-jet DF \(\text {g} \text {g} \text {H} \) categories. In categories with more than one hadronic jet, a selection on the invariant mass of the leading dijet pair \(m_{\text {jj}}\) is added to ensure that there is no overlap with the VBF and VH categories.

Table 3 Summary of the selection used in different-flavor \(\text {g} \text {g} \text {H} \) categories

Given the presence of neutrinos in the final state, the mass of the Higgs boson candidate can not be reconstructed in the \(\text {W} \text {W} \) channel. Nevertheless, specific features of the channel make it possible to achieve good sensitivity. In particular, the scalar nature of the Higgs boson results in the two final-state leptons being preferentially emitted in the same hemisphere. This fact compresses the distribution of \(m_{\ell \ell }\) for signal events to lower values with respect to the nonresonant \(\text {W} \text {W} \) process. This shape difference alone however is not sufficient to disentangle the signal from other background processes, such as DY production of \({\uptau } {\uptau } \) pairs and \({\text {V}} {\upgamma } \), that populate the low-\(m_{\ell \ell }\) phase space. The Higgs boson transverse mass \(m_{\textrm{T}} ^{\text {H}} = m_{\textrm{T}} (\ell \ell , p_{\textrm{T}} ^\text {miss})\) is thus introduced as a second discriminating variable. A selection on \(m_{\textrm{T}} ^{\text {H}}\) is applied by requiring its value to be above 60\(\,\text {Ge\hspace{-.08em}V}\) for signal events. It is found that signal and background events populate different regions of the \((m_{\ell \ell },m_{\textrm{T}} ^{\text {H}})\) plane. The signal extraction fit is therefore performed on a two-dimensional \((m_{\ell \ell },m_{\textrm{T}} ^{\text {H}})\) binned template, allowing for good signal-to-background discrimination.

Fig. 1
figure 1

Observed distributions of the \(m_{\ell \ell }\) (upper) and \(m_{\textrm{T}} ^{\text {H}}\) (lower) fit variables in the 0-jet \(\text {g} \text {g} \text {H} \) \(p_{\textrm{T}} {}_2 <20\,\text {Ge\hspace{-.08em}V} \) (left) and \(p_{\textrm{T}} {}_2 >20\,\text {Ge\hspace{-.08em}V} \) (right) DF categories. The uncertainty band corresponds to the total systematic uncertainty in the templates after the fit to the data. The signal template is shown both stacked on top of the backgrounds, as well as superimposed. The yields are shown with their best fit normalizations from the simultaneous fit. Vertical bars on data points represent the statistical uncertainty in the data. The overflow is included in the last bin. The lower panel in each figure shows the ratio of the number of events observed in data to that of the total SM MC as extracted from the fit

Fig. 2
figure 2

Observed distributions of the \(m_{\ell \ell }\) (upper) and \(m_{\textrm{T}} ^{\text {H}}\) (lower) fit variables in the 1-jet \(\text {g} \text {g} \text {H} \) \(p_{\textrm{T}} {}_2 <20\,\text {Ge\hspace{-.08em}V} \) (left) and \(p_{\textrm{T}} {}_2 >20\,\text {Ge\hspace{-.08em}V} \) (right) DF categories. A detailed description is given in the Fig. 1 caption

In order to optimize background subtraction in the signal region (SR), two additional orthogonal selections are defined for each jet multiplicity category. These define two sets of control regions (CR), enriched in \({\uptau } {\uptau } \) and top quark events, respectively. They are defined by the same selection as the SR, but inverting the b jet veto for the top CR and the \(m_{\textrm{T}} ^{\text {H}}\) requirement for the \({\uptau } {\uptau } \) CR. The full selection and categorization strategy is summarized in Table 3. Observed distributions for \(m_{\ell \ell }\) and \(m_{\textrm{T}} ^{\text {H}}\) for the 0-, 1-, and 2-jet \(\text {g} \text {g} \text {H} \) categories are shown in Figs. 12, and 3, respectively. The \(\text {W} \text {Z} \), \(\text {Z} \text {Z} \), \({\text {V}} {\upgamma } \), \({\text {V}} {\upgamma } ^{*}\), and \({\text {V}} {\text {V}} {\text {V}} \) backgrounds are shown together as minor backgrounds. The observed \(m_{\ell \ell }\) and \(m_{\textrm{T}} ^{\text {H}}\) distributions for the 0-, 1-, and 2-jet CRs enriched in top quark events are shown in Figs. 45, and 6, and for the \({\uptau } {\uptau } \) CRs in Figs. 78, and 9.

Fig. 3
figure 3

Observed distributions of the \(m_{\ell \ell }\) (left) and \(m_{\textrm{T}} ^{\text {H}}\) (right) fit variables in the 2-jet \(\text {g} \text {g} \text {H} \) DF category. A detailed description is given in the Fig. 1 caption

Fig. 4
figure 4

Observed distributions of the \(m_{\ell \ell }\) (left) and \(m_{\textrm{T}} ^{\text {H}}\) (right) variables in the 0-jet DF top quark control region. A detailed description is given in the Fig. 1 caption

5.2 Same-flavor ggH categories

The categories described in this section target the \(\text {g} \text {g} \text {H} \) production mechanism in final states with either two electrons or two muons. The two leading leptons in the event are required to form an oppositely charged ee or \(\upmu \) \(\upmu \) pair. Events containing at least one b-tagged jet with \(p_{\textrm{T}} > 20\,\text {Ge\hspace{-.08em}V} \) are discarded. Low-mass resonances are suppressed by requiring \(m_{\ell \ell } >12\,\text {Ge\hspace{-.08em}V} \). The W+jets background is reduced by requiring the \(p_{\textrm{T}}\) of the dilepton system to exceed 30\(\,\text {Ge\hspace{-.08em}V}\). Events are also required to have \(p_{\textrm{T}} ^\text {miss} >20\,\text {Ge\hspace{-.08em}V} \) to enrich the selection in processes with genuine missing transverse momentum. Finally, to reduce the DY background, which is dominant in this channel, a veto is placed on events in which \(m_{\ell \ell }\) is within 15\(\,\text {Ge\hspace{-.08em}V}\) of the nominal mass of the Z boson (\(m_\text {Z} \)).

Events are divided in subcategories based on the number of hadronic jets, and further selections on \(m_{\textrm{T}} ^{\text {H}}\), \(m_{\ell \ell }\), and the azimuthal angle between the two leading leptons (\(\varDelta \phi _{\ell \ell }\)) are applied depending on the subcategory. A dedicated multivariate discriminant based on a DNN, called DYMVA in the following, is built and trained with the TensorFlow package [69] to distinguish signal events from DY events. The DNN is trained separately for each jet multiplicity subcategory. The architecture of the DNN is that of a feed-forward multilayer perceptron, taking 21, 22, and 27 input variables in the 0-, 1-, and 2-jet categories, respectively. These include kinematic information from the dilepton system, \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\), and jets where present. To better constrain the top quark and \(\text {W} \text {W} \) backgrounds, two CRs are defined in each jet multiplicity subcategory, enriched in the respective processes. The full selection is given in Table 4. The selection efficiency of the requirement on the DYMVA score in 0-jet categories is found to be approximately 50, 7, and 30% for signal, DY, and total background events, respectively. In 1- and 2-jet categories the corresponding efficiencies are \(\approx \)50, 1, and 10%. Once the selection is performed, the signal is extracted via a simultaneous fit to the number of events in each category.

Fig. 5
figure 5

Observed distributions of the \(m_{\ell \ell }\) (left) and \(m_{\textrm{T}} ^{\text {H}}\) (right) variables in the 1-jet DF top quark control region. A detailed description is given in the Fig. 1 caption

Fig. 6
figure 6

Observed distributions of the \(m_{\ell \ell }\) (left) and \(m_{\textrm{T}} ^{\text {H}}\) (right) variables in the 2-jet DF top quark control region. A detailed description is given in the Fig. 1 caption

Fig. 7
figure 7

Observed distributions of the \(m_{\ell \ell }\) (left) and \(m_{\textrm{T}} ^{\text {H}}\) (right) variables in the 0-jet DF \({\uptau } {\uptau } \) control region. A detailed description is given in the Fig. 1 caption

Fig. 8
figure 8

Observed distributions of the \(m_{\ell \ell }\) (left) and \(m_{\textrm{T}} ^{\text {H}}\) (right) variables in the 1-jet DF \({\uptau } {\uptau } \) control region. A detailed description is given in the Fig. 1 caption

Fig. 9
figure 9

Observed distributions of the \(m_{\ell \ell }\) (left) and \(m_{\textrm{T}} ^{\text {H}}\) (right) variables in the 2-jet DF \({\uptau } {\uptau } \) control region. A detailed description is given in the Fig. 1 caption

6 Vector boson fusion categories

This section describes the categories targeting the VBF production mechanism, both in DF and SF final states. This mode involves the production of a Higgs boson in association with a pair of forward-backward jets. The dijet system is characterized by a large \(m_{\text {jj}}\), large pseudorapidity separation \(\varDelta \eta _{\text {jj}}\), and low hadronic activity in the pseudorapidity region between the tagging jets. The fully leptonic final state in the VBF category therefore consists of two isolated leptons, large \(p_{\textrm{T}} ^\text {miss}\) from the two undetectable neutrinos, and a pair of forward-backward jets. The main background processes for the VBF categories are the same as for the \(\text {g} \text {g} \text {H} \) categories. An additional complication however arises in the entanglement of VBF and \(\text {g} \text {g} \text {H} \) events, given the identical decay mode and the fact that the \(\text {g} \text {g} \text {H} \) cross section is larger than the VBF one by one order of magnitude.

6.1 Different-flavor VBF categories

On top of the common global selection, the same requirements on leptons and \(p_{\textrm{T}} ^\text {miss}\) used in the DF \(\text {g} \text {g} \text {H} \) categories are applied. In this case, however, there are no subcategories based on jet multiplicity. Instead, exactly two jets with \(p_{\textrm{T}} >30\,\text {Ge\hspace{-.08em}V} \) and \(m_{\text {jj}} >120\,\text {Ge\hspace{-.08em}V} \) are required, while still requiring the absence of b-tagged jets with \(p_{\textrm{T}} >20\,\text {Ge\hspace{-.08em}V} \). In this category the DeepFlavor tagger [32] is used. Finally, \(60< m_{\textrm{T}} ^{\text {H}} < 125\,\text {Ge\hspace{-.08em}V} \) is required.

In order to separate the signal from the background, a DNN approach has been followed. The DNN is constructed to perform a multiclass classification of an event as either signal (VBF) or any of the three main background processes, namely: \(\text {W} \text {W} \), top quark production, and \(\text {g} \text {g} \text {H} \). As a result, a vector \(\vec {\varvec{o}}\) of four numbers is attributed to an event. Each number represents the degree of agreement of the event with the signal and the three background processes. Each of these outputs can be interpreted as a probability, since they are normalized to one. Therefore, for a given event, the process j with the highest output \(o_j\) is interpreted as the most probable process. For this reason, the four outputs are referred to as classifiers: \(C_{\textrm{VBF}}\), \(C_{{\text {t}}}\), \(C_{\text {W} \text {W} }\), and \(C_{\text {g} \text {g} \text {H} }\). In the SR four orthogonal categories are made using the classifiers. If, for a given event, \(C_j\) is higher than the other three, the event is classified in the j-like category, and \(C_j\) is used as the discriminating variable. A shape-based analysis is hence performed in these categories. The DNN is trained on a set of 26 input variables, including kinematic information from the dilepton system, \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\), and jets. The variables with the most discrimination power are found to be \(m_{\text {jj}}\), \(\varDelta \eta _{\text {jj}}\) and \(m_{\ell \ell }\). As done in the DF \(\text {g} \text {g} \text {H} \) categories, in order to optimize background subtraction in the SR, two CRs are defined, enriched in \({\uptau } {\uptau } \) and top quark events, respectively. They are defined by the same selection as the SR, but inverting the b jet veto for the top quark CR and the \(m_{\textrm{T}} ^{\text {H}}\) requirement for the \({\uptau } {\uptau } \) CR. The full selection and categorization strategy is summarized in Table 5. Observed distributions for the \(C_{\textrm{VBF}}\) and \(C_{\text {g} \text {g} \text {H} }\) classifiers in the VBF-like and ggH-like categories respectively are shown in Fig. 10.

In order to verify that the simulated background processes agree with data in the DNN classifiers, the distributions are also checked at the level of the VBF SR global selection, i.e., before the further event categorization based on the classifier outputs. The \(C_{\textrm{VBF}}\) DNN output in the aforementioned global selection region is shown in Fig. 11.

6.2 Same-flavor VBF categories

On top of the common global selection, the same selection used in the SF \(\text {g} \text {g} \text {H} \) categories is applied. However, in this case, at least two jets with \(p_{\textrm{T}} > 30\,\text {Ge\hspace{-.08em}V} \) are required, with \(m_{\text {jj}} > 350\,\text {Ge\hspace{-.08em}V} \), while also rejecting events that contain any b-tagged jets with \(p_{\textrm{T}} >20\,\text {Ge\hspace{-.08em}V} \). To define a Higgs-boson-enriched phase space, a selection on the DYMVA DNN is added. The DNN is trained and optimized separately in each category. Two background CRs help in constraining the normalization of the top quark and \(\text {W} \text {W} \) backgrounds. These CRs consist in regions of phase space orthogonal but as close as possible to the signal phase space. This channel utilizes a simple counting experiment analysis, thus the event requirements are chosen to maximize the expected signal significance. The full selection and categorization strategy is summarized in Table 6.

7 Vector boson associated production categories

This section describes categories targeting the V H production mode. Four subcategories are defined (\(\text {W} \text {H} \) SS, \(\text {W} \text {H} \) 3\(\ell \), \(\text {Z} \text {H} \) 3\(\ell \), and \(\text {Z} \text {H} \) 4\(\ell \)) to target final states in which the vector boson V, produced in association with the Higgs boson, decays leptonically. Two more categories (V H 2j DF/SF) select events in which the V boson decays into two resolved jets. An additional selection is applied in each category to reduce the background, as well as an event categorization, defining phase spaces more sensitive to either signal or specific backgrounds. Details on the event selection and categorization are given below.

7.1 WHSS categories

The \(\text {W} \text {H} \) SS category targets the \(\text {W} \text {H} \rightarrow 2\ell 2{\upnu } {\text {q}} {\text {q}} \) final state, where the two charged leptons are required to have same sign to reduce DY background. Therefore, the final state contains two same-sign leptons, \(p_{\textrm{T}} ^\text {miss}\), and at least one jet. The analysis requires the leading (subleading) lepton to have \(p_{\textrm{T}} >25\,(20)\,\text {Ge\hspace{-.08em}V} \). To remove contributions from low-mass resonances, \(m_{\ell \ell }\) is required to be greater than 12\(\,\text {Ge\hspace{-.08em}V}\). The two leptons must have a pseudorapidity separation (\(\varDelta \eta _{\ell \ell }\)) of less than two. Events are also required to have \(p_{\textrm{T}} ^\text {miss} > 30\,\text {Ge\hspace{-.08em}V} \), as well as no b-tagged jet with \(p_{\textrm{T}} > 20\,\text {Ge\hspace{-.08em}V} \).

Signal region events are further categorized based on the number of jets and the lepton flavor composition. Events in the 1-jet category are required to contain exactly one jet with \(p_{\textrm{T}} > 30\,\text {Ge\hspace{-.08em}V} \) and \(|\eta |<4.7\). Events in the 2-jet category are required to contain at least two jets with the same kinematic constraints. For events containing more than two jets, only the two jets with highest \(p_{\textrm{T}}\) are considered for the analysis. These jets must have \(m_{\text {jj}} <100\,\text {Ge\hspace{-.08em}V} \). The SRs are further divided into e\(\upmu \) and \(\upmu \) \(\upmu \) categories. Events with two electrons are not considered, as this flavor category is less sensitive to signal.

To improve discrimination between signal and background, the variable \(\widetilde{m}_{\text {H}}\) is defined, which serves as a proxy for \(m_\text {H} \). It is computed as the invariant mass of the dijet pair four-momentum \(P_\textrm{jj}=(E_\textrm{jj}, \vec {p}_\textrm{jj})\) and twice the four-momentum of the lepton closest to the dijet pair \(P_{\ell } = (p_{\ell },\vec {p}_{\ell })\):

$$\begin{aligned} \widetilde{m}_{\text {H}} =\sqrt{(P_{jj}+2P_{\ell })^2}. \end{aligned}$$
(2)

The second lepton four-momentum serves as a proxy for the neutrino. If an event in the 1-jet category contains a second jet with \(20< p_{\textrm{T}} < 30\,\text {Ge\hspace{-.08em}V} \), this jet is included in the computation of this variable; otherwise the four-momentum of the single jet is used. Events in all categories are required to have \(\widetilde{m}_{\text {H}} > 50\,\text {Ge\hspace{-.08em}V} \). A summary of the event selection is given in Table 7.

The main backgrounds in the \(\text {W} \text {H} \) SS category are \(\text {W} \text {Z} \), W+jets, \({\text {V}} {\upgamma } \), and \({\text {V}} {\upgamma } ^{*}\) production. Additional backgrounds are top quark, triboson, \(\text {W} \text {W} \), and \(\text {Z} \text {Z} \) production. The W+jets events pass the selection when a nonprompt lepton passes the lepton selection. This nonprompt background is estimated from data, as described in Sect. 9. The remaining backgrounds are estimated using MC simulation. The \(\text {W} \text {Z} \) background normalization is estimated in the 1- and 2-jet CRs shared with the \(\text {Z} \text {H} \) 3\(\ell \) category, described in Sect. 7.3.

To extract the Higgs boson production cross section, a binned fit is performed to the \(\widetilde{m}_{\text {H}}\) variable. Figure 12 shows the \(\widetilde{m}_{\text {H}}\) distribution after the fit to the data.

Table 4 Summary of the selection used in same-flavor \(\text {g} \text {g} \text {H} \) categories. The DYMVA threshold is optimized separately in each subcategory and data set
Table 5 Selection used in the different-flavor VBF categories
Fig. 10
figure 10

Distributions for the \(C_{\textrm{VBF}}\) (left) and \(C_{\text {g} \text {g} \text {H} }\) (right) classifiers in the VBF-like and ggH-like VBF DF categories, respectively. A detailed description is given in the Fig. 1 caption

7.2 WH3\(\ell \) categories

The \(\text {W} \text {H} \) 3\(\ell \) category targets the \(\text {W} \text {H} \rightarrow 3\ell 3{\upnu } \) decay. The final state therefore contains three leptons and \(p_{\textrm{T}} ^\text {miss}\). The analysis selects events containing three leptons with \(p_{\textrm{T}} >25\), 20, and 15\(\,\text {Ge\hspace{-.08em}V}\), respectively and total charge (\(\text {Q}_{3\ell }\)) ±1. The invariant mass of any dilepton pair is required to be greater than 12\(\,\text {Ge\hspace{-.08em}V}\) to remove low-mass resonances. Events are rejected if they contain a jet with \(p_{\textrm{T}} > 30\,\text {Ge\hspace{-.08em}V} \), or any b-tagged jet with \(p_{\textrm{T}} > 20\,\text {Ge\hspace{-.08em}V} \).

Events in the SR are categorized based on the flavor composition of the lepton pairs. Events with at least one opposite-sign SF (OSSF) lepton pair are placed in the OSSF category, while all other events are placed in the same-sign SF (SSSF) category. To reject backgrounds containing Z bosons, events in the OSSF SR must pass a Z boson veto, where all lepton pairs must satisfy \(|m_{\ell \ell }- m_\text {Z} | > 20\,\text {Ge\hspace{-.08em}V} \), as well as \(p_{\textrm{T}} ^\text {miss} > 40\,\text {Ge\hspace{-.08em}V} \).

The main backgrounds in the \(\text {W} \text {H} \) 3\(\ell \) category are \(\text {W} \text {Z} \), \(\text {Z} \text {Z} \), \({\text {V}} {\upgamma } \), and \({\text {V}} {\upgamma } ^{*}\) production, as well as backgrounds containing nonprompt leptons. Nonprompt backgrounds are estimated from data, as described in Sect. 9. The remaining backgrounds are estimated from simulated samples. The \(\text {W} \text {Z} \) and Z\(\upgamma \) backgrounds are normalized using dedicated CRs, matching the OSSF SR with the exception of an inverted Z boson veto, a differing \(p_{\textrm{T}} ^\text {miss}\) requirement, and an additional selection on the invariant mass of the full lepton system (\(m_{3\ell }\)). A summary of the event selection and categorization is given in Table 8.

To discriminate between signal and background, two BDTs, trained separately for the OSSF and SSSF categories, are used. The BDTs are built using the TMVA [70] package and trained on events passing the OSSF and SSSF SR selections without the \(|m_{\ell \ell }- m_\text {Z} |\) requirement. The number of input variables used in the BDT training is 19 and 15 in the OSSF and SSSF regions, respectively. They include kinematic information on the leptons, \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\), b tagging scores for the leading jets, and various invariant masses built from leptons and \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\), with the minimum invariant mass and \(\varDelta R\) separation of the opposite sign lepton pairs giving the most discrimination power. To extract the Higgs boson production cross section, a binned fit is performed to the BDT score. Figure 13 shows the BDT discriminant distributions after the fit to the data.

Fig. 11
figure 11

Distribution of the \(C_{\textrm{VBF}}\) classifier in the VBF DF SR, before the further event categorization based on the classifier outputs. A detailed description is given in the Fig. 1 caption

7.3 ZH3\(\ell \) categories

The \(\text {Z} \text {H} \) 3\(\ell \) category targets the \(\text {Z} \text {H} \rightarrow 3\ell {\upnu } {\text {q}} {\text {q}} \) decay. The final state therefore contains three leptons with total charge ±1. The invariant mass of any dilepton pair is required to be greater than 12\(\,\text {Ge\hspace{-.08em}V}\) to reject low-mass resonances. The event must contain an OSSF lepton pair with invariant mass \(|m_{\ell \ell }- m_\text {Z} | < 25\,\text {Ge\hspace{-.08em}V} \). Events are rejected if any b-tagged jet with \(p_{\textrm{T}} > 20\,\text {Ge\hspace{-.08em}V} \) passing the medium WP of the tagging algorithm is found.

Events are categorized based on the number of jets. Events in the 1-jet category contain exactly one jet with \(p_{\textrm{T}} > 30\,\text {Ge\hspace{-.08em}V} \) and \(|\eta | < 4.7\), while events in the 2-jet category contain at least two jets passing these requirements. Signal region events must also have an azimuthal separation between the two W bosons (\(\varDelta \phi (\ell p_{\textrm{T}} ^\text {miss}, j(j))\)), represented by the \(\ell \)+\(p_{\textrm{T}} ^\text {miss}\) and (di)jet systems respectively, below \(\pi /2\), and pass a Z boson internal conversion veto \(|m_{3\ell } - m_{\text {Z}} | > 20\,\text {Ge\hspace{-.08em}V} \).

Table 6 Selection used in the same-flavor VBF categories. The DYMVA threshold is optimized separately in each subcategory and data set
Table 7 Event selection and categorization in the \(\text {W} \text {H} \) SS category

The main backgrounds in the \(\text {Z} \text {H} \) 3\(\ell \) analysis are \(\text {W} \text {Z} \), \(\text {Z} \text {Z} \), and Z+jets events. The Z\(\upgamma \)/\({\upgamma } ^{*}\), \({\text {V}} {\text {V}} {\text {V}} \), and \({{\text {t}} {}{\bar{\text {t}}}}\)+jets processes also contribute. The Z+jets events pass the selection when a nonprompt lepton passes the lepton selection. This background is estimated from data as described in Sect. 9. The remaining backgrounds are modeled using MC simulation. The \(\text {W} \text {Z} \) normalization as a function of the number of jets is extracted from dedicated CRs, which are categorized by the number of jets in the same way as the SRs. The \(\text {W} \text {Z} \) CRs are also used to constrain the \(\text {W} \text {Z} \) background in the \(\text {W} \text {H} \) SS category. A summary of the event selection and categorization is shown in Table 9.

To extract the Higgs boson production cross section, a binned fit is performed to the \(m_{\textrm{T}} ^{\text {H}} = m_{\textrm{T}} (\ell p_{\textrm{T}} ^\text {miss}, j(j))\) variable, defined in Eq. (1). Figure 14 shows the \(m_{\textrm{T}} ^{\text {H}}\) distributions after the fit to the data.

Fig. 12
figure 12

Observed distributions of the \(\widetilde{m}_{\text {H}}\) fit variable in the \(\text {W} \text {H} \) SS 1-jet e\(\upmu \) (upper left), 2-jet e\(\upmu \) (upper right), 1-jet \(\upmu \) \(\upmu \) (lower left), and 2-jet \(\upmu \) \(\upmu \) (lower right) SRs. A detailed description is given in the Fig. 1 caption

7.4 ZH4\(\ell \) categories

The \(\text {Z} \text {H} \) 4\(\ell \) category targets the \(\text {Z} \text {H} \rightarrow 4\ell 2{\upnu } \) decay. The final state therefore contains four leptons and \(p_{\textrm{T}} ^\text {miss}\). The analysis selects events containing four leptons with \(p_{\textrm{T}} > 25\), 15, 10, and 10\(\,\text {Ge\hspace{-.08em}V}\), respectively, and null total charge (\(\text {Q}_{4\ell }\)). The invariant mass of any dilepton pair is required to be greater than 12\(\,\text {Ge\hspace{-.08em}V}\) to reject low-mass resonances. The opposite-sign SF lepton pair with \(m_{\ell \ell }\) closest to \(m_\text {Z} \) is designated as the Z boson candidate, while the remaining lepton pair is referred to as the X candidate. The Z boson candidate mass is required to be within 15\(\,\text {Ge\hspace{-.08em}V}\) of \(m_\text {Z} \). Events are rejected if they contain any b-tagged jet with \(p_{\textrm{T}} > 20\,\text {Ge\hspace{-.08em}V} \).

Events are categorized based on the flavor of the lepton pair forming the X candidate. Events in the XSF category have an SF X lepton pair, while events in the XDF category have a DF X lepton pair. In the XSF category, events are required to satisfy \(m_{4\ell } > 140\,\text {Ge\hspace{-.08em}V} \), \(10< m_{\ell \ell }^{\text {X}} < 60\,\text {Ge\hspace{-.08em}V} \), and \(p_{\textrm{T}} ^\text {miss} > 35\,\text {Ge\hspace{-.08em}V} \). Events in the XDF category must have \(10< m_{\ell \ell }^{\text {X}} < 70\,\text {Ge\hspace{-.08em}V} \) and \(p_{\textrm{T}} ^\text {miss} > 20\,\text {Ge\hspace{-.08em}V} \).

Production of \(\text {Z} \text {Z} \) pairs is the main background in this category. Additional contributions arise from \({{\text {t}} {}{\bar{\text {t}}}}\)Z, \({\text {V}} {\text {V}} {\text {V}} \), and V\(\upgamma \) processes. These backgrounds are all modeled with MC simulation. The \(\text {Z} \text {Z} \) normalization is extracted from data in a dedicated CR defined by the requirements \(75< m_{\ell \ell }^{\text {X}} < 105\,\text {Ge\hspace{-.08em}V} \) and \(p_{\textrm{T}} ^\text {miss} < 35\,\text {Ge\hspace{-.08em}V} \). The event selection and categorization in the \(\text {Z} \text {H} \) 4\(\ell \) category is summarized in Table 10.

Table 8 Event selection and categorization in the \(\text {W} \text {H} \) 3\(\ell \) category
Fig. 13
figure 13

Observed distributions of the BDT score in the \(\text {W} \text {H} \) 3\(\ell \) OSSF (left) and SSSF (right) SRs. A detailed description is given in the Fig. 1 caption

Fig. 14
figure 14

Observed distributions of the \(m_{\textrm{T}} ^{\text {H}}\) fit variable in the \(\text {Z} \text {H} \) 3\(\ell \) 1-jet (left) and 2-jet (right) SRs. A detailed description is given in the Fig. 1 caption

Table 9 Event selection and categorization in the \(\text {Z} \text {H} \) 3\(\ell \) category
Table 10 Event selection and categorization in the \(\text {Z} \text {H} \) 4\(\ell \) category
Fig. 15
figure 15

Observed distributions of the BDT score in the \(\text {Z} \text {H} \) 4\(\ell \) XDF (left) and XSF (right) SRs. A detailed description is given in the Fig. 1 caption

A BDT approach is used to discriminate between signal and background. The BDT is trained on events passing the global selection, with \(p_{\textrm{T}} ^\text {miss} > 20\,\text {Ge\hspace{-.08em}V} \) and \(10< m_{\ell \ell }^{\text {X}} < 70\,\text {Ge\hspace{-.08em}V} \). The number of inputs used in the BDT is eight, and these include separation in the \(\eta \)-\(\phi \) plane between the leptons in each dilepton pair, transverse masses of combinations of leptons and \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\), as well as \(p_{\textrm{T}} ^\text {miss}\) itself. The kinematic variables of the X candidate give the most discriminating power, along with \(p_{\textrm{T}} ^\text {miss}\). To extract the Higgs boson cross section, a binned fit is performed on the BDT score. Figure 15 shows the BDT score distributions after the fit to the data.

7.5 Different-flavor V H2j categories

This category targets V H events in which the vector boson decays into two resolved jets and the Higgs boson decays to an e\(\upmu \) pair and neutrinos. The final state, and therefore the selection, is analogous to that of the \(\text {g} \text {g} \text {H} \) DF 2-jet category, with the added requirement that the dijet invariant mass be close to that of the W and Z bosons.

The main backgrounds in this category are top quark and nonresonant \(\text {W} \text {W} \) pair production, as well as \({\uptau } {\uptau } \) pair production. The top quark and \({\uptau } {\uptau } \) backgrounds are normalized to the data in dedicated CRs. The full selection is summarized in Table 11. The VH production is found to contribute about 30% of the total signal in the V H2j DF SR.

The signal extraction fit is performed on a binned template shape of \(m_{\ell \ell }\), which has a different profile for the signal and the nonresonant \(\text {W} \text {W} \) background. The distribution of \(m_{\ell \ell }\) after the fit to the data is shown in Fig. 16.

7.6 Same-flavor V H2j categories

This category targets V H events in which the vector boson decays into two jets and the Higgs boson decays to either an ee or a \(\upmu \) \(\upmu \) pair and neutrinos. The selection is identical to the 2-jet \(\text {g} \text {g} \text {H} \) SF categories described in Sect. 5.2 and Table 4, with the following modifications: the additional requirement \(65< m_{\text {jj}} < 105\,\text {Ge\hspace{-.08em}V} \) is imposed, the \(m_{\ell \ell }\) threshold is moved to 70\(\,\text {Ge\hspace{-.08em}V}\), a selection on \(m_{\textrm{T}} ^{\text {H}} < 150\,\text {Ge\hspace{-.08em}V} \) is added, and the angle between the two leptons in the transverse plane (\(\varDelta \phi _{\ell \ell } \)) is required to be less than 1.6. The threshold on the DYMVA is tuned to achieve the highest signal-to-background ratio. The signal is extracted via a simultaneous fit to the number of events in each category.

Table 11 Summary of the selection applied to different-flavor V H2j categories

8 The STXS measurement

Together with inclusive production cross sections, differential cross section measurements are also presented. These are performed within the STXS framework, using Stage 1.2 definitions [55]. In the STXS framework, the cross sections of different Higgs boson production mechanisms are measured in mutually exclusive regions of generator-level phase space, referred to as STXS bins, designed to enhance sensitivity to possible deviations from the SM. The full set of Stage 1.2 STXS bins is given in Fig. 17. The selections used in the STXS measurement match the ones described in the previous section, and the measurement is carried out by defining a set of analysis categories that target each STXS bin, as summarized in Fig. 18. The same CR setup as described in the previous section is maintained, and each CR is then subdivided to match the STXS categorization shown in Fig. 18. In all cases, the number of events is used as a fit variable in CRs. Results are then unfolded to the generator level, with the contribution from each STXS bin to each analysis category estimated from MC simulation, as shown in Fig. 19. Given the statistical power of the present data set, sensitivity to some of the Stage 1.2 bins is limited. Some bins are therefore measured together, by fixing the corresponding cross section ratios to the value predicted by the SM. We refer to this procedure as bin merging. Some STXS bins have been excluded, given the very low sensitivity. Groups of STXS bins merged with this procedure are highlighted in Fig. 17.

In the DF \(\text {g} \text {g} \text {H} \) and VBF categories, the discriminants of the same DNN explained in Sect. 6 are used for the categories which are common between VBF and \(\text {g} \text {g} \text {H} \) (\(m_{\text {jj}} >350\,\text {Ge\hspace{-.08em}V} \) and \(p_{\textrm{T}} ^{\text {H}} <200\,\text {Ge\hspace{-.08em}V} \)), and in the category exclusive to the VBF (\(m_{\text {jj}} >350\,\text {Ge\hspace{-.08em}V} \) and \(p_{\textrm{T}} >200\,\text {Ge\hspace{-.08em}V} \)). The signal extraction fit is performed on the two-dimensional (\(m_{\ell \ell }\), \(m_{\text {jj}}\)) template in the VH2j DF category (\(60<m_{\text {jj}} <120\,\text {Ge\hspace{-.08em}V} \)), while either \(m_{\ell \ell }\) or (\(m_{\ell \ell }\), \(m_{\textrm{T}} ^{\text {H}}\)) templates are used in the remaining DF categories, depending on the number of expected events in each. In the same flavor categories a similar approach is followed, but only the number of events is used for the fit.

In the V H categories with a leptonic decay of the V boson, to extract the cross section as a function of the vector boson \(p_{\textrm{T}}\), events are categorized into corresponding regions of reconstructed vector boson \(p_{\textrm{T}}\). The reconstructed vector boson \(p_{\textrm{T}}\) is defined differently depending on the vector boson type and decay channel. Because in the \(\text {W} \text {H} \) SS and \(\text {W} \text {H} \) 3\(\ell \) categories the W boson \(p_{\textrm{T}}\) (\({\vec p}_{\textrm{T}} ^{\text {W}}\)) cannot be fully reconstructed due to the unobserved neutrino, proxies are defined in both cases. In the \(\text {W} \text {H} \) SS category, the four-momenta of the lepton and neutrino from the associated W boson decay can be designated \(\ell _{\text {W}}\) and \({\upnu } _{\text {W}}\), while the four-momenta of the lepton and neutrino from the Higgs boson decay can be designated \(\ell _{\text {H}}\) and \({\upnu } _{\text {H}}\). The lepton from the W boson decay is identified as the one with the largest azimuthal separation from the jet or dijet. The transverse momentum of the W boson is defined as \(\vec {\ell }_{\text {W},\textrm{T}}+\vec {{\upnu }}_{\text {W},\textrm{T}}\), where \(\vec {{\upnu }}_{\text {W}}\) is defined as:

$$\begin{aligned} \vec {{\upnu }}_{\text {W},\textrm{T}} = {\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}- \vec {{\upnu }}_{\text {H},\textrm{T}} = {\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}- \vec {\ell }_{\text {H},\textrm{T}} \left( \frac{125\,\text {Ge\hspace{-.08em}V}}{ |\vec {\ell }_{\text {H}} + \vec {jj} | } - 1 \right) \end{aligned}$$
(3)

for events with two jets, or \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}- \vec {\ell }_{\text {H},\textrm{T}}\) for events with fewer than two jets. Here \(\vec {jj}\) indicates the dijet momentum. In the \(\text {W} \text {H} \) 3\(\ell \) category, \({\vec p}_{\textrm{T}} ^{\text {W}}\) is difficult to resolve given the ambiguities from the three neutrinos in the final state. Instead, \(p_{\textrm{T}} (\ell _{\text {W}})\) is used as a proxy for the W boson \(p_{\textrm{T}}\) in the \(\text {W} \text {H} \) 3\(\ell \) category. Here, \(\ell _{\text {W}}\) is defined as the lepton pointing away from the opposite-sign dilepton pair with smallest angular separation \(\varDelta R\). In the \(\text {Z} \text {H} \) 3\(\ell \) and \(\text {Z} \text {H} \) 4\(\ell \) categories, the reconstructed Z boson \(p_{\textrm{T}}\) (\({\vec p}_{\textrm{T}} ^{\text {Z}}\)) is defined as the \(p_{\textrm{T}}\) of the OSSF dilepton pair with \(m_{\ell \ell }\) closest to \(m_\text {Z} \). The variables used in the fit are the same as described in Sect. 7.

A summary of the expected signal fraction of the considered STXS signal processes in each category is shown in Fig. 20, together with the total number of expected \(\text {H} \rightarrow \text {W} \text {W} \) signal events.

Fig. 16
figure 16

Observed distribution of the \(m_{\ell \ell }\) fit variable in the V H2j DF SR. A detailed description is given in the Fig. 1 caption

Fig. 17
figure 17

The STXS Stage 1.2 binning scheme. Each rectangle corresponds to one of the STXS Stage 1.2 bins. Dashed lines indicate a possible finer splitting of some of the bins (not used in this analysis). Bins fused together with solid colors are merged in the analysis, i.e., they are measured as a single bin. Crossed-out bins are not measured

Fig. 18
figure 18

Analysis categories for the STXS measurement. The baseline \(\text {g} \text {g} \text {H} \), VBF, and VH selections are identical to what was described in Sects. 57. All dimensional quantities are measured in \(\text {Ge\hspace{-.08em}V}\)

Fig. 19
figure 19

Expected signal composition in each STXS bin. Generator-level bins are reported in the horizontal axis, and the corresponding analysis categories on the vertical axis. All quantities in the definitions of bins are measured in \(\text {Ge\hspace{-.08em}V}\)

Fig. 20
figure 20

Expected relative fractions of different STXS signal processes in each category. The total number of expected \(\text {H} \rightarrow \text {W} \text {W} \) signal events in each category is also shown. All dimensional quantities in the definitions of bins are measured in \(\text {Ge\hspace{-.08em}V}\)

9 Background estimation

9.1 Nonprompt lepton background

The nonprompt lepton backgrounds originating from leptonic decays of heavy quarks, hadrons misidentified as leptons, and electrons from photon conversions are suppressed by the identification and isolation requirements imposed on electrons and muons. The nonprompt lepton background in the two-lepton final state primarily originates from W+jets events, while the nonprompt lepton background in the three-lepton final state primarily comes from Z+jets events. Top quark production with a jet misidentified as a lepton also contributes to the three-lepton final state. The nonprompt lepton background gives a negligible contribution in the four-lepton final state. This background is estimated from data, as described in detail in Ref. [7]. The rate at which a nonprompt lepton passing a loose selection further passes a tight selection (misidentification rate) is measured in a data sample enriched in events composed uniquely of jets produced through the strong interaction, referred to as QCD multijet events. The corresponding rate for a prompt lepton to pass this selection (prompt rate) is measured using a tag-and-probe method [71] in a data sample enriched in DY events. The misidentification and prompt rates are used to construct a relation between the number of leptons passing the loose selection, the number of leptons passing the tight selection, and the number of true prompt leptons in an event. This relation is applied as a transfer function to a data sample containing leptons passing the loose selection, weighting the events by the probability for N leptons to pass the tight selection while fewer than N leptons are truly prompt. The nonprompt background with two leptons is validated with data in a CR enriched with W+jets events, in which a pair of same-sign leptons is required, while the nonprompt background with three leptons is validated in a CR enriched with top quark events or DY events. The systematic uncertainty in the misidentification rate determination, which arises mainly from the different jet flavor composition between the events entering the QCD multijet and the analysis phase space, is estimated with a twofold approach. First, a validation check in the aforementioned CRs yields a normalization uncertainty of about 30% that fully covers any differences with respect to data in all the kinematic distributions of interest in this analysis. Second, a shape uncertainty is estimated by varying the jet \(p_{\textrm{T}}\) threshold used in the calculation of the misidentification rate in the 15–25\(\,\text {Ge\hspace{-.08em}V}\) range, in bins of the lepton \(\eta \) and \(p_{\textrm{T}}\). For each threshold variation, the fake rate is recomputed and the difference with respect to the nominal fake rate is taken as a systematic uncertainty.

9.2 Top quark background

The background contributions from top quark processes are estimated using a combination of MC simulations and dedicated regions in data. A reweighting of the top quark and antiquark \(p_{\textrm{T}}\) spectra at parton level is performed for the \({{\text {t}} {}{\bar{\text {t}}}}\) simulation in order to match the NNLO and next-to-next-to-leading logarithmic (NNLL) QCD predictions, including also the NLO EW contribution [72]. A shape uncertainty based on renormalization (\(\mu _\text {R}\)) and factorization (\(\mu _\text {F}\)) scale variations is taken into account. For the \(\text {g} \text {g} \text {H} \), VBF, and V H2j categories, in which the contribution of top quark backgrounds is dominant, the normalization of the simulated templates is left unconstrained in the fit separately for 0-, 1-, 2-jet \(\text {g} \text {g} \text {H} \), V H, and VBF categories. The normalizations in these phase spaces are therefore measured from the data, by constraining the free-floating normalization parameters in top quark enriched CRs.

9.3 Nonresonant \(\text {W} \text {W} \) background

The nonresonant \(\text {W} \text {W} \) background is estimated using a combination of MC simulations and dedicated regions in data, and the quark-induced \(\text {W} \text {W} \) simulated events are reweighted to match the diboson \(p_{\textrm{T}}\) spectrum computed at NNLO+NNLL QCD accuracy [73, 74]. The shape uncertainties related to the missing higher-order corrections are estimated by varying the \(\mu _\text {R}\) and \(\mu _\text {F}\) scales, as well as considering the independent variation of the resummation scale from its nominal value, taken as the mass of the W boson. For the \(\text {g} \text {g} \text {H} \), VBF, and V H2j categories, the normalizations of the quark-induced and gluon-induced \(\text {W} \text {W} \) backgrounds are left unconstrained in the fit (the ratio between the two is kept fixed within the uncertainty), keeping a different parameter for each signal phase space as done for the top quark background. In the DF final states the normalization parameters are constrained directly in the SRs without the need of defining CRs, as the SRs span the high-\(m_{\ell \ell }\) phase space enriched in \(\text {W} \text {W} \) events with a negligible Higgs boson signal contribution. Since in SF final states a counting analysis is performed, dedicated CRs enriched in \(\text {W} \text {W} \) events are defined selecting events with high \(m_{\ell \ell }\). The normalizations of the EW and QCD \(\text {W} \text {W} \)+2 jets backgrounds are instead fixed to the respective SM cross sections provided by the MC simulation, taking into account the theoretical uncertainties arising from the variation of the \(\mu _\text {R}\) and \(\mu _\text {F}\) scales.

9.4 Drell–Yan background

The backgrounds arising from DY+jets processes are estimated using a different approach depending on the signal category.

In the \(\text {g} \text {g} \text {H} \), VBF, and V H2j DF categories, the only source of DY background arises from \({\uptau } {\uptau } \) production with subsequent leptonic decays of the \(\uptau \) leptons. This background process is estimated with a data-embedding technique [75], in which \({{\upmu }}^{+} {{\upmu }}^{-} \) events with well-identified muons are selected in a data sample. In each event, the selected muons are removed and replaced with simulated \(\uptau \) leptons, keeping the same four-momentum of the initial muons. The embedded sample is then corrected using scale factors related to the simulation of \(\uptau \) leptons. The usage of the embedded sample allows for a better modeling of the observables that are sensitive to the detector response and calibration, such as \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\) and other variables related to the hadronic activity in the event. Since the embedded sample takes into account all processes with a \({\uptau } {\uptau } \) pair decaying to either electrons or muons, simulated \({{\text {t}} {}{\bar{\text {t}}}}\), single top, and diboson background events that contain a \({\uptau } {\uptau } \) pair are not considered in the analysis to avoid any double counting. To correct for any additional discrepancy associated with the different acceptance of the \(\text {H} \rightarrow \text {W} \text {W} \) signal phase space, the normalization of the embedded samples is left unconstrained in the fit as done for top quark and \(\text {W} \text {W} \) backgrounds. An orthogonal \({\uptau } {\uptau } \) enriched CR is defined for the 0-, 1-, 2-jet \(\text {g} \text {g} \text {H} \) -like, 2-jet VH-like, and 2-jet VBF-like phase spaces to help in constraining the free normalization parameters. The embedded samples cover the events that pass the e \(\upmu \) triggers, which represent the vast majority of the events selected in the DF final state. The contribution of the remaining \({\uptau } {\uptau } \) events that enter the analysis phase space thanks to the single-lepton triggers (\(\approx \)5% of the total) is estimated using MC simulation.

In the \(\text {g} \text {g} \text {H} \), VBF, and V H2j SF categories, the dominant background contribution arises from DY production of \(\ell \ell \) pairs and is estimated using a data-driven technique described in Ref. [7]. The \(\ell \ell \) background contribution for events with \(|m_{\ell \ell }-m_\text {Z} |>7.5\,\text {Ge\hspace{-.08em}V} \) is estimated by counting the number of events in data passing a selection with an inverted \(m_{\ell \ell }\) requirement (i.e., under the Z boson mass peak), subtracting the non-Z-boson contribution from it, and scaling the obtained yield by the fraction of events outside and inside the Z boson mass region in MC simulation. The contribution of processes such as top quark and \(\text {W} \text {W} \) production in the Z boson mass peak region, which have the same probability to decay into the e e, e \(\upmu \), \(\upmu \) e, and \(\upmu \) \(\upmu \) final states, is estimated by counting the number of \(\text {e} ^\pm {\upmu } ^{\mp }\) events in data, and applying a correction factor that accounts for the differences in the detection efficiency between electrons and muons. Other minor processes in the Z boson mass peak region (mainly \(\text {Z} \text {Z} \) and ZW) are subtracted based on MC simulations. The yield obtained with this approach outside the Z boson mass peak is further corrected with a scale factor that takes into account the different acceptances between the estimation and SRs. The method is validated in orthogonal CRs enriched in DY events with a negligible signal contribution. The residual mismodeling between data and the estimated DY contribution arising from this validation is taken into account as a systematic uncertainty. The same procedure is repeated separately for estimating and validating the DY contribution in the \({\text {e}}^{+} {\text {e}}^{-} \) and \({{\upmu }}^{+} {{\upmu }}^{-} \) final states.

In the leptonic V H categories DY represents a minor background and is estimated using MC simulations.

9.5 Multiboson background

In categories with two charged leptons, the production of \(\text {W} \text {Z} \) and \(\text {W} {\upgamma } ^{*}\) contributes to the SRs whenever one of the three leptons is not identified. This background contribution is simulated as described in Sect. 3, and a data-to-simulation scale factor is derived in a three-lepton CR, orthogonal to the three-lepton SRs, as described in Ref. [7]. A normalization uncertainty of about 25% is associated to the scale factor determination. A different CR containing events with one pair of same-sign muons is also used as an additional validation of the \(\text {W} {\upgamma } ^{*}\) simulation. The contribution of the \(\text {W} {\upgamma } \) process may also be a background in two-lepton SRs due to photon conversions in the detector material when one of the three leptons is not identified. This process is estimated using MC simulation and validated using data in a two-lepton CR requesting events with a leading \(\upmu \) and a trailing e with same sign and a separation in \(\varDelta R\) smaller than 0.5. These requirements mainly select events arising from \(\text {W} {\upgamma } \) production where the W boson decays to \(\upmu \) \({\upnu } _{{\upmu }}\) and the photon is produced as final-state radiation from the muon. The theoretical uncertainties in \(\text {W} {\upgamma } \) and \(\text {W} {\upgamma } ^{*}\) processes estimated using \(\mu _\text {R}\) and \(\mu _\text {F}\) scale variations are taken into account.

The \(\text {W} \text {Z} \) process represents one of the main backgrounds in the leptonic VH categories and its normalization is left as a free parameter in the fit, separately for different jet multiplicity categories. Dedicated 0-, 1- and 2-jet CRs are included in the fit to help constraining the \(\text {W} \text {Z} \) normalization parameters.

The production of a Z boson pair is the main background in the \(\text {Z} \text {H} \) 4\(\ell \) category and is estimated using MC simulation. The normalization of this background is left free to float and constrained using data in a \(\text {Z} \text {Z} \)-enriched CR.

Triple vector boson production is a minor background in all the considered categories and is estimated using MC simulation.

10 Statistical procedure and systematic uncertainties

The statistical approach used to interpret the selected data sets for this analysis and to combine the results from the independent categories has been developed by the ATLAS and CMS Collaborations in the context of the LHC Higgs Combination Group [76]. All selections have been optimized entirely on MC simulation and have been frozen before comparing the templates to data, in order to minimize possible biases. In all the categories considered, the signal extraction is performed using binned templates based on variables that allow for a good discrimination between signal and background, as summarized in Table 12. Therefore, the effect of each source of systematic uncertainty is either a change of the normalization of a given signal or background process, or a change of its template shape. The signal extraction is performed by a binned maximum likelihood fit, and each such change is modeled as a constrained nuisance parameter distributed according to a log-normal probability distribution function with standard deviation set to the size of the corresponding change. Where the change in shape of a template caused by a nuisance parameter is found to be negligible (i.e., its effect on the expected uncertainty on signal strength modifiers is well below 1%), only its effect on the normalization is considered.

Table 12 Overview of the fit variables and CRs used in each analysis category. In all CRs, the number of events is used. The number of subcategories shown in the last column includes both SRs and CRs

The systematic uncertainties in this analysis arise either from an experimental or a theoretical source. The experimental uncertainties in the signal and background processes, as well as the theoretical uncertainties in the background processes, are taken into account for all the results discussed in Sect. 11. The treatment of the theoretical uncertainties in the signal processes is instead dependent on the measurement and interpretation being made. As an example, when measuring production cross sections for the STXS measurements, the theoretical uncertainties affecting the signal cross section in a given STXS bin are dropped and only the shape component is kept.

The following experimental uncertainties are included in the signal extraction fit.

  • The integrated luminosities for the 2016, 2017, and 2018 data-taking years have 1.2–2.5% individual uncertainties, while the overall uncertainty for the 2016–2018 period is 1.6% [35,36,37]. This uncertainty is partially correlated among the three data sets, and is applied to all samples that are purely based on simulation.

  • The uncertainties in the trigger efficiency and lepton reconstruction and identification efficiencies are modeled in bins of the lepton \(p_{\textrm{T}}\) and \(\eta \), independently for electrons and muons. These uncertainties cause both a normalization and a shape change of the signal and background templates and are kept uncorrelated among the three data sets. Their effect is of \(\approx \)2% for electrons and \(\approx \)1% for muons.

  • The uncertainties in the determination of the lepton momentum scale, jet energy scale, and unclustered energy scale cause the migration of the simulated events inside or outside the analysis acceptance, as well as migrations across the bins of the signal and background templates. The impact of these sources in the template normalizations is 0.6–1.0% for the electron momentum scale, 0.2% for the muon momentum scale, and 1–10% for \({\vec {p}}_{\textrm{T}}^{\hspace{1.0pt}\text {miss}}\). The main contribution to these uncertainties arises from the limited data sample used for their estimation, and they are therefore treated as uncorrelated nuisance parameters among the three years. The jet energy scale uncertainty is modeled by implementing eleven independent nuisance parameters corresponding to different jet energy correction sources, six of which are correlated among the three data sets. Their effects vary in the range of 1–10%, according mainly to the jet multiplicity in the analysis phase space.

  • The uncertainty in the jet energy resolution smearing applied to simulated samples to match the \(p_{\textrm{T}}\) resolution measured in data causes both a normalization and a shape change of the templates. This uncertainty has a minor impact on all the analyzed categories (effect below \(\approx \) 1%) and is uncorrelated among the three data sets.

  • The uncertainty in the pileup jet identification efficiency is modeled in bins of the jet \(p_{\textrm{T}}\) and \(\eta \). It is considered for jets with \(p_{\textrm{T}} <50\,\text {Ge\hspace{-.08em}V} \), since pileup jet identification techniques are only used for low-\(p_{\textrm{T}}\) jets. This uncertainty produces a change in both normalization and shape of the signal and background templates and is kept uncorrelated among the three data sets. The effect of this uncertainty on the measured quantities is found to be below 1%.

  • The uncertainty in the b tagging efficiency is modeled by implementing seventeen nuisance parameters, five of which are related to the theoretical uncertainties involved in the measurements and are therefore correlated among the three data sets. The remaining four parameters per data set, which arise from the statistical accuracy of the efficiency measurement, are kept uncorrelated [31]. These uncertainties have an impact on both the shape of the templates and their normalization for all the simulated samples.

  • The uncertainties in the nonprompt lepton background estimation affect both the normalization and shape of the templates of this process. They arise from the limited size of the data set used for the misidentification rate measurement and the difference in the flavor composition of jets mismeasured as leptons between the measurement region and the signal phase space. Both sources are implemented as uncorrelated nuisance parameters between electrons and muons, given the different mismeasurement probabilities for the two flavors, and are uncorrelated among the three data sets. Their effects vary between few percent to \(\approx \) 10% depending on the SR. A further normalization uncertainty of 30% is assigned to cover any additional mismodeling of the jet flavor composition using data in control samples, as described in Sect. 9. The latter uncertainty is correlated among the data sets, but uncorrelated among SRs containing different lepton flavor combinations, for which the main mechanism of nonprompt lepton production arises from different processes.

  • The statistical uncertainty due to the limited number of simulated events is associated with each bin of the simulated signal and background templates [77].

The theoretical uncertainties relevant to the simulated MC samples have different sources: the choice of the PDF set and the strong coupling constant \(\alpha _\textrm{S}\), missing higher-order corrections in the perturbative expansion of the simulated matrix elements, and modeling of the pileup. Template variations, both in shape and normalization, associated with the aforementioned sources are treated as correlated nuisance parameters for the three data sets.

The uncertainties in the PDF set and \(\alpha _\textrm{S}\) choice are found to have a negligible effect on the simulated templates (the effect of the shape variation on the expected uncertainties was found to be below 1%), therefore only the normalization change is considered, taking into account the effect due to the cross section and acceptance variation. These uncertainties are not considered for backgrounds with normalization constrained through data in dedicated CRs. For the Higgs boson signal processes, these theoretical uncertainties are computed by the LHC Higgs Cross Section Working Group [55] for each production mechanism.

The effect of missing higher-order corrections for the background processes is estimated by reweighting the MC simulation events with alternative event weights, where the \(\mu _\text {R}\) and \(\mu _\text {F}\) scales are varied by a factor of 0.5 or 2, and the envelopes of the varied templates are taken as the one standard deviation variation. All the combinations of the \(\mu _\text {R}\) and \(\mu _\text {F}\) scale variations are considered for computing the envelope, except for the extreme case where \(\mu _\text {R}\) is varied by 0.5 and \(\mu _\text {F}\) by 2, or vice versa. For backgrounds with normalization constrained using data in dedicated CRs, only the shape variation of the simulated templates arising from this procedure is considered. For the \(\text {W} \text {W} \) background, an uncertainty in the higher-order reweighting described in Sect. 9 is derived by shifting \(\mu _\text {R}\), \(\mu _\text {F}\), and the resummation scale. For the \(\text {g} \text {g} \text {H} \) signal sample, the uncertainties are decomposed into several sources according to Ref. [55], to account for the overall cross section, migrations of events among jet multiplicity and \(p_{\textrm{T}} ^{\text {H}}\) bins, choice of the resummation scale, and finite top quark mass effects. For the VBF signal sample, different sources of uncertainty are also decoupled to account for the overall normalization, migrations of events among Higgs boson \(p_{\textrm{T}}\), \(N_{\text {jet}}\), and \(m_{\text {jj}}\) bins, and EW corrections to the production cross section. The uncertainties due to missing higher-order corrections for the other signal samples are taken from Ref. [55]. For both PDF and missing higher-order uncertainties, the nuisance parameters are correlated for the \(\text {W} \text {H} \) and \(\text {Z} \text {H} \) processes and uncorrelated for the other ones.

In order to assess the uncertainty in the pileup modeling, the total inelastic \(\text {p} \text {p} \) cross section of 69.2\(\text {\,mb}\)  [78, 79] is varied within a 5% uncertainty, which includes the uncertainty in the inelastic cross section measurement, as well as the difference in the primary vertex reconstruction efficiency between data and simulation.

A theoretical uncertainty due to the modeling of the PS and UE is taken into account for all the simulated samples. The uncertainty in the PS modeling is evaluated by varying the PS weights computed by pythia 8.212 on an event-by-event basis, keeping the variations of the weights related to initial- and final-state radiation contributions uncorrelated. The uncertainty in the UE modeling is evaluated by shifting the nominal templates according to alternative MC simulations generated with a variation of the UE tune within its uncertainty. The corresponding nuisance parameter is correlated among all samples and between 2017 and 2018 data sets. An uncorrelated nuisance parameter is used for the 2016 data set, as the corresponding simulations are based on a different UE tune. The PS uncertainty affects the shape of the templates mainly through the migration of events across jet multiplicity bins, while the UE uncertainty is found to have a negligible impact on the shape of the templates and a normalization effect of \(\approx \) 1.5%.

Additional theoretical uncertainties in specific background processes are also taken into account. A 15% uncertainty is assigned to the relative fraction of the gluon-induced component in the \(\text {W} \text {W} \) background process [62]. An uncertainty of 8% is assigned to the relative fraction of single top quark and \({{\text {t}} {}{\bar{\text {t}}}}\) processes. A 30% uncertainty is assigned to the \(\text {W} {\upgamma } ^{*}\) process associated with the measurement of the scale factor in the trilepton CR.

For the measurement of the signal cross sections in the STXS framework, the effect of theoretical uncertainties in the template normalizations is removed for signal processes in each STXS bin being measured. In cases where two or more STXS bins are measured together because of the lack of statistical accuracy in measuring single bin cross sections, the shape effect of theoretical uncertainties causing event migrations among the merged bins is kept. In addition, residual theoretical uncertainties arising from \(\mu _\text {R}\) and \(\mu _\text {F}\) variations are taken into account to describe the acceptance effects that cause a shape variation of the signal templates within each STXS bin. The latter uncertainties are correlated among STXS bins that share a similar phase space definition, for example, \(\text {g} \text {g} \text {H} \) 0-jet bins, \(\text {g} \text {g} \text {H} \) 1-jet bins, \(\text {g} \text {g} \text {H} \) high-\(p_{\textrm{T}}\) bins, and \(\text {g} \text {g} \text {H} \) in VBF topology bins. A similar approach is used for the VBF STXS bins. For the measurement of leptonic VH cross sections in STXS bins, the aforementioned theoretical uncertainties are found to have a marginal impact with respect to the measurement statistical accuracy and have been neglected.

The contributions of different sources of systematic uncertainty in the signal strength measurement are summarized in Table 13.

Table 13 Contributions of different sources of uncertainty in the signal strength measurement. The systematic component includes the combined effect from all sources besides background normalization and the size of the dataset, which make up the statistical part

11 Results

Results are presented in terms of signal strength modifiers, STXS cross sections, and coupling modifiers. In all cases they are extracted via a simultaneous maximum likelihood fit to all the analysis categories, as explained in Sect. 10. The mass of the Higgs boson is assumed to be 125.38\(\,\text {Ge\hspace{-.08em}V}\), as measured by the CMS Collaboration [56]. The effect on event yields of varying \(m_\text {H} \) within its uncertainty is found to be below 1%. The number of expected and measured events for signal and background processes, as well as the number of observed events in each category, are reported in Tables 14, 15, 16 and 17. The normalization factors of the background contributions are found to be consistent with unity within their uncertainties. Figure 21 summarizes the full analysis template by showing the distribution of events as a function of the observed significance of the corresponding bins.

Table 14 Number of events by process in the \(\text {g} \text {g} \text {H} \) DF categories after the fit to the data, scaling the \(\text {g} \text {g} \text {H} \), VBF, \(\text {W} \text {H} \), and \(\text {Z} \text {H} \) production modes separately. The \({{\text {t}} {}{\bar{\text {t}}}} \text {H} \) contribution is fixed to its SM expectation. Numbers in parenthesis indicate expected yields
Table 15 Number of events by process in the \(\text {g} \text {g} \text {H} \) SF categories after the fit to the data, scaling the \(\text {g} \text {g} \text {H} \), VBF, \(\text {W} \text {H} \), and \(\text {Z} \text {H} \) production modes separately. The \({{\text {t}} {}{\bar{\text {t}}}} \text {H} \) contribution is fixed to its SM expectation. Numbers in parenthesis indicate expected yields
Table 16 Number of events by process in the VBF and V H2j categories after the fit to the data, scaling the \(\text {g} \text {g} \text {H} \), VBF, \(\text {W} \text {H} \), and \(\text {Z} \text {H} \) production modes separately. The \({{\text {t}} {}{\bar{\text {t}}}} \text {H} \) contribution is fixed to its SM expectation. Numbers in parenthesis indicate expected yields
Table 17 Number of events by process in the \(\text {W} \text {H} \) SS, \(\text {W} \text {H} \) 3\(\ell \), \(\text {Z} \text {H} \) 3\(\ell \), and \(\text {Z} \text {H} \) 4\(\ell \) categories after the fit to the data, scaling the \(\text {g} \text {g} \text {H} \), VBF, \(\text {W} \text {H} \), and \(\text {Z} \text {H} \) production modes separately. The \({{\text {t}} {}{\bar{\text {t}}}} \text {H} \) contribution is fixed to its SM expectation. Numbers in parenthesis indicate expected yields
Fig. 21
figure 21

Distribution of events as a function of the statistical significance of their corresponding bin in the analysis template, including all categories. Signal and background contributions are shown after the fit to the data

The \(\text {H} \rightarrow \text {W} \text {W} \) selection is subject to some degree of contamination from events in which the Higgs boson decays to a pair of \(\uptau \) leptons that themselves decay leptonically. These events are included in the signal definition, and their contribution ranges from below 1% in the ggH and VBF categories up to \(\approx \) 10% in some of the \(\text {W} \text {H} \) categories. As described in previous sections, CRs are used to fix the normalization of dominant backgrounds from data. This is achieved by scaling the corresponding background contributions jointly in the CR and SR. Given that the procedure effectively amounts to a measurement of the cross section of the background in question, the contributions from the 2017 and 2018 data sets are scaled together. The 2016 data set is kept separate in this regard because a different pythia tune was used.

For inclusive measurements, results are extracted in the form of signal strength modifiers \(\mu \). These are defined as the product of the production cross section and the branching ratio to a W boson pair, normalized to the SM prediction (\(\sigma \mathcal {B}/(\sigma \mathcal {B})_{\textrm{SM}}\)). Couplings of the Higgs boson to fermions and vector bosons are measured in the \(\kappa \) framework [80], while STXS results are provided as cross sections.

11.1 Signal strength modifiers

The global signal strength modifier is extracted by fitting the template to data leaving all contributions coming from the Higgs boson free to float, but keeping the relative importance of the different production modes fixed to the values predicted by the SM. As such, this measurement gives information on the compatibility of the SM with the LHC Run 2 data set. The observed signal strength modifier is:

$$\begin{aligned} \mu = 0.95^{+0.10}_{-0.09} = 0.95\pm 0.05\,\text {(stat)} \pm 0.08\,\text {(syst)}, \end{aligned}$$
(4)

where the uncertainty has been broken down into its statistical and systematic components. The purely statistical component is extracted by fixing all nuisance parameters in the likelihood function to their best fit values and extracting the corresponding profile. The systematic component is obtained by the difference in quadrature between the total uncertainty and the statistical one. The observed and expected profile likelihood functions, both with the full set of uncertainty sources as well as with statistical ones only, are shown in Fig. 22.

Fig. 22
figure 22

Observed profile-likelihood function for the global signal strength modifier \(\mu \). The dashed curve corresponds to the profile-likelihood function obtained considering statistical uncertainties only

Results are also extracted for individual production modes, by performing a 4-parameter fit in which contributions from the \(\text {g} \text {g} \text {H} \), VBF, \(\text {W} \text {H} \), and \(\text {Z} \text {H} \) modes are left free to float independently. Contributions from the \({{\text {t}} {}{\bar{\text {t}}}} \text {H} \) and \({\text {b}} \bar{{\text {b}}} \text {H} \) production modes are fixed to their SM expected values within uncertainties, given that this analysis has little sensitivity to them. Results are summarized in Fig. 23, where the separate contributions of statistical and systematic sources of uncertainty are also shown. Results correspond to observed (expected) significances of 10.5 (11.8)\(\sigma \), 3.15 (4.74)\(\sigma \), 3.61 (1.82)\(\sigma \), and 3.73 (2.19)\(\sigma \) for the \(\text {g} \text {g} \text {H} \), VBF, \(\text {W} \text {H} \), and \(\text {Z} \text {H} \) modes, respectively. The correlation matrix among the signal strengths is given in Fig. 24. The compatibility of the result with the SM is found to be 7%.

Fig. 23
figure 23

Observed signal strength modifiers for the main SM production modes

Fig. 24
figure 24

Correlation matrix between the signal strength modifiers of the main production modes of the Higgs boson

11.2 Higgs boson couplings

Given its large branching fraction and relatively low background, the \(\text {H} \rightarrow \text {W} \text {W} \) channel is a good candidate to measure the couplings of the Higgs boson to fermions and vector bosons. This is performed in the so-called \(\kappa \) framework. Two coupling modifiers \(\kappa _{{\text {V}}}\) and \(\kappa _\textrm{f}\) are defined, for couplings to vector bosons and fermions respectively. These scale the signal yield of the \(\text {H} \rightarrow \text {W} \text {W} \) channel as follows:

$$\begin{aligned}{} & {} \sigma \mathcal {B}({\text {X}} _i\rightarrow \text {H} \rightarrow \text {W} \text {W} ) \nonumber \\{} & {} \quad = \kappa _i^2\frac{\kappa _{{\text {V}}}^2}{\kappa _{\text {H}}^2}\sigma _{\textrm{SM}}\mathcal {B}_{\textrm{SM}}({\text {X}} _i\rightarrow \text {H} \rightarrow \text {W} \text {W} ), \end{aligned}$$
(5)

where \(\kappa _{\text {H}} = \kappa _{\text {H}}(\kappa _{{\text {V}}}, \kappa _{\textrm{f}})\) is the modifier to the total Higgs boson width, and \({\text {X}} _i\) are the different production modes. The corresponding coupling modifiers \(\kappa _i\) equal \(\kappa _{\textrm{f}}\) for the \(\text {g} \text {g} \text {H} \), \({{\text {t}} {}{\bar{\text {t}}}} \text {H} \), and \({\text {b}} \bar{{\text {b}}} \text {H} \) modes, and \(\kappa _{{\text {V}}}\) for the VBF and V H modes. Possible contributions to the total width of the Higgs boson coming from outside of the SM are neglected. The best fit values for the coupling modifiers are found to be \(\kappa _{{\text {V}}} = 0.99\pm 0.05\) and \(\kappa _\textrm{f} = 0.86^{+0.14}_{-0.11}\), where the better sensitivity to \(\kappa _{{\text {V}}}\) is due to the \(\text {H} \rightarrow \text {W} \text {W} \) decay vertex. The two-dimensional likelihood profile for the fit is shown in Fig. 25.

Fig. 25
figure 25

Two-dimensional likelihood profile as a function of the coupling modifiers \(\kappa _{{\text {V}}}\) and \(\kappa _\textrm{f}\), using the \(\kappa \)-framework parametrization. The 95 and 68% confidence level contours are shown as continuous and dashed lines, respectively

Table 18 Observed cross sections of the \(\text {H} \rightarrow \text {W} \text {W} \) process in each STXS bin. The uncertainties in the observed cross sections and their ratio to the SM expectation do not include the theoretical uncertainties on the latter. In cases where the ratio to the SM cross section is measured below zero, an upper limit at 68% confidence level on the observed cross section is reported. All dimensional quantities in STXS bin definitions are measured in \(\text {Ge\hspace{-.08em}V}\)
Fig. 26
figure 26

Observed cross sections of the \(\text {H} \rightarrow \text {W} \text {W} \) process in each STXS bin, normalized to the SM expectation

Fig. 27
figure 27

Correlation matrix between the measured STXS bins. All dimensional quantities in bin definitions are measured in \(\text {Ge\hspace{-.08em}V}\)

11.3 STXS

As explained in Sect. 8, the STXS measurement is carried out under the Stage 1.2 framework, although not all STXS bins are measured independently because of sensitivity limitations. Results are shown in Table 18 and in Fig. 26, for the signal strength modifiers and cross sections. The uncertainties are reported separately for statistical (stat), theoretical (theo), and experimental (exp) systematic sources. The correlation matrix for the measured STXS bins is shown in Fig. 27. Since final results are reported as cross sections, the effect of theoretical uncertainties in the normalization of signal templates is dropped, while uncertainties in the shape of the templates, such as STXS bin migration, are accounted for. In cases where cross sections are measured to be zero, an upper limit is reported instead of a symmetric confidence interval, so that all intervals reported correspond to a 68% confidence level. The compatibility of the STXS fit with the SM is found to be 1%.

12 Summary

A measurement of production cross sections for the Higgs boson has been performed targeting the gluon fusion, vector boson fusion, and Z or W associated production processes in the \(\text {H} \rightarrow \text {W} \text {W} \) decay channel. Results are presented as signal strength modifiers, coupling modifiers, and differential cross sections in the simplified template cross section Stage 1.2 framework. The measurement has been performed on data from proton-proton collisions recorded by the CMS detector at a center-of-mass energy of 13\(\,\text {Te\hspace{-.08em}V}\) in 2016–2018, corresponding to an integrated luminosity of 138\(\,\text {fb}^{-1}\). Specific event selections targeting different final states have been employed, and results have been extracted via a simultaneous maximum likelihood fit to all analysis categories. The overall signal strength for production of a Higgs boson is found to be \(\mu = 0.95^{+0.10}_{-0.09}\). All results are in good agreement with the standard model expectation.