UvA-DARE (Digital Academic Repository) Measurement of the associated production of a Higgs boson decaying into b-quarks with a vector boson at high transverse momentum in pp collisions at √s = 13 TeV with the ATLAS detector

√ s = 13 TeV


Introduction
Since the discovery of the Higgs boson (H ) [1][2][3][4] with a mass of around 125 GeV [5] by the ATLAS and CMS Collaborations [6,7] in 2012, the analysis of proton-proton (pp) collision data at centreof-mass energies of 7 TeV, 8 TeV and 13 TeV delivered by the Large Hadron Collider (LHC) [8] has led to precise measurements of the main production cross-sections and decay rates of the Higgs boson, as well as measurements of its mass and its spin and parity properties. In particular, the observation of the decay of the Higgs boson into b-quark pairs provided direct evidence for the Yukawa coupling of the Higgs boson to down-type quarks [9,10]. Finally, a combination of 13 TeV results searching for the Higgs boson produced in association with a leptonically decaying W or Z boson established the observation of this production process [9]. A first cross-section measurement as a function of the vector-boson transverse momentum was also carried out by the ATLAS Collaboration [11].
The previous ATLAS analyses [9,11] in this channel were mainly sensitive to vector bosons with transverse momentum (p T ) in the range of approximately 100-300 GeV. These analyses considered a pair of jets with radius parameter of R = 0.4, referred to as smallradius (small-R) jets, to reconstruct the Higgs boson. For higher Higgs boson transverse momenta, the decay products can become from a combined profile likelihood fit to the large-R jet mass, using several signal and control regions. The yield of diboson production V Z with Z → bb is also measured using the same fit and provides a validation of the analysis. The cross-section measurements are performed within the simplified template cross-section (STXS) framework [16,17]. These measurements are then used to constrain anomalous couplings in a Standard Model effective field theory (SMEFT) [18].

ATLAS detector
The ATLAS detector [14] at the LHC is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry and a near 4π coverage in solid angle. 1 It consists of an inner detector (ID) for tracking surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadronic calorimeters, and a muon spectrometer. The ID covers the pseudorapidity range |η| < 2.5. It consists of silicon pixel, silicon microstrip, and transition radiation tracking detectors. An inner pixel layer, the insertable B-layer [19,20], was added at a mean radius of 3.3 cm during the long shutdown period between Run 1 and Run 2 of the LHC. Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements with high granularity (|η| < 3.2). The hadronic calorimeter uses a steel/scintillator-tile sampling detector in the central pseudorapidity range (|η| < 1.7) and a copper/LAr detector in the region 1.5 < |η| < 3.2. The forward regions (3.2 < |η| < 4.9) are instrumented with copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic measurements, respectively. A muon spectrometer with an air-core toroid magnet system surrounds the calorimeters. Three layers of high-precision tracking chambers provide coverage in the range |η| < 2.7, while dedicated fast chambers allow muon triggering in the region |η| < 2.4. The ATLAS trigger system consists of a hardware-based first-level trigger followed by a software-based high-level trigger [21].

Data and Monte Carlo simulation
The data were collected in pp collisions at √ s = 13 TeV during Run 2 of the LHC. The data sample corresponds to an integrated luminosity of 139 fb −1 after requiring that all detector subsystems were operating normally and recording high-quality data [22]. The uncertainty in the combined 2015-2018 integrated luminosity is 1.7% [23], obtained using the LUCID-2 detector [24] for the primary luminosity measurements. Collision events considered for this analysis were recorded with a combination of triggers selecting events with high missing transverse momentum or with a high-p T lepton, depending on the analysis channel. More details of the trigger selection are given in Section 5. Monte Carlo (MC) simulated event samples processed with the ATLAS detector simulation [25] based on Geant 4 [26] are used to model the signal and background contributions, except for the multijet production, whose contribution is estimated with datadriven techniques as detailed in Section 6. A summary of all the signal and background processes with the corresponding generators used for the nominal samples is shown in Table 1. All simulated processes are normalised using the most precise theoretical 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point in the centre of the detector. The positive x-axis is defined by the direction from the interaction point to the centre of the LHC ring, with the positive y-axis pointing upwards, while the beam direction defines the z-axis. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis. The pseudorapidity η is defined in terms of the polar angle θ by η = − ln tan(θ/2). The angular distance is defined as R ≡ ( η) 2 + ( φ) 2 .
Rapidity is defined as y = 0.5 ln[(E + p z )/(E − p z )] where E denotes the energy and p z is the component of the momentum along the beam direction.
predictions currently available of their cross-sections. In addition to the hard scatter, each event was overlaid with additional pp collisions (pile-up) generated with Pythia 8.1 [27] using the ATLAS A3 set of tuned parameters [28] and the NNPDF23LO [29] parton distribution function (PDF) set. Simulated events were then reconstructed with the same algorithms as those applied to data and are weighted to match the pile-up distribution observed in the data.
For the signal events, the AZNLO [30] model of parton showers and the underlying event (UE) was used. For the top-quark pair and single-top-quark production processes, the UE model was taken from the ATLAS A14 [31] set of tuned Pythia 8.1 [27] parameters and for the other backgrounds the default Sherpa [32][33][34][35] tune set was used. For all samples of simulated events, except for those generated using Sherpa, the EvtGen v1.2.0 program [36] was used to describe the decays of bottom and charm hadrons. The nominal PDF set used for W /Z +jets and diboson processes was NNPDF3.0NNLO [37] while for the top-quark pair and single-top production the NNPDF3.0NLO [37] set was used. Samples produced with alternative generators which are used to estimate modelling systematic uncertainties are described in Section 7.
The nominal top-quark pair production generator was Powheg-Box v2 with real and virtual corrections at NLO accuracy in QCD and interfaced to Pythia 8.230 for the parton showering. The nominal top-quark pair production cross-section is from a resummed NNLO and next-to-next-to-leading logarithm (NNLL) prediction [59].
Single top-quark production was also generated with Powheg-Box v2 interfaced to Pythia 8.230. The nominal cross-section normalisations for the single top-quark production s-and t-processes were estimated from resummed calculations at NLO, while for the W t process approximate NNLO was used [61,62,64]. At higher orders in QCD, the definition of the W t process can correspond to leading-order top-quark pair production processes. To account for these ambiguities and related interference effects when generating the processes separately, the diagram removal (DR) subtraction scheme was used [68].
The nominal W /Z +jets background samples used Sherpa 2.2.1 [33][34][35] for the matrix element (ME) and parton shower with virtual corrections at NLO accuracy for up to two additional jets and at LO for up to four additional jets using OpenLoops [32,34,35]. In these samples, the simulation of the emission of hard partons matched with a parton shower was based on the Catani-Seymour subtraction term [32,34,35] and the multi-parton ME was merged with the parton shower using an improved ckkw matching procedure extended to NLO accuracy using the MEPS@NLO prescription [66]. The nominal normalisation of this background was obtained from an NNLO fixed-order estimate [67].
The diboson nominal samples were generated using Sherpa 2.2.1 for the dominant qq-initiated processes for which zero or one additional parton was calculated at NLO in the ME, while two or Table 1 Signal and background processes with the corresponding generators used for the nominal samples. If not specified, the order of the cross-section calculation refers to the expansion in the strong coupling constant (α S ). ( ) The events were generated using the first PDF in the NNPDF3.0NLO set and subsequently reweighted to the PDF4LHC15NLO set [38] using the internal algorithm in Powheg-Box v2. ( †) The NNLO(QCD)+NLO(EW) cross-section calculation for the pp → Z H process already includes the gg → Z H contribution. The qq → Z H process is normalised using the cross-section for the pp → Z H process, after subtracting the gg → Z H contribution. An additional scale factor is applied to the qq → V H processes as a function of the transverse momentum of the vector boson, to account for electroweak (EW) corrections at NLO. This makes use of the V H differential cross-section computed with Hawk [39,40]. For these samples, zero or one additional parton was calculated at LO in the ME. These generators also provided the nominal normalisation for this process.

Object reconstruction
Of all the reconstructed pp collision vertices with at least two reconstructed trajectories of charged particles in the ID (tracks) with p T > 0.5 GeV, the hard-scattering primary vertex is selected as the one with the highest sum of squared transverse momenta of associated tracks [69].
Leptons are used for event categorisation as described in Section 5. Electrons are reconstructed from tracks in the ID associated with topological clusters of energy depositions in the calorimeter [70,71]. The identification criteria closely follow those described in Ref. [9]. Baseline electrons are required to have p T > 7 GeV and |η| < 2.47, to be isolated from other tracks and energy deposit clusters, to meet loose likelihood selection criteria based on shower shapes and to satisfy |d 0 /σ (d 0 )| < 5 and |z 0 sin θ| < 0.5 mm, where d 0 and z 0 are the transverse and longitudinal impact parameters defined relative to the primary vertex position 2 and σ (d 0 ) is the d 0 uncertainty. Signal electrons are a subset of the baseline electron set and are selected using a tighter likelihood requirement, which also includes tracking and track-cluster matching variables, and using a tighter calorimeter-based isolation criterion.
Muon candidates are identified by matching ID tracks to full tracks or track segments reconstructed in the muon spectrometer within the inner detector coverage and using only information from the muon spectrometer outside of that coverage. Muons are required to have p T > 7 GeV and |η| < 2.7 and to have |d 0 /σ (d 0 )| < 3 and |z 0 sin θ| < 0.5 mm. Two muon categories are 2 For the computation of the impact parameters, the beam line is used to approximate the primary vertex position in the transverse plane. used in the analysis: baseline muons are selected using the 'loose' identification criterion of Ref. [72] and a loose track isolation; signal muons are required to have |η| < 2.5, to satisfy the 'medium' identification criterion [72] and a tighter track-based isolation criterion.
The low-threshold (7 GeV) baseline leptons are used to define the three main channels requiring exactly zero, one and two leptons. The latter 1-and 2-lepton channels further require at least one signal lepton, with identification and isolation requirements chosen to optimise the suppression of the multijet background.
Signal leptons must have a p T > 27 GeV (except in the 1-lepton muon sub-channel where a p T > 25 GeV is used).
Calorimeter jets are reconstructed from noise-suppressed topological clusters (topoclusters) of calorimeter energy depositions . Large-R jets are groomed using trimming [77,78] to improve the jet mass resolution and its stability with respect to pile-up by discarding the softer components of jets that originate from initial-state radiation, pile-up interactions, or the underlying event. This is done by reclustering the constituents of the initial large-R jet, using the k t algorithm [79,80], into subjets with radius parameter R sub = 0.2 and removing any subjet that has a p T less than 5% of the parent jet p T . The large-R jet mass m J is computed using tracking and calorimeter information [81]. A dedicated MC-based calibration, similar to the procedure used in Ref. [81], is applied to correct the p T and mass of the trimmed jets to the particle level. Large-R jets are required to have p T > 250 GeV, m J > 50 GeV and |η| < 2.0, the last due to tracking acceptance.
Small-R jets are used in building the missing transverse momentum and event categorisation. They are calibrated with a series of simulation-based corrections and in situ techniques, including corrections to account for pile-up energy entering the jet area, as described in Ref. [76]. They are required to have p T > 30 GeV and |η| < 4.5. To reduce the number of small-R jets originating from pile-up interactions, small-R jets are required to pass the jet vertex tagger (JVT) [82] requirement if they are in the range p T < 120 GeV and |η| < 2.5 due to tracking acceptance.
Track-jets formed from charged-particle tracks are used to reconstruct a candidate two-body H → bb decay within the large-R jet. Track-jets are built with the anti-k t algorithm with a variable radius (VR) p T -dependent parameter, from tracks reconstructed in the inner detector with p T > 0.5 GeV and |η| < 2.5 [83][84][85]. VR track-jets have an effective jet radius R eff proportional to the inverse of the jet p T in the jet finding procedure: R eff (p T ) = ρ/p T , where the ρ-parameter is set to 30 GeV. There are two additional parameters, R min and R max , used to set the minimum and maximum cut-offs on the jet radius, and these are set to 0.02 and 0.4, respectively. Only VR track-jets with p T > 10 GeV, |η| < 2.5 and with at least two constituents are considered [86]. VR track-jets are matched to the large-R calorimeter jets via ghost-association [87]. Track-jets not associated with large-R jets are also used in the analysis for event categorisation as described in Section 5.
The 'truth' flavour labelling of track-jets in simulation is done by geometrically matching the jet to 'truth' hadrons, using 'truth' information from the generator's event record. If a b-hadron with p T above 5 GeV is found within R = 0.3 of the direction of the track-jet, the track-jet is labelled as a b-jet. If the b-hadron is matched to more than one track-jet, only the closest track-jet is labelled as a b-jet. If no b-hadron is found, the procedure is repeated first for c-hadrons to label c-jets and then for τ -leptons to label τ -jets. As is the case for defining a b-jet, the labelling is also exclusive for cand τ -jets. A jet for which no such matching can be made is labelled as a light-flavour jet.
To identify track-jets containing b-hadron decay products, track-jets are tagged using the multivariate algorithm MV2c10, which exploits the presence of large-impact-parameter tracks, the topological decay chain reconstruction and the displaced vertices from b-hadron decays [88,89]. The MV2c10 algorithm is configured to achieve an average efficiency of 70% for tagging jets labelled as b-jets in an MC sample of tt events. This requirement has corresponding rejection factors of 9 and 304 for jets labelled as c-jets and light-flavour jets, respectively, in simulated tt events. The tagging efficiencies per jet flavour are corrected in the simulation to match those measured in data [86,90,91].
Two additional corrections are applied to the large-R jets to improve the scale and the resolution of their energy and mass measurements. First, to account for semileptonic decays of the b-hadrons, the four-momentum of the closest reconstructed non-isolated muon candidate within R = min(0.4, 0.04 + 10 GeV/p muon T ) of a track-jet matched to the large-R jet by ghost association is added to the calorimeter-based component of the large-R jet four-momentum while its expected calorimeter energy deposits are removed [85]. This is known as the muon-in-jet correction. Non-isolated muons satisfy the 'medium' identification criterion [72], but no isolation or impact parameter criteria are applied. Second, in the 2-lepton channel only, a per-event likelihood uses the full reconstruction of the event kinematics to improve the estimate of the energy of the b-jets [92]. The kinematic fit constrains the + − bb system and the additional small-R jets in the event to be balanced in the transverse plane and the dilepton system to the Z boson mass, by scaling the four-momentum of the objects in the event including the large-R jet, additional small-R jets and leptons within their detector response resolutions. The large-R jet mass is then scaled by the ratio of the energies after and before the correction. For the event selection detailed in Section 5, the large-R jet mass resolution improves by 5% to 10% after the first correction (depending on the lepton channel), while the second correction brings an additional improvement in the 2lepton channel of up to 40%.
The presence of neutrinos in the W H → νbb and Z H → ννbb signatures can be inferred from a momentum imbalance in the transverse plane. The missing transverse momentum E miss T is reconstructed as the negative vector sum of the momenta of leptons and small-R jets in the event plus a 'soft term' built from additional tracks associated with the primary vertex [93]. Small-R jets used for the E miss T reconstruction are required to have p T > 20 GeV. The magnitude of E miss T is referred to as E miss T . To suppress non-collision and multijet backgrounds in the 0-lepton channel, an additional track-based missing transverse momentum estimator, E miss T, trk , is built independently as the negative vector sum of the transverse momenta of all tracks from the primary vertex.

Event selection
Events are categorised into the 0-, 1-and 2-lepton channels depending on the number of selected electrons and muons to target The 0-lepton selection is applied to events selected with an E miss T trigger with thresholds varying from 70 to 110 GeV depending on the data-taking period to cope with increasing trigger rates at higher instantaneous luminosities. In the 1-lepton channel, single-electron events are required to be triggered by at least one of several unprescaled single-electron triggers. The lowest E T threshold of these unprescaled triggers varied with time from 24 to 26 GeV. Events in the single-muon channel were triggered using the same E miss T trigger as used in the 0-lepton channel. Given that muons do not enter in the online E miss T calculation and that uninstrumented regions affect the coverage of the muon spectrometer, the E miss T triggers translate into a requirement on the transverse momentum of the lepton and neutrino pair, p ν T , which is more efficient in the analysis phase space than the single-muon triggers. In the 2-lepton channel, the same trigger strategy as in the 1-lepton channel is adopted. The dielectron selection is applied to events triggered by at least one of the un-prescaled single-electron triggers. The dimuon selection is applied to events triggered by an E miss T trigger. All triggers used in this analysis are fully efficient for the events selected using the requirements described below.
In all three channels, events are required to contain at least one large-R jet with p T > 250 GeV and |η| < 2.0. To select the Higgs boson candidate, the leading p T large-R jet is chosen, at least two VR track-jets are required to be matched to it by ghostassociation, and the two leading ones are required to be b-tagged. This jet is referred to as the 'Higgs-jet candidate' in the following. To avoid the ambiguous cases of concentric jets, events where the b-tagged VR track-jets overlap with other VR track-jets, satisfying R/R s < 1 (where R corresponds to the distance among any pair of VR track-jets and R s corresponds to the smaller radius of the considered pair), are removed. The reconstructed transverse momentum p V T of the vector boson corresponds to E miss T in the 0lepton channel, to the magnitude of the vector sum of E miss T and the charged-lepton transverse momentum in the 1-lepton channel, and to the transverse momentum of the 2-lepton system in the 2-lepton channel. The p V T is required to be above 250 GeV in all three channels. The event selection is detailed in Table 2, with further explanations provided below for the non-straightforward selection criteria.
The multijet background in the 0-lepton channel originates mainly from jet energy mismeasurements. To reduce this background to a negligible level, three dedicated selection criteria are applied. Events are removed if the missing transverse momentum is pointing towards the direction of the Higgs-jet candidate Track-jets at least two track-jets, p T > 10 GeV , |η| < 2.5, matched to the leading large-R jet b-tagged jets leading two track-jets matched to the leading large-R must be b-tagged (MV2c10, 70%) In the 1-lepton channel, the isolation requirements remove most of the non-prompt lepton background. An additional E miss T requirement is applied in the electron sub-channel to reduce this background further. In order to reduce other backgrounds, such as top and W +jets production, a further selection on the rapidity difference between the Higgs-jet candidate and the vector boson is applied (| y(V , H cand )| < 1.4). The W -boson rapidity is estimated assuming that E miss T is the p T of the neutrino and the longitudinal momentum of the neutrino is estimated using the W -boson mass constraint. This method leads to a quadratic equation for the longitudinal momentum of the neutrino. In case of two real solutions: the retained solution is the one that minimises the difference between the longitudinal boost of the W boson and the Higgs boson. In case of no real solution, the imaginary part is set to 0. 3 In the 2-lepton channel, where two same-flavour leptons are required (in the dimuon sub-channel the two muons are further required to be of opposite sign, in the dielectron case this selection is not applied due to the comparatively higher charge misidentification), the rapidity difference (| y(V , H cand )| < 1.4) effectively reduces the main Z +jets background. A requirement is imposed on Since the signal-to-background ratio increases for large Higgs boson transverse momenta [12,96], events are further split into two p V T bins with 250 < p V T < 400 GeV and with p V T ≥ 400 GeV. 3 This procedure is equivalent to setting the reconstructed W transverse mass to the W mass.
The selection efficiency in the 0-, 1-and 2-lepton channels and two p V T bins ranges between approximately 6% and 16% for the W H and Z H processes where the W and Z bosons decay leptonically and the Higgs boson decays into a pair of b-quarks. The analysis does not explicitly select τ -leptons but they are accounted for in the case of leptonically decaying τ -leptons in the 1-and 2-lepton channels and hadronically decaying τ -leptons in the 0lepton channel if they are misidentified as jets.
As discussed in Section 1 the overlaps between the event selections presented herein and those of Ref. [15] are non negligible.
In the 250 GeV < p V T < 400 GeV region, approximately 40% of the signal events are selected by both sets of selections, and the fraction of signal events uniquely selected by the large-R jet analysis varies between 5% and 30% with increasing p V T . In the p V T > 400 GeV region, the overlap decreases progressively to reach approximately 15% and the unique large-R jet analysis signal events increase to 75% at a p V T of around 700 GeV. The tt process is a major background in the 0-and 1-lepton channels. For tt events, the b-tagged track-jets associated with the Higgs-jet candidate are mainly a b-and a c-jet (the former from a top-quark decay and the latter from the hadronic W boson decay) and therefore the second b-jet from the other top-quark is often expected to be identified as an additional b-tagged track-jet not associated with the Higgs-jet candidate. Taking this into account, signal regions (SR) in the 0-and 1-lepton channels are defined by vetoing on b-tagged track-jets outside the Higgs-jet candidate and control regions (CR), enriched in tt events, are built from events which fail this veto. The SRs and CRs are accounted for in the same way in the fit, but CRs are dominated by backgrounds and are used to constrain specific background components.
Events in the 0-and 1-lepton channels are further categorised depending on the number of small-R jets not matched to the Higgs-jet candidate, i.e. with R(H cand , small-R jet)>1.0. Two categories are defined: a high-purity signal region (HP SR) with 0 small-R jets not matched to the Higgs-jet candidate and a lowpurity signal region (LP SR) with ≥ 1 small-R jets not matched to the Higgs-jet candidate.
The ten SRs and the four CRs are summarised in Table 3.

Background composition and estimation
The background contribution in the SRs is different for each of the three channels. In the 0-lepton channel, the dominant background sources are Z + jets and tt events with a significant contribution from W + jets and diboson production. In the 1-lepton channel, the largest backgrounds are tt and W + jets production followed by the single-top background. In the 2-lepton channel, Z + jets production is the dominant background followed by the Z Z background. Contributions from tt V and tt H are negligible. The multijet background, due to semileptonic heavy-flavourhadron decays or misidentified jets, is found to be negligible in the 0-and 2-lepton channels as well as in the 1-lepton muon subchannel after applying the event selections described in Section 5, as confirmed using data-driven techniques. In the 1-lepton electron sub-channel its contribution is not neglected. All initial background distribution shapes prior to the fit (described in Section 8), except those for multijet, are estimated from the samples of simulated events. The multijet shape and normalisation are determined using data.
The W /Z +jets simulated event samples are split into 6 categories depending on the 'truth' labels of the track-jets ghostassociated to the Higgs-jet candidate: W /Z + bb, W /Z + bc, W /Z + bl, W /Z + cc, W /Z + cl and W /Z + ll; in this notation l refers to a light-flavour jet. 4 The W /Z + bb fraction corresponds to approximately 80% of the total W /Z +jets background. This categorisation is used in the uncertainties variations of the ratios V + bc/V + bb, V + bl/V + bb and V + cc/V + bb to cover uncertainties on the flavour composition in V +jets production, see Section 7.
In the statistical analysis described in Section 8, the components W /Z + bb, W /Z + bc, W /Z + bl and W /Z + cc are treated as a single background component denoted by W /Z +HF. The W +HF and Z +HF contributions, which together constitute 90% of V +jets background, are estimated separately, each with its own normalisation factor determined from the fit to data. The tt production background arises from topologies with decays of W bosons into τ -leptons which then decay hadronically in the 0-lepton channel and from W bosons decaying into e/μ in the 1-lepton channel. In the 2-lepton channel the tt contribution is much smaller. For the 0-and 1-lepton channels, two independent normalisation factors are considered and left floating in the fit, where they are constrained by the CRs.
Single-top production contributes to the 0-and 1-lepton channels and W t production is the dominant process (s-and t-channel processes amount to less than 1% globally and less than 5% of the single-top contribution). 4 When labelling jets in the V + jets backgrounds modelling, the labelling of τ -jets is omitted and the negligible τ -lepton contribution is included with lightflavour jets.
The diboson background process consists of final states arising mostly from W Z and Z Z events, where a Z boson decays into a pair of b-quarks. This process has a topology very similar to that of the signal, exhibiting a peak in m J at the mass of the hadronically decaying vector boson. Although it is a subdominant contribution, it provides an important reference for validation. Its normalisation is measured simultaneously with the V H signal.
In the 1-lepton channel, the multijet background originating from jets misidentified as leptons and/or due to semileptonic heavy-flavour-hadron decays cannot be neglected. Since MC simulation samples are statistically limited and are not expected to reproduce the multijet production in this corner of the phase space, it is estimated from a template fit using the data. The m J templates in the electron and muon sub-channels are taken from dedicated CRs enriched in multijet background, obtained from the inversion of the tight lepton isolation requirements and the removal of the E miss T requirement, and after subtraction of the other backgrounds. The multijet normalisations are estimated in the SRs from a fit to the transverse mass 5 distribution separately for the electron and muon sub-channels. The contribution of the multijet background is found to be negligible in the muon sub-channel. In the electron sub-channel it is approximately 2% of the total background, with an uncertainty of 55% estimated mainly from the statistical uncertainty of the transverse mass fit. This contribution and its associated uncertainty are taken into account in the signal extraction fit.

Systematic uncertainties
Systematic uncertainties can have an impact on the overall signal and background yields, on the shapes of the jet mass distributions, on the CR to SR extrapolations, and on the relative acceptances between the HP and LP SRs and between the p V T bins. Systematic uncertainties are discussed herein for three main categories: experimental, signal modelling, and background modelling.

Experimental systematic uncertainties
The uncertainties in the small-R jet energy scale and resolution have contributions from in situ calibration studies, from the dependency on the pile-up activity and on the flavour composition of the jets [76]. For large-R jets, the uncertainties in the energy and mass scales are based on a comparison of the ratio of calorimeter-based to track-based measurements in dijet data and simulation, as described in Ref. [81]. The impact of the jet energy scale and resolution uncertainties on the large-R jet mass are assessed by applying different calibration scales and smearings to 5 The transverse mass m T of the W boson candidate in the event is calculated using the lepton candidate and E miss the jet observables in the simulation, according to the estimated uncertainties. An absolute uncertainty of 2% is used for the jet energy resolution while a relative uncertainty of 20% is used for the jet mass resolution, consistent with previous studies for trimmed jets [97,98].
The b-tagging uncertainties are assessed from the calibration data in various kinematic regions and separately for b-, c-, and light-flavour jets. The uncertainties are then decomposed in each of the flavour categories into independent components. An additional uncertainty is included to account for the extrapolation to jets with p T beyond the kinematic reach of the data calibration (the thresholds are 250 GeV, 140 GeV and 300 GeV for b-, cand light-flavour jets, respectively) [86,90,91].
Other experimental systematic uncertainties with a smaller impact are those in the lepton energy and momentum scales, in lepton reconstruction and identification efficiency, and in the efficiency of the triggers. An uncertainty associated with the modelling of pile-up in the simulation is included to cover the difference between the predicted and measured inelastic crosssections [99]. The uncertainties in the energy scale and resolution of the small-R jets and leptons are propagated to the calculation of E miss T , which also has additional uncertainties from the scale, resolution and reconstruction efficiency of the tracks used to compute the soft term, along with the modelling of the underlying event [93].

Signal modelling systematic uncertainties
The systematic uncertainties that affect the modelling of the signal are derived closely following the procedure outlined in Refs. [11,16,92] and in Refs. [100,101] for uncertainties specific to STXS. The systematic uncertainties in the calculations of the V H production cross-sections and the H → bb branching fraction are assigned following the recommendations of the LHC Higgs Cross Section Working Group [56,57,[102][103][104]. Acceptance and shape systematic uncertainties are derived to account for missing higherorder QCD and EW corrections, for PDF+α S uncertainties, and for variations of the PS and UE models. Factorisation and renormalisation scales are varied by factors of 0.5 and 2. PDF-related uncertainties are derived following Ref. [38]. The effects of the uncertainties from missing higher-order EW corrections, PDF+α S and QCD scale variations on the jet mass shape are negligible. The PS and UE uncertainty is evaluated by comparing the nominal signal Powheg-Box samples showered by Pythia 8 with alternative samples showered by Herwig 7 [105].

Background modelling systematic uncertainties
The principal additional modelling uncertainties for the backgrounds that were considered are the following: renormalisation and factorisation scale variations by factors of 0.5 and 2 for higher order in QCD corrections of the matrix element of the process; merging scale variations from multi-leg simulations; resummation scale or parton shower uncertainties; PDF uncertainties; and differences with alternative MC generators. The impact of these systematic uncertainties in terms of normalisation, shape, acceptance and extrapolation between analysis regions is then estimated and included in the fit model (described in Section 8). Given that the analysis is based on the fit of the m J variable only, all shape uncertainties are estimated with respect to this observable.
The normalisations of the W /Z +HF backgrounds are free parameters in the fit. They are determined thanks to the use of the jet mass distributions in SRs once tt is constrained from C R enriched in tt events. In addition to scale variations within Sherpa 2.2.1, alternative samples for acceptance and shape variations generated with MadGraph interfaced to Pythia 8 were con-sidered. Finally, variations in the V + bc/V + bb, V + bl/V + bb and V + cc/V + bb ratios are accounted for independently for the Wand Z -boson backgrounds.
For top-quark pair production modelling uncertainties, specific initial-state radiation (ISR) and final-state radiation (FSR) Pythia parameters are used to assess the related systematic uncertainties. In addition to the typical scale variations, alternative NLO samples using the MadGraph5_aMC@NLO and Herwig 7 generators were considered. The tt normalisation is free in the fit and mainly constrained in the CRs for the 0-and 1-lepton channel. For the 2-lepton channel it is constrained to its nominal predicted value with an uncertainty of 20%. Due to top decays not fully contained within the large-R jet, the relative number of events where exactly two and where three or more VR track-jets are ghost-associated to the large-R jet can modify the large-R jet mass template. This is accounted for by an additional uncertainty estimated from the impact on the tt background template of a 20% variation in this relative ratio.
The normalisations, acceptances and shapes of all single-top production processes are constrained to their predictions within the corresponding uncertainties. For the dominant W t channel, ISR/FSR uncertainties as well as alternative generator samples, Herwig 7 and Madgraph5_aMC@NLO, are considered. Since the W t channel has the same flavour composition and a similar shape in the 0-and 1-lepton channels, the modelling uncertainties were studied in the 1-lepton channel and then propagated to the 0lepton channel. An associated extrapolation uncertainty is taken into account.
To account for the ambiguities in the interference between tt and single-top production, an alternative sample generated with Powheg-Box interfaced to Pythia 8, using the diagram subtraction (DS) scheme, is used [68]. The difference between the DS and DR schemes for the W t single-top production is accounted for as an additional systematic uncertainty.
For diboson production, in addition to the scale variations for acceptance, extrapolation and shape systematic uncertainties, alternative diboson samples were generated using Powheg-Box interfaced to Pythia 8 and the difference with respect to the Sherpa nominal samples was used as an additional uncertainty.

Results
The results are obtained from a binned maximum-profilelikelihood fit to the data of the m J distribution, using all the signal and control regions defined in Section 5. The fit is performed using the RooStats framework [106,107]. Signal and background m J templates are determined from MC simulation (described in Section 3) in all cases except for the multijet background in the 1-lepton channel, which is extracted from the data as discussed in Section 6.
The likelihood function is constructed from the product of the Poisson probabilities of each bin of the mass distributions and auxiliary terms used to model systematic uncertainties. The likelihood function is described in more detail in Ref. Systematic uncertainties are modelled in the likelihood function by parameterised variations of the number of signal and background events, and of the templates through nuisance parameters (NP). Systematic variations of the templates that are subject to large statistical fluctuations are smoothed, and systematic uncertainties that have a negligible impact on the final results are pruned away region-by-region [108]. NPs corresponding to most  uncertainties discussed in Section 7 are constrained using Gaussian or log-normal probability density functions as auxiliary terms in the likelihood. The normalisations of the largest backgrounds, tt (in the 0-and 1-lepton channels), W +HF and Z +HF, are left unconstrained in the fit. The background normalisation factor values from the fit correspond to 0.88 ± 0.10 and 0.83 ± 0.09 for tt, in the 0-and 1-lepton channels, respectively; 1.12 ± 0.14 for W +HF and 1.32 ± 0.16 for Z +HF and are also summarised in Table 4. The fit model uses a single normalisation factor for Z +HF and compatible results were found when using two different factors for the 0and 2-lepton channels. To account for the uncertainty due to the limited size of the MC simulation samples, an NP is used for each bin of the templates [109].
The m J distributions with signal strengths, background normalisations and all NPs set at their best-fit values, are shown in Fig. 1 for all three channels' SRs and in Fig. 2 for the CRs. The low-purity and high-purity categories in the case of the 0-lepton and 1-lepton channels are merged in Fig. 1. In all SRs and CRs a good agreement between the data and the prediction is observed.
For a Higgs boson mass of 125 GeV, when all lepton channels are combined, the observed excess with respect to the backgroundonly hypothesis has a significance of 2.1 standard deviations, to be compared with an expectation of 2.7 standard deviations. The  Table 5.
The m J distribution is shown in Fig. 3(a) summed over all channels and signal regions, weighted by their respective values of the ratio of the fitted Higgs boson signal to background yields and after subtraction of all backgrounds except for the W Z and Z Z diboson processes. Fig. 3(b) shows the results of: a fit with six V H POIs measuring the individual signal strengths in each of the three channels and p V T bins separately; a three V H POI fit measuring the combined signal strengths in each channel; a two V H POI fit combining all channels in the two p V T bins separately; and the overall single V H POI combination.
For V Z production the fitted signal strength μ bb in agreement with the SM prediction and the W ± Z differential cross-section measurement performed by ATLAS at high transverse momentum of the Z boson (p Z T > 220 GeV) in the fully leptonic channel (W ± Z → ν + − ) [110]. The simultaneous fit tests the performance of the analysis on an irreducible background, the known V Z production, with a topology similar to the V H signal. With all three lepton channels combined, a significance of 5.4 standard deviations is observed for the V Z process, compared to an expectation of 5.7 standard deviations. The correlation with the μ bb V H signal strength is approximately 11%. The statistical uncertainties amount to approximately 60% of the total uncertainty. The dominant source of systematic uncertainty is the background modelling, which has an impact of approximately 0.16 on the result. The source of systematic uncertainty related to the large-R jet reconstruction follows closely, with an impact of approximately 0.09 The cross-sections in the STXS framework are measured separately for Z H and W H production in two p V T regions, 250 GeV < p V T < 400 GeV and p V T ≥ 400 GeV. The analysis closely follows the strategy used in Ref. [11]. The expected signal distributions and acceptance times efficiencies for each STXS region are estimated from the simulated signal samples by selecting events using the generator's 'truth' information, in particular the 'truth' p V T , denoted by  [111]. The sources of systematic uncertainty are identical to those defined in Section 7, except for the theoretical cross-section and branching fraction uncertainties, which are not included in the likelihood function because they affect the signal strength measurements but not the STXS measurements. The cross-sections are not constrained to be positive in the fit. The measured reduced stage-1.2 V H cross-section times branching fraction σ × B in each STXS bin, together with the SM predictions are summarised in Fig. 4 where the red error bands correspond to the theoretical uncertainty of the fiducial cross-section prediction in each bin. The measurements are also reported in Table 6 and are in agreement with the SM predictions from the signal MC sample. The principal sources of systematic uncertainties are similar to those affecting μ bb V H . These results complement and extend those obtained by the small-R jets analysis [15] using the same dataset. The latter provides a more precise measurement of the cross-section in the inclusive p V T > 250 GeV region. This can be attributed to the larger acceptance at lower p V T value, the usage of more precise physics objects calibration and to the use of multivariate analysis techniques. The results obtained by the two analyses in this region are compatible within one standard deviation.

Constraints on anomalous Higgs boson interactions
The STXS results presented in Section 8 are interpreted in an effective field theory approach where the scale of new physics is significantly larger than the SM electroweak scale so as to affect the measured observables at the LHC only through effective interactions among SM particles.
In this SMEFT approach, the SM Lagrangian is extended with higher-dimensional operators that capture the low-energy limit effects of a fundamental ultraviolet theory, without a priori knowledge of this theory [18]    operators is used, taking into account only the lepton-and baryonnumber-conserving ones. Furthermore, it only considers the CPeven terms respecting a U(3) 5 flavour symmetry, which affect the pp → V (→ leptons)H(→ bb) process [113]. The operators affecting the signal processes are listed in Table 7 [114]. The Wilson coefficients are used to parameterise the STXS and the Higgs boson decay rates [114] from leading-order predictions [113] and can be constrained using the STXS measurements presented in Section 8. The parameterisation of the STXS takes Table 7 Wilson coefficients and their corresponding dimension-6 operators in the Warsaw formulation considered in this analysis [112,114] into account the linear terms originating from the interference between SM and non-SM amplitudes as well as the quadratic ones from the squared non-SM amplitudes. The former are of order 1/ 2 and the latter of order 1/ 4 . Given that the current parameterisation takes neither next-to-leading-order effects nor the interference between SM and dimension-8 operators into account, the 1/ 4 terms are incomplete. Where applicable, fit results will be shown for both the linear-only parameterisation and the case where quadratic terms are also included. Since the gg → Z H production cross-section is higher order in QCD, it is kept fixed to its SM expectation. The impact of the Wilson coefficients on the experimental analysis acceptance is not accounted for in this study. It was, however, verified that the impact was less than 20% of the SMEFT parameterisation in the kinematic range considered in this study.
Due to the limited number of STXS bins, not all Wilson coefficients can be measured simultaneously. The interpretations in the SMEFT framework are carried out with two different approaches. In the first, the four Wilson coefficients the analysis is most sensitive to (after removing degeneracies) in addition to the operator which affect the H → bb decay, are varied individually, while the others are kept to their Standard Model value of 0. In the second approach, a principal component decomposition is performed which provides a set of four linear combinations (corresponding to the number of STXS measurements) of all the Wilson coefficients of Table 7 Hq coefficient the linear-andquadratic fit slightly favours a negative value of c (3) Hq , but is also compatible with the SM value of 0 at the 68% confidence level (CL). The negative-log-likelihood one-dimensional profiles for the individual fits of the five aforementioned Wilson coefficients are shown in the Appendix.
The second approach aims at identifying the Wilson coefficients to which the measurements are most sensitive (those to which the measurements are not sensitive manifest themselves as flat directions in the likelihood) and determining a set of orthogonal linear combinations of Wilson coefficients. Only linear terms are considered and four coefficients denoted c E A , c E B , c EC , and c E D , corresponding to linear combinations of the Wilson coefficients according to the eigenvectors of the principal component Table 8 Linear combinations of Wilson coefficients corresponding to the principal component decomposition eigenvectors (coefficients less than 0.10 have been omitted for better readability). The corresponding eigenvalues, representing in the Gaussian approximation the squared inverse uncertainty of the measured eigenvector, are also indicated.

Coefficient
Eigenvalue Eigenvector combination Fig. 6. The impact of the measured +1σ (solid) and −1σ (dashed) variations of the four eigenvectors on the reduced STXS bins.

Table 9
Expected and observed best-fit values and associated uncertainties (68% CL) from a simultaneous fit of the four coefficients corresponding to the eigenvector combinations.

Coefficient
Expected Observed decomposition, ordered in terms of their experimental sensitivity are extracted from the data. The parameterisation in terms of the main Wilson coefficients (reported in Table 7) and of the branching fraction of the Higgs boson to b-quarks, is given in Table 8 (coefficients less than 0.10 have been omitted for better readability). In contrast to the first approach where the STXS and the partial and total Higgs decay widths were parameterised independently, this second approach considers the full branching fraction H → bb as one linear parameter to remove redundancies in operators that only affect the total Higgs width. The leading combination c E A is dominated by the c Hq Wilson coefficient as expected from the fact that it is also the most constrained from the individual fits. The next-to-leading c E B combination is dominated by the c Hq . The impact of the variations corresponding to the measurement of each eigenvector combination on the expected cross-sections in the reduced STXS measurement bins is shown in Fig. 6. The results of the simultaneous fit are given in Table 9. No significant deviation from their expected SM value is observed.

Conclusion
Measurements of a Higgs boson decaying into a b-quark pair, produced in association with a vector boson at high transverse momentum subsequently decaying into a pair of leptons (electrons, muons and/or neutrinos) are performed. The results are based on the Run 2 dataset of pp collision data collected at √ s = 13 TeV by the ATLAS detector at the LHC, corresponding to an inte-grated luminosity of 139 fb −1 . These are the most precise measurements currently available in the high transverse momentum regime for this process. The Higgs boson is reconstructed from a single large-R jet to enhance the sensitivity in the high-p T regime. The significance of the measurement of the SM V H process is 2.1 standard deviations. The expected significance corresponding to the production of the SM V H process is 2.7 standard deviations. For the diboson process (V Z), which is measured simultaneously, the observed significance is 5.4 standard deviations, in good agreement with the expected diboson significance of 5.7 standard deviations. The cross-sections for the associated production of a Higgs boson decaying into a b-quark pair with an electroweak gauge boson W or Z decaying into leptons are measured in the simplified template cross-section framework in two p V T regions: 250 GeV < p V T < 400 GeV and p V T ≥ 400 GeV. All results are in agreement with SM predictions and are interpreted in terms of constraints on anomalous couplings in the framework of a Standard Model effective field theory.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements
We thank CERN for the very successful operation of the LHC, as well as the support staff from our institutions without whom ATLAS could not be operated efficiently.

Appendix A
This appendix contains the negative log-likelihood profile for the individual fits of the Wilson coefficients (Fig. 7). In each fit all coefficients other than the one individually varied are set to zero. Each fit is performed twice, once with and once without the quadratic terms taken into account, as described in Section 9.