Observation of electroweak $W^{\pm}Z$ boson pair production in association with two jets in $pp$ collisions at $\sqrt{s} =$ 13 TeV with the ATLAS detector

An observation of electroweak $W^{\pm}Z$ production in association with two jets in proton-proton collisions is presented. The data collected by the ATLAS detector at the Large Hadron Collider in 2015 and 2016 at a centre-of-mass energy of $\sqrt{s} =$ 13 TeV are used, corresponding to an integrated luminosity of 36.1 fb$^{-1}$. Events containing three identified leptons, either electrons or muons, and two jets are selected. The electroweak production of $W^{\pm}Z$ bosons in association with two jets is measured with an observed significance of 5.3 standard deviations. A fiducial cross-section for electroweak production including interference effects is measured to be $\sigma_{WZjj\mathrm{-EW}} = 0.57 \; ^{+ 0.14} _{- 0.13} \,(\mathrm{stat.}) \; ^{+ 0.07} _{- 0.06} \,(\mathrm{syst.}) \; \mathrm{fb}$. Differential cross-sections of $W^\pm Z jj$ production for several kinematic observables are also measured.


Introduction
The electromagnetic calorimeter covers the region |η| < 3.2 and is based on high-granularity, lead/liquidargon (LAr) sampling technology. The hadronic calorimeter uses a steel/scintillator-tile detector in the region |η| < 1.7 and a copper/LAr detector in the region 1.5 < |η| < 3.2. The most forward region of the detector, 3.1 < |η| < 4.9, is equipped with a forward calorimeter, measuring electromagnetic and hadronic energies in copper/LAr and tungsten/LAr modules.
The muon spectrometer comprises separate trigger and high-precision tracking chambers to measure the deflection of muons in a magnetic field generated by three large superconducting toroidal magnets arranged with an eightfold azimuthal coil symmetry around the calorimeters. The high-precision chambers cover the range |η| < 2.7 with three layers of monitored drift tubes, complemented by cathode strip chambers in the forward region, where the particle flux is highest. The muon trigger system covers the range |η| < 2.4 with resistive-plate chambers in the barrel and thin-gap chambers in the endcap regions.
A two-level trigger system is used to select events in real time [15]. It consists of a hardware-based first-level trigger and a software-based high-level trigger. The latter employs algorithms similar to those used offline to identify electrons, muons, photons and jets.

Phase space for cross-section measurements
The W ± Z j j electroweak cross-section is measured in a fiducial phase space that is defined by the kinematics of the final-state leptons, electrons or muons, associated with the W ± and Z boson decays. Leptons produced in the decay of a hadron, a τ-lepton, or their descendants are not considered in the definition of the fiducial phase space. At particle level, the kinematics of the charged lepton after quantum electrodynamics (QED) final-state radiation (FSR) are 'dressed' by including contributions from photons with an angular distance ∆R ≡ (∆η) 2 + (∆φ) 2 < 0.1 from the lepton. Dressed charged leptons, and final-state neutrinos that do not originate from hadron or τ-lepton decays, are matched to the W ± and Z boson decay products using a Monte Carlo (MC) generator-independent algorithmic approach, called the 'resonant shape' algorithm. This algorithm is based on the value of an estimator expressing the product of the nominal line shapes of the W and Z resonances as detailed in Ref. [10].
The fiducial phase space of the measurement matches the one used in Refs. [10,16] and is defined at particle level by the following requirements: the charged leptons from the Z boson decay are required to have transverse momentum p T > 15 GeV, the charged lepton from the W ± decay is required to have transverse momentum p T > 20 GeV, the charged leptons from the W ± and Z bosons are required to have |η| < 2.5 and the invariant mass of the two leptons from the Z boson decay must be within 10 GeV of the nominal Z boson mass, taken from the world average value, m PDG Z [17]. The W boson transverse mass, defined as m W T = 2 · p ν T · p T · [1 − cos ∆φ( , ν)], where ∆φ( , ν) is the angle between the lepton and the neutrino in the transverse plane and p ν T is the transverse momentum of the neutrino, is required to be m W T > 30 GeV. The angular distance between the charged lepton from the W ± decay and each of the charged leptons from the Z decay is required to be ∆R > 0.3, and the angular distance between the two leptons from the Z decay is required to be ∆R > 0.2. Requiring that the transverse momentum of the leading lepton be above 27 GeV reduces the acceptance of the fiducial phase space by only 0.02% and is therefore not added to the definition of the fiducial phase space, although it is present in the selection at the detector level presented in Section 5.
In addition to these requirements that define an inclusive phase space, at least two jets with p T > 40 GeV and |η j | < 4.5 are required. These particle-level jets are reconstructed from stable particles with a lifetime of τ > 30 ps in the simulation after parton showering, hadronisation, and decay of particles with τ < 30 ps. Muons, electrons, neutrinos and photons associated with W and Z decays are excluded. The particle-level jets are reconstructed using the anti-k t [18] algorithm with a radius parameter R = 0.4. The angular distance between all selected leptons and jets is required to be ∆R( j, ) > 0.3. If the ∆R( j, ) requirement is not satisfied, the jet is discarded. The invariant mass, m j j , of the two highest-p T jets in opposite hemispheres, η j1 · η j2 < 0, is required to be m j j > 500 GeV to enhance the sensitivity to VBS processes. These two jets are referred to as tagging jets. Finally, processes with a b-quark in the initial state, such as t Z j production, are not considered as signal. The production of t Z j results from a t-channel exchange of a W boson between a band a u-quark giving a final state with a t-quark, a Z boson and a light-quark jet, but does not include diagrams with gauge boson couplings.

Signal and background simulation
Monte Carlo simulation is used to model signal and background processes. All generated MC events were passed through the ATLAS detector simulation [19], based on G 4 [20], and processed using the same reconstruction software as used for the data. The event samples include the simulation of additional proton-proton interactions (pile-up) generated with P 8.186 [21] using the MSTW2008LO [22] parton distribution functions (PDF) and the A2 [23] set of tuned parameters.
Scale factors are applied to simulated events to correct for the differences between data and MC simulation in the trigger, reconstruction, identification, isolation and impact parameter efficiencies of electrons and muons [24,25]. Furthermore, the electron energy and muon momentum in simulated events are smeared to account for differences in resolution between data and MC simulation [25,26]. The S 2.2.2 MC event generator [27-34] was used to model W ± Z j j events. It includes the modelling of hard scattering, parton showering, hadronisation and the underlying event. A MC event sample, referred to as W Z j j−EW, includes processes of order six (zero) in α EW (α S ). In this sample, which includes VBS diagrams, two additional jets originating from electroweak vertices from matrixelement partons are included in the final state. Diagrams with a b-quark in either the initial or final state, i.e. b-quarks in the matrix-element calculation, are not considered. This sample provides a LO prediction for the W Z j j−EW signal process. A second MC event sample, referred to as W Z j j−QCD, includes processes of order four in α EW in the matrix-element with up to one parton at next-to-leading order (NLO) and two to three partons at leading order (LO). This W Z j j−QCD sample includes matrixelement b-quarks. Both S samples were generated using the NNPDF3.0 [35] PDF set. Interferences between the two processes were estimated at LO using the M G 5_aMC@NLO 2.3 [36] MC event generator with the NNPDF3.0 PDF set, including only contributions to the squared matrix-element of order one in α S . They are found to be positive and approximately 10% of the W Z j j−EW cross-section in the fiducial phase space and are treated as an uncertainty in the measurement, as discussed in Section 8. For the estimation of modelling uncertainties, alternative MC samples of W Z j j−QCD and W Z j j−EW processes were generated with M G 5_aMC@NLO 2.3 at LO in QCD, including up to two partons in the matrix-element for W Z j j−QCD, and using the NNPDF3.0 PDF set. MC samples of inclusive W ± Z production generated at NLO in QCD with the P -B v2 [37][38][39][40] generator, interfaced to P 8.210 or H ++ 2.7.1 [41] for simulation of parton showering and hadronisation are also used for tests of the modelling of W Z j j−QCD events.
The qq → Z Z ( * ) processes were generated with S 2.2.2 and the NNPDF3.0 PDF set. Similarly to W ± Z simulation, the Z Z j j−QCD and Z Z j j−EW processes are generated separately with in the matrix-element up to one parton at NLO and two jets at LO from electroweak vertices, respectively. The S 2.1.1 MC event generator was used to model the gg → Z Z ( * ) and VVV processes at LO using the CT10 [42] PDF set. The ttV processes were generated at NLO with the M G 5_aMC@NLO 2.3 MC generator using the NNPDF3.0 PDF set interfaced to the P 8.186 parton shower model. The associated production of a single top quark and a Z boson was simulated at LO with M G 5_aMC@NLO 2.3 using the NNPDF3.0 PDF set and interfaced to P 8.186 for parton shower.

Event selection
Candidate events were selected using single-leptons triggers [15] that required at least one electron or muon. The transverse momentum threshold of the leptons in 2015 was 24 GeV for electrons and 20 GeV for muons satisfying a loose isolation requirement based only on ID track information. Due to the higher instantaneous luminosity in 2016 the trigger threshold was increased to 26 GeV for both the electrons and muons and tighter isolation requirements were applied. Possible inefficiencies for leptons with large transverse momenta were reduced by including additional electron and muon triggers that did not include any isolation requirements with transverse momentum thresholds of p T = 60 GeV and 50 GeV, respectively. Finally, a single-electron trigger requiring p T > 120 GeV or p T > 140 GeV in 2015 and 2016, respectively, with less restrictive electron identification criteria was used to increase the selection efficiency for high-p T electrons. The combined efficiency of these triggers is close to 100% for W ± Z j j events. Only data recorded with stable beam conditions and with all relevant detector subsystems operational are considered.
Events are required to have a primary vertex reconstructed from at least two charged-particle tracks and compatible with the pp interaction region. If several such vertices are present in the event, the one with the highest sum of the p 2 T of the associated tracks is selected as the production vertex of the W ± Z. All final states with three charged leptons (electrons or muons) and neutrinos from W ± Z leptonic decays are considered.
Muon candidates are identified by tracks reconstructed in the muon spectrometer and matched to tracks reconstructed in the inner detector. Muons are required to satisfy a 'medium' identification selection that is based on requirements on the number of hits in the ID and the MS [25]. The efficiency of this selection averaged over p T and η is > 98%. The muon momentum is calculated by combining the MS measurement, corrected for the energy deposited in the calorimeters, with the ID measurement. The transverse momentum of the muon must satisfy p T > 15 GeV and its pseudorapidity must satisfy |η| < 2.5.
Electron candidates are reconstructed from energy clusters in the electromagnetic calorimeter matched to ID tracks. Electrons are identified using a likelihood function constructed from information from the shape of the electromagnetic showers in the calorimeter, track properties and track-to-cluster matching quantities [24]. Electrons must satisfy a 'medium' likelihood requirement, which provides an overall identification efficiency of 90%. The electron momentum is computed from the cluster energy and the direction of the track. The transverse momentum of the electron must satisfy p T > 15 GeV and the pseudorapidity of the cluster must be in the ranges |η| < 1.37 or 1.52 < |η| < 2.47.
Electron and muon candidates are required to originate from the primary vertex. The significance of the track's transverse impact parameter relative to the beam line must satisfy |d 0 /σ d 0 | < 3 (5) for muons (electrons), and the longitudinal impact parameter, z 0 (the difference between the value of z of the point on the track at which d 0 is defined and the longitudinal position of the primary vertex), is required to satisfy |z 0 · sin(θ)| < 0.5 mm.
Electrons and muons are required to be isolated from other particles, according to calorimeter-cluster and ID-track information. The isolation requirement for electrons varies with p T and is tuned for an efficiency of at least 90% for p T > 25 GeV and at least 99% for p T > 60 GeV [24]. Fixed thresholds values are used for the muon isolation variables, providing an efficiency above 90% for p T > 15 GeV and at least 99% for p T > 60 GeV [25].
Jets are reconstructed from clusters of energy depositions in the calorimeter [43] using the anti-k t algorithm [18] with a radius parameter R = 0.4. Events with jets arising from detector noise or other non-collision sources are discarded [44]. All jets must have p T > 25 GeV and be reconstructed in the pseudorapidity range |η| < 4.5. A multivariate combination of track-based variables is used to suppress jets originating from pile-up in the ID acceptance [45]. The energy of jets is calibrated using a jet energy correction derived from simulation and in situ methods using data [46]. Jets in the ID acceptance with p T > 25 GeV containing a b-hadron are identified using a multivariate algorithm [47, 48] that uses impact parameter and reconstructed secondary vertex information of the tracks contained in the jets. Jets initiated by b-quarks are selected by setting the algorithm's output threshold such that a 70% b-jet selection efficiency is achieved in simulated tt events.
The transverse momentum of the neutrino is estimated from the missing transverse momentum in the event, E miss T , calculated as the negative vector sum of the transverse momentum of all identified hard (high p T ) physics objects (electrons, muons and jets), as well as an additional soft term. A track-based measurement of the soft term [49,50], which accounts for low-p T tracks not assigned to a hard object, is used.
Events are required to contain exactly three lepton candidates satisfying the selection criteria described above. To ensure that the trigger efficiency is well determined, at least one of the candidate leptons is required to have p T > 25 GeV or p T > 27 GeV for the 2015 or 2016 data, respectively, and to be geometrically matched to a lepton that was selected by the trigger.
To suppress background processes with at least four prompt leptons, events with a fourth lepton candidate satisfying looser selection criteria are rejected. For this looser selection, the p T requirement for the leptons is lowered to p T > 5 GeV and 'loose' identification requirements are used for both the electrons and muons. A less stringent requirement is applied for electron isolation based only on ID track information and electrons with cluster in the range 1.37 ≤ |η| ≤ 1.52 are also considered.
Candidate events are required to have at least one pair of leptons of the same flavour and of opposite charge, with an invariant mass that is consistent with the nominal Z boson mass [51] to within 10 GeV. This pair is considered to be the Z boson candidate. If more than one pair can be formed, the pair whose invariant mass is closest to the nominal Z boson mass is taken as the Z boson candidate.
The remaining third lepton is assigned to the W boson decay. The transverse mass of the W candidate, computed using E miss T and the p T of the associated lepton, is required to be greater than 30 GeV.
Backgrounds originating from misidentified leptons are suppressed by requiring the lepton associated with the W boson to satisfy more stringent selection criteria. Thus, the transverse momentum of these leptons is required to be p T > 20 GeV. Furthermore, leptons associated with the W boson decay are required to satisfy the 'tight' identification requirements, which have an efficiency between 90% and 98% for muons and an efficiency of 85% for electrons. Finally, muons must also satisfy a tighter isolation requirement, tuned for an efficiency of at least 90% (99%) for p T > 25 (60) GeV.
To select W ± Z j j candidates, events are further required to be associated with at least two 'tagging' jets. The leading tagging jet is selected as the highest-p T jet in the event with p T > 40 GeV. The second tagging jet is selected as the one with the highest p T among the remaining jets that have a pseudorapidity of opposite sign to the first tagging jet and a p T > 40 GeV. These two jets are required to verify m j j > 150 GeV, in order to minimise the contamination from triboson processes.
The final signal region (SR) for VBS processes is defined by requiring that the invariant mass of the two tagging jets, m j j , be above 500 GeV and that no b-tagged jet be present in the event.

Background estimation
The background sources are classified into two groups: events where at least one of the candidate leptons is not a prompt lepton (reducible background) and events where all candidates are prompt leptons or are produced in the decay of a τ-lepton (irreducible background). Candidates that are not prompt leptons are also called 'misidentified' or 'fake' leptons.
The dominant source of background originates from the QCD-induced production of W ± Z dibosons in association with two jets, W Z j j−QCD. The shapes of distributions of kinematic observables of this irreducible background are modelled by the S MC simulation. The normalisation of this background is, however, constrained by data in a dedicated control region. This region, referred to as W Z j j−QCD CR, is defined by selecting a sub-sample of W ± Z j j candidate events with m j j < 500 GeV and no reconstructed b-quarks.
The other main sources of irreducible background arise from Z Z and tt + V (where V = Z or W).These irreducible backgrounds are also modelled using MC simulations. Data in two additional dedicated control regions, referred to as Z Z-CR and b-CR, respectively, are used to constrain the normalisations of the Z Z j j−QCD and tt + V backgrounds. The control region Z Z-CR, enriched in Z Z events, is defined by applying the W ± Z j j event selection defined in Section 5, with the exception that instead of vetoing a fourth lepton it is required that events have at least a fourth lepton candidate with looser identification requirements. This region is dominated by Z Z j j−QCD events with a small contribution of Z Z j j−EW events. The control region b-CR, enriched in tt +V events, is defined by selecting W ± Z j j candidate events having at least one reconstructed b-jet. Remaining sources of irreducible background are Z Z j j−EW VVV and t Z j events. Their contributions in the control and signal regions are estimated from MC simulations.
The reducible backgrounds originate from Z + j, Zγ, tt, Wt and WW production processes. The reducible backgrounds are estimated using a data-driven method based on the inversion of a global matrix containing the efficiencies and the misidentification probabilities for prompt and fake leptons [10,52]. The method exploits the classification of the lepton as loose or tight candidates and the probability that a fake lepton is misidentified as a loose or tight lepton candidate. Tight leptons candidates are signal lepton candidates as defined in Section 5. Loose lepton candidates are leptons that do not meet the isolation and identification criteria of signal lepton candidates but satisfy only looser criteria. The misidentification probabilities for fake leptons are determined from data, using dedicated control samples enriched in non-prompt leptons from heavy-flavour jets and in misidentified leptons from photon conversions or charged hadrons in lightflavour jets. The lepton misidentification probabilities are applied to samples of W ± Z j j candidate events in data where at least one and up to three of the lepton candidates are loose. Then, using a matrix inversion, the number of events with at least one misidentified lepton, which represents the amount of reducible background in the selected W ± Z j j sample, is obtained.
The number of observed events together with the expected background contributions are summarised in Table 1 for the signal region and the three control regions. All sources of uncertainties, as described in Section 8, are included. The expected signal purity in the W ± Z j j signal region is about 13%, and 72% of the events arise from W Z j j−QCD production. Table 1: Expected and observed numbers of events in the W ± Z j j signal region and in the three control regions, before the fit. The expected number of W Z j j−EW events from S and the estimated number of background events from the other processes are shown. The sum of the backgrounds containing misidentified leptons is labelled 'Misid. leptons'. The contribution arising from interferences between W Z j j−QCD and W Z j j−EW processes is not included in the table. The total uncertainties are quoted.

Signal extraction procedure
Given the small contribution to the signal region of W Z j j−EW processes, a multivariate discriminant is used to separate the signal from the backgrounds. A boosted decision tree (BDT), as implemented in the TMVA package [53], is used to exploit the kinematic differences between the W Z j j−EW signal and the W Z j j−QCD and other backgrounds. The BDT is trained and optimised on simulated events to separate W Z j j−EW events from all background processes.
A total of 15 variables are combined into one discriminant, the BDT score output value in the range [−1, 1]. The variables can be classified into three categories: jet-kinematic variables, vector-bosons-kinematics variables, and variables related to both jets and leptons kinematics. The variables related to the kinematic properties of the two tagging jets are the invariant mass of the two jets, m j j , the transverse momenta of the jets, the difference in pseudorapidity and azimuthal angle between the two jets, ∆η j j and ∆φ j j , the rapidity of the leading jet and the jet multiplicity. Variables related to the kinematic properties of the vector bosons are the transverse momenta of the W and Z bosons, the pseudorapidity of the W boson, the absolute difference between the rapidities of the Z boson and the lepton from the decay of the W boson, |y Z − y ,W |, and the transverse mass of the W ± Z system m W Z T . The pseudorapidity of the W boson is reconstructed using an estimate of the longitudinal momentum of the neutrino obtained using the W mass constraint as detailed in Ref. [54]. The m W Z T observable is reconstructed following Ref. [10]. Variables that relate the kinematic properties of jets and leptons are the distance in the pseudorapidity-azimuth plane between the Z boson and the leading jet, ∆R( j 1 , Z), the event balance R hard p T , defined as the transverse component of the vector sum of the W Z bosons and tagging jets momenta, normalised to their scalar p T sum, and, finally the centrality of the W Z system relative to the tagging jets, defined as ζ lep. = min(∆η − , ∆η + ), . A larger set of discriminating observables was studied but only variables improving signal-to-background were retained. The good modelling by MC simulations of the distribution shapes and the correlations of all input variables to the BDT is verified in the W Z j j−QCD CR, as exemplified by the good description of the BDT score distribution of data in the W Z j j−QCD CR shown in Figure 1(c).
The distribution of the BDT score in the W ± Z j j signal region is used to extract the significance of the W Z j j−EW signal and to measure its fiducial cross-section via a maximum-likelihood fit. An extended likelihood is built from the product of four likelihoods corresponding to the BDT score distribution in the W ± Z j j SR, the m j j distribution in the W Z j j−QCD CR, the multiplicity of reconstructed b-quarks in the b-CR and the m j j distribution in the Z Z-CR. The inclusion of the three control regions in the fit allows the yields of the W Z j j−QCD, tt + V and Z Z j j−QCD backgrounds to be constrained by data. The shapes of these backgrounds are taken from MC predictions and can vary within the uncertainties affecting the measurement as described in Section 8. The normalisations of these backgrounds are introduced in the likelihood as parameters, labelled µ W Z j j−QCD , µ tt+V and µ Z Z j j−QCD for W Z j j−QCD, tt + V and Z Z j j−QCD backgrounds, respectively. They are treated as unconstrained nuisance parameters that are determined mainly by the data in the respective control region. The normalisation and shape of the other irreducible backgrounds are taken from MC simulations and are allowed to vary within their respective uncertainties. The distribution of the reducible background is estimated from data using the matrix method presented in Section 6 and is allowed to vary within its uncertainty.
The determination of the fiducial cross-section is carried out using the signal strength parameter µ W Z j j−EW : where N signal data is the signal yield extracted from data by the fit and N signal MC is the number of signal events predicted by the S MC simulation. The measured cross-section σ fid. W Z j j−EW is derived from the signal strength µ W Z j j−EW by multiplying it by the S MC cross-section prediction σ fid., MC W Z j j−EW in the fiducial region. The W Z j j−QCD contribution that is considered as background in the fit procedure does not contain interference between the W Z j j−QCD and W Z j j−EW processes. The measured cross-section σ fid. W Z j j−EW therefore formally corresponds to the cross-section of the electroweak production including interference effects.

Systematic uncertainties
Systematic uncertainties in the signal and control regions affecting the shape and normalisation of the BDT score, m j j and N b-jets distributions for the individual backgrounds, as well as the acceptance of the signal and the shape of its template are considered. In cases where variations as a function of the BDT score distribution are consistent with being due to statistical fluctuations, their effect is neglected.
Systematic uncertainties due to the theoretical modelling in the event generator used to evaluate the W Z j j−QCD and W Z j j−EW templates are considered. Uncertainties due to higher order QCD corrections are evaluated by varying the renormalisation and factorisation scales independently by factors of two and one-half, removing combinations where the variations differ by a factor of four. These uncertainties are of −20% to +30% on the W Z j j−QCD background normalisation and up to ±5% on the W Z j j−EW signal shape. The uncertainties due to the PDF and the α S value used in the PDF determination are evaluated using the PDF4LHC prescription [55]. They are of the order of 1% to 2% in shape. A global modelling uncertainty in the W Z j j−QCD background template that includes effects of the parton shower model is estimated by comparing predictions of the BDT score distribution in the signal region from the S and M G MC event generators. The difference between the predicted shapes of the BDT score distribution from the two generators is considered as an uncertainty. The resulting uncertainty ranges from 5% to 20% at medium and high values of the BDT score, respectively. Alternatively, using two MC samples with different parton shower models, P +P 8 and P +H , it was verified that for W Z j j−QCD events the variations of the BDT score shape due to different parton shower models are within the global modelling uncertainty defined above. A global modelling uncertainty in the W Z j j−EW signal template is also estimated by comparing predictions of the BDT score distribution in the signal region from the S and M G MC event generators. This modelling uncertainty affects the shape of the BDT score distribution by at most 14% at large values of the BDT score. The Sherpa W Z j j−EW sample used in this analysis was recently found to implement a colour flow computation in VBS-like processes that increases central parton emissions from the parton shower. It was verified that possible effects on kinematic distributions and especially on the BDT score distribution are covered by the modelling uncertainty used. The interference between electroweak-and QCD-induced processes is not included in the probability distribution functions of the fit but is considered as an uncertainty affecting only the shape of the W Z j j−EW MC template. The effect is determined using the M G MC generator, resulting for the signal region in shape-only uncertainties ranging from 10% to 5% at low and high values of the BDT score, respectively. The effect of interference on the shape of the W Z j j−EW MC template in the W ± Z j j-QCD CR is negligible and is therefore not included. . The uncertainty in the jet energy resolution [56] and in the suppression of jets originating from pile-up are also considered [45]. The uncertainties in the b-tagging efficiency and the mistag rate are also taken into account. The effect of jet uncertainties on the expected number of events ranges from 10% to 3% at low and high values of the BDT score, respectively.
The uncertainty in the E miss T measurement is estimated by propagating the uncertainties in the transverse momenta of hard physics objects and by applying momentum scale and resolution uncertainties to the track-based soft term [49,50].
The uncertainties due to lepton reconstruction, identification, isolation requirements and trigger efficiencies are estimated using tag-and-probe methods in Z → events [24,25]. Uncertainties in the lepton momentum scale and resolution are also assessed using Z → events [25, 26]. These uncertainties impact the expected number of events by 1.4% and 0.4% for electrons and muons, respectively, and are independent of the BDT score.
A 40% yield uncertainty is assigned to the reducible background estimate. This takes into account the limited number of events in the control regions as well as the differences in background composition between the control regions used to determine the lepton misidentification rate and the control regions used to estimate the yield in the signal region. The uncertainty due to irreducible background sources other than W Z j j−QCD is evaluated by propagating the uncertainty in their MC cross-sections. These are 20% for VVV [57], 15% for t Z j [10] and tt + V [58], and 25% for Z Z j j−QCD to account for the potentially large impact of scale variations.
The uncertainty in the combined 2015+2016 integrated luminosity is 2.1%. It is derived, following a methodology similar to that detailed in Ref. [59], and using the LUCID-2 detector for the baseline luminosity measurements [60], from a calibration of the luminosity scale using x-y beam-separation scans.
The effect of the systematic uncertainties on the final results after the maximum-likelihood fit is shown in Table 2 where the breakdown of the contributions to the uncertainties in the measured fiducial crosssection σ fid.
W Z j j−EW is presented. The individual sources of systematic uncertainty are combined into theory modelling and experimental categories. As shown in the table, the systematic uncertainties in the jet reconstruction and calibration play a dominant role, followed by the uncertainties in the modelling of the W Z j j−EW signal and of the W Z j j−QCD background. Systematic uncertainties in the missing transverse momentum computation arise directly from the momentum and energy calibration of jets, electrons and muons and are included in the respective lines of Table 2. Systematic uncertainties in the modelling of the other backgrounds are also considered.

Cross-section measurements
The signal strength µ W Z j j−EW and its uncertainty are determined with a profile-likelihood-ratio test statistic [61]. Systematic uncertainties in the input templates are treated as nuisance parameters with an assumed Gaussian distribution. The BDT score distribution in the QCD control region and in the signal region, with background normalisations, signal normalisation and nuisance parameters adjusted by the profile-likelihood fit are shown in Figure 1. The corresponding post-fit yields are detailed in Table 3. The table presents the integral of the BDT score distribution in the SR, but the uncertainty on the measured signal cross section is dominated by events at high BDT score. The signal strength is measured to be where the uncertainties correspond to statistical, experimental systematic, theory modelling and interference systematic, and luminosity uncertainties, respectively. It corresponds to the cross-section of electroweak W ± Z j j production, including interference effects between W Z j j−QCD and W Z j j−EW processes, in the fiducial phase space defined in Section 3 using dressed-level leptons.
The SM LO prediction from S for electroweak production without interference effects is where the effects of uncertainties in the PDF and the α S value used in the PDF determination, as well as the uncertainties due to the renormalisation and factorisation scales, are evaluated using the same procedure as the one described in Section 8.
A larger cross-section of σ fid., MadGraph W Z j j−EW = 0.366 ± 0.004 (stat.) fb is predicted by M G . These predictions are at LO only and include neither the effects of interference, estimated at LO to be 10%, nor the effects of NLO electroweak corrections as discussed in Ref.
From the number of observed events in the SR, the integrated cross-section of W ± Z j j production in the VBS fiducial phase space defined in Section 3, including W Z j j−EW and W Z j j−QCD contributions and their interference, is measured. It is calculated as where N data and N bkg are the number of observed events and the estimated number of background events in the SR, respectively, and L is the integrated luminosity. The factor C W Z j j , obtained from simulation, is the ratio of the number of selected signal events at detector level to the number of events at particle level in the fiducial phase space. This factor corrects for detector efficiencies and for QED final-state radiation effects. The contribution from τ-lepton decays, amounting to 4.7%, is removed from the cross-section definition by introducing the term in parentheses. This term is computed using simulation, where N τ is the number of selected events at detector level in which at least one of the bosons decays into a τ-lepton and N all is the number of selected W Z events with decays into any lepton. The C W Z j j factor calculated 0 500 1000150020002500  with S is 0.52 with a negligible statistical uncertainty. The theory modelling uncertainty in this factor is 8%, as estimated from the difference between the S and M G predictions.
Events in the SR are also used to measure the W ± Z j j differential production cross-section in the VBS fiducial phase space. The differential detector-level distributions are corrected for detector resolution using an iterative Bayesian unfolding method [63], as implemented in the RooUnfold toolkit [64]. Three iterations were used for the unfolding of each variable. The width of the bins in each distribution is chosen according to the experimental resolution and to the statistical significance of the expected number of events in that bin. The fraction of signal MC events reconstructed in the same bin as generated is always greater than 40% and around 70% on average.
For each distribution, simulated W ± Z j j events are used to obtain a response matrix that accounts for bin-to-bin migration effects between the reconstruction-level and particle-level distributions. The S MC samples for W Z j j−EW and W Z j j−QCD production are added together to model W ± Z j j production. To more closely model the data and to minimise unfolding uncertainties, their predicted cross-sections are rescaled by the respective signal strengths of 1.77 and 0.56 for the W Z j j−EW and W Z j j−QCD contributions, respectively, as measured in data by the maximum-likelihood fit.
Uncertainties in the unfolding due to imperfect modelling of the data by the MC simulation are evaluated using a data-driven method [65], where the MC differential distribution is corrected to match the data distribution and the resulting weighted MC distribution at reconstruction level is unfolded with the response matrix used in the data unfolding. The new unfolded distribution is compared with the weighted MC distribution at generator level and the difference is taken as the systematic uncertainty. The uncertainties obtained range from 0.1% to 25% depending on the resolution of the unfolded observables and on the quality of its description by S .
Measurements are performed as a function of three variables sensitive to anomalies in the quartic gauge coupling in W ± Z j j events [10], namely the scalar sum of the transverse momenta of the three charged leptons associated with the W and Z bosons p T , the difference in azimuthal angle ∆φ(W, Z) between the W and Z bosons' directions, and the transverse mass of the W ± Z system m W Z T , defined following Ref. [10]. These are presented in Figure 2.
Measurements are also performed as a function of variables related to the kinematics of jets. The exclusive multiplicity of jets, N jets , is shown in Figure 3. The absolute difference in rapidity between the two tagging jets ∆y j j , the invariant mass of the tagging jets m j j , the exclusive multiplicity N gap jets of jets with p T > 25 GeV in the gap in η between the two tagging jets, and the azimuthal angle between the two tagging jets ∆φ j j are shown in Figure 4.
Total uncertainties in the measurements are dominated by statistical uncertainties. The differential measurements are compared with the prediction from S , after having rescaled the separate W Z j j−QCD and W Z j j−EW components by the global µ W Z j j−QCD and µ W Z j j−EW parameters, respectively, obtained from the profile-likelihood fit to data. Interference effects between the W Z j j−QCD and W Z j j−EW processes are incorporated via the µ W Z j j−EW parameter as a change of the global normalisation of the S electroweak prediction. The measured W ± Z j j differential cross-section in the VBS fiducial phase space as a function of (a) p T , (b) ∆φ(W, Z) and (c) m W Z T . The inner and outer error bars on the data points represent the statistical and total uncertainties, respectively. The measurements are compared with the sum of the rescaled W Z j j−QCD and W Z j j−EW predictions from S (solid line). The W Z j j−EW and W Z j j−QCD contributions are also represented by dashed and dashed-dotted lines, respectively. In (a) and (c), the right y-axis refers to the last crosssection point, separated from the others by a vertical dashed line, as this last bin is integrated up to the maximum value reached in the phase space. The lower panels show the ratios of the data to the predictions from S .  Figure 3: The measured W ± Z j j differential cross-section in the VBS fiducial phase space as a function of the exclusive jet multiplicity of jets with p T > 40 GeV. The inner and outer error bars on the data points represent the statistical and total uncertainties, respectively. The measurements are compared with the sum of the scaled W Z j j−QCD and W Z j j−EW predictions from S (solid line). The W Z j j−EW and W Z j j−QCD contributions are also represented by dashed and dashed-dotted lines, respectively. The right y-axis refers to the last cross-section point, separated from the others by a vertical dashed line, as this last bin is integrated up to the maximum value reached in the phase space. The lower panel shows the ratio of the data to the prediction from S . (d) Figure 4: The measured W ± Z j j differential cross-section in the VBS fiducial phase space as a function of (a) the absolute difference in rapidity between the two tagging jets ∆y j j , (b) the invariant mass of the tagging jets m j j , (c) N gap jets the exclusive jet multiplicity of jets with p T > 25 GeV in the gap between the two tagging jets, and (d) the azimuthal angle between the two tagging jets ∆φ j j . The inner and outer error bars on the data points represent the statistical and total uncertainties, respectively. The measurements are compared with the sum of the rescaled W Z j j−QCD and W Z j j−EW predictions from S (solid line). The W Z j j−EW and W Z j j−QCD contributions are also represented by dashed and dashed-dotted lines, respectively. In (b), the right y-axis refers to the last cross-section point, separated from the others by a vertical dashed line, as this last bin is integrated up to the maximum value reached in the phase space. The lower panels show the ratios of the data to the predictions from S .

Conclusion
An observation of electroweak production of a diboson W ± Z system in association with two jets and measurements of its production cross-section in √ s = 13 TeV pp collisions at the LHC are presented. The data were collected with the ATLAS detector and correspond to an integrated luminosity of 36.1 fb −1 . The measurements use leptonic decays of the gauge bosons into electrons or muons and are performed in a fiducial phase space approximating the detector acceptance that increases the sensitivity to W ± Z j j electroweak production modes.
The electroweak production of W ± Z bosons in association with two jets is measured with observed and expected significances of 5.3 and 3.2 standard deviations, respectively. The measured fiducial crosssection for electroweak production including interference effects is It is found to be larger than the LO SM prediction of 0.32 ± 0.03 fb as calculated with the S MC event generator that includes neither interference effects, estimated at LO to be 10%, nor NLO electroweak corrections. Differential cross-sections of W ± Z j j production, including both the strong and electroweak processes, are also measured in the same fiducial phase space as a function of several kinematic observables.
(Taiwan), RAL (UK) and BNL (USA), the Tier-2 facilities worldwide and large non-WLCG resource providers. Major contributors of computing resources are listed in Ref. [66].      [61] G. The ATLAS Collaboration