Search for non-resonant Higgs boson pair production in the $bb\ell\nu\ell\nu$ final state with the ATLAS detector in $pp$ collisions at $\sqrt{s} = 13$ TeV

A search for non-resonant Higgs boson pair production, as predicted by the Standard Model, is presented, where one of the Higgs bosons decays via the $H\rightarrow bb$ channel and the other via one of the $H \rightarrow WW^*/ZZ^*/\tau\tau$ channels. The analysis selection requires events to have at least two $b$-tagged jets and exactly two leptons (electrons or muons) with opposite electric charge in the final state. Candidate events consistent with Higgs boson pair production are selected using a multi-class neural network discriminant. The analysis uses 139 fb$^{-1}$ of $pp$ collision data recorded at a centre-of-mass energy of 13 TeV by the ATLAS detector at the Large Hadron Collider. An observed (expected) upper limit of 1.2 ($0.9^{+0.4}_{-0.3}$) pb is set on the non-resonant Higgs boson pair production cross-section at 95% confidence level, which is equivalent to 40 ($29^{+14}_{-9}$) times the value predicted in the Standard Model.


Introduction
In 2012, the ATLAS and CMS Collaborations reported the observation of a new particle in the search for the Standard Model (SM) Higgs boson (H) [1,2]. So far, measurements of the spin and couplings of the new particle are consistent with those predicted by the Brout-Englert-Higgs (BEH) mechanism of the SM [3][4][5][6][7][8][9][10][11][12]. The SM predicts non-resonant production of Higgs boson pairs (HH) in proton-proton (pp) collisions, referred to as non-resonant HH production, with the dominant production modes at the LHC proceeding via the gluon-gluon fusion (ggF) process. The ggF process has two leading order contributions: the first corresponds to the so-called 'triangle diagram', which includes the Higgs boson self-coupling, and the second is the so-called 'box diagram', which includes a heavy-quark loop with two fermion-fermion-Higgs ( f f H) vertices. These two amplitudes interfere destructively, resulting in a low cross-section of only 31.05 ± 1.90 fb for the ggF HH production mode, computed at next-to-next-to-leading order (NNLO) and including finite top-quark mass effects [13][14][15][16][17][18][19][20]. Feynman diagrams illustrating these two contributions are shown in Figure 1. The measurement of non-resonant HH production at the LHC stands as an important test of the BEH mechanism. In many beyond-the-SM (BSM) theories, HH production can be enhanced by modifying the Higgs boson self-coupling, λ H H H , or the top-quark Yukawa coupling, y t , and/or by introducing new contact interactions between two top-quarks or gluons and two Higgs bosons or introducing production mechanisms via intermediate BSM particles [21][22][23]. The ATLAS and CMS Collaborations have performed searches for non-resonant HH production in a variety of final states at 13 TeV [24][25][26][27][28][29][30][31][32][33]. No significant excess of events beyond SM expectations is observed in these searches, with the ATLAS and CMS data-analyses setting observed (expected) limits on non-resonant HH production to be no larger than 6.9 (10.0) and 22.2 (12.8) times the predicted rate in the SM, respectively [34,35].
This Letter describes a search for non-resonant HH production in the bb ν ν final state, where refers to a lepton (either an electron or a muon), using 13 TeV pp collision data collected with the ATLAS detector during 2015-2018 and corresponding to a total integrated luminosity of 139 fb −1 . The analysis uses machine-learning techniques based on feedforward neural network architectures [36] to construct an event-level classifier trained to distinguish between the HH signal and SM backgrounds. Analyses searching for non-resonant HH production via similar decay channels were performed previously in the single-lepton final state by ATLAS in searches for HH → bbWW * [28] and in the dilepton channel by CMS in searches for HH → bbWW * /bbZ Z * [31].

ATLAS detector
The ATLAS detector [37-39] is a general-purpose particle detector with forward-backward symmetric cylindrical geometry.1 It includes an inner tracking detector (ID), immersed in an axial magnetic field, which provides precision tracking of charged particles over the range of |η| < 2.5. Calorimeter systems with either liquid argon or scintillator tiles as the active medium provide energy measurements over the range of |η| < 4.9. The muon spectrometer (MS) is positioned outside the calorimeters and includes three air-core toroidal magnets. The MS is composed of several types of muon detectors which provide trigger and high-precision tracking capabilities for |η| < 2.4 and |η| < 2.7, respectively. A hardware-based trigger followed by a software-based trigger reduce the recorded event rate to an average of 1 kHz [40].

Dataset and simulated events
The data used for this search were collected in pp collisions at the LHC with a centre-of-mass energy of 13 TeV. Only those data collected during stable LHC beam conditions and with all ATLAS detector subsystems fully operational are used, and correspond to an integrated luminosity of 139 fb −1 . The selection of candidate events with oppositely charged leptons is based on a combination of single-lepton and dilepton triggers. 2 The use of a given trigger depends on the flavour and the transverse momenta (p T ) of the two (p T -ordered) leptons in the event, and on the data-taking period. Single-lepton triggers with p T thresholds between 22 and 28 GeV are given priority over dilepton triggers. The criteria of the dilepton triggers are checked only if no single-lepton trigger criteria are met and have p T thresholds as low as 19 (10) GeV for the leading (subleading) lepton. At least one reconstructed lepton (or lepton pair) has to match a corresponding trigger object, in which case their offline p T must be higher than the trigger threshold by at least 2 GeV, in order to be on the efficiency plateau of the corresponding trigger. The signal processes with ggF-initiated non-resonant HH production in the bb ν ν final state were generated with an effective Lagrangian in the infinite top-quark mass approximation. The generated signal events were reweighted with form factors that take into account the finite mass of the top-quark [46,47]. 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe. The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upwards. Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis. The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2). The angular distance is measured in units of ∆R ≡ (∆η) 2 + (∆φ) 2 . 2 Distinct sets of single-lepton triggers are used for electrons and muons. Dilepton triggers require either two electrons, two muons, or one electron and one muon.
SM background processes were simulated using different MC event generators. The MC matrix element (ME) event generators and PDF sets, the parton showering (PS) and the underlying event (UE) modelling, UE tuned parameters (tune), and the accuracy of the theoretical cross-sections used to normalise the simulated processes are summarised in Table 1. Each SM background process is normalised to the best available respective theoretical cross-section. The mass of the Higgs boson was set to 125 GeV for all signal and background processes. The HH branching fractions (BF) predicted by the SM [13] are used for all Higgs boson decays. M S [48] was used to model top-quark spin correlations and E G [49] was used to model properties of band c-hadron decays for processes using P and for the signal processes.
SM top-quark pair production (tt) and the production of single top-quarks in association with W bosons (Wt) contribute with significant background contamination in the bb ν ν final state. At next-to-leading-order (NLO) accuracy, there exists non-trivial interference between these two processes that may be enhanced in phase-space regions wherein there are high fractions of Wt events [50]. Two schemes are typically used to remove the overlap between these two processes: the so-called diagram removal (DR) and diagram subtraction (DS) schemes [51]; the former is used in the present analysis to remove the overlapping events and the latter is used to evaluate the systematic uncertainty in corresponding background event yields. Because of these effects, the sum of the simulated tt and Wt processes is considered as a single background process and referred to as the 'Top' process in what follows. Table 1: List of the ME generators and PS/UE modelling algorithms used in the simulation. Alternative generators and PS/UE models, used to estimate systematic uncertainties, are shown in parentheses. The PDF sets, tunes, and the perturbative QCD highest-order accuracy (leading-order, LO; next-to-leading-order, NLO; next-to-next-to-leadingorder, NNLO; next-to-next-to-leading-logarithm, NNLL) used for the normalisation of the samples are also included. The top-quark mass is set to 172.5 GeV.

Event selection and object definitions
Selected events are required to have at least one pp interaction vertex reconstructed from at least two ID tracks with p T > 0.4 GeV. The primary vertex for each event is defined as the vertex with the highest (p T ) 2 of associated ID tracks [102]. Events that contain at least one jet arising from non-collision sources or detector noise are rejected by a set of quality criteria [103].
Loose and signal criteria are defined in order to select reconstructed lepton and jet candidates, where the latter is a subset of the former. Compared to the loose objects, the signal objects are required to satisfy tighter identification or quality criteria that are designed to suppress background contributions. Reconstructed loose (signal) electrons are required to satisfy the 'Loose' ('Tight') likelihood identification criteria [104]. Loose electrons are required to have p T > 10 GeV and to be within |η| < 2.47. In addition, signal electrons are required to be outside the range 1.37 < |η| < 1.52, which corresponds to the transition regions between the barrel and endcaps of the electromagnetic calorimeters. In order to reduce background contributions from jets misidentified as electrons, signal electrons are required to be isolated according to the 'Gradient' selection criteria [104]. Reconstructed loose and signal muon candidates are required to have p T > 10 GeV, to be within |η| < 2.4, and to satisfy the 'Medium' identification criteria [105]. Additionally, signal muons are required to be isolated according to the 'FixedCutLoose' selection criteria [105]. Signal electron (muon) candidates are required to originate from the primary vertex by demanding that the significance of the transverse impact parameter, defined as the absolute value of the track transverse impact parameter, d 0 , measured relative to the primary vertex, divided by its uncertainty, σ d 0 , satisfy |d 0 |/σ d 0 < 5 (3). The difference ∆z 0 between the value of the z coordinate of the point on the track at which d 0 is defined and the longitudinal position of the primary vertex is required to satisfy |∆z 0 × sin θ| < 0.5 mm, where θ is the polar angle of the track with respect to the z-axis.
Jets are reconstructed from topological clusters of energy deposits in the calorimeters [106] using the anti-k t algorithm [107,108] with a radius parameter of R = 0.4 and calibrated as described in Ref. [109]. Candidate loose jets are required to have p T > 20 GeV. Signal jets are required to have |η| < 2.8 and must satisfy pile-up suppression requirements based on the output of a multivariate classifier [110], which identifies jets consistent with a primary vertex in the region |η| < 2.4 and p T < 120 GeV. The MV2c10 multivariate algorithm [111] is used to identify jets containing b-hadrons (b-tagged jets). An MV2c10 working point with a b-tagging efficiency of 70%, estimated from simulated tt events [112], is used. The b-tagged jets must have p T > 20 GeV and |η| < 2.5. The momentum of b-tagged jets is adjusted using the muon-in-jet correction, as described in Ref. [6], by accounting for momentum losses due to muons originating from in-flight semileptonic b-hadron decays occurring within the b-tagged jet.
The missing transverse momentum p miss T , the magnitude of which is denoted by E miss T , is constructed from the negative vectorial sum of the transverse momenta of calibrated loose objects in the event. An additional term is included to account for the energy of ID tracks that are matched to the primary vertex in the event but not to any of the selected loose objects [113].
To avoid double-counting, loose objects are subject to the overlap removal procedure defined as follows. If a reconstructed electron and muon share a track in the ID, the electron is removed. However, if the muon sharing the track with the electron is calorimeter-tagged,3 then the muon is removed instead of the electron. If a jet and an electron are reconstructed within ∆R = 0.2 of each other, then the jet is removed. If a jet and a muon are within ∆R = 0.2 of each other, and the jet has less than three tracks or carries less than 50% of the muon p T , then the jet is removed; otherwise, the muon is removed. Electrons or muons separated from the remaining jets by ∆R < 0.4 are removed.
The analysis selects candidate events with exactly two oppositely charged signal leptons, electrons or muons, and at least two signal b-tagged jets. To enhance sensitivity to the signal process and to maximise rejection of the expected SM backgrounds, the analysis uses a multivariate approach to select signal events.

Analysis strategy
The analysis relies on the use of a multivariate discriminant designed to select candidate events consistent with non-resonant HH production. Section 5.1 describes the architecture and the training of the deep neural network (DNN) classifier from which the discriminant is constructed. Section 5.2 describes the signal region selection criteria. Section 5.3 describes the final background estimation procedure.

Deep learning approach to target H H
The discriminant uses the outputs of a DNN classifier that is built using the K library with T as a backend [114,115] and uses the library [116] to interface with the analysis software infrastructure of the ATLAS experiment. The sample of events used for training is composed of equal numbers of events from the signal and each of the dominant background processes: Top (as defined in Section 3), Z/γ * → (Z-), and Z/γ * → ττ (Z-ττ) production. The signal sample used in the training of the classifier contains only the HH → bbWW * component due to its larger BF relative to the HH → bbττ and HH → bbZ Z * components. However, the sum of all three signal components is evaluated as the signal when performing the statistical analysis. Additionally, all processes that make up the training sample (HH → bbWW * , Top, Z-, and Z-ττ) have the same weight during the training of the classifier. The training sample is composed of simulated candidate events with m > 20 GeV and having one or more b-tagged jets, where events with exactly one b-tagged jet are included to increase the number of events available for training. For the training events with exactly one b-tagged jet, each observable that requires at least two b-tagged jets is set to its mean value as computed with the full set of training events that contain at least two b-tagged jets. Observables that require two b-tagged jets are defined using the leading two b-tagged jets. The classifier contains two fully connected hidden layers each with 250 nodes. Rectified linear unit (ReLU) activations are used for each layer [117]. In order to improve the robustness of the training and to reduce effects due to overtraining, there is a dropout layer that randomly drops 50% of the nodes between the two fully connected layers during training [118]. The classifier produces four outputs that are passed through a softmax activation, constraining their sum to one [36]. The resulting four outputs, each constrained to values between 0 and 1, are referred to as p i (i ∈ {HH, Top, Z-, Z-ττ}). Values of p i nearer to 1 indicate that the event likely belongs to class i and values nearer to 0 indicate otherwise. The main discriminant in the analysis, d H H , is constructed from the four p i and is defined as The HH → bb ν ν signal events are characterised by two distinct 'Higgs hemispheres'. One hemisphere contains the two b-tagged jets from the H → bb decay and it is typically opposite in the transverse plane to the second hemisphere that contains the two leptons and E miss T from the H → WW * /Z Z * /ττ decay. The final-state objects in the SM backgrounds, the Top process in particular, are distributed more uniformly within the event and they typically do not exhibit the same opposite hemispheres topology as the HH signal. These Higgs hemispheres thus provide a topological criterion that distinguishes the signal from the background and motivates the choice of input observables that are provided to the classifier. Thirty-five such variables are provided as inputs to the classifier, ranging from momentum components of the visible final-state objects to observables using event-wide information, and are constructed using only calibrated final state objects that have well-understood uncertainties (Section 6). A complete list is provided in Table  2. The event-wide input observables are sensitive to the presence of Higgs hemispheres in the signal and are largely angular in nature or take advantage of the fact that the final state objects from each of the Higgs bosons in the signal tend to be near to each other. The observables H R T2 and m bb T2 are non-standard high-level observables that are not straightforward functions of the momenta of the final-state objects. By generally has values below the mass of the top-quark due to kinematic constraints while for the Z/γ * processes, which have little-to-no E miss T , m bb T2 is typically below 45 GeV. The use of dropout regularisation during the training of the classifier allows it to more effectively use the information contained in the full set of inputs presented in Table 2 by reducing its susceptibility to overtraining effects that may otherwise appear as a result of using such an extended input feature space in the case where no such regularisation is performed. To verify this, the performance of the classifier was checked using an independent sample of events not used in the training of the classifier and was found to be compatible to its performance when presented with those of the training sample.
Ratio of H T2 and scalar sum of the transverse momenta of the H decay products, are the transverse momenta of the leading {subleading} lepton (b-tagged jet)

Signal selection criteria
To define signal selection criteria, the analysis relies on the invariant mass of the two leptons, m , and the invariant mass of the two leading (p T -ordered) b-tagged jets, m bb . Due to spin-correlation effects present in the H → WW * → ν ν decay within the dominant HH → bbWW * signal process, the signal events exhibit values of m that are typically below 60 GeV. By selecting low values of m , the signal purity can therefore be enhanced while rejecting a large component of the SM Z boson and Top backgrounds. Additionally, m bb has a peak at the mass of the Higgs boson for the signal process and therefore provides an effective means to define selections in which the HH contribution is enhanced. The signal selection criteria therefore require m ∈ (20, 60) GeV and m bb ∈ (110, 140) GeV. The m > 20 GeV requirement is enforced in order to remove contamination from low-mass resonances and Z/γ * processes. The signal selection criteria are further broken down into same-flavour (SF), i.e. ee or µµ, or different-flavour (DF), i.e. eµ, regions. Separating by dilepton flavour enhances the separation power between the signal and Z/γ * background; the former has roughly equal probabilities for the SF and DF final states and the latter leads predominantly to SF final states.
In addition to the m and m bb requirements, the same-flavour and different-flavour signal regions, SR-SF and SR-DF, respectively, are defined by requiring high values of d H H and are presented in Table 3. The chosen threshold values of d H H > 5.45 (5.55) for SR-SF (SR-DF) are found to maximise the expected sensitivity to the non-resonant HH process. The predicted HH → bb ν ν signal yields in SR-SF and SR-DF are shown in Table 3, and are composed of 90% HH → bbWW * , 9% HH → bbττ, and 1% HH → bbZ Z * . The predominance of the HH → bbWW * process over the other two is a result of both its larger overall BF and of the classifier having been trained only on this component of the signal.

Background estimation
As mentioned in Section 5.1, the dominant backgrounds expected to contaminate the signal regions are the Top and Z/γ * processes, specifically Z/γ * production in association with jets originating from heavy-flavour hadrons (bb, bc, or cc), subsequently referred to as Z/γ * + HF. Subdominant SM processes contribute via tt production in association with an electroweak vector boson, single Higgs boson production (predominantly via the ttH mode), Z/γ * production in association with light-flavour jets, and electroweak diboson processes. There is additionally a minor contribution of background events from non-prompt leptons produced in semileptonic decays of heavy-flavour hadrons and from misidentified electron candidates arising from photon conversions and jets. This background is estimated using events with a same-charge lepton pair, following procedures described in Ref.
[121], after subtracting the prompt-lepton contribution. The rest of the SM background processes detailed in Table 1 are estimated primarily using simulation.
Dedicated control regions are defined to derive data-driven normalisation corrections for the dominant background processes: CR-Top for Top and CR-Z+HF for Z/γ * + HF. These normalisation corrections have a uniform prior and are checked in two validation regions, VR-1 and VR-2, enriched with events from the Top and Z/γ * + HF processes. The control and validation regions are defined in Table 3 and are kinematically close to the signal regions. CR-Top (CR-Z+HF) and VR-1 (VR-2) are defined by inverting the m bb (m ) requirements relative to those of the signal regions but retain a selection of the high d H H region similar to the signal regions. The d H H selections were relaxed to increase statistical power, independent checks showed that this did not have a significant impact on the post-fit normalisation corrections in Table  3.
VR-1 keeps only those events with m bb > 140 GeV, excluding the region m bb < 100 GeV which is included in CR-Top, due to significant contamination of Z/γ * + HF events. The correlations between the m and m bb observables and d H H after the preselection are observed to be small and do not prevent the use of the former two in the construction of the analysis regions defined in Table 3, as d H H is found to rely mainly on the information provided by the additional input observables listed in Table 2. This absence of strong correlation ensures that the measurements made in the tails of d H H in the control regions can be extrapolated to those in the signal regions.
The Top background in the signal regions is expected to be composed of approximately equal contributions from the tt and single-top-quark Wt process and therefore susceptible to the interference effects as described in Section 3. For this reason, CR-Top and the validation regions are defined so that they have predicted tt and Wt compositions similar to that of the signal regions. This ensures that the normalisation correction determined in the fit for the Top background results in an accurate estimate of the combined tt and Wt process in the signal regions, accounting for potential interference effects present in data but not necessarily modelled in MC simulation. Table 3 compares the observed and predicted event yields, where the background event yields obtained after background-only fits in the corresponding control regions are also shown. The post-fit normalisation correction factors for the Top and Z/γ * + HF background processes, respectively µ Top = 0.79 ± 0.10 and µ Z/γ * + HF = 1.36 ± 0.07, are also shown in Table 3. The uncertainties in µ Top and µ Z/γ * + HF take into account the statistical and systematic uncertainties due to the experimental sources, as described in Section 6.
Distributions of d H H in the control regions after performing background-only fits to data in the control regions and applying the Top and Z/γ * + HF normalisation corrections are shown in Figure 2. In the control and validation regions, good agreement between the data and SM prediction provided by the post-fit MC simulation is observed for the observables relevant to the analysis. Table 3: Analysis region and background estimation summary. Shown are the definitions of the control, validation, and signal regions used in the analysis as well as the predicted and observed event yields in each of these regions. The predicted yields are shown after background-only fits to data in the control regions. The Top and Z/γ * + HF post-fit normalisation factors, obtained from background-only fits in the corresponding control regions, are shown at the bottom of the table. Also shown is the predicted HH → bb ν ν signal yield in each of the regions, multiplied by a factor of 20. Of the HH yield in the signal regions, 90% comes from the HH → bbWW * process, 9% from the HH → bbττ process, and 1% from the HH → bbZ Z * process. The uncertainties in each yield and in the normalisation correction factors account for the statistical and systematic uncertainties described in Section 6, with those on the normalisation corrections due only to experimental sources.

Systematic uncertainties
The analysis evaluates several sources of systematic uncertainty for the signal and background processes, which are classified as either experimental (detector or luminosity related) or theoretical modelling uncertainties. Statistical uncertainties of the simulated event samples are also taken into account. The main uncertainty components are summarised in Table 4. MC modelling uncertainties in the Top and Z/γ * + HF background estimates are dominant, followed by statistical and detector uncertainties.
The normalisation corrections of the Top and Z/γ * + HF background processes are determined primarily by the data events in the control regions when performing the statistical analysis. These corrections take into account the statistical and systematic uncertainties due to the experimental sources, as described later  in this section. In addition, the systematic uncertainties in the theoretical modelling of these processes are applied as uncertainties in the corrected predictions in the signal regions using the following procedures. The uncertainties in the estimated Top background event yields due to parton shower modelling are assessed as the difference between the predictions of P -B showered with P or H , and those due to the choice of event generator are assessed by comparing the predictions of P -B or M G 5_aMC@NLO [122], both showered with P . The uncertainties due to missing higher-order corrections are estimated by changing the renormalisation and factorisation scales (µ and µ , respectively) up and down by a factor of two (8-points variation). The uncertainties due to the modelling of initial-and final-state radiation (ISR and FSR, respectively) in the generators used to simulate the Top background processes are evaluated using the method described in Ref. [122]. The Top background composition is varied within the uncertainties in the theoretical predictions for the tt and single-top-quark Wt cross-sections [65,68,123]. The uncertainty arising from the interference between the NLO predictions for tt and Wt processes is estimated by taking the difference between the predicted Top background yields obtained with the DR and DS schemes used for the NLO Wt calculation [122]. The uncertainties due to PDF variations are computed as the envelope of the central values of the nominal NNPDF3.0 PDF set and the CT14, MMHT14, and PDF4LHC15_30PDF PDF sets [124]. All uncertainties except those in the scale variations, cross-section, and interference are considered as fully correlated between the tt and Wt processes. The Z/γ * + HF modelling uncertainties are estimated using the nominal S 2.2.1 samples by considering different merging (CKKW-L) [125] and resummation scales. The uncertainties due to PDF variations and changes in µ and µ are calculated using the same procedures as for the Top backgrounds. An additional uncertainty in the Z/γ * process is computed by taking the difference between the nominal S 2.2.1 samples with samples generated using M G 5_aMC@NLO+P 8. The dominant uncertainties in the total background estimates in SR-SF are the Z/γ * + HF modelling uncertainties (8%), primarily that due to comparison of S 2.2.1 and M G 5_aMC@NLO, and the parton shower uncertainty affecting the Top background process (5%). The uncertainties in the background estimates in SR-DF are dominated by the uncertainty due to the parton shower affecting the Top background process (12%), the uncertainty in the Top normalisation correction µ Top (10%), the uncertainty due to the comparison between the generators used for the Top process (7.5%), and the uncertainty due to the modelling of ISR and FSR in the Top process (5%).
Systematic uncertainties in the signal acceptance due to varying µ and µ , as well as PDF-induced uncertainties, are evaluated using the same procedure as for the Top background process. The resulting scale (PDF) uncertainties are < 3% (< 1%) in both signal regions. The uncertainty due to the parton shower modelling is computed by comparing H 7 with P 8, and is found to be 8% (9%) in SR-SF (SR-DF). The uncertainty in the HH production cross-section, evaluated to be 5%, is included as an uncertainty in σ SM (gg → HH) when computing the upper limits on the cross-section ratio in Table  5 Table  4 and is dominated by the JER, with all other contributions found to be negligible. Table 4: Breakdown of the main uncertainty components in the background estimates in the two signal regions for the Top, Z/γ * + HF, and all other ("Other") backgrounds. The uncertainty components associated with the total background estimate in the signal regions (the sum of Top, Z/γ * + HF, and Other) is listed under "Total Bkg.". As in the upper half of Table 3, all uncertainties are shown "post-fit". The percentages show the size of the uncertainty relative to the expected background in each column and uncertainties can be correlated, not necessarily adding in quadrature to the total uncertainty in each column or across each row. Uncertainties in the post-fit normalisation factors, µ Top and µ Z/γ * + HF , are only applicable for the Top and Z/γ * + HF processes.

Results
In order to extract information about the HH → bb ν ν signal cross-section, a counting experiment is performed with a profile-likelihood fit [131] simultaneously across the CR-Top, CR-Z+HF, SR-SF, and SR-DF regions using the predicted and observed event counts in each region as inputs. The Top and Z/γ * + HF normalisation corrections are also extracted from this fit and are found to differ negligibly from those presented in Table 3. All sources of systematic and statistical uncertainty in the signal and background models are implemented as deviations from the nominal model, scaled by nuisance parameters that are profiled in the fit. The p-value corresponding to the background-only hypothesis, giving the probability that the data in the signal regions be at least as incompatible with the background-only hypothesis as that observed in SR-SF and SR-DF, is p 0 = 0.15 and corresponds to 1.05σ significance. Distributions of m bb , m , and d H H after performing background-only fits to data in the control regions and applying the Top and Z/γ * + HF normalisation corrections are shown in Figure 3. The signal selection criteria are imposed on all observables shown in Figure 3 apart from the one being plotted, except that the d H H requirement for the m bb and m distributions is relaxed to d H H > 5. No significant excess of events over the expected SM background is observed and upper limits are set on non-resonant Higgs boson pair production at 95% confidence level (CL) using the CL s method [132]. Table 5 presents these upper limits and comparisons with the SM prediction. The observed (expected) limit at 95% CL is 1.2 (0.9) pb, corresponding to 40 (29) times the SM prediction.

Conclusions
A search for non-resonant Higgs boson pair production, as predicted by the SM, is presented in the final state with at least two b-tagged jets and exactly two leptons with opposite electric charge, where one of the Higgs bosons decays to bb and the other decays to either WW * , Z Z * , or ττ. The analysis uses pp collision data recorded at √ s = 13 TeV by the ATLAS detector at the LHC, corresponding to an integrated luminosity of 139 fb −1 . The data are in agreement with the predictions for the SM background processes. An observed (expected) 95% CL upper limit is set on the cross-section for the production of Higgs boson pairs, corresponding to 40 (29) times the SM prediction. These limits are comparable to the previous leading searches for non-resonant Higgs boson pair production performed by the ATLAS and CMS experiments.