University of Birmingham Observation of photon-induced W+W production in pp collisions at s=13 TeV using the ATLAS detector

• Users may freely distribute the URL that is used to identify this publication. • Users may download and/or print one copy of the publication from the University of Birmingham research portal for the purpose of private study or non-commercial research. • User may use extracts from the document in line with the concept of ‘fair dealing’ under the Copyright, Designs and Patents Act 1988 (?) • Users may not further distribute the material nor use it for the purposes of commercial gain.


Introduction
The study of W -boson pair production from the interaction of incoming photons (γ γ → W W ) in proton-proton (pp) collisions offers a unique window to a wide range of physical phenomena.
In the Standard Model (SM), the γ γ → W W process proceeds E-mail address: atlas .publications @cern .ch. through trilinear and quartic gauge-boson interactions. This process is unique in that, at leading order, it only involves diagrams with self-couplings of the electroweak gauge bosons, as shown in Fig. 1. Hence, a cross-section measurement directly tests the SU(2)×U(1) gauge structure of the SM. At the same time, as a process driven only by electroweak boson self-interactions, it is sensitive to anomalous gauge-boson interactions [1] as parameterised in effective field theory (EFT) with additional dimension-6 and dimension-8 operators [2,3] 1. The leading-order Feynman diagrams contributing to the γ γ → W W process are the t-channel diagram (left) proceeding via the exchange of a W boson between two γ W W vertices and a diagram with a quartic γ γ W W coupling (right). In addition, a u-channel diagram exists (not shown), which also proceeds via two γ W W vertices. γ γ → W W can in future provide valuable input for the global EFT fits.
This letter presents a measurement in the W + W − → e ± νμ ∓ ν channel that results in the observation of photon-induced W W production. Previously, the ATLAS and CMS Collaborations found only evidence for γ γ → W W production with the Run-1 data, AT-LAS by using 8 TeV pp collisions [4] and CMS by combining their 7 TeV and 8 TeV pp collision data [5,6].
The signal process proceeds through the pp(γ γ ) → p ( * ) W + W − p ( * ) reaction, where p ( * ) indicates that the final-state proton either stays intact or fragments after emitting a photon. Whilst the former occurs through a coherent photon radiation off the whole proton without disintegration, for the latter at least one of the photons can be considered as being radiated off a parton in the proton. These contributions are classified as elastic, single-dissociative, and double-dissociative W W production. Elastic γ γ → W W production with leptonic decays of the W bosons results in a final state containing two charged leptons and no additional charged-particle activity. Even in the case of dissociative photon-induced production, the charged particles from the proton remnants often fall outside the acceptance of the tracking detector.
The suppressed activity in the central region of the detector in the γ γ → W W signal gives the means to control and significantly reduce background from quark-and gluon-induced W W production or top-quark production where the leptonic final state is typically produced in association with a substantial amount of hadronic activity. The analysis therefore selects events that have no additional charged-particle tracks reconstructed in the vicinity of the selected interaction vertex. The modelling of the hadronic activity in quark-and gluon-induced processes, as well as uncorrelated activity from additional pp interactions, is constrained using same-flavour ee and μμ Drell-Yan, DY(→ ee/μμ), events in data, reducing the associated uncertainties by a significant amount. Background from other photon-induced processes, mainly dilepton production γ γ → , is reduced by selecting only different-flavour lepton pairs, eμ, leaving a smaller contribution from γ γ → τ τ production with leptonic τ decays. Since the contribution from the γ γ → τ τ process falls off rapidly with increasing transverse momentum of the dilepton system, p eμ T , it can be further suppressed by placing requirements on p eμ T . A fiducial cross section for the pp(γ γ ) → p ( * ) W + W − p ( * ) process through the decay channel W + W − → e ± νμ ∓ ν is measured in a fit to the number of events in several kinematic regions with different signal and background contributions.

ATLAS detector
The ATLAS detector [7] at the Large Hadron Collider (LHC) is a multipurpose detector with a forward-backward symmetric cylin-drical geometry and nearly 4π coverage in solid angle. 1 It consists of an inner tracking detector surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadron calorimeters, and a muon spectrometer.
The inner tracking detector (ID) covers the pseudorapidity range |η| < 2.5 and is composed of three subdetectors. The highgranularity silicon pixel detector covers the vertex region and typically provides four measurements per track, the first hit normally being in the insertable B-layer [8,9]. It is followed by the silicon microstrip tracker (SCT), which usually provides eight measurements per charged-particle track. These silicon detectors are complemented by the transition radiation tracker, which enables radially extended track reconstruction up to |η| = 2.0 and provides electron identification information. The resolution of the z-coordinate of tracks at the point of closest approach to the beam line is about 0.170 mm for tracks with p T = 500 MeV and improves with higher track momentum [10]. For tracks with p T < 1 GeV, the dominant contribution to the z-resolution is due to multiple scattering. Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements with high granularity. A steel/scintillator-tile hadron calorimeter covers the central pseudorapidity range (|η| < 1.7). The endcap and forward regions are instrumented with LAr calorimeters for EM and hadronic energy measurements up to |η| = 4.9. The muon spectrometer (MS) surrounds the calorimeters and is based on three large air-core toroidal superconducting magnets with eight coils each. The muon spectrometer includes a system of precision tracking chambers (|η| < 2.7) and fast detectors for triggering (|η| < 2.4). A two-level trigger system [11] selects the events used in the analysis.

Data and simulated event samples
The analysis uses proton-proton collision data recorded with the ATLAS detector during the Run-2 data-taking period (2015-2018) at √ s = 13 TeV with the number of interactions, μ int , per bunch crossing (also referred to as pile-up) ranging from about 10 to 60 with an average of 33.7 [12].
The size of the region where the collisions occur, the so-called beam spot, is a result of the operating parameters of the LHC. Of specific importance for this analysis is its width along the zdirection, which determines the density of pp interactions. The width is determined by fitting the distribution of the z positions of the reconstructed vertices to Gaussian functions using an unbinned 1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point in the centre of the detector and the z-axis coinciding with the axis of the beam pipe. The x-axis points from the interaction point to the centre of the LHC ring, and the y-axis points upward. The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2), and φ is the azimuthal angle around the beam pipe relative to the x-axis. The angular distance is defined likelihood fit. It varied between 30 and 50 mm during the Run-2 data-taking period [13]. The data correspond to an integrated luminosity of L =139.0 ± 2.4 fb −1 after data quality requirements [14] have been applied. This value is derived from the calibration of the luminosity scale with the method explained in Ref.
Signal and background processes were modelled using Monte Carlo (MC) event generators to study kinematic distributions, to evaluate background contamination in the signal region and to interpret the results. To simulate the detector response, the generated events were passed through a detailed simulation of the ATLAS detector [16] based on Geant4 [17] or on a combination of Geant4 and a parameterised calorimeter simulation [18]. The present measurement relies only on tracking information from charged hadrons, muons and electrons, which is simulated by Geant4 in either case, as well as the modelling of the calorimetric response of electrons which can be reliably parametrized. Multiple pp interactions occurring in the same or adjacent bunch crossings are included in the simulation by overlaying several inelastic pp collisions matching the average number of interactions per bunch crossing. The inelastic pp collisions were generated with Pythia 8.186 [19] using a set of tuned parameters called the A3 tune [20] and the NNPDF2.3LO [21] set of parton distribution functions (PDF). All MC samples are corrected to the beam conditions of the data as described in Section 5.1. In all samples using Pythia8 or Herwig7 to simulate the parton showering, underlying event and hadronisation, the decays of bottom and charm hadrons were performed with EvtGen 1.2.0 [22].
The elastic component of the γ γ → W W signal process was modelled at leading order (LO) using Herwig 7.1.5 [23,24] interfaced with the BudnevQED photon flux [25] through ThePEG software [26]. This sample is used to model the photon-induced processes in the fiducial region of the measurement as it uses a photon flux, which is differential in both x and virtuality Q 2 . It is corrected to match the cross section, including the dissociative as well as non-perturbative components, using a data-driven method described in Section 5.3. This data-driven approach is validated using elastic and dissociative γ γ → W W samples produced using MG5_aMC@NLO 2.6.7 [27] interfaced to Pythia 8.243. The default photon flux in MG5_aMC@NLO and the CT14QED [28] PDF were used to model the photon radiation from protons and quarks, respectively. The parametrized detector simulation was used in the generation of the MG5_aMC@NLO samples. They are used whenever regions with reconstructed track multiplicities larger than zero are studied.
The production of γ γ → , with = e, μ, τ , was modelled in the same way as for the γ γ → W W signal process. Additional generators were used to validate the modelling of the γ γ → dissociative events. The single-dissociative processes were modelled using LPAIR 4.0 [29]. Alternative γ γ → doubledissociative samples were produced with Pythia 8.240 using the NNPDF3.1NLOluxQED PDF set [30]. Diffractive QCD-processes and γ γ → 4 production were produced using Pythia 8.244 and MG5_aMC@NLO 2.6.7 interfaced to Pythia 8.243 and studied using particle-level information only. The contribution of these processes was found to be negligible in the signal region of the measurement.
The dominant background from quark-induced W W production, also referred to as qq → W W , was modelled at next-to- sensitive to initial-and final-state radiation, as well as further variations related to multiple parton interactions and colour reconnection, were produced to study the description of the parton showers and hadronisation. Herwig 7.1.6 was used as an alternative parton shower, using the H7UE tune [24] and the MMHT2014LO PDF set [39] for events generated with the Powheg-Box v2 generator. An alternative sample for quark-induced W W production was generated using the Sherpa [40,41] event generator in order to evaluate modelling uncertainties. The Sherpa 2.2.2 sample uses matrix elements at NLO accuracy in QCD for up to one additional parton and at LO accuracy for up to three additional parton emissions. The matrix element calculations were matched and merged with the Sherpa parton shower based on Catani-Seymour dipole factorisation [42,43] using the MEPS@NLO prescription [40,[44][45][46]. The virtual QCD corrections were provided by the OpenLoops 1 library [47][48][49]. The sample was generated using the NNPDF3.0NNLO set [50], along with the dedicated set of tuned parton-shower parameters developed by the Sherpa authors.
DY production, pp → Z /γ * → with = e, μ, τ , was modelled using the same settings for Sherpa, Powheg+Pythia8 and Powheg+Herwig7 as for the quark-induced W W event generation described above. DY( Z /γ * → τ τ ) was modelled with Powheg interfaced to Pythia 8.186 using the NNPDF3.0NLO PDF set [50] and the AZNLO tune together with the CTEQ6L1 PDF set for parton showering and hadronisation.
The W Z and Z Z background processes were modelled at NLO using Sherpa as well as Powheg-Box v2 interfaced to Pythia 8.212 with the same settings as employed for the W W event generation. W γ production, gluon-induced W W production including resonant and non-resonant contributions and W W jj production in vector-boson scattering were simulated using the Sherpa 2.2.2 generator with the NNPDF3.0NNLO PDF set. These samples use matrix elements at NLO QCD accuracy for up to one additional parton and LO accuracy for up to three additional parton emissions for W γ and gluon-induced W W production and LO-accurate matrix elements for W W jj production in vector-boson scattering.
The tt and W t processes were simulated with the Powheg-Box [31][32][33]51,52] v2 generator at NLO with the NNPDF3.0NLO PDF interfaced to Pythia 8.230 using the A14 tune [53] and the NNPDF2.3LO set of PDFs. For the W t process, the diagram removal scheme [54] was applied to remove interference and overlap with tt production.

Event reconstruction and selection
Candidate events from γ γ → W W production are identified by the presence of an electron and a muon with high transverse momentum and the absence of additional reconstructed chargedparticle tracks associated with the interaction vertex.
Tracks are reconstructed from position measurements (hits) in the ID caused by the passage of charged particles [55,56]. The track reconstruction consists of an iterative track-finding algorithm seeded by combinations of at least three silicon-detector hits followed by a combinatorial Kalman filter [57] to build track candidates based on hits compatible with the extrapolated trajectory. Ambiguities between the track candidates are then resolved and quality criteria are applied to suppress combinations of hits unlikely to originate from a single charged particle. At least one hit in the two innermost layers is required if the extrapolated track crosses the sensitive region of an active sensor module. The number of silicon hits in the pixel and SCT detectors must be larger than 9 for |η| ≤ 1.65 or larger than 11 for |η| > 1.65, with no more than two missing SCT hits on a track if the respective SCT modules are operational. Additionally, a selection is imposed on the transverse impact parameter, |d 0 | < 1 mm, to reject tracks from secondary interactions. Tracks are required to have p T > 500 MeV and be within |η| < 2.5. These selection criteria result in an efficiency of 75-80% depending on the track p T . The largest source of inefficiency is hadronic interactions with the detector material. In simulated events, reconstructed tracks can be classified as originating from the hard scatter or from additional pp collisions by matching the hits that contributed to the track fit to the energy deposited by the charged particle in the Geant4 simulation. The respective tracks are counted as n HS trk and n PU trk . Electrons are reconstructed from energy clusters in the electromagnetic calorimeter that are matched to tracks reconstructed in the ID [58,59]. The best-matching track is selected using as criteria track-cluster spatial distance and the number of hits in the silicon detectors [59]. Further tracks may be assigned to the electron candidate if they are likely to originate from interactions with detector material. The pseudorapidity of electrons is required to be within the range of |η| < 2.47, excluding the transition region between the barrel and endcaps in the LAr calorimeter (1.37 < |η| < 1.52).
Electron candidates are required to have transverse momenta p T > 20 GeV.
Muons are built from tracks reconstructed using MS hits matched to ID tracks. A global fit using the hits from both subdetectors is performed [60]. Each muon candidate is matched uniquely to exactly one ID track and is required to satisfy |η| < 2.4 and p T > 20 GeV.
Identification and isolation criteria are applied to electron and muon candidates to suppress non-prompt leptons from hadron decays. Identification criteria are based on shower shapes and track parameters for the electrons, and on track parameters for the muons. The isolation criteria use information about ID tracks and calorimeter energy deposits in a fixed cone of R = 0.2 around each lepton. Electrons must satisfy the 'medium' identification criteria as well as the loose isolation criteria described in Ref.
[59], which have a combined efficiency of 75-85% depending on the electron p T . Muon candidates are required to satisfy the 'medium' identification and loose isolation criteria introduced in Ref.
[60], which have an efficiency of about 95%. The significance of the transverse impact parameter, defined as the absolute value of d 0 , divided by its uncertainty, σ d 0 , must satisfy |d 0 |/σ d 0 < 3 for muons and |d 0 |/σ d 0 < 5 for electrons. The decision on whether or not to record the event was made by single-electron or single-muon triggers with requirements on lepton identification and isolation similar to those applied offline. The transverse momentum thresholds for these triggers were 24 GeV for electrons [61] and 20 GeV for muons [62] in 2015, whilst during the 2016-2018 data-taking period the thresholds were both raised to 26 GeV and requirements on lepton identification and isolation were tightened. Complementary triggers with higher p T thresholds and no isolation or looser identification criteria were used to increase the trigger efficiency.
Events are required to contain exactly two leptons of opposite electric charge that satisfy the above criteria. One of the leptons must have transverse momentum exceeding 27 GeV and be matched to an object that provided one of the triggers used for the read-out and storage of the event. The invariant mass of the two selected leptons must exceed m = 20 GeV. Both same-flavour (ee/μμ) and different-flavour (eμ) events are accepted either for auxiliary measurements or for the signal extraction, respectively.
The interaction vertex is reconstructed from the two leptons in the event, 1 and 2 , as the weighted average z-position of the tracks extrapolated to the beam line: where sin 2 θ approximately parameterises the resolution of the z-position [10]. This definition of the interaction vertex is not bi-ased by the presence of additional tracks from hadronic activity in association with the dilepton pair production or by additional tracks from nearby pile-up interactions. It results in a 30% higher efficiency than a primary vertex selection based on the sum of squared track transverse momenta [63]. Requirements are placed on each lepton to fulfil |(z − z vtx ) sin θ| < 0.5 mm. A window of z = ±1 mm around z vtx defines the region in which ID tracks are matched to the interaction vertex. The number of tracks in this window, excluding those used in the reconstruction of leptons, is counted as n trk . Signal γ γ → W W event candidates are selected using the exclusivity requirement that n trk = 0. Events with low track multiplicities, 1 ≤ n trk ≤ 4, are used to evaluate backgrounds. The modelling of n trk is therefore vital to the extraction of the γ γ → W W signal, and this is discussed further in the following section.

Modelling of signal and backgrounds
Corrections are applied to the simulated signal and background event samples to adjust the lepton trigger, reconstruction, identification and isolation efficiencies, as well as the energy and momentum resolutions, to those observed in data. The muon momentum scale is corrected in the MC simulation, whilst the electron energy scale is corrected in data [59-62]. Accurate modelling of the transverse momenta of the bosons is important because of its correlation with the expected charged-particle multiplicity from hadronic activity. The p W W T distribution in the MC samples for quark-induced W W production is reweighted to the theoretical calculation at next-to-next-to-leading-order (NNLO) accuracy in perturbative quantum chromodynamics with resummation of soft gluon emissions up to next-to-next-to-next-to-leading-logarithm (N 3 LL) accuracy using MATRIX+RadISH [48,49,[64][65][66][67][68][69][70][71][72]. A correction for the transverse momentum distribution of dilepton pairs from the DY process is derived from data using ee and μμ final states with an invariant mass within 15 GeV of the nominal Z boson mass corrected for background, and is applied to all DY samples as a function of the generator-level p Z T . Additional data-driven corrections are needed for this analysis to account for (i) mismodelling of the additional pp interactions produced in the same bunch crossing, (ii) mismodelling of the charged-particle multiplicity in the qq → W W background process, and (iii) second scatterings and the dissociative contribution to the γ γ → W W signal process.

Modelling of additional pp interactions
Tracks from nearby additional pp interactions can be matched to the interaction vertex and, thus, lower the efficiency of the exclusivity requirement. Their number depends on the density of additional pp interactions and the number of tracks originating from these interactions. Data-driven techniques are used to derive corrections to the simulated events to further improve their description of the data, targeting the density of pp interactions and the number of tracks per interaction separately.
The simulated events are reweighted such that the distribution of the average number of pp interactions per bunch crossing reproduces the one measured in the data.  An ancillary data measurement is used to determine the correction for the number of tracks from additional pp interactions randomly matched to the interaction vertex, n PU trk . In same-flavour Z → events, this correction is obtained by counting the number of tracks satisfying the nominal selection criteria relative to a random position in z that is well separated from the interaction vertex, |z vtx − z| > 10 mm. Each event is sampled multiple times using non-overlapping regions in z. This procedure optimises the statistical power, but does not consider the actual distribution of z vtx along z. To correct for the resulting bias, n PU trk is extracted as a function of the z-coordinate and weighted with the normalised beam spot distribution.
This method is tested using simulated events and found to reproduce the n PU trk distribution in data within 0.1-3.5% for low track multiplicities, with larger disagreement for larger n trk . Fig. 2 shows the probability distribution of n PU trk associated with z vtx , extracted in data and simulation before and after the corrections for the beam spot width. The bottom panel shows the ratio to data. The inverse ratio of the beam-spot-corrected simulation to data corresponds to the correction applied as a function of n PU trk in the simulation. The distributions of the number of n PU trk in γ γ → and γ γ → W W MC events are shown after the beam spot and the pile-up corrections. Before any corrections, the disagreement can be up to 15% depending on the beam spot conditions in the simulation. After the σ BS correction, for low track multiplicities disagreements of about 10% persists because the σ BS correction only improves the modelling of the density of the pile-up vertices but not of their track multiplicity. This is corrected using the n PU trk correction. The full set of corrections is applied to all MC samples used in the analysis.
The presence of the additional tracks from pile-up will randomly lead to the rejection of signal events and therefore the distribution of n PU trk can be used to extract the signal efficiency of the exclusivity requirement (n trk = 0). This exclusive efficiency depends strongly on the number of interactions per bunch crossing and the general beam conditions. The average efficiency for the 2015-2018 dataset with an average μ int of 33.7 is 52.6%. It drops from 60% at μ int = 20 to about 30% at μ int = 60. When comparing the data-driven efficiency with that obtained directly from signal MC samples, the results agree to better than 0.2%. The full effect of the data-driven correction for tracks from additional pp interactions is assigned as a systematic uncertainty, resulting in 1% and 3% uncertainty in the efficiency to select events without any additional associated tracks (n trk = 0) for signal and background, respectively. The uncertainty of having a low number of tracks associated with the vertex (1 ≤ n trk ≤ 4) is 2% for photon-induced processes and 10% for quark-and gluon-induced processes.

Modelling of the underlying event
For quark-induced diboson production, additional charged particles can be produced from initial-state radiation or secondary partonic scatters in the same pp collision, also called the underlying event. However, for low values of the number of charged particles, the n ch distribution was found to be not well modelled by many of the phenomenological models implemented in the generators [73-76]. The underlying event can be assumed to be similar for quark-induced production of different colourless final states if the transverse momenta of these final states are comparable [76]. Therefore, the charged-particle multiplicity in qq → W W events can be constrained using data measurements of DY production of pairs in pp collisions. Specifically, the charged-particle multiplicity is measured for Z → produced in slices of p T . This two-dimensional measurement is then used to correct the DY and diboson simulation. The general validity of this approach has been tested using DY and diboson samples generated with Powheg+Pythia8, Sherpa and Powheg+Herwig7. The multiplicity spectra of charged particles are found to be very different in the different MC samples, yet relatively similar between the respective DY and diboson processes at a constant value of the boson or diboson p T with the agreement being of the order to 10-20%.
The Z → events are selected using the criteria described in Section 4 with an additional requirement on the dilepton mass (70 GeV < m < 105 GeV) to suppress contributions from background processes. The contribution of pile-up tracks is estimated from data by sampling random z-positions well separated from the dilepton vertex as discussed in Section 5.1. The background at low track multiplicities is dominated by γ γ → events, which have a different p T dependence than DY events and amount to about 5% of the total events selected with 70 GeV < m < 105 GeV and n trk = 0 while their contribution is 0.5% or smaller for higher track multiplicities. The relative normalisations for the elastic, singledissociative and double-dissociative γ γ → as well as the DY process are determined in a fit to the measured p T distribution in a m > 105 GeV sideband, requiring n trk = 0 and using the shapes from MC simulation. In this sideband, the γ γ → process contributes about 60% to the total event sample. The contribution from the γ γ → W W process with a same-flavour final state amounts to less than 1% of the γ γ → processes in this kinematic region and is neglected. The overall normalisations of the different γ γ → contributions relative to the prediction are compatible within the statistical uncertainty with those from earlier ATLAS studies [77]. After the γ γ → and pile-up contributions are subtracted as backgrounds, D'Agostini unfolding [78,79] is used to unfold the distribution of the reconstructed track multiplicity, n trk , to that of the number of charged particles, n ch , using four iterations. 2 The charged-particle multiplicity is extracted as a function of the p T of the dilepton system, which corresponds to the transverse momentum of the recoil, using 5-GeV-wide intervals of p T . The largest sources of uncertainty are the contributions from pile-up tracks and uncertainties in the distribution used as the prior, assessed by comparing Powheg+Pythia8 and Sherpa. Other uncertainties originate from the event selection and the γ γ → background subtraction, assessed by varying the kinematic selection and the normalisation of the photon-induced background within the uncertainties of the fit in the m sideband. Fig. 3 (left) compares the unfolded charged-particle multiplicity distribution for different MC models and data. For low values of n ch , the chargedparticle multiplicity distribution is mismodelled by a factor of 2.5 in Powheg+Pythia8 and by a factor of 4 in Sherpa, whilst good agreement with the Powheg+Herwig7 model is found except at n ch = 0 where the Powheg+Herwig7 prediction exceeds the data yield by about 30%.
The charged-particle multiplicity in simulated DY events is corrected using per-event weights determined as the ratio of the unfolded data to the unfolded MC simulation as a function of the charged-particle multiplicity, and of the particle-level p T of the decay products of the Z boson. The impact of the charged-particle multiplicity correction is shown in Fig. 3 (right) for DY events. The simulation is shown both before and after the correction for pileup modelling and underlying-event modelling in Z → events satisfying 70 GeV < m < 105 GeV. The corrections bring the MC simulation into agreement with data within the systematic uncertainty of the charged-particle measurement. The correction for the underlying-event modelling is applied to W W , W Z and Z Z processes as a function of the charged-particle multiplicity, and of the particle-level p T of the decay products of the diboson system. 2 Similarly to Ref.
[80], charged particles are defined to be stable if they have a mean lifetime τ > 30 ps and satisfy p T > 500 MeV and |η| < 2.5.

Signal modelling
After the initial γ γ → W W process, the protons can undergo a second inelastic interaction. These additional rescatterings do not change the kinematics of the γ γ → W W process, but lead to the production of particles such that the cross section of γ γ → W W production without associated tracks is reduced. This effect is not included in the modelling of the signal. The probability that no such additional particles are produced is commonly referred to as the survival factor. In addition, the γ γ → W W signal when applying the exclusivity requirement is modelled by Herwig7, which includes only the elastic component. To obtain a better estimate of the expected signal yield including the dissociative components and to correct for effects from the rescattering of protons, a correction factor is obtained from a γ γ → control sample in data, following a procedure similar to that applied in Refs. [4,6] using same-flavour lepton final states. To enhance the purity in γ γ → production and to mimic the kinematic threshold of γ γ → W W production, the dilepton mass is required to be larger than 160 GeV. The exclusivity requirement of n trk = 0 is applied. In the region, where the correction factor is extracted, the predicted event yield from the γ γ → W W process with sameflavour final states is approximately 1.5% of the γ γ → yield so that the derived correction factor is essentially independent of the γ γ → W W signal process.
The background, dominated by DY production, is estimated using a data-driven technique. The shape of the m distribution for background events is estimated using events with n trk = 5, which is a compromise between small signal contamination and closeness to the signal region. This template is normalised to the n trk = 0 selection using a narrow window around the nominal Z boson mass (83.5 GeV < m < 98.5 GeV) where the contribution from photon-induced processes is small. The m lineshape in simulated DY events is found to be independent of n trk for low multiplicities.
When the exclusivity requirement of n trk = 0 is applied, the ratio of the yield from photon-induced processes in data to the MC prediction for the elastic processes is found to be 3.59 ± 0.15 (tot.).
This agrees with the expectation of 3.55 obtained using the MC prediction. It has been verified that the signal modelling correction is extracted as the ratio of the yield of γ γ → and γ γ → W W processes passing the exclusivity requirement of n trk = 0 to the yield of the simulated elastic process only. Shown are the data, where a requirement of n trk = 0 has been applied, and the background templates selected from data using n trk = 2 and n trk = 5. In addition, the γ γ → and γ γ → W W MC predictions are depicted, as well as the sum of the nominal background template (n trk = 5) and the γ γ → and γ γ → W W MC predictions scaled by the signal modelling correction. The normalisation region around the nominal Z boson mass is indicated with a vertical dashed line, as is the region where the signal modelling correction is extracted (m > 160 GeV). The excess in data relative to the elastic γ γ → and γ γ → W W prediction is attributed to the dissociative photon-induced processes and used to extract the signal modelling correction that is shown in the lower panel of the plot. The uncertainties shown are statistical only.
does not vary as a function of p eμ T within the boundaries used to extract the signal. Fig. 4 illustrates the extraction of the signal modelling correction from data. The signal modelling correction is only applicable to events with n trk = 0. The simulated Herwig7 events are used in conjunction with the signal modelling correction for predictions of photon-induced processes in events where the n trk = 0 requirement is applied, while the event samples from MG5_aMC@NLO+Pythia8 are used for predictions in regions with larger track multiplicities.
Uncertainties are evaluated by increasing the mass window of the DY background normalisation region to 73.5 GeV < m < 108.5 GeV and by changing the number of tracks used in the selection of the template, using n trk = 2 instead of the nominal value. The resulting uncertainty in the signal modelling correction amounts to 4.2%. When the signal modelling correction is applied to γ γ → W W , an additional transfer uncertainty is included to account for potential differences between γ γ → and γ γ → W W events due to the fact that rescattering effects are mass-dependent. It is calculated as the largest variation that arises from placing different lower bounds on the evaluation region; the lower bound on m was varied from m = 110 GeV to 400 GeV in intervals of 10 GeV. The resulting uncertainty amounts to 11%.
This uncertainty affects only the scaling of the γ γ → W W process and thus the measured signal strength and any cross section prediction derived using the signal correction factor, but cancels out in the measurement of the fiducial cross section.

Event categories and background estimation
One signal region and three control regions, enriched in signal and background events respectively, are defined using the dilepton transverse momentum, p eμ T , and the number of additional tracks associated with the interaction vertex, n trk . The signal region is defined by selecting p eμ T > 30 GeV and n trk = 0. It has an expected purity of 57% and an expected background contamination from qq → W W production of 33%.
Additional kinematic regions with alternative requirements on p eμ T and n trk are used to control the modelling of background processes. The first control region is defined by p eμ T < 30 GeV and 1 ≤ n trk ≤ 4 and helps to constrain the DY( Z /γ * → τ τ ) normalisation, as this process contributes 75% of the selected events in this region. It also has non-negligible contributions from qq → W W events and non-prompt leptons. The second control region is defined by p eμ T > 30 GeV and 1 ≤ n trk ≤ 4 and is designed to be enriched in qq → W W events, with an expected contribution of about 70% from that process and minor contributions from the DY process and non-prompt lepton events. An additional control region is selected with p eμ T < 30 GeV and n trk = 0. It brings some additional control for the modelling of backgrounds specific to events with no tracks, however has a signal contamination of the order of 10%. The boundaries between these regions are chosen such that good signal-background separation is achieved. In addition, the regions used to control the normalisation of the backgrounds are defined to be topologically very similar to the signal region, which helps to minimise uncertainties in extrapolating the normalisation from the control regions to the signal region.
Background events from non-prompt leptons contribute about 6% of the selected signal candidates in the signal region. The primary source of these backgrounds in dilepton events is W +jets production where one of the leptons is prompt and the other stems from light-hadron or heavy-flavour decays. Background events from non-prompt leptons are estimated from a control region where exactly one of the leptons must fail to satisfy some of the lepton identification criteria of the nominal event selection. All other kinematic selection criteria are the same as for the signal selection. The contribution from non-prompt leptons is then estimated by scaling the number of events in the control region by the ratio of the number of non-prompt leptons passing all identification requirements to those failing some of these requirements. This ratio is measured in data selected with one electron and one muon with the same electric charge, and requiring 1 ≤ n trk ≤ 4.
Contributions from prompt leptons are subtracted using MC simulation. For the extrapolation to the event samples selected with n trk = 0 a dedicated uncertainty is assigned.

Systematic uncertainties
Uncertainties and their correlations are evaluated in each of the signal and control regions. The uncertainties in the measurement of tracks originate from uncertainties in the inner detector alignment, the reconstruction efficiency, and the probability to incorrectly reconstruct tracks by including hits from noise or from several tracks. The combined uncertainty amounts to 5-7% of the event yields for DY and qq → W W production, whilst for photoninduced processes these uncertainties are < 1% in the regions where these processes contribute significantly.
Systematic uncertainties in the event yields due to electron and muon reconstruction, including effects from the trigger and reconstruction efficiencies, energy/momentum scale and resolution, and pile-up modelling are 0.5% and up to 2% depending on the process, in the signal and control regions, respectively [59][60][61][62].
The uncertainty in the background from non-prompt leptons is dominated by the uncertainty in the measurement of the ratio of non-prompt leptons passing all identification requirements to those failing some, in particular the subtraction of contributions from genuine leptons in the numerator of that ratio. The resulting uncertainty on this background estimation ranges between 50% Table 1 Summary of the data event yields, and the predicted signal and background event yields in the signal region and control regions as obtained after the fit. The uncertainties shown include statistical and systematic components. Because the fit introduces correlations between systematic uncertainties, the uncertainty in the total expected yield is smaller than its components. The leftmost column of values corresponds to the signal region used to measure γ γ → W W in proton-proton collisions. The numbers for qq → W W also contain a small contribution from gluon-induced W W and electroweak W W jj production. The event yields for other backgrounds include contributions from W Z and Z Z diboson production, top-quark production and other gluon-induced processes.

Signal region
Control regions n trk and 100% depending on the region. The statistical uncertainty in the control region for the estimation of background from misidentified leptons is also a significant source of uncertainty. The uncertainties in the correction of pile-up modelling and the underlying event as well as the uncertainty in the signal modelling correction are described in Section 5. The correction for the underlying-event modelling in the W W , W Z and Z Z processes is derived in bins of p T , but applied as a function of diboson p T , utilising the fact that there are only relatively small differences in charged-particle multiplicity between the DY and diboson processes. Residual differences are evaluated at the particle level and considered as systematic uncertainties. For the largest source of background, the quark-induced W W process, further studies are made. The predicted event yields are compared for Powheg+Pythia8 and variations of the Pythia8 parton-shower tunes, and for Powheg+Herwig7 and Sherpa, with each prediction using its dedicated underlying-event correction. The event yields agree well for 1 ≤ n trk ≤ 4, but disagree in the signal region, n trk = 0. The background yield from the quark-induced W W process is estimated as the average of the highest and lowest value of the various predictions, that is the midpoint of the most extreme predictions as no preference for either model can be deduced from the data. The envelope of all predictions is taken as the upper and lower one-standard-deviation boundary, amounting to ±7% for events selected with n trk = 0, and amounting to less than 1% for events selected with 1 ≤ n trk ≤ 4. The uncertainties in the total quark-induced W W cross section and the shape of the p W W T distribution are taken from the MATRIX+RadISH prediction used to reweight the W W samples, amounting to 5-6%.
Because of the specific event selection of the analysis, large uncertainties are applied to minor backgrounds, where the n trk modelling cannot be easily studied in data: the uncertainty in the W γ normalisation is taken to be ±100%, whereas uncertainties of ±30% are used for the normalisation of top-quark production and W W jj production through vector-boson scattering (VBS) as well as gluon-induced resonant and non-resonant W W production. The numbers are informed by the size of the underlying-event correction in DY and W W events and studies on events with forward jets outside the acceptance of the ID. For the smaller background contributions from W Z and Z Z production the uncertainty is assessed by comparing the event yields predicted by Powheg+Pythia8 with those predicted in Sherpa after applying the underlying-event correction described in Section 5.2.
The systematic uncertainty in the measured cross section also includes a contribution due to differences in reconstruction efficiency between elastic and dissociative photon-induced processes as well as an uncertainty due to missing spin correlations in Herwig7, which mainly affects the p eμ T modelling. These uncertainties are evaluated separately by comparing the reconstruction efficiency of the elastic-only prediction with that including all production mechanisms and by comparing the reconstruction efficiency between Herwig7 and MG5_aMC@NLO+Pythia8. Their combined effect is ±2%. Uncertainties stemming from the signal modelling correction are applied to the signal prediction and are discussed in detail in Section 5.3.

Results
The γ γ → W W signal in proton-proton collisions is extracted using a profile likelihood fit of the estimated signal and background event yields to data. The fit uses the integrated event yields in the four kinematic regions introduced in Section 6, and the ee + μμ events selected as described in Section 5.3. It maximises the product of Poisson probabilities to produce the observed number of data events, N obs , in each of these regions [81].
The normalisation of the backgrounds from DY and qq → W W processes are free parameters in the fit. The expected elastic γ γ → and γ γ → W W event yields for n trk = 0 are multiplied by the signal modelling correction discussed in Section 5.3, which is obtained as described within the fit to preserve the experimental correlations correctly. The event yield for the γ γ → W W signal process is also multiplied by a signal strength that is a free parameter in the fit. Systematic uncertainties are included in the fit as nuisance parameters constrained by Gaussian functions. The fit can only constrain the sum of the backgrounds, since the background composition is similar in events selected with n trk = 0 and those selected with 1 ≤ n trk ≤ 4. Overall, the uncertainty in the sum of their yields is dominated by the systematic uncertainties assigned to events selected with n trk = 0. In this fit, the background-only hypothesis is expected to be rejected with a significance of 6.7 standard deviations. Table 1 gives an overview of the number of data events compared to background and signal event yields in the different regions after the fit. The data yield in the signal region is 307, compared with 132 background events predicted by the best-fit result. The normalisations of the W W and the DY background are con- control and signal regions is at 30 GeV. The distributions in Fig. 5 include the fitted normalisations and nuisance parameters described above; the resulting predictions are in good agreement with the data. Fig. 6 shows the distribution of the number of reconstructed tracks for p eμ T > 30 GeV.
The fiducial phase space used for the cross-section measurement is defined to be close to the acceptance of the detector. The leptons must at particle level satisfy the pseudorapidity requirement |η| < 2.5. One of the leptons is required to have a  Without requirements on the number of reconstructed tracks, the selection efficiency after reconstruction is 75% for elastic γ γ → W W events in the fiducial region. The full selection efficiency after applying n trk = 0 is 39%. The predicted number of signal events includes a ∼5% contribution of leptons from W → τ ν τ , τ → ν ν τ , which is estimated using the MC simulation and which is removed from the measured fiducial cross section using this fractional contribution. Table 2 The impact of different components of systematic uncertainty on the measured fiducial cross section, without taking into account correlations. The impact of each source of systematic uncertainty is computed by first performing the fit with the corresponding nuisance parameter fixed to one standard deviation up or down from the value obtained in the nominal fit, then these high and low variations are symmetrised. The impacts of several sources of systematic uncertainty are added in quadrature for each component. The observed signal strength translates into a fiducial cross section of σ meas = 3.13 ± 0.31 (stat.) ± 0.28 (syst.) fb The uncertainties correspond to the statistical and systematic uncertainties, respectively. Table 2 gives an overview of the sources of systematic uncertainties, which are discussed in Section 7 and presents their effect on the measured cross section. To evaluate the impact of one source of systematic uncertainty, the fit is performed with the corresponding nuisance parameter fixed one standard deviation up or down from the value obtained in the nominal fit, then these high and low variations are symmetrised.
The data measurement can be compared with two types of predictions. The first, used in the definition of the signal strength and the calculation of the expected significance, is based on the Herwig7 prediction for elastic γ γ → W W events scaled by the data-driven signal modelling correction to include the dissociative processes and rescattering effects as described in Section 5.3. It is found to be σ theo × (3.59 ± 0.15 (exp.) ± 0.39 (trans.)) = 2.34 ± 0.27 fb , where the uncertainty contains all experimental uncertainties and receives an additional component due to the transfer from the γ γ → to the γ γ → W W process described above. The uncertainties in the theory prediction are negligible because the scale uncertainty in the calculation of elastic production based on a photon-flux is small and partially cancels with the signal correction that is calculated with respect to the same photon-flux compared to the data. A standalone theory prediction for the fiducial cross section is computed with MG5_aMC@NLO+Pythia8 using the appropriate elastic or inelastic MMHT2015qed PDF sets [82] for each of the contributions by applying the fiducial requirements to all photon-induced contributions, which yields 4.3 ± 1.0 (scale) ± 0.1 (PDF) fb. The scale uncertainty is determined by varying the factorisation scale by factors of 2 and 0.5 and symmetrising the effect. The contributions to this cross-section prediction from elastic and single-dissociative production are 16% and 81%, respectively. Double-dissociative production contributes only 3%. Using CT14qed [28] as the central PDF set yields a prediction which is 26% smaller and amounts to 3.2 fb.
The MG5_aMC@NLO+Pythia8 prediction does not include rescattering effects that are expected to decrease the fiducial cross section. For elastic γ γ → W W production, a survival factor of 0.65 was estimated in Ref. [83]. In Ref. [84] a survival factor of 0.82 was calculated in a two-channel eikonal model also accounting for the helicity structure of the hard scattering process. 3 Multiplying the MG5_aMC@NLO+Pythia8 prediction by these survival factors results in theoretical predictions of 2.8 ± 0.8 fb and 3.5 ± 1.0 fb, respectively, with the total uncertainties calculated as the quadratic sum of scale and PDF uncertainties. These predictions are in agreement with the measurement.

Conclusion
The photon-induced production process, γ γ → W W , was studied in proton-proton collisions at √ s = 13 TeV recorded with the ATLAS detector at the LHC corresponding to an integrated luminosity of 139 fb −1 . Events with leptonic W boson decays into e ± νμ ∓ ν final states were selected by requiring that no tracks except those of the two charged leptons are associated with the production vertex. The background-only hypothesis is rejected with a significance of 8.4 standard deviations whereas well above 5 σ was expected. This measurement constitutes the observation of photoninduced W W production in pp collisions, a process for which only evidence was previously reported. The signal strength and the cross section for the sum of elastic and dissociative production mechanisms are measured. The cross section for the pp(γ γ ) → p ( * ) W + W − p ( * ) process in the decay channel W + W − → e ± νμ ∓ ν in a fiducial phase space close to the experimental acceptance is measured to be 3.13 ± 0.31 (stat.) ± 0.28 (syst.) fb. This result is in agreement with the theoretical predictions and may serve as input into EFT interpretations.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.