1 Introduction

Fig. 1
figure 1

a Standard Model production of \(\textrm{Z}\,(\rightarrow \ell \ell )\gamma \gamma \) at leading order. b, c Standard Model production of \(\ell \ell \gamma \gamma \) involving final-state radiation, which is not considered as signal in this paper. d Production of \(\ell \ell \gamma \gamma \) involving an anomalous quartic coupling between neutral EW gauge bosons

Processes involving the production of three electroweak (EW) gauge bosons from proton–proton collisions are typically rare. Some of these triboson processes are only just becoming accessible due to the unprecedented integrated luminosity provided during Run 2 of the CERN LHC. Measurements of such processes provide a direct probe of non-Abelian quartic gauge couplings, both those that are predicted by the Standard Model (SM) and those that could only be due to new physics. The production of a Z boson in association with two prompt photons provides an opportunity to test the electroweak sector of the SM and to constrain any potential new physics effects. Neutral quartic gauge couplings are not allowed in the SM, and hence the production of \(\textrm{Z}\,\gamma \gamma \) has no tree-level contribution involving quartic couplings. In this paper, the leptonic decay channels of the Z boson, i.e. \(\textrm{Z}\,\rightarrow \ell ^{+}\ell ^{-}\) where \(\ell = e,\mu \), are considered. Despite having lower branching fractions than the quark and neutrino decay channels, the leptonic channels benefit from having a cleaner final state and smaller backgrounds. The measurement of \(\textrm{Z}\,\gamma \gamma \) is also crucial for our understanding of the irreducible background to \(\textrm{Z}\,(\rightarrow \ell \ell )H(\rightarrow \gamma \gamma )\) production, and for searches for resonances in the \(\ell \ell \gamma \gamma \) final state.

The production of \(\ell \ell \gamma \gamma \) from proton–proton collisions proceeds at leading order by diagrams of the first three types given in Fig. 1. The production of \(\ell \ell \gamma \gamma \) via three on-shell bosons is shown in Fig. 1a, where both of the photons are produced via initial-state radiation (ISR). In Fig. 1b, c, one or both of the photons are produced via final-state radiation (FSR). The main sources of background to this signal arise from processes involving jets which are misidentified as photons. Previous studies of the \(\ell \ell \gamma \gamma \) final state with the ATLAS detector at \(\sqrt{s}=8~\text {TeV}\,\) [1], and the CMS detector at 8 \(\text {TeV}\)  [2] and 13 \(\text {TeV}\)  [3], were performed in phase spaces which included both the ISR and FSR production of photons. The measurement presented in this paper suppresses the FSR contribution, which allows a simpler interpretation of the measurements since the dominant signal contribution comes through the production of three on-shell bosons. The \(\ell \ell \gamma \gamma \) final state is also sensitive to new physics via anomalous quartic couplings, an example of which is shown in Fig. 1d.

The measurements use \(pp \rightarrow e^+e^-\gamma \gamma \,+ X\) and \(pp \rightarrow \mu ^+\mu ^-\gamma \gamma \,+ X\) events recorded by the ATLAS detector at \(\sqrt{s}=13~\text {TeV}\,\). The ATLAS detector is described in Sect. 2. The full Run 2 dataset corresponding to an integrated luminosity of 139 fb\(^{-1}\) is used. It is described in Sect. 3 along with simulated event samples. The event selection is given in Sect. 4. Background processes are estimated using a combination of data-driven techniques and simulation, which are described in Sect. 5. The systematic uncertainties are discussed in Sect. 6. The yields of signal events are corrected (unfolded) to a fiducial volume where the integrated (differential) cross-section is measured; the unfolding procedures and results are described in Sect. 7. Differential cross-sections are measured as a function of the transverse energy \(E_{\text {T}}\,^{\gamma 1}\) of the leading (highest \(p_{\text {T}}\) ) photon, the transverse energy \(E_{\text {T}}\,^{\gamma 2}\) of the subleading photon, the transverse momentum \(p_{\text {T}}\,^{\ell \ell }\) of the dilepton system, the transverse momentum \(p_{\text {T}}\,^{\ell \ell \gamma \gamma }\) of the four-body system, the invariant mass \(m_{\gamma \gamma }\) of the diphoton system, and the invariant mass \(m_{\ell \ell \gamma \gamma }\) of the four-body system. The \(E_{\text {T}}\,^{\gamma 1}\) , \(E_{\text {T}}\,^{\gamma 2}\) and \(p_{\text {T}}\,^{\ell \ell }\) distributions measure the transverse momentum of one of the three bosons in the event. The \(p_{\text {T}}\,^{\ell \ell \gamma \gamma }\) distribution is a measure of the hadrons recoiling against the \(\ell \ell \gamma \gamma \) system and is hence sensitive to QCD modelling. The \(m_{\gamma \gamma }\) distribution is useful for constraining backgrounds to \(\gamma \gamma \) resonances in the \(\ell \ell \gamma \gamma \) final state, and the \(m_{\ell \ell \gamma \gamma }\) spectrum describes the scale of the full four-body system. The data are compared with predictions from Monte Carlo (MC) event generators with matrix elements calculated to up to next-to-leading order (NLO) in perturbative QCD. The differential \(p_{\text {T}}\,^{\ell \ell }\) measurement is also used to constrain new physics effects arising through anomalous neutral quartic couplings. This is done via an effective field theory (EFT) approach, where limits are set on the coupling strengths of dimension-8 operators. This procedure and the measured limits are presented in Sect. 8. The conclusions are stated in Sect. 9.

2 The ATLAS detector

The ATLAS experiment [4] at the LHC is a multipurpose particle detector with a forward–backward symmetric cylindrical geometry and a near \(4\pi \) coverage in solid angle.Footnote 1 It consists of an inner tracking detector (ID) surrounded by a thin superconducting solenoid providing a \(2\,\hbox {T}\) axial magnetic field, electromagnetic and hadron calorimeters, and a muon spectrometer. The inner tracking detector covers the pseudorapidity range \(|\eta | < 2.5\). It consists of silicon pixel, silicon microstrip, and transition radiation tracking detectors. Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements with high granularity. A steel/scintillator-tile hadron calorimeter covers the central pseudorapidity range (referred to as the barrel), covering \(|\eta | < 1.7\). The endcap and forward regions are instrumented with LAr calorimeters for both the EM and hadronic energy measurements up to \(|\eta | = 4.9\). The muon spectrometer surrounds the calorimeters and is based on three large superconducting air-core toroidal magnets with eight coils each. The field integral of the toroids ranges between 2.0 and 6.0 T m across most of the detector. The muon spectrometer (MS) includes a system of precision tracking chambers and fast detectors for triggering. A two-level trigger system is used to select events. The first-level trigger is implemented in hardware and uses a subset of the detector information to accept events at a rate below \({100}\,\hbox {kHz}\). This is followed by a software-based trigger that reduces the accepted event rate to \({1}\,\hbox {kHz}\) on average depending on the data-taking conditions. An extensive software suite [5] is used in data simulation, in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

3 Data and simulated event samples

The data used in this analysis were collected in proton–proton collisions at \(\sqrt{s} = 13~\text {TeV}\,\) from 2015 to 2018. After applying criteria to ensure normal ATLAS detector operation [6], the total integrated luminosity useful for data analysis is 139 fb\(^{-1}\) . The uncertainty in the total Run 2 integrated luminosity is 1.7% [7], obtained using the LUCID-2 detector [8] for the primary luminosity measurements. The average number of inelastic \(pp\)interactions produced per bunch-crossing for the dataset considered is \(\langle \mu \rangle = 33.7\).

Simulated event samples are used to correct the background-subtracted data yield for detector effects and to estimate several background contributions. The simulated samples were produced with various MC event generators, processed through a full ATLAS detector simulation [9] based on Geant4  [10], and reconstructed using the same algorithms as used for data. All simulated samples were corrected with data-driven correction factors to account for differences in trigger, reconstruction, identification and isolation performance between data and simulation. Additional \(pp\)interactions (pile-up) occurring in the same or neighbouring bunch-crossings were modelled by overlaying each simulated event with minimum-bias events generated using Pythia 8.186 [11] with the A3 set of tuned parameters [12] and the NNPDF2.3lo [13] set of parton distribution functions (PDFs). The simulated events were then reweighted to reproduce the distribution of the number of \(pp\)interactions per bunch-crossing observed in the data.

Samples of simulated \(e^+e^-\gamma \gamma \) and \(\mu ^+\mu ^-\gamma \gamma \) events were generated using SHERPA 2.2.10 [14] at NLO accuracy in QCD for zero additional partons and LO accuracy for up to two additional partons. Matrix elements were matched and merged with the SHERPA parton shower [15] based on Catani–Seymour dipole factorisation [15, 16] using the MEPS@NLO prescription [17,18,19,20]. The NNPDF3.0nnlo [21] set of PDFs were used.

For studies of systematic uncertainties and cross-checks, an alternative signal sample is considered. It was produced with the MadGraph5_aMC@NLO 2.7.3 [22] generator with up to one additional final-state parton at NLO accuracy, using the NNPDF3.0nlo PDF set. Events were interfaced to Pythia 8.244 [23], via the FxFx merging procedure [24], for modelling of the parton shower, hadronisation and underlying event. Both the baseline and alternative signal samples utilise smooth-cone photon isolation [25], with the parameters \(\delta _{0}=0.1\), \(\epsilon =0.1\) and \(n=2\), to remove contributions from fragmentation photons.

The \(\textrm{Z}\,\gamma \text {\,+\,jets}\) (\(\textrm{Z}\,+ \text {jets}\) ) samples used in the estimation of misidentified-photon backgrounds were generated with SHERPA 2.2.4 (Powheg Box v1 [26,27,28,29]). Truth-level selections on the photon candidates and their parent particles are used to separate these two samples into three distinct misidentified-photon background components. The \(t{\bar{t}}\gamma \) sample used in the estimation of the background where the leptons originate from top quark decays was generated with MadGraph5_aMC@NLO 2.3.3. The backgrounds arising from electrons misidentified as photons were modelled with SHERPA 2.2.2 \((ZZ\rightarrow \ell \ell \ell \ell )\) and SHERPA 2.2.5 \((WZ\gamma \rightarrow \ell \nu \ell \ell \gamma ).\) The contribution from \(Z(\rightarrow \ell \ell )H(\rightarrow \gamma \gamma )\) was generated with Powheg Box v2. The single-photon and diphoton samples used in the estimation of the pile-up overlay backgrounds were modelled using SHERPA 2.2.2. Further details are given in Table 1, along with a summary of the signal samples.

Table 1 Summary of simulated MC event samples for the \(\ell \ell \gamma \gamma \) signal process and those used in the estimation of backgrounds. The third and fourth columns give the order in perturbative QCD and the PDF set used in the hard-scattering matrix element calculations. The rightmost column specifies the generator used to model parton showering, hadronisation, the underlying event and multiple parton interactions

4 Event reconstruction and selection

Events are selected in the electron and muon channels using unprescaled single-lepton triggers [31, 32] with the lowest \(p_{\text {T}}\) threshold available. From 2016 to 2018, this was 26 \(\text {GeV}\)  for both electrons and muons, and in 2015 it was 24 \(\text {GeV}\)  for electrons and 20 \(\text {GeV}\)  for muons. The low-\(p_{\text {T}}\) triggers are supplemented by higher ones with relaxed identification and isolation requirements which improve the overall trigger efficiency. Reconstructed tracks in the ID and clusters of energy deposits in the EM calorimeter are used as inputs in the reconstruction of electrons and photons [33]. Reconstructed track segments in the ID and MS are used as inputs in the reconstruction of muons [34].

Electron candidates are seeded by EM calorimeter energy clusters and must have \(p_{\text {T}}\,>20\) \(\text {GeV}\) , \(|\eta |<2.47\), and a matching ID track. The transition region between the barrel and endcap of the EM calorimeter \((1.37<|\eta |<1.52)\) is excluded. Electrons are identified using a likelihood discriminant formed from shower shape variables, track variables and a measure of how well the track matches the cluster. All electron candidates must satisfy the Medium identification working point [33]. To suppress the contribution from jets misidentified as electrons, the electron candidates must be isolated from other activity in the tracking and calorimeter systems. The calorimeter and tracking isolation variables are constructed, respectively, from the sums of cluster energies and track momenta falling within a cone of size \(\Delta R = 0.2\) around the electron, which are then required to satisfy the Loose working point described in Ref. [33].

Muon candidates are formed from tracks and must have \(p_{\text {T}}\,>20\) \(\text {GeV}\)  and \(|\eta |<2.5\). Identification requirements comprise selections on track quality and a measure of how well the ID track matches the MS track. Muon candidates must pass the Medium identification working point [34]. The muon candidates are also required to be isolated in the tracking and calorimeter systems using variables similar to those for electrons, in a variable-sized cone up to a maximum of \(\Delta R = 0.3\). The Loose isolation working point is used and is similar to that described in Ref. [34].

Reconstructed tracks matched to common points of origin along the beam axis serve as candidates for the location of proton–proton collisions (vertices). The vertex with the largest sum of the track \(p_{\textrm{T}}^{2}\) is chosen as the primary vertex (PV). Electrons and muons must be consistent with originating from the PV, requiring that their transverse impact parameter significance satisfies \(|d_{0}|/\sigma _{d_{0}} < 3\) (5) for muons (electrons) and the longitudinal distance \(z_0\) from the PV to the point where \(d_0\) is measured satisfies \(|z_{0} \sin (\theta )\,| < 0.5\) mm.

Photon candidates are seeded by EM calorimeter energy clusters which have \(p_{\text {T}}\,>20\) \(\text {GeV}\)  and \(|\eta |<2.37\). The transition region between the barrel and endcap of the EM calorimeter \((1.37<|\eta |<1.52)\) is excluded. Converted photon candidates are formed from clusters which are matched to a conversion vertex. The conversion vertices are formed from one or two tracks which are consistent with a massless particle converting within the ID volume. Unconverted photon candidates are formed from clusters which are not matched to an electron track or conversion vertex. Photon candidates are required to pass a number of selections on shower shape variables which correspond to the Loose identification working point [33].

Overlap removal requirements are applied to the preliminarily selected objects to prevent the same particle from being reconstructed as two separate physics objects. Photons are removed if they are within \(\Delta R=0.4\) of a selected electron or muon. Electrons are removed if they are within \(\Delta R=0.2\) of a selected muon.

Correction factors are applied to the selected objects to account for object trigger, reconstruction, identification and isolation efficiency differences between data and simulation.

Candidate events are considered further if they contain at least one opposite-sign same-flavour lepton pair and at least two photons. One of the leptons must be matched to the trigger object which fired the event, and the highest-\(p_{\text {T}}\) (leading) lepton must have \(p_{\text {T}}\,>30\) \(\text {GeV}\)  to be well above the trigger threshold. If an electron (muon) pair is selected, the leading electron (muon) must pass the Tight identification [33] (Tight isolation) requirement. The invariant mass of the dilepton pair must be above 40 \(\text {GeV}\)  in order to remove contributions from low-mass resonances.

The two highest-\(p_{\text {T}}\) photons in the event that pass the Tight identification and Loose isolation [33] requirements are selected. The two selected photons must be separated from each other by at least \(\Delta R=0.4\). Finally, the contribution from FSR photons is suppressed by requiring that the sum of the dilepton invariant mass and the smaller of the two three-body invariant masses, formed from the dilepton system and each of the two photons, is greater than twice the Z boson rest mass. The selection criterion (\(m_{\ell \ell }\) + min\((m_{\ell \ell \gamma _{1}},\,m_{\ell \ell \gamma _{2}}) > 2m_{\textrm{Z}\,}\) ) is motivated by the fact that, at the generator level, it provides absolute distinction between ISR and FSR events when the Z boson is on-shell. The effect of this selection at the detector level is illustrated in Fig. 2.

Fig. 2
figure 2

The predicted detector-level distribution in SHERPA 2.2.10 simulation of the dilepton invariant mass versus the smaller of the two three-body masses formed from the dilepton system and each of the two photons. The events are subject to the full set of signal region requirements, with the exception of the FSR removal selection (\(m_{\ell \ell }\) + min\((m_{\ell \ell \gamma _{1}},\,m_{\ell \ell \gamma _{2}}) > 2m_{\textrm{Z}\,}\) ). The cut value is indicated by the red line

Table 2 The observed data yield, background composition and estimated signal yield is given for each channel. The signal yield predictions from both of the MC event generators are also given. The statistical uncertainty of the signal predictions and all backgrounds besides the \(j\rightarrow \gamma \) background (second row) is due to the limited number of generated simulation events

5 Backgrounds

The dominant background contributions to \(\ell \ell \gamma \gamma \) arise from processes involving jets misidentified as photons (referred to as \(j\rightarrow \gamma \) backgrounds), and are estimated using a data-driven method. These backgrounds account for approximately 20% of the data yield in the signal region. A small contribution is expected from electrons misidentified as photons (referred to as \(e\rightarrow \gamma \) backgrounds), and is estimated from simulation. The remaining backgrounds, which are small, come from processes involving prompt photons, and are also estimated from simulation. The largest of the prompt-photon backgrounds arises from \(t\bar{t}\gamma \gamma \) events, which contribute approximately 5% of the events selected in the signal region.

The number of data events selected in each channel is given in Table 2 along with the estimated background yields. The data yield in the muon channel is higher than in the electron channel because the muons have a higher reconstruction efficiency than electrons, and also a larger detector acceptance. Table 2 also shows the extracted number of signal events in data (i.e. data minus background), which is compared with predictions from both signal event generators for each channel. The detector-level \(E_{\text {T}}\,^{\gamma 1}\) , \(E_{\text {T}}\,^{\gamma 2}\) and \(m_{\ell \ell \gamma \gamma }\) data distributions, in both the electron and muon channels, are compared with the signal-plus-background predictions in Fig. 3. The estimation of the different backgrounds is described in the following subsections.

5.1 \(j\rightarrow \gamma \) backgrounds

Processes involving jets misidentified as prompt photons populate the \(\ell \ell \gamma \gamma \) signal region. They typically involve light-hadron decays into a pair of photons within jets. In these processes, one or both of the photon candidates are misidentified jets; these are divided into \(\textrm{Z}\,\gamma j\) , \(\textrm{Z}\,j\gamma \) and \(\textrm{Z}\,jj\) categories, the first two according to whether the lower- or higher-\(p_{\text {T}}\) photon candidate is a misidentified jet. The probability of a jet being misidentified as a photon is poorly modelled in simulation, so a data-driven method is used. Such methods utilise jet-enriched control regions, defined by using photon candidates which fail either the photon identification or isolation selections, or both. A loosened data sample is used, where two photons are selected with a loose identification requirement and with no isolation requirement. Each of the two photons in an event can be assigned to one of four categories defined by the signal region’s photon identification and isolation requirements: pass identification and pass isolation (A), pass identification and fail isolation (B), fail identification and pass isolation (C), or fail identification and fail isolation (D). The signal region is hence denoted by AA, and the other 15 combinations define the control regions. In the following, the number of data (signal) events falling into each control region is denoted by \(N_{\textrm{data}}^{XY}\) \((N_{\textrm{signal}}^{XY})\) where \(X,Y = \text {A}, \text {B}, \text {C}, \text {D}\).

The number of events involving jets misidentified as photons in the signal region can be computed from the relevant yields in the control regions using a matrix method that has been employed in previous diphoton analyses [1, 35]. The method uses as inputs the prompt-photon isolation efficiencies \((\epsilon _{1}, \epsilon _{2}),\) which are the probabilities for Tight identified photons to be isolated, and the jet-to-photon fake rates \((f_{1}, f_{2}),\) which are the probabilities for photon candidates which fail the Tight identification selection to be isolated. The real photon efficiencies are measured in simulation and the jet-to-photon fake rates are calculated in data as

$$\begin{aligned}&f_{1}=\frac{(N^{\textrm{CA}}_{\textrm{data}}-N^{\textrm{CA}}_{\textrm{signal}})R}{(N^{\textrm{CA}}_{\textrm{data}}-N^{\textrm{CA}}_{\textrm{signal}})R +N^{\textrm{DA}}_{\textrm{data}}-N^{\textrm{DA}}_{\textrm{signal}}},\\&f_{2}=\frac{(N^{\textrm{AC}}_{\textrm{data}}-N^{\textrm{AC}}_{\textrm{signal}})R}{(N^{\textrm{AC}}_{\textrm{data}}-N^{\textrm{AC}}_{\textrm{signal}})R +N^{\textrm{AD}}_{\textrm{data}}-N^{\textrm{AD}}_{\textrm{signal}}}, \end{aligned}$$

where the indices 1 and 2 refer respectively to the leading and subleading photons and \(R = N_{\textrm{A}}N_{\textrm{D}}/N_{\textrm{B}}N_{\textrm{C}}\) is a correlation parameter which accounts for the bias due to requiring the photon candidate to fail the identification requirement. The values of \(\epsilon _{1}=94\pm 3\)% and \(\epsilon _{2}=91\pm 4\)% are used where the uncertainty is systematic and calculated by looking at the variation of the isolation efficiencies with \(E_{T}^{\gamma }\) in the signal simulations. The parameter R is determined from simulation for each of the photon candidates, and the combined average of \(R=1.18\pm 0.18\) is used for the calculated values of both fake rates. The systematic uncertainty on R is assigned such that it conservatively covers the values predicted for both fake photon candidates by the \(\textrm{Z}\,\gamma \text {\,+\,jets}\) and \(\textrm{Z}\,+ \text {jets}\) simulations. The signal leakage into the jet-enriched control regions is corrected for by subtracting from data the number of signal events predicted by the simulation to fall in the control region.

Fig. 3
figure 3

The detector-level distributions in data compared with signal-plus-background predictions in the electron and muon channels for a, b \(E_{\text {T}}\,^{\gamma 1}\) , c, d \(E_{\text {T}}\,^{\gamma 2}\) and e, f \(m_{\ell \ell \gamma \gamma }\) . The lower panels show the ratio of the data to the total prediction

The number of events from each process in the loosened sample (\(W_{Zxy}\) where \(x,y=\gamma ,j\)) can be mapped onto the signal region (AA) and the control regions AB, BA and BB by using the matrix

$$\begin{aligned}{} & {} \begin{pmatrix} N^{\textrm{AA}} \\ N^{\textrm{AB}} \\ N^{\textrm{BA}} \\ N^{\textrm{BB}} \end{pmatrix}= \begin{pmatrix} \epsilon _{1}\epsilon _{2} &{} \epsilon _{1}f_{2} &{} f_{1}\epsilon _{2} &{} f_{1}f_{2} \\ \epsilon _{1}(1-\epsilon _{2}) &{} \epsilon _{1}(1-f_{2}) &{} f_{1}(1-\epsilon _{2}) &{} f_{1}(1-f_{2}) \\ (1-\epsilon _{1})\epsilon _{2} &{} (1-\epsilon _{1})f_{2} &{} (1-f_{1})\epsilon _{2} &{} (1-f_{1})f_{2} \\ (1-\epsilon _{1})(1-\epsilon _{2}) &{} (1-\epsilon _{1})(1-f_{2}) &{} (1-f_{1})(1-\epsilon _{2}) &{} (1-f_{1})(1-f_{2}) \end{pmatrix} \begin{pmatrix} W_{Z\gamma \gamma } \\ W_{Z\gamma j} \\ W_{Zj\gamma } \\ W_{Zjj} \end{pmatrix}. \end{aligned}$$
(1)

The matrix can then be inverted to determine the unknown yields, \(W_{Zxy}\). The contributions of the four processes to the signal region are determined from the first row of the matrix in Eq. (1):

$$\begin{aligned} N^{\textrm{AA}}&= N_{Z\gamma \gamma } + N_{Z\gamma j} + N_{Zj\gamma } + N_{Zjj} \\&= W_{Z\gamma \gamma }\epsilon _{1}\epsilon _{2} + W_{Z\gamma j}\epsilon _{1}f_{2} + W_{Zj\gamma }f_{1}\epsilon _{2} + W_{Zjj}f_{1}f_{2}. \end{aligned}$$

The \(j\rightarrow \gamma \) background fractions are determined using the signal region events in the \(e^+e^-\gamma \gamma \) and \(\mu ^+\mu ^-\gamma \gamma \) channels combined. These fractions, and their statistical uncertainties, are found to be: \(8.2\pm 2.6\%\) for \(\textrm{Z}\,\gamma j\) , \(9.1\pm 2.3\%\) for \(\textrm{Z}\,j\gamma \) and \(2.8\pm 1.1\%\) for \(\textrm{Z}\,jj\) . The statistical uncertainties are derived using 1000 sets of ‘toy’ data. For each set, the data yield in each region is randomly drawn from a Poisson distribution with a mean value equal to the observed data yield in that region. Each set of toy data is propagated through the matrix inversion, and the standard deviation of the 1000 extracted background fractions is taken as the statistical uncertainty.

The uncertainties on the extracted yields related to the fixed parameters \(\epsilon _{1}\), \(\epsilon _{2}\) and R are investigated by varying these parameters within their systematic uncertainties. Two sources of systematic uncertainty related to the choice of jet-enriched control regions are considered. The dependence of the photon candidates on shower shape variables used in the photon identification definition is tested by varying the selections placed on these variables. An additional requirement is added to the nominal photon isolation requirement in order to reduce the amount of signal which leaks into regions \(\text {B}\) and \(\text {D}\). A systematic uncertainty for this effect is calculated from the difference in extracted yields when not including this requirement. The largest contribution to the total uncertainty on the \(j\rightarrow \gamma \) background yield comes from the data statistical uncertainty, but is similar in size to the total systematic uncertainty.

The method is validated on a ‘pseudo-dataset’ formed from the signal and \(j\rightarrow \gamma \) background simulation samples, where the fractions of the four contributing processes are known. The extracted total normalisation of the \(j\rightarrow \gamma \) background agrees with the expected value within one standard deviation.

There are insufficient events to perform the matrix method in each of the differential measurement bins, therefore the shapes of the \(j\rightarrow \gamma \) backgrounds are taken from simulation and normalised to the overall fractions found in data. The shapes are derived in a slightly loosened signal region where one of the two photons is allowed to fail either the Tight identification or Loose isolation requirements, in order to increase the number of events selected from simulation. The ability of the simulation to describe the shapes in data is checked in a jet-enriched control region. Two sources of uncertainty affecting these shape templates are considered. The first is related to differences between data and simulation, and is estimated by testing the ability of the simulation to model the data distributions in a jet-enriched control region. This uncertainty is typically below 5% of the measured cross-section per bin, but can be as large as 15% in the most poorly modelled regions. The second source of shape uncertainty is related to the choice of control region in which the shapes are derived, and is evaluated by reweighting the shapes to a harder \(p_{\text {T}}\) spectrum because the photons which fail the identification or isolation requirements are typically softer ones. The uncertainty is below 2% across all the differential measurement bins.

5.2 Other backgrounds

The second largest background contribution comes from \(t\bar{t}\gamma \gamma \) events where the top quarks decay leptonically. The normalisation factor for this background is determined in a control region with the same selection requirements as the signal region, except that an opposite-sign \(e\mu \) lepton pair is selected. The \(t\bar{t}\gamma \gamma \) process dominates in this region, but the contribution from \(j\rightarrow \gamma \) events is also considered using the matrix method described above. The ratio of data, with the \(j\rightarrow \gamma \) background subtracted, to the \(t\bar{t}\) simulation in the \(e\mu \) control region is used to define a normalisation factor which is applied to \(t\bar{t}\) simulation events entering the signal region. The considered sources of systematic uncertainty are the same as for the signal. The normalisation factor is \(0.81 \pm 0.17\), where the dominant uncertainty is due to the limited number of data events in the \(e\mu \) control region.

As there are no vertex requirements placed on photons, a source of background arises when two proton–proton interactions in the same bunch-crossing overlap to produce a combined \(\ell \ell \gamma \gamma \) system. A data-driven method, such as the one described in Ref. [36], is not possible due to the limited number of signal region events, so these backgrounds are estimated using simulation only. Two processes contribute at first order: \(Z\gamma \) + \(\gamma \) and Z + \(\gamma \gamma \). Random events from each sample are combined and subjected to the fiducial selection (described later in Sect. 7.1). The resulting particle-level distributions of the six kinematic variables listed in Sect. 1 are corrected to the detector level using bin-by-bin factors determined from the simulated signal events. A systematic uncertainty is assigned to account for the different \(p_{\text {T}}\) distributions of signal events and pile-up background events. It is estimated by recalculating the bin-by-bin factors after reweighting the signal simulation to the pile-up background photon \(p_{\text {T}}\) spectra. The uncertainties in the predicted cross-section of the single-photon [37] and diphoton [38] samples are significant and hence are also included as systematic uncertainties. The total systematic uncertainty is 35% (27%) for the \(Z\gamma \) + \(\gamma \) (Z + \(\gamma \gamma \)) pile-up background processes.

Another source of background is misidentification of electrons as photons. This \(e\rightarrow \gamma \) background is modelled by ZZ and \(WZ\gamma \) simulations. The modelling of electron-to-photon misidentification rates has been tested [39] and is found to disagree with data at a level of up to 50% in some regions. Therefore, a conservative systematic uncertainty of 50% is applied to the ZZ and \(WZ\gamma \) yields in the signal region.

The contribution from \(Z(\rightarrow \ell \ell )H(\rightarrow \gamma \gamma )\) is estimated directly from simulation. The contribution from \(Z(\rightarrow \tau ^{+}\tau ^{-})\gamma \gamma \) is estimated from simulation and is found to be negligible.

6 Systematic uncertainties

Systematic uncertainties in the measured cross-sections are related to the background estimation, the detector-to-fiducial acceptance correction factors (both inclusively and through the unfolding, as described in Sect. 7.1) and the integrated luminosity. The uncertainties in the backgrounds are discussed in Sect. 5. The correction factor and response matrix used for the unfolding are affected by the selection efficiency, and therefore variations of the different object reconstruction efficiencies are considered.

The performance of the electron and photon reconstruction, and their associated systematic uncertainties, are studied in Ref. [33]. For electrons, the reconstruction, identification and isolation efficiencies and their uncertainties are measured by applying tag-and-probe methods to events containing \(\textrm{Z}\,\rightarrow e^{+}e^{-}\) or \(J/\psi \rightarrow e^{+}e^{-}\) decays [40]. For photons, the corresponding efficiencies are measured using samples of \(\textrm{Z}\,\rightarrow \ell ^{+}\ell ^{-}\gamma \) decays, and an inclusive photon sample collected using single-photon triggers. The energy scale and resolution for electrons and photons, and their uncertainties, are obtained from a sample of \(\textrm{Z}\,\rightarrow e^{+}e^{-}\) events. For muons, the efficiencies, the momentum scale and resolution, and their uncertainties, are obtained using samples of \(\textrm{Z}\,\rightarrow \mu ^{+}\mu ^{-}\) or \(J/\psi \rightarrow \mu ^{+}\mu ^{-}\) decays [34].

The uncertainty due to the pile-up reweighting procedure discussed in Sect. 3 is estimated by varying the amount of pile-up in the signal simulation to cover the uncertainty in the ratio of the predicted and measured inelastic cross-sections [41].

The statistical uncertainty due to the limited number of generated signal events is considered.

Various sources of theoretical uncertainty are considered. The uncertainty due to the choice of PDF is estimated from the standard deviation of the mean of 100 variations of the nominal PDF set (NNPDF3.0nnlo). The renormalisation and factorisation scales are each varied by factors of 0.5 and 2.0, except for shifts in opposite directions, and the envelope of the effects of these scale variations is taken as an estimate of the uncertainty due to missing higher order corrections. The assumed value of the strong coupling constant, \(\alpha _{\text {s}}\,(m_Z) = 0.118\), is varied by ± 0.001 and the average effect is taken as the \(\alpha _{\text {s}}\) contribution to the uncertainty. The effect of these theoretical uncertainties is accounted for in the integrated fiducial cross-section measurements and in the predicted cross-sections from the MC event generators. For the differential cross-section measurements, the theoretical uncertainties are determined to be negligible with respect to the total uncertainty.

The systematic uncertainties in the integrated cross-section in the fiducial region are summarised in Table 3. The measurement in each channel is dominated by the data statistical uncertainty, and the largest systematic uncertainty comes from the \(j\rightarrow \gamma \) background estimation.

Table 3 Relative systematic uncertainties in the integrated fiducial cross-section measurements in each channel using 139 fb\(^{-1}\) of Run 2 data. Systematic uncertainty sources that contribute less than 0.1% are not shown. Systematic sources labelled with an asterisk are treated as correlated between the two channels

7 Cross-section determination

7.1 Fiducial volume definition

The measured yields for the signal process in data are corrected to a fiducial volume which accounts for detector inefficiency, geometry and resolution. The fiducial volume is defined using particle-level objects in simulation which have a proper decay length longer than 10 mm. To correct for bremsstrahlung, each particle-level lepton is ‘dressed’ by vectorially adding to its four-momentum the four-momenta of any nearby photons, except those from hadron decays, within a cone of size \(\Delta R=0.1\) around the lepton.

To minimise the model-dependence of the procedure to correct the data from detector level to particle level, also known as unfolding, the selection requirements placed on the particle-level objects are as close as possible to the detector-level selection outlined in Sect. 4. An exception is the calorimeter transition region, which is included in the selection of particle-level electrons and photons. The detector-to-fiducial-level correction procedure includes an extrapolation, over a few percent of the total phase space, which accounts for the loss of detector-level acceptance in this region. The particle-level photons must pass an isolation selection which requires \(E_{\text {T}}^{\text {iso}}\), defined as the summed transverse energy of all particles except muons, neutrinos and the photon itself within a cone of size \(\Delta R=0.2\) around the photon, to be less than 7% of the photon \(p_{\text {T}}\) . This value is chosen as it best replicates the performance of the Loose isolation working point used in the detector-level selection. The complete set of requirements is listed in Table 4.

Table 4 Definition of the \(\textrm{Z}\,(\rightarrow \ell \ell )\gamma \gamma \) fiducial phase space

7.2 Cross-section extraction

The integrated fiducial cross-section, \(\sigma _{\textrm{fid}}\), is calculated from the observed yield in data \((N_{\textrm{data}}),\) the expected background yield \((N_{\textrm{bkg}})\) and the total integrated luminosity (L)

$$\begin{aligned} \sigma _{\textrm{fid}} = \frac{N_{\textrm{data}}-N_{\textrm{bkg}}}{C\times L}. \end{aligned}$$

The correction factor (C) is defined as the ratio of the number of signal simulation events passing the detector-level selection to the number which pass the fiducial-level selection. The value of C is 0.286 ± 0.014 in the electron channel and 0.379 ± 0.017 in the muon channel where the uncertainties include the systematic sources discussed in Sect. 6. The dominant contributions come from the photon identification efficiency and pile-up reweighting systematic uncertainties.

For the differential cross-section measurements, the detector-to-fiducial-level correction is instead done via an iterative Bayesian unfolding procedure [42] which accounts for bin migrations. The procedure takes as input the background-subtracted data distributions and a response matrix produced from the nominal signal simulation. The bin migrations are typically below 5% but can be as large as 18% in the regions with the finest binning. After accounting for migrations the unfolded yields are corrected for out-of-fiducial events which corresponds to a reduction of 5–10% per bin. Two iterations are used which is chosen as the central values do not change significantly, less than 3%, with a larger number of iterations. The statistical uncertainties related to both the finite number of events in data and in the signal simulation are assessed using Poisson distributed ‘toys’. The systematic uncertainties related to the signal selection efficiency are propagated through the unfolding by constructing new response matrices for the upwards and downwards shifts of one standard deviation and taking the average effect as the uncertainty on the unfolded yield. The background systematic uncertainties are assessed by varying each background expectation up and down by one standard deviation and taking the average effect as the uncertainty on the unfolded yield. The reliability of the unfolding procedure is tested by unfolding the detector-level signal distribution from simulation, reweighted such that it better describes the data. The difference between the resulting unfolded distribution and the reweighted fiducial distribution is assigned as an uncertainty of the differential measurements. The uncertainty is negligible in most bins, up to 8% in one bin, but overall has a very small effect on the measurements.

The bins used for the differential distributions were chosen such that there are sufficient events in each bin to perform the unfolding. The uppermost bin edges were chosen using the expected distributions from the signal simulation to exclude regions which are not sensitive to \(\textrm{Z}\,\gamma \gamma \) production.

The integrated and differential cross-section measurements are performed separately in each channel. The results are combined into \(\textrm{Z}\,(\rightarrow \ell \ell )\gamma \gamma \) measurements via an averaging procedure which accounts for statistical and systematic uncertainties, and their correlations between the two channels. The technique uses a \(\chi ^{2}\) minimisation procedure which is documented in Ref. [43].

7.3 Results

The measured integrated cross-sections in each channel and the combined average are

$$\begin{aligned} \sigma _{\textrm{fid}}^{\textrm{Z}\,(\rightarrow ee)\gamma \gamma \,}{} & {} = 2.65 \pm 0.31 ({\textrm{stat}}) \pm 0.24 ({\textrm{syst}})\\{} & {} \quad \pm 0.05 ({\textrm{lumi}})\,\,{\textrm{fb}},\\ \sigma _{\textrm{fid}}^{\textrm{Z}\,(\rightarrow \mu \mu )\gamma \gamma \,}{} & {} = 2.29 \pm 0.25 ({\textrm{stat}}) \pm 0.21 ({\textrm{syst}})\\{} & {} \quad \pm 0.04 ({\textrm{lumi}})\,\,{\textrm{fb}},\\ \sigma _{\textrm{fid}}^{\textrm{Z}\,(\rightarrow \ell \ell )\gamma \gamma \,}{} & {} = 2.45 \pm 0.20 ({\textrm{stat}}) \pm 0.22 ({\textrm{syst}})\\{} & {} \quad \pm 0.04 ({\textrm{lumi}})\,\,{\textrm{fb}}. \end{aligned}$$

The integrated \(\textrm{Z}\,(\rightarrow \ell \ell )\gamma \gamma \) cross-section is measured with an overall precision of 12% and is compared with the MC event generator predictions in Fig. 4, where good agreement between data and both predictions is seen. The SHERPA prediction suffers from larger scale uncertainties due to the matrix element NLO accuracy being at the 0-jet level, whereas it includes 1-jet contributions in the MadGraph5_aMC@NLO prediction.

Fig. 4
figure 4

The \(\textrm{Z}\,(\rightarrow \ell \ell )\gamma \gamma \) integrated cross-section, measured in a fiducial region corresponding to the production of three on-shell electroweak bosons. The measurement is compared with both the signal event generator predictions

The measured \(\textrm{Z}\,(\rightarrow \ell \ell )\gamma \gamma \) differential cross-sections are compared with the predictions from SHERPA and MadGraph5_aMC@NLO in Figs. 5, 6 and 7.

The photon transverse energy (\(E_{\text {T}}\,^{\gamma 1}\) , \(E_{\text {T}}\,^{\gamma 2}\) ) distributions displayed in Fig. 5 are well described by the predictions. The \(p_{\text {T}}\,^{\ell \ell }\) distribution in Fig. 6a describes the transverse momentum of the \(\textrm{Z}\,\) boson, which typically recoils against the two photons. This distribution is therefore sculpted by the minimum transverse momentum selections imposed on the two photons, which results in the peak around 40 \(\text {GeV}\) . The \(p_{\text {T}}\,^{\ell \ell \gamma \gamma }\) distribution in Fig. 6b probes the QCD modelling of the transverse momentum of the \(\textrm{Z}\,\gamma \gamma \) system. The description by the MC event generator predictions is good across all measurement bins except for a discrepancy in the high-\(p_{\text {T}}\,^{\ell \ell \gamma \gamma }\) region. The \(m_{\gamma \gamma }\) distribution is important in the context of diphoton resonance searches in \(\ell \ell \gamma \gamma \) channels. The measured distribution is shown in Fig. 7a and the simulations provide a good description, particularly in the fourth bin, which is most relevant for \(Z(\rightarrow \ell \ell )H(\rightarrow \gamma \gamma )\) measurements. The \(m_{\ell \ell \gamma \gamma }\) distribution (Fig. 7b) provides a measure of the hard scale of the system, and is described well by the predictions, even for \(m_{\ell \ell \gamma \gamma }\,>500\) \(\text {GeV}\) .

Fig. 5
figure 5

The unfolded differential cross-sections as a function of a the leading photon transverse energy and b the subleading photon transverse energy are compared with NLO predictions from SHERPA and MadGraph5_aMC@NLO. The black uncertainty bar on the data represents the statistical uncertainty, whereas the slightly taller grey uncertainty band represents the total uncertainty. The uncertainty in the predictions includes both the statistical and theoretical uncertainties. The lower panels show the ratios of the predictions to data, as well as the fractional uncertainty of the data

Fig. 6
figure 6

The unfolded differential cross-sections as a function of a the dilepton transverse momentum and b the four-body transverse momentum are compared with NLO predictions from SHERPA and MadGraph5_aMC@NLO. The black uncertainty bar on the data represents the statistical uncertainty, whereas the slightly taller grey uncertainty band represents the total uncertainty. The uncertainty in the predictions includes both the statistical and theoretical uncertainties. The lower panels show the ratios of the predictions to data, as well as the fractional uncertainty of the data

Fig. 7
figure 7

The unfolded differential cross-sections as a function of a the diphoton invariant mass and b the four-body invariant mass are compared with NLO predictions from SHERPA and MadGraph5_aMC@NLO. The black uncertainty bar on the data represents the statistical uncertainty, whereas the slightly taller grey uncertainty band represents the total uncertainty. The uncertainty in the predictions includes both the statistical and theoretical uncertainties. The lower panels show the ratios of the predictions to data, as well as the fractional uncertainty of the data

8 EFT interpretation

The \(\ell \ell \gamma \gamma \) final state can probe the non-Abelian structure of the \({\textrm{SU}}(2)_{{\textrm{L}}} \times {\textrm{U}}(1)_{{\textrm{Y}}}\) symmetry of the Standard Model, which gives rise to gauge boson self-interactions. Modifications of the self-interactions arising through new physics (NP) are investigated using an effective field theory approach [44]. The SM Lagrangian \({\mathcal {L}}_{\textrm{SM}}\) is expanded with operators of dimension \(d > 4\), which are suppressed by the energy scale \(\Lambda \) of NP:

$$\begin{aligned} {\mathcal {L}}_{\textrm{EFT}} = {\mathcal {L}}_{\textrm{SM}} + \sum _{d>4}\sum _{i}\frac{f^{d}_{i}}{\Lambda ^{d-4}}{\mathcal {O}}^{d}_{i}. \end{aligned}$$

The dimensionless Wilson coefficient of operator \({\mathcal {O}}^{d}_{i}\), where i runs over all operators of dimension d, is given by \(f^{d}_{i}\). The production of \(\ell \ell \gamma \gamma \) events is altered by modifying the SM coupling between four gauge bosons. These so-called ‘anomalous quartic gauge couplings’ introduce new contributions via the SM-forbidden vertices between four neutral EW gauge bosons. The lowest-dimension operators which give rise to \(ZZ\gamma \gamma \), \(Z\gamma \gamma \gamma \), and \(\gamma \gamma \gamma \gamma \) interactions are of dimension 8.

Constraints are placed on the subset of dimension-8 operators from Ref. [45] which are constructed using only field strength tensors. They are typically referred to as transverse operators, in the following abbreviated by \({\mathcal {O}}^{8}_{T\!,j}\) with \(j \in \lbrace 0,1,2,5,6,7,8,9 \rbrace \), and can introduce the aforementioned interactions between neutral EW gauge bosons. The contributions were simulated in MadGraph5_aMC@NLO 2.8.1 at LO in QCD with the NNPDF3.0nlo PDF set, and Pythia 8.244 was used to perform parton showering. The full fiducial phase-space selection defined in Sect. 7.1 is applied. The transverse momentum of the dilepton system \(p_{\text {T}}\,^{\ell \ell }\) provides the highest sensitivity to NP effects in the fiducial volume. It is thus used to constrain the \({\mathcal {O}}^{8}_{T\!,j}\) coupling parameters defined by dividing the Wilson coefficients by the NP scale, \(f_{T\!,j}/\Lambda ^4\).

The measured differential cross-section is compared with the SHERPA 2.2.10 \(Z\gamma \gamma \) prediction and the EFT prediction of transverse operator \({\mathcal {O}}^{8}_{T\!,8}\) in Fig. 8. The SM \(Z\gamma \gamma \) contribution estimated by a MadGraph5_aMC@NLO simulation at LO is also shown and this can be compared with the NLO prediction to investigate the impact of NLO QCD corrections. A slightly softer \(p_{\text {T}}\,^{\ell \ell }\) spectrum is observed at LO. The predicted differential cross-section at LO is used in Sect. 8.1 in the estimation of NLO QCD corrections for the EFT prediction.

Limits are placed on the coupling parameters \(f_{T\!,j}/\Lambda ^4\) by constructing and scanning a profile likelihood ratio, taking as input the baseline SHERPA 2.2.10 \(Z\gamma \gamma \) production (expected limits) and Run 2 data (observed limits), the contributions of the transverse operators, and all sources of uncertainty. The limits are calculated for the combination of the electron and muon channels. In the fit, the likelihood function is represented by a multivariate Gaussian distribution, where theory uncertainties are modelled by additional Gaussian constraints. All experimental uncertainties are encoded in the covariance matrix accounting for \(p_{\text {T}}\,^{\ell \ell }\) bin correlations. A description of the experimental uncertainties is given in Sect. 6. The bin-to-bin correlation of the statistical uncertainty, which can be present after unfolding, is found to be negligible and is not considered further. The shift of each systematic uncertainty is applied consistently in the same direction for all \(p_{\text {T}}\,^{\ell \ell }\) bins; correlations between different sources of systematic uncertainty are not considered. Theory uncertainties, consisting of renormalisation and factorisation scale, PDF, and \(\alpha _{\text {s}}\,\) uncertainties, are included for SM \(Z\gamma \gamma \) production and all transverse operators. Gaussian constraints for the limited size of the SHERPA 2.2.10 \(Z\gamma \gamma \) sample and the EFT samples are also added. The largest experimental uncertainties stem from the limited size of the data sample and the estimation of the fake-photon contribution, reaching \(18\%\) and \(14\%\), respectively, in certain \(p_{\text {T}}\,^{\ell \ell }\) bins. The renormalisation and factorisation scales are the sources of the largest theory uncertainties, which reach values of \(23\%\) in the last \(p_{\text {T}}\,^{\ell \ell }\) bin. Limits at \(95\%\) confidence level are constructed from the profile likelihood ratio by applying Wilks’ theorem [46] and thus assuming that the test statistic is \(\chi ^{2}\)-distributed. The effect of one transverse operator at a time is studied while all other Wilson coefficients are set to zero.

8.1 Non-unitarised limits

The expected and observed limits are displayed in Fig. 9. Constraints arising from unitarity conservation are not considered. The observed limits are typically 11–12% less stringent than those expected. This is driven by the larger contribution of fake photons in data and the corresponding uncertainties in the fake-photon normalisation and shape.

Higher-order QCD corrections are not available for the EFT prediction. In order to study the impact of such corrections on the constraints that can be placed on couplings of dimension-8 operators, a test fit was performed assuming that the EFT scales similarly to the SM with respect to higher-order corrections. In this test, the MadGraph5_aMC@NLO 2.7.3 differential cross-section at NLO (see Table 1) was divided by the MadGraph5_aMC@NLO LO prediction displayed in Fig. 8 to obtain bin-wise correction factors. The parameter settings for the LO simulation were identical to those chosen for the generation of the EFT contributions, except that all Wilson coefficients were set to zero. The differential cross-sections predicted by \({\mathcal {O}}^{8}_{T\!,j}\) were then multiplied by the correction factors. The results of this study show that the expected and observed constraints on the coupling parameters \(f_{T\!,j}/\Lambda ^4\) are 13–15% more stringent. Although such higher-order QCD corrections result in a sizeable impact on the limits, the underlying assumption cannot be validated with the available theoretical calculations in the EFT formalism, therefore the nominal confidence intervals, as shown in Fig. 9, are calculated without the bin-wise correction factors.

The confidence intervals presented for the four transverse operators \({\mathcal {O}}^{8}_{T\!,1}\), \({\mathcal {O}}^{8}_{T\!,2}\), \({\mathcal {O}}^{8}_{T\!,6}\), \({\mathcal {O}}^{8}_{T\!,7}\) are the first published by the ATLAS experiment at a centre-of-mass energy of \(13~\text {TeV}\,\) and are up to two orders of magnitude more stringent than the limits extracted at \(8~\text {TeV}\,\). The non-unitarised limits derived in this analysis are less stringent than those published by CMS in their \(W^{\pm }\gamma \gamma \) and \(Z\gamma \gamma \) analysis at \(13~\text {TeV}\,\) [3], but differ by less than a factor of two. This difference is primarily driven by binning requirements on \(p_{\text {T}}\,^{\ell \ell }\) . The binning was optimised to have sufficient events in the unfolding procedure. This results in a reduced sensitivity to EFT effects, particularly in the final bin.

Fig. 8
figure 8

Comparison of the differential cross-section at particle level as a function of the dilepton transverse momentum between the observation in full Run 2 data, the NLO prediction from SHERPA, the LO prediction from MadGraph5_aMC@NLO, and the EFT prediction of one dimension-8 operator. The black uncertainty line on the data represents the statistical uncertainty, whereas the slightly taller grey uncertainty band represents the total uncertainty. The uncertainty in the SHERPA prediction includes both the statistical and theoretical uncertainties. The coupling parameter \(f_{T\!,8}/\Lambda ^4\) for the NP contribution of transverse operator \({\mathcal {O}}^{8}_{T\!,8}\) is set to \(1/{\textrm{TeV}}^4\). The NP contribution contains interference effects between the SM \(\textrm{Z}\,\gamma \gamma \) production and the contribution of transverse operator \({\mathcal {O}}^{8}_{T\!,8}\)

Fig. 9
figure 9

Expected and observed \(95\%\) confidence intervals for the coupling parameters \(f_{T\!,j}/\Lambda ^4\) of transverse dimension-8 operators \({\mathcal {O}}^{8}_{T\!,j}\). All parameter values outside of the shown range are excluded at the chosen confidence level. No unitarity constraints are applied. The numeric values of the observed confidence intervals are displayed at the right border of the figure

8.2 Unitarisation treatment

Limits for all transverse operators are also derived as functions of an energy scale cut-off which prevents the violation of unitarity at large energy scales. Various techniques make use of a truncation of the EFT contribution to restore unitarity. This analysis uses a method which is typically referred to as clipping [47]. Any EFT contribution is suppressed above an energy scale \(E_{\text {c}}\). The \(\ell \ell \gamma \gamma \) invariant mass is used to select various thresholds for the clipping energy. Technically, this is achieved by scanning the simulated events at parton level before the parton showering is performed and suppressing any EFT contribution of events in which \(m_{\ell \ell \gamma \gamma } > E_{\text {c}}\). The SM \(Z\gamma \gamma \) contribution is not truncated and is allowed to reach arbitrary energy scales. The evolution of the expected and observed confidence intervals as a function of \(E_{\text {c}}\) for the coupling parameter of transverse operator \({\mathcal {O}}^{8}_{T\!,8}\) is shown in Fig. 10. A similar evolution is observed for the remaining dimension-8 operators, where the confidence intervals become 4–5 more stringent between \(E_{\text {c}} = 1.1~\text {TeV}\,\) and \(E_{\text {c}} = \infty \).

Fig. 10
figure 10

Expected and observed unitarised 95% confidence intervals for the coupling parameter \(f_{T\!,8}/\Lambda ^4\) in the clipping energy range between 1.1 and \(5~\text {TeV}\,\). The non-unitarised limits \((E_{\text {c}}=\infty )\) are also displayed. All parameter values above the upper or below the lower observed and expected limits are excluded at the chosen confidence level

9 Conclusions

The production of a \(\textrm{Z}\,\) boson in association with two photons in a phase-space region dominated by the ISR production of photons is studied in proton–proton collisions. The measurements are performed using 139 fb\(^{-1}\) of 13 \(\text {TeV}\)  proton–proton collision data recorded by the ATLAS detector at the LHC. The electron and muon decay channels of the \(\textrm{Z}\,\) boson are used, and events where either photon is radiated from one of the final-state leptons are rejected.

The integrated \(\textrm{Z}\,(\rightarrow \ell ^{+}\ell ^{-})\gamma \gamma \) cross-section is measured with a precision of 12%, with approximately equal contributions from statistical and systematic uncertainties. The cross-section is also measured differentially for the first time, and is used to test Standard Model predictions at up to next-to-leading-order accuracy from SHERPA and MadGraph5_aMC@NLO. The description by the MC event generator predictions is good across all measurement bins.

The measurements are also used to set limits on the Wilson coefficients, divided by the new physics scale \(\Lambda \), of dimension-8 EFT operators. The constraints on four of the eight operators under consideration are tightened by up to two orders of magnitude with respect to previous ATLAS analyses using 8 \(\text {TeV}\)  data.