Study of the hard double-parton scattering contribution to inclusive four-lepton production in $pp$ collisions at $\sqrt{s}$ = 8 TeV with the ATLAS detector

The inclusive production of four isolated charged leptons in $pp$ collisions is analysed for the presence of hard double-parton scattering, using 20.2 fb$^{-1}$ of data recorded in the ATLAS detector at the LHC at centre-of-mass energy $\sqrt{s}$ = 8 TeV. In the four-lepton invariant-mass range of $80<m_{4\ell}<1000$ GeV, an artificial neural network is used to enhance the separation between single- and double-parton scattering based on the kinematics of the four leptons in the final state. An upper limit on the fraction of events originating from double-parton scattering is determined at 95% confidence level to be $f_{\mathrm{DPS}} = 0.042$, which results in an estimated lower limit on the effective cross section at 95% confidence level of $1.0$ mb.


Introduction
The parton-parton scattering at the origin of hard processes in pp interactions is accompanied by protonremnant fragments that contribute to the hadronic final state through the so-called underlying event.As first pointed out by Sjöstrand and van Zijl [1], one source of the underlying-event activity, particularly in the high-energy regime of the LHC, is multi-parton interactions (MPI): interactions of pairs of partons from the interacting protons which occur simultaneously with the hard process.In high-energy pp interactions, where the density of low-x partons is high, there is enough energy to produce hard multi-parton interactions.The simplest example is hard double-parton scattering (DPS), where two partons from each proton interact with each other leading to perturbative final states.
The interest in studying DPS is twofold.Firstly, the probability of occurrence of DPS and the potential correlations between the products of these two perturbative interactions provide valuable information about the dynamics of the partonic structure of the proton (see Ref. [2] and references therein).Secondly, DPS processes may also constitute a background to reactions proceeding through single-parton scattering (SPS).An example is the production of four charged leptons in the final state, addressed in this Letter.This reaction is dominated by the SPS production of two Z ( * ) bosons, followed by subsequent leptonic decays.The Z ( * ) notation indicates the production of on-or off-shell Z bosons (Z and Z * ), or the production of off-shell photons (γ * ).However, the four leptons could also be produced as the result of two Drell-Yan processes occurring simultaneously, potentially distorting the measurements of prompt-lepton production.
For a process pp → A + B + X, the expected DPS cross section for producing states A and B in two independent scatterings, σ AB DPS , may be estimated from the following formula [3][4][5] (see also Ref. [6] for a detailed derivation): where σ A(B) SPS denotes the production cross section of state A(B) in a single-parton scattering, the symmetry factor k depends on whether the two scatterings lead to the same final state (A = B, k = 1) or different final states (A B, k = 2), and σ eff represents the effective transverse overlap area containing the interacting partons.
Since double Drell-Yan production is driven by quark-antiquark annihilation, while most of the previously explored DPS processes are driven by gluon-gluon scattering, and the final state of four charged leptons constitutes the golden channel for the studies of Higgs boson properties, H → Z ( * ) Z ( * ) → 4 , a study of a possible DPS contribution to the production of four isolated charged leptons at √ s = 8 TeV is warranted.The analysis presented in this Letter closely follows a previous analysis of this final state [33], but extends it to consider DPS.

ATLAS detector
The ATLAS detector [37] is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry and nearly full coverage in solid angle.1It consists of an inner tracking detector (ID) system surrounded by a superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer (MS) incorporating superconducting toroid magnets.During Run 1 of the LHC the ID consisted of a pixel detector closest to the beam-pipe, followed by a silicon strip detector and a transition radiation tracker.This ID system, operating in a 2 T axial magnetic field, provides the tracking of charged particles within the pseudorapidity range |η| < 2.5.The calorimeter system, which covers the range |η| < 4.9, includes in the barrel region a high-granularity lead/liquid-argon (LAr) barrel electromagnetic (EM) calorimeter (|η| < 1.5) and a steel/scintillator-tile hadronic calorimeter (|η| < 1.7).In the endcap (1.5 < |η| < 3.2) and forward (3.2 < |η| < 4.9) regions, the EM calorimeter and the hadronic calorimeter are made of LAr active layers with either copper or tungsten as the absorber material.The muon spectrometer constitutes the outermost detector and includes fast trigger chambers covering the region |η| < 2.4 and high-precision tracking chambers covering |η| < 2.7.A three-level trigger system [38] was used to select events to be recorded. 1ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector and the z-axis along the beam pipe.The x-axis points from the IP to the centre of the LHC ring, and the y-axis points upwards.Cylindrical coordinates (r, φ) are used in the transverse plane, φ being the azimuthal angle around the z-axis.The pseudorapidity is defined in terms of the polar angle θ as η = − ln tan(θ/2).Angular distance is measured in units of ∆R ≡ (∆η) 2 + (∆φ) 2 .

Monte Carlo event samples
In SPS events, the four-lepton events correspond to the production and subsequent decay of resonant Z or Higgs bosons, or to the production of the continuum Z ( * ) Z ( * ) system.In the case of DPS, the four leptons are decay products of two Z ( * ) bosons that are produced in two distinct parton-parton scatterings within the same pp interaction.
The Monte Carlo samples are unchanged with respect to Ref. [33].The SPS q q → 4 was simulated with the P -B (revision 2330) [34][35][36] Monte Carlo (MC) program, which is based on perturbative QCD calculations at NLO.The four-lepton production through the qg initial state is included as part of the NLO contributions to the q q process.The parton distribution functions (PDFs) of the CT10NLO [39] set were used.The gg → 4 events corresponding to the continuum Z ( * ) Z ( * ) production were generated with MCFM 6.1 [40] at leading order (LO) in QCD, using the CT10NNLO [41] set of PDFs, and the cross sections were corrected for higher-order effects using the ratio of NLO to LO cross sections (the so-called K-factors) [42].The on-shell Higgs boson production was simulated with P -B at NLO QCD, using the CT10NLO PDFs, in the case of gluon-gluon fusion and vector-boson fusion, and with LO P 8 [43] in the case of vector-boson associated production (V H) and top-pair associated production (t tH).The event yield of on-shell Higgs boson was normalised to the higher-order corrected cross section [44].The events with off-shell Higgs boson production were simulated with the LO M G 5.1.5.12 [45] generator via vector-boson fusion and vector-boson scattering processes, including their interference.For the LO P 8 and M G generators, the LO version of CTEQ6L1 PDFs [46] was used.
The MC generators listed above were interfaced to P 8 for parton showering, except M G which was interfaced to P 6 [47].The underlying-event parameter values belong to the AU2 [48] tune.
The DPS events that contribute to the 4 production were simulated with P 8.175 using the LO version of CTEQ6L1 PDFs.
The production of Z + jets events, including the light-and heavy-flavour contributions, was simulated with A 2.1.4[49], using the Perugia2011C [50] tune.The Zγ production was modelled with S 1.4.5 [51].Background t t events were generated with P -B using the Perugia2011C tune.The Z H events, with subsequent decays Z → and H → VV * (with two leptons and two neutrinos or two leptons and two jets in the final state), were generated with P 8, using the AU2 tune.The ZW and t Z processes were simulated with S and M G respectively, with the latter using the AUET2B tune [52].The background contribution from VVV and t t Z was modelled with M G , using the AUET2B tune.The MC generators for background simulation used the LO version of the CTEQ6L1 PDF set, except S , which used the CT10 PDF set.
The largest contributions to the background, originating from Z + b b jets and t t production, were estimated in Ref. [33] from the respective MC samples normalised to the data in selected control regions.The remaining background contributions were directly extracted from the MC expectations.
Additional pp interactions occurring in the same and neighbouring bunch crossings (pile-up) were also simulated, using the P 8 MC generator, with the A2 [53] tune and MSTW 2008 LO [54] PDF set.The MC samples were reweighted to reproduce the distribution of the mean number of pp interactions per bunch crossing observed in the data.The estimated number of events with two Z ( * ) bosons produced in the same bunch crossing with less than 1 cm separation along the beam axis is negligible compared to the DPS expectations.
Monte Carlo events were passed through the ATLAS detector simulation [55], which is based on the G 4 [56] framework, and which includes simulation of the trigger selection.The MC events were reconstructed and selected offline using the same software and selections as for the data.

Event selection
The dataset and the event selection are unchanged with respect to Ref. [33].The updated luminosity of the analysed sample is 20.2 fb −1 .The events were selected online using single-lepton or dilepton triggers.The single-lepton trigger required the transverse energy of the electron candidate or the transverse momentum of the muon candidate to be above 24 GeV.The dielectron trigger had the same threshold of 12 GeV for both electron candidates.The dimuon trigger required either two muons with transverse momentum above 13 GeV or one above 18 GeV and the other above 8 GeV.An electron-muon trigger was also used with thresholds at 12 GeV for electrons and 8 GeV for muons.
The final sample consists of events with at least four leptons, where each lepton is either an electron or a muon.The four leptons are required to form two same-flavour (electrons or muons) opposite-charge (SFOC) lepton pairs.The pair with the invariant mass closer to the mass of the Z boson is called the leading pair, and the other pair is the sub-leading one.The invariant mass of the leading pair is restricted to the range 50 < m leading < 120 GeV, while for the sub-leading pair the mass requirement is 12 < m sub-leading < 120 GeV.A J/ψ veto is applied such that for any SFOC lepton combination the invariant mass of the dilepton, m 2 , must be greater than 5 GeV.Only events with the four-lepton invariant mass in the range 80 < m 4 < 1000 GeV are selected.The transverse momentum of dileptons, p + − T , is required to be above 2 GeV.Selected leptons, ordered in descending order of transverse momentum, are required to have transverse momenta p T above 20, 15, 10 (8 if muon), and 7 (6 if muon) GeV.The leptons are selected within the pseudorapidity range |η| < 2.5 in the case of electrons and |η| < 2.7 in the case of muons.In order to have well-measured leptons, a lepton separation requirement is imposed, such that the distance between any two leptons in the η-φ space, ∆R, is required to fulfil the condition ∆R > 0.1 (0.2) for same-flavour (different-flavour) leptons.Each event is required to have the triggering lepton(s) matched to one or two of the selected leptons.
The data sample, after all selections, contains 476 events.The resulting data and MC distributions of the four-lepton invariant mass are shown in Figure 1.For completeness, the figure also includes the DPS contribution of 0.4 events predicted by the P 8.175 simulation.

DPS signal extraction
The assumption that in DPS the two scatters are distinct implies that, in the DPS four-lepton final states, the two leptons of each dilepton will tend to be balanced in p T and therefore back-to-back in the azimuthal angle φ, due to the dominance of low-p T Z ( * ) production.In the SPS case, the leading and sub-leading pairs are expected to balance each other in p T .
[GeV] Based on the experience gained in the study of four-jet final states [57], in order to distinguish between DPS events and SPS events, the distributions of the following kinematic variables of the four leptons are considered: Here, ì p T,i is the transverse momentum component of the i-th lepton (i = 1, 2, 3, 4), and φ i and y i are the azimuthal angle and the rapidity of the i-th lepton, respectively.The angle φ i+j is the azimuthal angle of the momentum vector composed by the sum of momenta of leptons i and j.Leptons 1 and 2 form the leading dilepton.The lepton ordering is chosen such that p T,1 > p T,2 and p T,3 > p T,4 .
The distributions of the variables ∆p T,12 , ∆φ 13 , ∆y 13 , and ∆ 1234 are presented in Figure 2(a)-(d).The distribution of ∆p T,12 peaks around 0.1 for simulated DPS events, while the simulated SPS events are more evenly distributed across the range [0,1].This demonstrates that, as expected, two leptons coming from the same Z candidate in DPS balance each other in p T , while in SPS the pairwise p T balance is not dominant.This is again demonstrated in the ∆φ 13 distribution, where leptons 1 and 3 are decorrelated in ∆φ for DPS, while for the SPS events these leading-p T decay leptons tend to be back-to-back in φ, because they originate from the two Z bosons, which themselves are expected to be back-to-back in φ.The ∆y 13 distribution shows that leptons associated to different dileptons tend to be more separated in rapidity in DPS than in SPS.The back-to-back configurations of the two Z candidates in the case of SPS, and their decorrelation in the case of DPS is explicitly demonstrated in the distribution of the azimuthal angle between two Z candidates, ∆ 1234 .2).Also plotted are the MC expectations for SPS and DPS, where the latter is normalised to the number of observed data events in order to make it clearly visible.

ATLAS
The difference between the topologies of SPS and DPS events is used to train an artificial neural network (ANN) to discriminate between the DPS and non-DPS classes, where the latter corresponds to SPS and background events.
The training is performed with the ANN available in the ROOT [58] implementation of a feed-forward multilayer perceptron.The Broyden-Fletcher-Goldfarb-Shanno supervised learning algorithm [59][60][61][62] is used in the training.The input layer contains 21 neurons, corresponding to the variables listed in Eq. ( 2), and the output layer consists of one neuron.As the result of optimising the convergence and the performance of the ANN, a configuration of 30 and 9 neurons is adopted for the first and second hidden layer, respectively.The output of the ANN, ξ DPS , is a number distributed between 0 and 1, which represents the likelihood for an event to belong to the DPS class.
The event weights are chosen such that during the training procedure the effective numbers of SPS q q-initiated events, gg-initiated events and background Z + b b jets events are in the ratio 1 : 1 : 1.The SPS gg-initiated events tend to spill over into the DPS signal region, and a better separation between the SPS and DPS classes is achieved by increasing their weight in the minimisation of the error function.Similarly, the effective contribution of Z + b b jets events is increased for the ANN training to distinguish them better from the DPS ones, as the kinematics of the Z + b b jets background subprocess has features similar to DPS.The effective numbers of events for DPS and non-DPS events are equal.Each MC set is split randomly into two subsets having approximately the same number of events.One subset is used for the ANN training, while the other is used to validate the performance of the ANN and to determine the number of training epochs, so as to reach the best possible level of discrimination while preventing overtraining.
The trained ANN is applied to data events, and the resulting distribution of ξ DPS is shown in Figure 3, together with the corresponding DPS, SPS and background MC distributions.The DPS MC events form a peak around ξ DPS = 1 and the SPS and background events form a peak at ξ DPS = 0, as expected.A similar peak at ξ DPS = 0 is observed in data events, with no indication of a substantial contribution of double-parton scattering at ξ DPS = 1.
In order to quantify the level of the potential DPS contribution in the data, the variable f DPS is introduced, defined as the ratio of the number of DPS events, N DPS,4 , to the sum of the DPS and SPS (N SPS,4 ): 4 .
The MC template fit of the sum of the DPS, SPS and background contributions to the data yields f DPS = −0.009± 0.017 with a χ 2 per degree of freedom χ 2 /dof = 8.6/9.Since the result is consistent with zero, an upper limit on f DPS is extracted, as described in Section 7.1.
For the ANN performance to be robust and independent of the DPS model, it is best to have a DPS training sample with no inherent correlations between the initial partons or the final states.The DPS model in P [63][64][65] used in the analysis contains some correlations between the initial-state partons, implied by conservation of flavour and by the proton momentum sum-rule, as well as correlations due to inherent primordial transverse momentum of the partons and interleaved initial-state radiation.These effects are expected to be weak in the phase space of the present analysis (low-momentum partons and large transverse momenta of the final-state leptons).No correlations are expected in the production of the Drell-Yan final states.
To test this assumption of a very weak correlation between two subscatterings in the P DPS model, the MC training sample was compared with a sample of two randomly overlaid dilepton events, where any correlation is eliminated by construction.Such a sample was made by overlaying dilepton events selected in the data, with the selection driven by the four-lepton phase space.Each dilepton event was required to have two selected leptons forming an SFOC pair with transverse momenta p 1 , 2 T > 20, 15 GeV to account for the trigger conditions under which the dilepton data were collected.The same single-lepton, double-electron and double-muon triggers were used as in the selection of the four-lepton sample.An event was rejected if there was a third lepton with p T > 7 GeV (6 GeV for muons).The pairs of events were chosen randomly and overlaid by adding the lepton four-vectors of one event to the other.The distance between the primary vertices along the z-axis for the two events was required to be smaller than 1 cm.After the overlay, the same four-lepton selection was applied as described in Section 4, but the trigger configuration of the available dilepton datasets required an increase in the lepton p T thresholds.They were chosen to be 20, 20, 15, and 15 GeV for leptons ordered in descending order of p T .To have a valid comparison within the same phase space between the overlaid dileptons and the P 8 sample, the same selection on lepton p T was also applied to the latter.The distributions of discriminating variables were compared, as were the distributions of ξ DPS , obtained with the ANN trained on P 8. Very good agreement between P 8 and the overlaid data was observed, confirming the initial assumption of a very weak correlation between the two scatterings in the P DPS model with no effect on the analysis.
The value of f DPS is extracted using detector-level distributions.To test how well this result agrees with the parton-level value, f parton DPS , several pseudo-datasets were constructed by mixing DPS, SPS and background samples with a number of predefined parton-level values of f parton DPS = 0.01, 0.03, 0.05, 0.1, and 0.3.The number of background events in all mixtures was the same as expected in the selected four-lepton data sample.The corresponding value of f DPS at the detector level was then determined by fitting the detector-level distributions and compared with the input f parton DPS value.It was found that the fitted value of f DPS is systematically lower than f parton DPS due to slightly different detector acceptances for DPS and SPS events.However, the two quantities agree within 2%.

Systematic uncertainties
The following sources of systematic uncertainty are considered: • The experimental systematic uncertainty, which includes the uncertainties of the electron and muon energy scales, the uncertainty of the energy and momentum resolution, and of the trigger, reconstruction and identification efficiencies [66,67].
• The uncertainty due to the model choice for the SPS process, which is evaluated by considering the effect of the variation of the fractions of q q-and gg-initiated subprocesses, which are modelled with different MC generators, as described in Section 3.For the determination of the range of variation, these fractions are fitted to the m 4 distribution in the data, keeping the fraction of background events unchanged.The fraction values of q q-and gg-initiated subprocesses were varied between the nominal values and the values obtained from the fit to the m 4 distribution.
• The uncertainty in the background modelling, which is estimated by varying the contributions of various background subprocesses according to the uncertainty of their normalisations obtained in Ref. [33] No uncertainty is assigned to the DPS model, since the kinematic distributions agree well between the P 8 DPS model and the assumption of two independent interactions as represented by the overlaid dilepton data.
The combined effect of all systematic uncertainties, of which the variation of the Z + b b jets background is the dominant uncertainty, is about 20% of the statistical uncertainty on the fitted value of f DPS .The effect of systematic uncertainties is therefore neglected when setting the upper limit on f DPS .
The validity of neglecting the systematic uncertainties was also checked with pseudo-experiments: the contents of data bins were varied according to a Poisson distribution and those of MC profile histograms were varied according to the systematic uncertainty, sampling the variations according to Gaussian distribution in the corresponding nuisance parameter, taking into account the correlation between the bins where appropriate.For each set of varied data and MC histograms, the fit of f DPS was performed.The resulting distribution of f DPS was compared with that obtained with systematic uncertainties neglected.The comparison showed no significant difference between the two distributions.

Upper limit on f DPS
The upper limit on f DPS is determined using the distributions of the ξ DPS variable in data, SPS, DPS, and background MC samples.The statistical method to interpret the data uses the test statistic for upper limits, q µ , based on the profile likelihood ratio as described in Ref. [68], Here µ is the signal strength and λ(µ) is the profile likelihood ratio, where θ is the number of non-DPS events and constitutes a nuisance parameter.The values μ and θ are maximum-likelihood estimators.The value of θ maximises L for a given value of µ.The parameter of interest, µ, is defined to be equal to the f DPS variable, µ = f DPS .Thus µ = 0 corresponds to no DPS contribution, while µ = 1 means that the four-lepton sample consists exclusively of DPS events.The procedure is that the data distribution is fitted with the sum of background, SPS and DPS histograms using the maximum-likelihood method.The upper limit is extracted using the CL s method [69] from distributions of the test statistic for various hypothesised values of µ.The test-statistic distribution is obtained from an ensemble of pseudo-experiments.The shape of the test-statistic distribution agrees with the asymptotic formulae of Ref. [68].The value of the CL s upper limit on f DPS found with this method at 95% confidence level (CL) is 0.042.

Lower limit on the effective cross section
The upper limit on f DPS can be transformed into a lower limit on σ eff by using Eq. ( 1).In order to perform this calculation, several inputs to the formula have to be determined.
The value of the symmetry factor k/2 in Eq. ( 1) is well defined for the case of 2e + 2µ or 2µ + 2e final states, k/2 = 1.For the 4e or 4µ final states, k/2 is well defined only in the case of completely overlapping (k/2 = 1/2) or fully exclusive (k/2 = 1) dilepton phase spaces.Therefore, the dilepton phase space is divided into 40 mutually exclusive regions.The boundaries of these regions are driven by the lepton-p T thresholds and by the dilepton invariant-mass ranges for the leading and sub-leading lepton pairs.The product k 2 σ A σ B is determined by representing Eq. ( 1) as the sum over these phase-space regions.In order to determine the Drell-Yan cross section in each of the regions, the P -B MC simulation was used, based on NLO QCD calculations with the CT10 NLO set of PDFs.In the most populated region of p T > 20 GeV for each lepton and of 50 < m 2 < 120 GeV, the calculated cross section is 0.55 nb for 2µ and 0.49 nb for 2e final states.A conservative uncertainty of ±15% is assigned to Drell-Yan cross sections.After summing the contributions from different dilepton phase-space regions, the result is k 2 σ A σ B = (13.9± 0.1 (stat) ± 3.6 (syst)) • 10 11 fb 2 .
Here the systematic uncertainty is determined by propagating the assumed Drell-Yan cross-section uncertainty, assuming 100% correlation between various phase-space regions.
From the definition of f DPS , Eq. ( 1) may be written as: and hence an approach similar to that used for the extraction of the upper limit on f DPS can be applied to set the lower limit on σ eff .The lower limit on σ eff at 95% CL is 1.0 mb, consistent with previously measured values of the effective cross section, as shown in Figure 4.

Summary
The production of four-lepton (electrons or muons) final states in pp interactions at 8 TeV is analysed for the presence of double-parton scattering, using 20.2 fb −1 of data recorded by the ATLAS experiment at the LHC.Leptons with transverse momentum above 20, 15, 10 (8 if muon), and 7 (6 if muon) GeV, sorted in descending order of p T , are selected in the pseudorapidity range |η| < 2.5 in the case of electrons and |η| < 2.7 in the case of muons.The four leptons form two same-flavour opposite-charge lepton pairs.
The dilepton invariant masses are required to be in the range 50 < m leading < 120 GeV for the leading pair and 12 < m sub-leading < 120 GeV for the sub-leading pair, where the leading pair is defined as the pair with invariant mass closer to the Z boson mass.The transverse momentum p + − T of the dileptons is required to be above 2 GeV.The events in the four-lepton invariant-mass range of 80 < m 4 < 1000 GeV are considered.An artificial neural network is used to discriminate between single-and double-parton scattering events.No signal of double-parton scattering is observed and an upper limit on the fraction of the DPS contribution to the inclusive four-lepton final state of 0.042 is obtained at 95% CL.This upper limit translates, for two independent subscatterings, into a lower limit of 1.0 mb on the effective cross section, consistent with previously measured values in different processes and at different centre-of-mass energies.[26] M. W. Krasny and W. Placzek, The LHC excess of four-lepton events interpreted as Higgs

Figure 1 :
Figure 1: The distribution of the four-lepton invariant mass, m 4 .The data (black dots) are compared with the sum of signal and background MC expectations (filled coloured histograms).Also shown is the expected contribution of DPS from P 8.

Figure 2 :
Figure 2: Distributions of the discriminating variables (a) ∆p T,12 , (b) ∆φ 13 , (c) ∆y 13 , and (d) ∆ 1234 .The definition of variables is given in Eq. (2).Also plotted are the MC expectations for SPS and DPS, where the latter is normalised to the number of observed data events in order to make it clearly visible.

Figure 3 :
Figure 3: The distribution of the output variable of the artificial neural network, ξ DPS , shown separately for the data, SPS, background, and DPS distributions.

Figure 4 :
Figure 4: Summary of measurements and limits on the effective cross section, determined in different experiments [7-25], sorted chronologically.The measurements that were made by different experiments are denoted by different symbols and colours.The inner error bars represent statistical uncertainties and the outer error bars correspond to the total uncertainty.Dashed arrows indicate lower limits.Lines with arrows on both ends represent ranges of the effective cross-section values, determined within a single publication.In the case of the double J/ψ measurement by LHCb, the dashed line denotes the upper and lower uncertainties.The AFS measurement[7], indicated with a dot, was published without uncertainties.