Study of the hard double-parton scattering contribution to inclusive four-lepton production in pp collisions at s = 8 TeV with the ATLAS detector

The inclusive production of four isolated charged leptons in pp collisions is analysed for the presence of hard double-parton scattering, using 20 . 2 fb − 1 of data recorded in the ATLAS detector at the LHC at centre-of-mass energy √ s = 8 TeV. In the four-lepton invariant-mass range of 80 < m 4 (cid:3) < 1000 GeV, an artiﬁcial neural network is used to enhance the separation between single- and double-parton scattering based on the kinematics of the four leptons in the ﬁnal state. An upper limit on the fraction of events originating from double-parton scattering is determined at 95% conﬁdence level to be f DPS = 0 . 042, which results in an estimated lower limit on the effective cross section at 95% conﬁdence level of 1 . 0 mb.


Introduction
The parton-parton scattering at the origin of hard processes in pp interactions is accompanied by proton-remnant fragments that contribute to the hadronic final state through the so-called underlying event. As first pointed out by Sjöstrand and van Zijl [1], one source of the underlying-event activity, particularly in the highenergy regime of the LHC, is multi-parton interactions (MPI): interactions of pairs of partons from the interacting protons which occur simultaneously with the hard process. In high-energy pp interactions, where the density of low-x partons is high, there is enough energy to produce hard multi-parton interactions. The simplest example is hard double-parton scattering (DPS), where two partons from each proton interact with each other leading to perturbative final states.
The interest in studying DPS is twofold. Firstly, the probability of occurrence of DPS and the potential correlations between the products of these two perturbative interactions provide valuable information about the dynamics of the partonic structure of the proton (see Ref. [2] and references therein). Secondly, DPS processes may also constitute a background to reactions proceeding through single-parton scattering (SPS). An example is the production of four charged leptons in the final state, addressed in this Letter. This reaction is dominated by the SPS production of two Z ( * ) bosons, followed by subsequent leptonic decays. The Z ( * ) no-E-mail address: atlas .publications @cern .ch. tation indicates the production of on-or off-shell Z bosons ( Z and Z * ), or the production of off-shell photons (γ * ). However, the four leptons could also be produced as the result of two Drell-Yan processes occurring simultaneously, potentially distorting the measurements of prompt-lepton production.
For a process pp → A + B + X , the expected DPS cross section for producing states A and B in two independent scatterings, σ A B DPS , may be estimated from the following formula [3-5] (see also Ref. [6] for a detailed derivation): For most of the existing measurements [7-21], σ eff fluctuates around 15 mb. However, for the associated production of quarkonia J /ψ J /ψ or J /ψϒ, σ eff is systematically lower [22][23][24][25] than for all other investigated processes. This might indicate that σ eff is not universal and that there are spatial fluctuations of the parton densities in the proton, which may favour certain final states over others [26,27]. The concept of geometric fluctuations in the spatial parton densities has also been invoked [28]  Since double Drell-Yan production is driven by quark-antiquark annihilation, while most of the previously explored DPS processes are driven by gluon-gluon scattering, and the final state of four charged leptons constitutes the golden channel for the studies of Higgs boson properties, H → Z ( * ) Z ( * ) → 4 , a study of a possible DPS contribution to the production of four isolated charged leptons at √ s = 8 TeV is warranted. The analysis presented in this Letter closely follows a previous analysis of this final state [33], but extends it to consider DPS.

ATLAS detector
The ATLAS detector [37] is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry and nearly full coverage in solid angle. 1 It consists of an inner tracking detector (ID) system surrounded by a superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer (MS) incorporating superconducting toroid magnets. During Run 1 of the LHC the ID consisted of a pixel detector closest to the beam-pipe, followed by a silicon strip detector and a transition radiation tracker. This ID system, operating in a 2 T axial magnetic field, provides the track- energy of the electron candidate or the transverse momentum of the muon candidate to be above 24 GeV. The dielectron trigger had the same threshold of 12 GeV for both electron candidates. The dimuon trigger required either two muons with transverse momentum above 13 GeV or one above 18 GeV and the other above 8 GeV. An electron-muon trigger was also used with thresholds at 12 GeV for electrons and 8 GeV for muons.
The final sample consists of events with at least four leptons, where each lepton is either an electron or a muon. The four leptons are required to form two same-flavour (electrons or muons) opposite-charge (SFOC) lepton pairs. The pair with the invariant mass closer to the mass of the Z boson is called the leading pair, and the other pair is the sub-leading one. The invariant mass of the leading pair is restricted to the range 50 < m leading < 120 GeV, while for the sub-leading pair the mass requirement is 12 < m sub-leading < 120 GeV. A J /ψ veto is applied such that for any SFOC lepton combination the invariant mass of the dilepton, m 2 , must be greater than 5 GeV. Only events with the four-lepton invariant mass in the range 80 < m 4 < 1000 GeV are selected.
The transverse momentum of dileptons, p + − T , is required to be above 2 GeV. Selected leptons, ordered in descending order of transverse momentum, are required to have transverse momenta p T above 20, 15, 10 (8 if muon), and 7 (6 if muon) GeV. The leptons are selected within the pseudorapidity range |η| < 2.5 in the case of electrons and |η| < 2.7 in the case of muons. In order to have well-measured leptons, a lepton separation requirement is imposed, such that the distance between any two leptons in the η-φ space, R, is required to fulfil the condition R > 0.1 (0.2) for same-flavour (different-flavour) leptons. Each event is required to have the triggering lepton(s) matched to one or two of the selected leptons.
The data sample, after all selections, contains 476 events. The resulting data and MC distributions of the four-lepton invariant mass are shown in Fig. 1. For completeness, the figure also includes the DPS contribution of 0.4 events predicted by the Pythia 8.175 simulation.

DPS signal extraction
The assumption that in DPS the two scatters are distinct implies that, in the DPS four-lepton final states, the two leptons of each dilepton will tend to be balanced in p T and therefore back-to-back in the azimuthal angle φ, due to the dominance of low-p T Z ( * ) production. In the SPS case, the leading and sub-leading pairs are expected to balance each other in p T .
Based on the experience gained in the study of four-jet final states [57], in order to distinguish between DPS events and SPS events, the distributions of the following kinematic variables of the four leptons are considered: . This demonstrates that, as expected, two leptons coming from the same Z candidate in DPS balance each other in p T , while in SPS the pairwise p T balance is not dominant. This is again demonstrated in the φ 13 distribution, where leptons 1 and 3 are decorrelated in φ for DPS, while for the SPS events these leading-p T decay leptons tend to be back-to-back in φ, because they originate from the two Z bosons, which themselves are expected to be back-to-back in φ. The y 13 distribution shows that leptons associated to different dileptons tend to be more separated in rapidity in DPS than in SPS. The back-to-back configurations of the two Z candidates in the case of SPS, and their decorrelation in the case of DPS is explicitly demonstrated in the distribution of the azimuthal angle between two Z candidates, 1234 .
The difference between the topologies of SPS and DPS events is used to train an artificial neural network (ANN) to discriminate between the DPS and non-DPS classes, where the latter corresponds to SPS and background events.
The training is performed with the ANN available in the ROOT [58] implementation of a feed-forward multilayer perceptron. The Broyden-Fletcher-Goldfarb-Shanno supervised learning algorithm [59][60][61][62] is used in the training. The input layer contains 21 neurons, corresponding to the variables listed in Eq. (2), and the output layer consists of one neuron. As the result of optimising the convergence and the performance of the ANN, a configuration of 30 and 9 neurons is adopted for the first and second hidden layer, respectively. The output of the ANN, ξ DPS , is a number distributed between 0 and 1, which represents the likelihood for an event to belong to the DPS class.
The event weights are chosen such that during the training procedure the effective numbers of SPS qq-initiated events, gg-initiated events and background Z + bb jets events are in the ratio 1 : 1 : 1. The SPS gg-initiated events tend to spill over into the DPS signal region, and a better separation between the SPS and DPS classes is achieved by increasing their weight in the minimisation of the error function. Similarly, the effective contribution of Z + bb jets events is increased for the ANN training to distinguish them better from the DPS ones, as the kinematics of the Z + bb jets background subprocess has features similar to DPS.
The effective numbers of events for DPS and non-DPS events are equal. Each MC set is split randomly into two subsets having approximately the same number of events. One subset is used for the ANN training, while the other is used to validate the performance of the ANN and to determine the number of training epochs, so as  The MC template fit of the sum of the DPS, SPS and background contributions to the data yields f DPS = −0.009 ± 0.017 with a χ 2 per degree of freedom χ 2 /dof = 8.6/9. Since the result is consistent with zero, an upper limit on f DPS is extracted, as described in Section 7.1.
For the ANN performance to be robust and independent of the DPS model, it is best to have a DPS training sample with no inherent correlations between the initial partons or the final states. The DPS model in Pythia [63][64][65] used in the analysis contains some correlations between the initial-state partons, implied by conservation of flavour and by the proton momentum sum-rule, as well as correlations due to inherent primordial transverse momentum of the partons and interleaved initial-state radiation. These effects are expected to be weak in the phase space of the present analysis (low-momentum partons and large transverse momenta of the final-state leptons). No correlations are expected in the production of the Drell-Yan final states.
To test this assumption of a very weak correlation between two subscatterings in the Pythia DPS model, the MC training sample was compared with a sample of two randomly overlaid dilepton events, where any correlation is eliminated by construction. Such a sample was made by overlaying dilepton events selected in the data, with the selection driven by the four-lepton phase space. Each dilepton event was required to have two selected leptons forming an SFOC pair with transverse momenta p 1 , 2 T > 20, 15 GeV to account for the trigger conditions under which the dilepton data were collected. The same single-lepton, double-electron and double-muon triggers were used as in the selection of the four-lepton sample. An event was rejected if there was a third lepton with p T > 7 GeV (6 GeV for muons). The pairs of events were chosen randomly and overlaid by adding the lepton four-vectors of one event to the other. The distance between the primary vertices along the z-axis for the two events was required to be smaller than 1 cm. After the overlay, the same four-lepton selection was applied as described in Section 4, but the trigger configuration of the available dilepton datasets required an increase in the lepton p T thresholds. They were chosen to be 20, 20, 15, and 15 GeV for leptons ordered in descending order of p T . To have a valid comparison within the same phase space between the overlaid dileptons and the Pythia 8 sample, the same selection on lepton p T was also applied to the latter. The distributions of discriminating variables were compared, as were the distributions of ξ DPS , obtained with the ANN trained on Pythia 8. Very good agreement between Pythia 8 and the overlaid data was observed, confirming the initial assumption of a very weak correlation between the two scatterings in the Pythia DPS model with no effect on the analysis.
The value of f DPS is extracted using detector-level distributions. To test how well this result agrees with the parton-level value, value. It was found that the fitted value of f DPS is systematically lower than f parton DPS due to slightly different detector acceptances for DPS and SPS events. However, the two quantities agree within 2%.

Systematic uncertainties
The following sources of systematic uncertainty are considered: • The experimental systematic uncertainty, which includes the uncertainties of the electron and muon energy scales, the uncertainty of the energy and momentum resolution, and of the trigger, reconstruction and identification efficiencies [66,67].
• The uncertainty due to the model choice for the SPS process, which is evaluated by considering the effect of the variation of the fractions of qq-and gg-initiated subprocesses, which are modelled with different MC generators, as described in Section 3. For the determination of the range of variation, these fractions are fitted to the m 4 distribution in the data, keeping the fraction of background events unchanged. The fraction values of qq-and gg-initiated subprocesses were varied between the nominal values and the values obtained from the fit to the m 4 distribution.
• The uncertainty in the background modelling, which is estimated by varying the contributions of various background subprocesses according to the uncertainty of their normalisations obtained in Ref. [33].
No uncertainty is assigned to the DPS model, since the kinematic distributions agree well between the Pythia 8 DPS model and the assumption of two independent interactions as represented by the overlaid dilepton data.
The combined effect of all systematic uncertainties, of which the variation of the Z + bb jets background is the dominant uncertainty, is about 20% of the statistical uncertainty on the fitted value of f DPS . The effect of systematic uncertainties is therefore neglected when setting the upper limit on f DPS . The validity of neglecting the systematic uncertainties was also checked with pseudo-experiments: the contents of data bins were varied according to a Poisson distribution and those of MC profile histograms were varied according to the systematic uncertainty, sampling the variations according to Gaussian distribution in the corresponding nuisance parameter, taking into account the correlation between the bins where appropriate. For each set of varied data and MC histograms, the fit of f DPS was performed. The resulting distribution of f DPS was compared with that obtained with systematic uncertainties neglected. The comparison showed no significant difference between the two distributions.

Upper limit on f DPS
The upper limit on f DPS is determined using the distributions of the ξ DPS variable in data, SPS, DPS, and background MC samples.
The statistical method to interpret the data uses the test statistic for upper limits, q μ , based on the profile likelihood ratio as described in Ref. [68], Here μ is the signal strength and λ(μ) is the profile likelihood ratio, where θ is the number of non-DPS events and constitutes a nuisance parameter. The values μ and θ are maximum-likelihood estimators. The value of θ maximises L for a given value of μ. The parameter of interest, μ, is defined to be equal to the f DPS variable, μ = f DPS . Thus μ = 0 corresponds to no DPS contribution, while μ = 1 means that the four-lepton sample consists exclusively of DPS events. The procedure is that the data distribution is fitted with the sum of background, SPS and DPS histograms using the maximum-likelihood method. The upper limit is extracted using the CL s method [69] from distributions of the test statistic for various hypothesised values of μ. The test-statistic distribution is obtained from an ensemble of pseudo-experiments. The shape of the test-statistic distribution agrees with the asymptotic formulae of Ref. [68]. The value of the CL s upper limit on f DPS found with this method at 95% confidence level (CL) is 0.042.

Lower limit on the effective cross section
The upper limit on f DPS can be transformed into a lower limit on σ eff by using Eq. (1). In order to perform this calculation, several inputs to the formula have to be determined. The fiducial cross section for inclusive four-lepton production [33] is σ 4 = 32.0 ± 1.6 (stat.) ± 0.7 (syst.) ± 0.9 (lumi.) fb.
The value of the symmetry factor k/2 in Eq. (1) is well defined for the case of 2e + 2μ or 2μ + 2e final states, k/2 = 1. For the 4e or 4μ final states, k/2 is well defined only in the case of completely overlapping (k/2 = 1/2) or fully exclusive (k/2 = 1) dilepton phase spaces. Therefore, the dilepton phase space is divided into 40 mutually exclusive regions. The boundaries of these regions are driven by the lepton-p T thresholds and by the dilepton invariant-mass ranges for the leading and sub-leading lepton pairs. The product k 2 σ A σ B is determined by representing Eq. (1) as the sum over these phase-space regions. In order to determine the Drell-Yan cross section in each of the regions, the Powheg-Box MC simulation was used, based on NLO QCD calculations with the CT10 NLO set of PDFs. In the most populated region of p T > 20 GeV for each lepton and of 50 < m 2 < 120 GeV, the calculated cross section is 0.55 nb for 2μ and 0.49 nb for 2e final states. A conservative uncertainty of ±15% is assigned to Drell-Yan cross sections. After summing the contributions from different dilepton phase-space regions, the result is k 2 σ A σ B = (13.9 ± 0.1 (stat) ± 3.6 (syst)) · 10 11 fb 2 .
Here the systematic uncertainty is determined by propagating the assumed Drell-Yan cross-section uncertainty, assuming 100% correlation between various phase-space regions.
From the definition of f DPS , Eq. (1) may be written as: and hence an approach similar to that used for the extraction of the upper limit on f DPS can be applied to set the lower limit on σ eff . The lower limit on σ eff at 95% CL is 1.0 mb, consistent with previously measured values of the effective cross section, as shown in Fig. 4.

Summary
The production of four-lepton (electrons or muons) final states in pp interactions at 8 TeV is analysed for the presence of doubleparton scattering, using 20.2 fb −1 of data recorded by the ATLAS experiment at the LHC. Leptons with transverse momentum above 20, 15, 10 (8 if muon), and 7 (6 if muon) GeV, sorted in descending order of p T , are selected in the pseudorapidity range |η| < 2.5 in the case of electrons and |η| < 2.7 in the case of muons. The four leptons form two same-flavour opposite-charge lepton pairs. The dilepton invariant masses are required to be in the range 50 < m leading < 120 GeV for the leading pair and 12 < m sub-leading < 120 GeV for the sub-leading pair, where the leading pair is defined as the pair with invariant mass closer to the Z boson mass. The transverse momentum p + − T of the dileptons is required to be above 2 GeV. The events in the four-lepton invariant-mass range of 80 < m 4 < 1000 GeV are considered. An artificial neural network is used to discriminate between singleand double-parton scattering events. No signal of double-parton scattering is observed and an upper limit on the fraction of the DPS contribution to the inclusive four-lepton final state of 0.042 is obtained at 95% CL. This upper limit translates, for two independent subscatterings, into a lower limit of 1.0 mb on the effective cross section, consistent with previously measured values in different processes and at different centre-of-mass energies.