Measurement of subjet multiplicities in neutral current deep inelastic scattering at HERA and determination of $\alpha_s$

The subjet multiplicity has been measured in neutral current e+p interactions at Q**2>125 GeV**2 with the ZEUS detector at HERA using an integrated luminosity of 38.6 pb-1. Jets were identified in the laboratory frame using the longitudinally invariant K_T cluster algorithm. The number of jet-like substructures within jets, known as the subjet multiplicity, is defined as the number of clusters resolved in a jet by reapplying the jet algorithm at a smaller resolution scale y_cut. Measurements of the mean subjet multiplicity,<n_sbj>, for jets with transverse energies E_T,jet>15 GeV are presented. Next-to-leading-order perturbative QCD calculations describe the measurements well. The value of alpha_s(M_Z), determined from<n_sbj>at y_cut=10**-2 for jets with 25<E_T,jet<71 GeV, is alpha_s (M_Z) = 0.1187 +/- 0.0017 (stat.) +0.0024 / -0.0009 (syst.) +0.0093 / -0.0076 (th.).

1 also affiliated with University College London 2 on leave of absence at University of Erlangen-Nürnberg, Germany 3 supported by the GIF, contract I-523-13.7/97 4 PPARC Advanced fellow 5 supported by the Portuguese Foundation for Science and Technology (FCT) 6

Introduction
Jet production in e + p neutral current (NC) deep inelastic scattering (DIS) provides a rich testing ground for perturbative QCD (pQCD) and allows a precise determination of the strong coupling constant, α s [1,2,3,4,5]. In the analysis described here, a new method is used to extract α s in DIS, which exploits the pQCD description of the internal structure of jets. The investigation of such structure also gives information on the transition from a parton produced in a hard subprocess to the experimentally observed jet of hadrons. The method uses measurements of the mean subjet multiplicity for an inclusive sample of jets, where the subjet multiplicity is defined as the number of clusters resolved in a jet by reapplying the jet algorithm at a smaller resolution scale y cut [6,7]. At high transverse energy, E T,jet , and for values of y cut not too low, fragmentation effects become small and the subjet multiplicity is calculable in pQCD. Furthermore, the pQCD calculations depend only weakly on the knowledge of the parton distribution functions (PDFs) of the proton, since the subjet multiplicity is determined by QCD radiation processes in the final state. In zeroth order QCD a jet consists of only one parton and the subjet multiplicity is trivially equal to unity. The first non-trivial contribution to the subjet multiplicity is given by O(α s ) processes in which, e.g., a quark radiates a gluon at a small angle. The deviation of the subjet multiplicity from unity is proportional to the rate of parton emission and thus to α s . The next-to-leading-order (NLO) QCD corrections are available, enabling α s to be determined reliably. Measurements of subjet production have been made in e + e − interactions [8], pp collisions [9] and NC DIS [10] and have been used to test the QCD predictions on coherence effects, differences between quarks and gluons and splitting of jets. This paper presents measurements of the mean subjet multiplicity in NC DIS at Q 2 > 125 GeV 2 , where Q 2 is the virtuality of the exchanged boson, for an inclusive sample of jets identified in the laboratory frame with the longitudinally invariant k T cluster algorithm [11,12]. The measurements are compared to NLO QCD predictions [13] and are used to extract α s (M Z ).

Experimental conditions
The data sample was collected with the ZEUS detector at HERA and corresponds to an integrated luminosity of 38.6 ± 0.6 pb −1 . During 1996-97, HERA operated with protons of energy E p = 820 GeV and positrons of energy E e = 27.5 GeV. The ZEUS detector is described in detail elsewhere [14,15]. The main components used in the present analysis are the central tracking detector (CTD) [16], positioned in a 1.43 T solenoidal magnetic field, and the uranium-scintillator sampling calorimeter (CAL) [17]. The CTD was used to establish an interaction vertex with a typical resolution along (transverse to) the beam direction of 0.4 (0.1) cm.
The CAL covers 99.7% of the total solid angle. It is divided into three parts with a corresponding division in the polar angle 1 , θ, as viewed from the nominal interaction point: forward (FCAL, 2.6 • < θ < 36.7 • ), barrel (BCAL, 36.7 • < θ < 129.1 • ), and rear (RCAL, 129.1 • < θ < 176.2 • ). For normal incidence, the depth of the CAL is seven interaction lengths in FCAL, five in BCAL and four in RCAL. Each of the calorimeter parts is subdivided into towers which in turn are segmented longitudinally into one electromagnetic (EMC) and one (RCAL) or two (FCAL, BCAL) hadronic (HAC) sections. The FCAL and RCAL sections are further subdivided into cells with inner-face sizes of 5 × 20 cm 2 (10 × 20 cm 2 in the RCAL) for the EMC and 20 × 20 cm 2 for the HAC sections. The BCAL EMC cells have a projective geometry as viewed from the nominal interaction point; each is 23.3 cm long in the azimuthal direction and has a width of 4.9 cm along the beam direction at its inner face, at a radius 123.

Data selection and jet reconstruction
A three-level trigger was used to select events online [15,18]. The NC DIS events were selected offline using criteria similar to those reported previously [3]. The main steps are outlined below.
The scattered-positron candidate was identified from the pattern of energy deposits in the CAL [19]. The energy (E ′ e ) and polar angle (θ e ) of the positron candidate were also determined from the CAL measurements. The double angle method [20], which uses θ e and an angle (γ) that corresponds, in the quark-parton model, to the direction of the scattered quark, was used to reconstruct Q 2 (Q 2 DA ). The angle γ was reconstructed using the CAL measurements of the hadronic final state [20]. The following requirements were imposed on the data sample: • a positron candidate of energy E ′ e > 10 GeV. This cut ensured a high and well understood positron-finding efficiency and suppressed background from photoproduction, in which the scattered positron escapes in the rear beampipe; • y e < 0.95, where y e = 1 − E ′ e (1 − cos θ e )/(2E e ). This condition removed events in which fake positron candidates from photoproduction background were found in the FCAL; • the energy not associated with the positron candidate within a cone of radius 0.7 units in the η −φ plane around the positron direction was required to be less than 10% of the positron energy. This condition removed photoproduction and DIS events in which part of a jet was incorrectly identified as the scattered positron; • for positrons in the polar-angle range 30 • < θ e < 140 • , the fraction of the positron energy within a cone of radius 0.3 units in the η−φ plane around the positron direction was required to be larger than 0.9; for θ e < 30 • , the cut was raised to 0.98. These requirements removed events in which a jet was incorrectly identified as the scattered positron; • the vertex position along the beam axis, determined from the CTD tracks, was required to be in the range −38 < Z < 32 cm, symmetrical around the mean interaction point for this running period; • 38 < (E−p Z ) < 65 GeV, where E is the total energy measured in the CAL, E = i E i , and p Z is the Z component of the vector p = i E i r i ; in both cases the sum runs over all CAL cells, E i is the energy of the CAL cell i and r i is a unit vector along the line joining the reconstructed vertex to the geometric centre of the cell i. This cut removed events with large initial-state radiation and further reduced the background from photoproduction; and E T is the total transverse energy in the CAL. This cut removed cosmic rays and beam-related background; • events were rejected if a second positron candidate with energy above 10 GeV was found and the total energy in the CAL after subtracting that of the two positron candidates was below 4 GeV. This requirement removed elastic Compton-scattering events (ep → eγp); • Q 2 DA > 125 GeV 2 . The longitudinally invariant k T cluster algorithm [11] was used in the inclusive mode [12] to reconstruct jets in the hadronic final state both in data and in Monte Carlo (MC) simulated events (see Section 4). In data, the algorithm was applied in the laboratory frame to the energy deposits measured in the CAL cells after excluding those associated with the scattered-positron candidate. The jet search was performed in the η − φ plane. In the following discussion, E T,i denotes the transverse energy, η i the pseudorapidity and φ i the azimuthal angle of object i. For each pair of objects (where the initial objects are the energy deposits in the CAL cells), the quantity was calculated. For each object, the quantity d i = (E T,i ) 2 was also calculated. If, of all the values {d ij , d i }, d kl was the smallest, then objects k and l were combined into a single new object. If, however, d k was the smallest, then object k was considered a jet and was removed from the sample. The procedure was repeated until all objects were assigned to jets. The jet variables were defined according to the Snowmass convention [21]: This prescription was also used to determine the variables of the intermediate objects.
Jet energies were corrected for all energy-loss effects, principally in inactive material, typically about one radiation length, in front of the CAL. The corrected jet variables were then used in applying additional cuts on the selected sample: • events with at least one jet satisfying E T,jet > 15 GeV and −1 < η jet < 2 were selected; • events were removed from the sample if the distance of any of the jets to the positron candidate in the η − φ plane, was smaller than one unit. This requirement removed photoproduction background.
With the above criteria, 37 933 one-jet, 821 two-jet and 25 three-jet events were identified.

Definition of the subjet multiplicity
Subjets were resolved within a jet using all CAL cells associated with the jet and repeating the application of the k T cluster algorithm described above, until, for every pair of objects i and j, the quantity d ij was greater than d cut = y cut · E T,jet 2 [7]. All remaining objects were called subjets. The reconstruction of subjets within a jet was performed using the uncorrected cell and jet energies, since systematic effects largely cancel in the ratio d ij / E T,jet 2 as seen in Eq. (1). The subjet structure depends upon the value chosen for the resolution parameter y cut . The mean subjet multiplicity, n sbj , is defined as the average number of subjets contained in a jet at a given value of y cut : where n i sbj (y cut ) is the number of subjets in jet i and N jets is the total number of jets in the sample. By definition, n sbj 1. The mean subjet multiplicity was measured for y cut values in the range 5 · 10 −4 − 0.1.

Monte Carlo simulation
Samples of events were generated to determine the response of the detector to jets of hadrons and the correction factors necessary to obtain the hadron-level mean subjet multiplicities. The generated events were passed through the GEANT 3.13-based [22] ZEUS detector-and trigger-simulation programs [15]. They were reconstructed and analysed by the same program chain as the data.
Neutral current DIS events were generated using the LEPTO 6.5 program [23] interfaced to HERACLES 4.6.1 [24] via DJANGOH 1.1 [25]. The HERACLES program includes photon and Z exchanges and first-order electroweak radiative corrections. The QCD cascade was modelled with the colour-dipole model [26] by using the ARIADNE 4.08 program [27] and including the boson-gluon-fusion process. The colour-dipole model treats gluons emitted from quark-antiquark (diquark) pairs as radiation from a colour dipole between two partons. This results in partons that are not ordered in their transverse momenta. Samples of events were also generated using the model of LEPTO based on first-order QCD matrix elements plus parton showers (MEPS). For the generation of the samples with MEPS, the option for soft-colour interactions was switched off [28]. In both cases, fragmentation into hadrons was performed using the Lund [29] string model as implemented in JETSET 7.4 [30]. Events were also generated using the HERWIG 6.3 [31] program, in which the fragmentation into hadrons is simulated by a cluster model [32]. The CTEQ4D [33] proton PDFs were used for all simulations.
The MC events were analysed with the same selection cuts and jet-search methods as were used for the data. A good description of the measured distributions for the kinematic and jet variables was given by both ARIADNE and LEPTO-MEPS. The simulations based on HERWIG provided a poor description of the data at low values of y cut (y cut 5 · 10 −3 ) and, for this reason, it was not used to correct the data. At relatively large values of y cut (y cut 3 · 10 −2 ), HERWIG gave a good description of the data. The identical jet algorithm was also applied to the hadrons (partons) to obtain predictions at the hadron (parton) level. The MC programs were used to estimate QED radiative effects, which were negligible for the measurements of n sbj .

NLO QCD calculations
Experimental studies of QCD using jet production in NC DIS at HERA are often performed in the Breit frame [34]. The analysis of the subjet multiplicity presented here was instead performed in the laboratory frame, since calculations of the mean subjet multiplicity for jets defined in the Breit frame can, at present, only be performed to O(α s ), precluding a reliable determination of α s . However, calculations of the mean subjet multiplicity can be performed up to O(α 2 s ) for jets defined in the laboratory frame. The perturbative QCD prediction for n sbj was calculated as the ratio of the cross section for subjet production to that for inclusive jet production (σ jet ): where σ sbj,j (y cut ) is the cross section for producing jets with j subjets at a resolution scale of y cut . The NLO QCD predictions for the mean subjet multiplicity were derived from Eq. (2) by computing the subjet cross section to O(α 2 s ) and the inclusive jet cross section to O(α s ). As a result, the α s -dependence of the mean subjet multiplicity up to O(α 2 s ) is given by n sbj = 1 + C 1 α s + C 2 α 2 s , where C 1 and C 2 are quantities whose values depend on y cut and the jet and kinematic variables.
The measurements of the mean subjet multiplicity were performed in the kinematic region defined by Q 2 > 125 GeV 2 since, at lower values of Q 2 , the sample of events with at least one jet with E T,jet > 15 GeV is dominated by dijet events. The calculation of the mean subjet multiplicity for dijet events can be performed only up to O(α s ), which would severely restrict the accuracy of the predictions.
The measurements were compared with NLO QCD calculations using the program DIS-ENT [13]. The calculations were performed in the MS renormalisation and factorisation schemes using a generalised version [13] of the subtraction method [35]. The number of flavours was set to five and the renormalisation (µ R ) and factorisation (µ F ) scales were chosen to be µ R = µ F = Q. The strong coupling constant, α s , was calculated at two loops with Λ (5) MS = 202 MeV, corresponding to α s (M Z ) = 0.116. The calculations were performed using the CTEQ4M parameterisations of the proton PDFs. The jet algorithm described in Section 3 was also applied to the partons in the events generated by DISENT in order to compute the parton-level predictions for the mean subjet multiplicity. The results obtained with DISENT were cross-checked by using the program DISASTER++ [36]. The differences were smaller than 1% [37]. Although DISENT does not include Z exchange, its effect in this analysis was negligible.
Since the measurements involve jets of hadrons, whereas the NLO QCD calculations refer to partons, the predictions were corrected to the hadron level using ARIADNE. The multiplicative correction factor, C had , was defined as the ratio of n sbj for jets of hadrons over that for jets of partons. The value of C had increases as y cut decreases due to the increasing importance of non-perturbative effects. The hadron-level prediction for n sbj approaches n jet hadrons as y cut approaches 0, where n jet hadrons is the mean multiplicity of hadrons in a jet. However, the maximum number of partons that can be assigned to a jet in the NLO calculation is three, so the parton-level prediction for n sbj is restricted to n sbj 3. This fundamental problem was avoided by selecting high E T,jet and a relatively high y cut value, i.e. E T,jet > 25 GeV and y cut 10 −2 . In this region, the hadronisation correction is small and the measured n sbj is much smaller than three, so that a reliable comparison of data and NLO QCD can be made and α s extracted.
The procedure for applying hadronisation corrections to the NLO QCD calculations was validated by verifying that the predicted dependence of the mean subjet multiplicity on y cut and E T,jet predicted by NLO QCD was well reproduced by both ARIADNE and LEPTO-MEPS. The predictions based on HERWIG exhibited a different dependence both at low values of y cut and at high E T,jet ; for this reason, the HERWIG model was not used in the evaluation of the uncertainty on the hadronisation correction.
The following sources were considered in the evaluation of the uncertainty affecting the theoretical prediction of n sbj : • the uncertainty in the NLO QCD calculations due to terms beyond NLO, estimated by varying µ R between Q/2 and 2Q, was ∼ 3% at y cut = 10 −2 ; • the uncertainty in the NLO QCD calculations due to that in the hadronisation correction was estimated as half of the difference between the values of C had obtained with LEPTO-MEPS and with ARIADNE. It was smaller than 1.5% at y cut = 10 −2 for E T,jet > 25 GeV; • the uncertainty in the NLO QCD calculations due to the uncertainties in the proton PDFs was estimated by repeating the calculations using three additional sets of proton PDFs, MRST99, MRST99-g↑ and MRST99-g↓ [38]. The differences were negligible; • the NLO QCD calculations were carried out using µ R = E T,jet and µ F = Q. The differences were smaller than 0.3% at y cut = 10 −2 .

Data corrections and systematic uncertainties
The raw distribution of n sbj in the data is compared to the prediction of the ARIADNE simulation for several values of y cut in Fig. 1. The simulation provides a satisfactory description of the data, thus validating the use of these MC samples to correct the measured mean subjet multiplicity to the hadron level. Figure 1 also shows that the fraction of jets in the data with more than three subjets at y cut = 10 −2 is small; this fraction becomes negligible for E T,jet > 25 GeV, thus allowing a meaningful comparison with the NLO QCD calculations. The mean subjet multiplicity corrected for detector effects was determined bin-by-bin as n sbj = K n sbj CAL , where the correction factor was defined as K = n sbj MC had / n sbj MC CAL and was evaluated separately for each value of y cut in each region of E T,jet ; the subscript CAL (had) indicates that the mean subjet multiplicity was determined using the CAL cells (hadrons). The deviation of the correction factor K from unity was less than 10% for y cut 10 −2 and decreased as y cut increased.
The following sources of systematic uncertainty on the measurement of n sbj were considered [37]: • the differences in the results obtained by using either ARIADNE or LEPTO-MEPS to correct the data for detector effects. This uncertainty was typically smaller than 1%; • the scattered-positron candidate identification. The analysis was repeated by using an alternate technique [39] to select the scattered-positron candidate resulting in an uncertainty smaller than 0.5%; • the 1% uncertainty in the absolute energy scale of the jets [40] resulted in an uncertainty smaller than 0.5%; • the 1% uncertainty in the absolute energy scale of the positron candidate [41] resulted in a negligible uncertainty; • the uncertainty in the simulation of the trigger and in the cuts used to select the data also resulted in a negligible uncertainty.

Measurement of the mean subjet multiplicity
The mean subjet multiplicity was measured for events with Q 2 > 125 GeV 2 , including every jet of hadrons in the event with E T,jet > 15 GeV and −1 < η jet < 2, after correction for detector effects. It is shown as a function of y cut in Fig. 2a) and in Fig. 2b) as a function of E T,jet at y cut = 10 −2 and presented in Tables 1 and 2, respectively. The measured mean subjet multiplicity decreases as E T,jet increases. This result is in agreement with that of a previous publication [42], in which the internal structure of jets in NC DIS was studied using the jet shape and it was observed that the jets become narrower as E T,jet increases. This tendency is also consistent with the transverse-energy dependence of the mean subjet multiplicity for jets identified in the Breit frame [10].
The measurements in Fig. 2 are compared with the predictions of the ARIADNE and LEPTO-MEPS. The LEPTO-MEPS predictions overestimate the observed mean subjet multiplicity; ARIADNE overestimates the data at low E T,jet and approaches the data at high E T,jet .
Calculations of n sbj in NLO QCD, corrected for hadronisation effects, using the sets of proton PDFs of the CTEQ4 "A-series" are compared to the data in Figs. 3 and 4. The hadronisation correction is small in the unshaded regions: as a function of y cut and for jets with E T,jet > 15 GeV, C had differs from unity by less than 25% for y cut 10 −2 (see Fig. 3); as a function of E T,jet at y cut = 10 −2 , C had differs from unity by less than 17% for E T,jet > 25 GeV (see Fig. 4). The measured n sbj as a function of y cut is well described by the NLO QCD predictions. For very small y cut values, the agreement is also good. In that region, fixed-order QCD calculations are affected by large uncertainties and a resummation of terms enhanced by ln y cut [7] would be required for a precise comparison with the data. At relatively large values of y cut , an NLO fixed-order calculation is expected [7] to be a good approximation to such a resummed calculation.
The sensitivity of the measurements to the value of α s (M Z ) is illustrated in Fig. 4 by the comparison of the measured n sbj at y cut = 10 −2 as a function of E T,jet with NLO QCD calculations for different values of α s (M Z ). The overall description of the data by the NLO QCD calculations is good, so that the measurements can be used to make a determination of α s .

Determination of α s
The measurements of n sbj for 25 < E T,jet < 71 GeV at y cut = 10 −2 were used to determine α s (M Z ). The y cut value and the lower E T,jet limit were justified in Section 5; the value of C had differs from unity by less than 17% and approaches unity as E T,jet increases. The mean value of Q 2 was Q 2 = 1580 GeV 2 . The following procedure was used: • NLO QCD calculations of n sbj were performed for the five sets of the CTEQ4 "Aseries". The value of α s (M Z ) used in each partonic cross-section calculation was that associated with the corresponding set of PDFs; • for each bin, i, in E T,jet , the NLO QCD calculations, corrected for hadronisation effects, were used to parameterise the α s (M Z ) dependence of n sbj according to The coefficients C i 1 and C i 2 were determined by performing a χ 2 -fit of this form to the NLO QCD predictions. The NLO QCD calculations were performed with an accuracy such that the statistical uncertainties of these coefficients were negligible compared to any other uncertainty. This simple parameterisation gives a good description of the α s (M Z ) dependence of n sbj over the entire range spanned by the CTEQ4 "A-series"; • the value of α s (M Z ) was then determined by a χ 2 -fit of Eq. (3) to the measurements of n sbj . The resulting fit described the data well, giving χ 2 = 2.7 for four degrees of freedom.
This procedure correctly handles the complete α s -dependence of the NLO calculations (the explicit dependence coming from the partonic cross sections and the implicit one coming from the PDFs) in the fit, while preserving the correlation between α s and the PDFs.
The uncertainty on the extracted value of α s (M Z ) due to the experimental systematic uncertainties was evaluated by repeating the analysis above for each systematic check.
The largest contribution to the experimental uncertainty was that due to the simulation of the hadronic final state. A total systematic uncertainty on α s (M Z ) of ∆α s (M Z ) = +0.0024 −0.0009 was obtained by adding in quadrature the individual contributions.
The theoretical uncertainties on α s (M Z ) arising from terms beyond NLO and uncertainties in the hadronisation correction, evaluated as described in Section 5, were found to be ∆α s (M Z ) = +0.0089 −0.0071 and ∆α s (M Z ) = ±0.0028, respectively. The total theoretical uncertainty was obtained by adding these uncertainties in quadrature. The results are presented in Table 3. In addition, as a cross check, the measurement was repeated using three of the MRST99 sets of proton PDFs: central, α s ↑↑ and α s ↓↓. The result agreed with that obtained by using CTEQ4 to better than 0.3%. The value of α s is in agreement with the central result for variations in the choice of y cut in the range 5 · 10 −3 to 3 · 10 −2 .
The value of α s (M Z ) as determined from the measurements of n sbj for 25 < E T,jet < 71 GeV at y cut = 10 −2 is α s (M Z ) = 0.1187 ± 0.0017 (stat.) +0.0024 −0.0009 (syst.) +0.0093 −0.0076 (th.) . This result is consistent with recent determinations by the H1 [5,43] and ZEUS [2,3,44] Collaborations and with the PDG value, α s (M Z ) = 0.1172 ± 0.0020 [45]. This determination of α s has experimental uncertainties as small as those based on the measurements of jet cross sections in DIS. However, the theoretical uncertainty is larger and dominated by terms beyond NLO. Further theoretical work on higher-order contributions would allow an improved measurement.

Summary
Measurements of the mean subjet multiplicity for jets produced in neutral current deep inelastic e + p scattering at a centre-of-mass energy of 300 GeV have been made using every jet of hadrons with E T,jet > 15 GeV and −1 < η jet < 2 identified with the longitudinally invariant k T cluster algorithm in the laboratory frame. The average number of subjets within a jet decreases as E T,jet increases.
Next-to-leading-order QCD calculations reproduce the measured values well, demonstrating a good description of the internal structure of jets by QCD radiation. The mean subjet multiplicity of an inclusive sample of jets produced in NC DIS has the advantage of being mostly sensitive to final-state parton-radiation processes and of allowing an extraction of α s with very little dependence on the proton parton distribution functions.
A QCD fit of the measurements of the mean subjet multiplicity for 25 < E T,jet < 71 GeV at y cut = 10 −2 yields α s (M Z ) = 0.1187 ± 0.0017 (stat.) +0 Table 3: The α s (M Z ) values as determined from the QCD fit to the measured n sbj at y cut = 10 −2 as a function of E T,jet , as well as that obtained by combining all regions. The statistical, systematic and theoretical uncertainties are shown separately.   Figure 4: a) The mean subjet multiplicity corrected to the hadron level, n sbj , at y cut = 10 −2 as a function of E T,jet for inclusive jet production in NC DIS with Q 2 > 125 GeV 2 and −1 < η jet < 2 (dots). The inner error bars show the statistical uncertainty. The outer error bars show the statistical and systematic uncertainties added in quadrature. b) The parton-to-hadron correction, C had , used to correct the QCD predictions and determined using ARIADNE (solid line) and LEPTO-MEPS (dashed line). c) The relative uncertainty on the NLO QCD calculation due to the variation of the renormalisation scale. Other details are as described in the caption to Fig. 3.