1 Introduction

The multiplicity of charged particles produced in high-energy pp collisions is one of the key observables to describe the global properties of the interactions and has been the subject of long standing experimental and theoretical investigations. The pp multiplicity distributions of primary-charged particles have been measured for five increasingly wider pseudorapidity ranges. A primary-charged particle is a charged particle with a mean proper lifetime \(\tau \) larger than 1 cm/c, which is either produced directly in the interaction, or from decays of particles with \(\tau \) smaller than 1 cm/c, excluding particles produced in interactions with material [1]. The results are determined using both the Silicon Pixel Detector (SPD) and the Forward Multiplicity Detector (FMD) in ALICE to widen the pseudorapidity coverage with respect to previous ALICE results [2,3,4], which made exclusive use of the SPD. The extension of the pseudorapidity coverage allows us to increase the high-multiplicity reach of the distributions by around 70–90\(\%\) with respect to the previous ALICE publication [4], exploring a wider phase space.

The multiplicity distribution of charged particles produced in high-energy pp collisions is sensitive to the number of interactions between quarks and gluons contained in the protons and to underlying mechanisms of particle production. At LHC energies, the particle production is dominated by soft QCD processes, which cannot be treated perturbatively and can only be modeled phenomenologically. On the other hand, as the colliding energy grows, the particle production receives increased contributions from hard scattering, which can be treated perturbatively.

We have compared directly our data to previous measurements from CMS [5]. ATLAS and LHCb use different \(p_{\text {T}}\) and \(\eta \) ranges [6, 7], making the direct comparison impossible. This manuscript presents also an overview of the parameters obtained when fitting multiplicity distributions with the sum of two Negative Binomial Distributions (NBDs). Additionally, the results have been compared to simulations regularly used at LHC [8,9,10,11] and calculations based on saturation density of gluons in the colliding hadrons [12, 13].

The manuscript is organized in the following way: Sect. 2 describes the detectors used to measure the charged-particle multiplicity distributions. Section 3 explains the analysis procedure in detail. The systematic uncertainties are described in Sect. 4 and the results along with comparisons to models are presented in Sect. 5, which contains also the analysis of the NBD fits. A brief summary and conclusions are finally given in Sect. 6.

2 Experimental setup

Full details of the sub-detectors are given elsewhere [14]. ALICE is designed to measure particles over a wide kinematic range \(-\,3.4<\eta <5.0\). Only the sub-detectors used in this analysis are described, namely the V0 scintillation counters, the SPD, and the FMD.

2.1 V0 detector

The V0 detector [15] is composed of two arrays of 32 scintillators positioned at 330 cm (V0-A) and −90 cm (V0-C) from the nominal interaction point (IP) along the beam axis. Each array has a ring structure segmented into 4 radial and 8 azimuthal sectors. The detector has full azimuthal coverage in the pseudorapidity ranges \(2.8< \eta < 5.1\) and \(-3.7< \eta < -1.7\). The signal amplitudes and times are recorded for each of the 64 scintillators. The V0 is appropriate for triggering, thanks to the good timing resolution of each scintillator (1 ns) along with its large acceptance for detecting charged particles. Different V0 trigger settings are used in the analysis.

2.2 Silicon pixel detector

The SPD are the two innermost cylindrical layers of the ALICE Inner Tracking System (ITS) [14] surrounding the beam line. The layers have full azimuthal coverage and radii of 3.9 and 7.6 cm with \(9.8\times 10^6\) silicon diodes, each of size \(50\times 425\) \({\upmu } \mathrm{m}^2\). The first layer of the SPD has the largest pseudorapidity coverage of the ITS (\(\vert \eta \vert <1.98\) for collisions at the nominal IP). Besides the readout of individual pixels with signals above a certain threshold, each SPD chip provides a fast signal every 100 ns, indicating a presence of fired pixels (FastOR), making it suitable for triggering. Charged particles can deposit energy in more than one pixel of the SPD. The offline reconstruction combines such neighboring signals into a a single cluster. The charged-particle multiplicity can then be estimated by counting the number of clusters detected in a SPD layer. This analysis uses only clusters from the inner layer of the SPD to provide the largest pseudorapidity coverage for particle detection. Alternatively, clusters from the two SPD layers together with the primary vertex can be combined to form tracklets [4], allowing to select primary particles with very high efficiency. The charged-particle multiplicity is then estimated by counting the number of tracklets.

2.3 Forward multiplicity detector

The purpose of the FMD is to extend, with high spatial resolution, the charged-particle detection acceptance beyond the reach of the SPD and central detectors in ALICE [16]. The FMD is a silicon strip detector and consists of three sub-detectors placed at 320 cm (FMD1), 79 cm (FMD2), and \(-69\) cm (FMD3) from the nominal IP along the beam pipe. FMD2 and FMD3 contain both inner and outer rings of silicon strips. FMD1 is located farther from the IP and has only one inner ring. Inner rings consist of 10 sensors, each with two azimuthal sectors and 512 strips with radii from 4.2 to 17.2 cm. Outer rings contain 20 sensors each again with two azimuthal sectors, but with 256 strips, with radii from 15.4 to 28.4 cm. Each ring (inner or outer), therefore, contains 10,240 strips giving in total 51,200 strips. The FMD has full azimuthal coverage in the pseudorapidity ranges \(-\,3.4<\eta <-\,1.7\) and \(1.7<\eta <5.0\).

The FMD records, for each strip, the energy deposited by charged particles traversing the detector. Various selection criteria, see [17, 18] for details, are applied to the energy measured in each strip to determine if the signal corresponds to a single particle traversing only this strip or also a neighboring strip. The number of particles traversing the FMD is determined taking into account only the signals which pass these selection criteria. The majority of the particles which reach the FMD, however, are secondary particles produced in interactions with the beam pipe, the material of the ITS, cables and support structures [17]. Therefore, a detailed Monte Carlo simulation is needed to determine the number of primary particles produced in the collision.

3 Analysis procedure

The multiplicity distribution of the primary particles is affected by many detector effects, such as dead detector regions and secondary particle production. These detector effects must be minimized and corrected for as they have increasing effects when determining accurately the probability of progressively higher-multiplicity events. The unfolding method is used to correct for the detector effects, as will be described in the following.

3.1 Event selection

Collisions at three different center of mass energies (0.9, 7, and 8 TeV) are analyzed. The data used for the analysis were collected at low beam currents and low pileup during three data taking periods: the 0.9 and 7 TeV samples were acquired in 2010, while the 8 TeV sample was collected in 2012. The last sample is the most affected by the pileup contamination. For this reason we selected few specific runs with low contamination from pileup events for this energy, and used data taken with interaction rate not exceeding 1 kHz. A pileup is defined as more than one collision occurring during the readout time of the detector (300 ns, for the SPD, and 2 \(\upmu \)s sampling time, for the FMD). Such events produce a bias towards larger multiplicity that enhance mostly the tail of the multiplicity distribution. Table 1 shows the number of selected events at each energy and the average number of interactions per bunch crossing, \(\langle \mu \rangle \), measured by the experiment [19]. This parameter is determined experimentally and for this measurements, in which \(\langle \mu \rangle \ll 1\), the average probability of having more than one interaction in a single bunch crossing, where at least one interaction occurs, is around \(1-2\)% (\(\langle \mu \rangle /2\)).

Table 1 Data samples used in this analysis. For each center-of-mass energy, the total number of selected minimum-bias (MB) events along with the average number of interactions per bunch crossing, \(\langle \mu \rangle \), are listed

Inelastic non-diffractive scatterings are the dominant processes in pp collisions, for which most of the hadrons are produced as a consequence of an exchange of color charge. On the contrary, diffractive events can be single-, double-, or central-diffractive. In Regge theory [20], diffraction occurs when the Pomeron interacts with the proton and produces a system of particles, called the diffractive system. The case in which only one of the protons dissociates is called single-diffractive.

The signals of the V0 and SPD are used to select events where at least one interaction occurred, which are triggered by requiring the detection of at least one particle in either the V0-A, V0-C, or SPD (\(\hbox {MB}_{\text {OR}}\)). Events are divided into three classes depending on further requirements. The first class includes all inelastic events (the INEL class), which is the same condition as used to select events where an interaction occurred (\(\hbox {MB}_{\text {OR}}\) trigger). The second class (the INEL > 0) requires the presence of at least one charged particle (tracklet) in the region \(\vert \eta \vert <1.0\) in addition to the INEL condition. This class has higher trigger efficiency and, therefore, reduced corrections relative to the INEL event class. The third class requires charged particles to be detected in both the V0-A and the V0-C (\(\hbox {MB}_{\text {AND}}\)). This class is used to remove the majority of the single-diffractive events and is, therefore, called the non-single-diffractive (NSD) event class.

To remove interactions of the beam with residual gas in the beam pipe, further selection criteria are applied to the event sample. Since these interactions can occur anywhere along the beam line, the most efficient way to reject them is to require that the interaction occurs close to the expected bunch-crossing position. The position of the collision along the beam pipe is determined from the vertex position reconstructed correlating SPD tracklets, with a precision of about 0.2 cm. Beam-gas interactions far from the IP are vetoed by the time difference in the V0-A and V0-C detectors. The vertex is required to be within 4 cm of the nominal IP position to reduce the contribution from beam-gas interactions and to remove acceptance gaps in the pseudorapidity coverage of the SPD and FMD, since the acceptance depends on the vertex position.

Even though runs with very low \(\langle \mu \rangle \) (average 0.04) were chosen, a residual background from pileup events remains. The majority of pileup events are identified and removed by searching for additional vertices in the same event. It is required that the uncertainty on the measurement of the longitudinal vertex position is less than 0.2 cm to have the most accurate determination of the vertex. Events with an additional vertex separated by more than 0.8 cm from the main one and containing at least three attached tracklets are tagged as pileup and removed from the analysis. Dedicated simulations show that the probability for the pileup event to pass this selection criteria is at most 10\(\%\) and the residual pileup does not exceed 10\(\%\) up to the highest multiplicities kept in this analysis. Therefore, the overall pileup contribution does not exceed 0.2\(\%\) for \(\langle \mu \rangle =0.04\) and, because it is covered by systematic uncertainties for all multiplicities, no correction is applied for this bias.

3.2 Unfolding

The FMD had nearly 100% azimuthal acceptance, but the SPD had a significant number of modules excluded from read-out that must be accounted for. On the other hand, interactions in detector material increase the detected number of charged particles, in particular in the FMD. A good understanding of the detector acceptance and of the number of secondary particles which hit the FMD and the SPD is crucial.

The main ingredients necessary to evaluate the primary multiplicity distributions are the raw (detected) multiplicity distributions and a matrix, which maps the measured multiplicity to the number of charged-primary particles distributions, called true. The raw multiplicity distributions are determined by counting the number of clusters in the SPD acceptance, the number of signals passing selection criteria in the FMD, or the average between the two if the acceptance of the SPD and FMD overlaps. The response of the detector is determined by the matrix \(R_{mt}\), which corresponds to the probability that an event with true multiplicity t and measured multiplicity m occurs. This matrix is obtained using PYTHIA ATLAS-CSC flat tune [21] simulations in which the generated particles are transported through the experimental setup using the GEANT3 [22] software package. The same reconstruction algorithm is used for simulations of real data. Experimental conditions and detector settings at the time of data-taking at a center-of-mass energy of \(\sqrt{s}=\) 0.9, 7, and 8 TeV are simulated when evaluating the response matrices. Figure 1 shows two different response matrices for different pseudorapidity ranges. The left panel of Fig. 1 shows the response matrix obtained for the \(\vert \eta \vert <2.0\). In this range, the unfolding increases the multiplicity on average because of the acceptance gaps in the SPD. When the extended pseudorapidity range, \(\vert \eta \vert <3.4\), is used, the number of detected counts exceeds on average the number of true counts as the secondary particles in the FMD dominate the bias. This is shown in the right panel of Fig. 1.

Fig. 1
figure 1

Response matrices obtained propagating Monte Carlo generated events, in this case with the PYTHIA ATLAS-CSC flat tune [21] for the non-single-diffractive event class selection. Left: Matrix including the overlap region between SPD and FMD. Right: Matrix for the region where the majority of the counts are from the FMD. The diagonal (generated=reconstructed) is plotted as a black dotted line

A method based on Bayes’ Theorem [23] is used to derive the final multiplicity distributions. Bayes’ Theorem states that the conditional probability \(\text {P}(A\vert B)\) (probability of A if B is true) can be written as

$$ \text {P}(A\vert B)=\frac{\text {P}(B\vert A)\text {P}(A)}{\text {P}(B)}, $$
(1)

in which \(\text {P}(A)\) and \(\text {P}(B)\) are the independent probabilities of A and B, and \(\text {P}(B\vert A)\) is the probability of B if A is true. A can be identified as a certain true multiplicity, while B is the measured multiplicity. The conditional probability \(\text {P}(B\vert A)\) is the response matrix of the detector, and can then be computed.

Equation 1 is restated as

$$ {\widetilde{R}}_{tm}=\frac{R_{mt}{\text {P}}_{t}}{\sum _{\text{t'}}R_{mt'}{\text {P}}_{t'}}, $$
(2)

where \(\text {P}_{t}\) is an a priori guess of the true distribution and \(\widetilde{R}_{tm}\) is the matrix of probabilities that allows one to compute the true multiplicity distribution from the measured one. The unfolded distribution, \(U_{t}\), is then obtained from

$$ U_{t}=\sum _{m}\widetilde{R}_{tm}M_{m}, $$
(3)

in which \(M_{m}\) is the measured distribution. The obtained \(U_{t}\) is used as a priori probability for the next iteration. The number of iterations is fixed to 10. This parameter has been chosen by examining the optimal performance obtained from simulation studies, performing closure tests using different number of iterations.

3.3 Event selection efficiency

The probability that an event is triggered depends on the multiplicity of charged particles. At high multiplicities, it is more probable that one of the trigger detectors is fired. At low multiplicities large trigger inefficiencies for finding events exist and must be corrected for. The event selection efficiency, \(\epsilon _{\text {TRIG}}\), is defined via simulations as

$$ \epsilon _{\text {TRIG}}=\frac{N_{\text {ev,reco}}(\text {TRIG}\, \,\vert \text {v}_{\text {z,reco}}\vert<4\,\mathrm{\text {cm}})}{N_{\text {ev,gen}}(\text {TRIG}\, \,\vert \text {v}_{\text {{z,gen}}}\vert <4\,\mathrm{\text {cm}})}, $$
(4)

where the numerator is the number of reconstructed events with the selected hardware trigger condition (MB\(_{\text {AND}}\) or MB\(_{\text {OR}}\)) and with the reconstructed vertex less than 4 cm from the nominal IP, in longitudinal direction. There is a dependence in the z vertex distribution and selecting \( \text {v}_{\text {z,reco}}\) introduces a bias in the efficiency. The effect is visible only for narrow vertex selections, and it is not relevant for \(\vert \text {v}_{\text {z}}\vert <4\,\mathrm{\text {cm}}\). The denominator is a similar quantity, but for the generated sample (inelastic or non-single-diffractive events). The unfolded distribution is corrected for the vertex and trigger inefficiency by dividing each multiplicity bin by its \(\epsilon _{\text {TRIG}}\) value.

The efficiencies used are shown in Fig. 2 for 0.9 and 7 TeV for the range \(\vert \eta \vert <3.0\). Both the INEL and NSD efficiencies are displayed. The points are obtained by averaging the efficiencies found with the PYTHIA Perugia 0 [8] and the PHOJET  [9] diffraction tuned event generators. Diffraction was accounted for using the Kaidalov–Poghosyan model [24] to tune the diffractive processes. The event generators are adjusted to reproduce the measured diffraction cross-sections and the shapes of the diffractive masses. The cross-section ratios are \(\sigma _{\text {SD}}/\sigma _{\text {INEL}}\backsim 0.20\) for upper diffractive mass limit of \(M_{\text {X}}<200\) GeV\(/c^{2}\), and \(\sigma _{\text {DD}}/\sigma _{\text {INEL}}\backsim 0.11\) for a pseudorapidity gap of \(\Delta \eta >3\), as measured at the LHC [25]. The uncertainties are estimated by evaluating the difference between the two event generators and are only relevant at low multiplicity. The efficiency of NSD trigger requiring signal in V0-A and V0-C detectors, on both sides of the IP, is lower at low multiplicities than that of INEL trigger, which requires response of at least one V0. For \(N_{\text {ch}}\gtrsim 20\) at the widest pseudorapidity ranges probed, both efficiencies reach 100\(\%\) and the corresponding systematic uncertainty becomes negligible.

Fig. 2
figure 2

Event selection efficiencies for 0.9 and 7 TeV for both INEL and NSD event samples as a function of the number of primary-charged particles for the \(\vert \eta \vert <3.0\) range

4 Systematic uncertainties

The steps involved in the analysis depend on the knowledge of the detector response to charged particles. The uncertainties in Table 2 are purely model dependent and related to how diffraction, and soft QCD in general, are processed in the two generators used to determine the efficiency uncertainty. The difference between PYTHIA Perugia 0 and PHOJET diffraction tuned generators, used to determine this uncertainty, is larger for small values of \(N_{\text {ch}}\). Therefore, the uncertainty mostly influences the first bins of the multiplicity distributions. Table 2 reports the values for charged-multiplicity of 0, 1, and 2. In general, the Lorentz boost of the diffracted system increases with increasing center-of-mass energies, and single and double diffraction contributions are smaller when going to higher energies. At wider pseudorapidity ranges there are higher chances of including diffractive events in the distribution. We observe that the uncertainty for NSD events at \(-\,3.4<\eta <+\,5.0\) is higher for lower energy in one multiplicity bin, where the description of diffraction differs the most among PYTHIA and PHOJET.

Table 2 Systematic uncertainties (in percent) for the efficiency correction, for the INEL, NSD and INEL\(>0\) event classes. Numbers are given for multiplicity 0, 1, and 2

Systematic effects from different sources related to run conditions could produce biases in the number of detected particles. To investigate such effects, the fluctuations in the results are examined for all three energies by splitting the data set into two separate samples with similar beam conditions, which are then unfolded with two different response matrices. The response matrices are calculated from simulations relative to the conditions of the runs that are used to unfold. The two resulting unfolded distributions are then averaged bin by bin. For \(\sqrt{s}=\) 0.9 and 8 TeV, the run-to-run fluctuations are found to be negligible up to the value of \(N_{\text {ch}}\) in which statistical uncertainties become large. For \(\sqrt{s}=\) 7 TeV, however, run-to-run fluctuations of around 10–15% in the low \(N_{\text {ch}}\) bins are found.

As discussed in Sect. 3.2, an accurate detector description is crucial in determining the number of particles created in interactions with detector material in order to retrieve the primary distribution. For the SPD, little material (besides the beam pipe) exists between the detector and the interaction point, whereas a significant amount of material is present between the FMD and the interaction point. An estimate of the amount of material versus \(\eta \) was done using special satellite collisions [18], which occur away from the nominal IP and thereby reduce the amount of traversed material. The result was that for \(-\,3.4<\eta <-\,1.7\) the material was underestimated in the simulations of the experiment by a maximum value of 14%. For \(1.7<\eta <5.0\), the material was estimated with \(\pm \,7\)% precision. It is possible to correct for the measurements of the first moment of the distribution (pseudorapidity density [18]). On the contrary, higher order effects are non-trivial, making it impossible to directly correct in this case for the amount of material. Instead, the entire raw distribution was unfolded with two response matrices, one which increases the material by 14% and the other which decreases the material by 7%. The difference between the results using the two different matrices determined the maximum systematic error contribution from the material budget.

The last uncertainty was determined by varying the selection criteria used to determine which signals correspond to single particles in the FMD (described in Sect. 2.3) by 5%. This value is the maximum variation of the fit parameters to the energy distributions [17] within each of the three data taking periods.

The systematic uncertainties from the three sources described along with the one from the efficiency correction are summed in quadrature (see Table 3). The methods to determine the systematic uncertainties from material budget give the largest possible variations. To convert the variations to a root-mean-square value, they have been divided by \(\sqrt{3}\) considering that the variation is flat and was taken from the mean bin-by-bin value as the reference, not the full spread. Three particular multiplicities are considered when reporting the systematic uncertainties in Table 3: the 0-bin, the mean value \(\langle m\rangle \), and the value in which the probability is very low (\(\text {P}(N_{\text {ch}})=10^{-4}\)). For \(\text {P}(N_{\text {ch}})<10^{-4}\) the uncertainties grow rapidly and the level of the systematic uncertainty depends strongly on the multiplicity. The \(\text {P}(N_{\text {ch}})\) region with lowest uncertainty is near the mean of the distribution.

Table 3 Total systematic uncertainties (in percent), for the INEL, NSD and INEL> 0 event classes. Numbers are given at multiplicity values of 0, the mean \(\langle m\rangle \), and when \(\text {P}(N_{\text {ch}})=10^{-4}\)

5 Results

The multiplicity distributions have been measured for the three event classes (INEL >0, INEL, and NSD) for pp collisions at \(\sqrt{s}=\) 0.9, 7, and 8 TeV. To extract the relative contributions from hard and soft processes, the distributions are fitted with double Negative Binomial Distributions. The results are also compared with the LHC measurements done by CMS and with distributions obtained from models including the IP-Glasma, which is based on the Color Glass Condensate (CGC) [26].

Fig. 3
figure 3

Charged-particle multiplicity distributions for NSD (top left), INEL (top right) and INEL >0 (bottom) pp collisions at \(\sqrt{s}=0.9\) TeV. The lines show fits to double NBDs. Ratios of the data to the fits are also presented. Combined systematic and statistical uncertainties are shown as bands

Fig. 4
figure 4

Charged-particle multiplicity distributions for NSD (top left), INEL (top right) and INEL >0 (bottom) pp collisions at \(\sqrt{s}=7\) TeV. The lines show fits to double NBDs. Ratios of the data to the fits are also presented. Combined systematic and statistical uncertainties are shown as bands

Fig. 5
figure 5

Charged-particle multiplicity distributions for NSD (top left), INEL (top right) and INEL >0 (bottom) pp collisions at \(\sqrt{s}=8\) TeV. The lines show fits to double NBDs. Ratios of the data to the fits are also presented. Combined systematic and statistical uncertainties are shown as bands

5.1 Multiplicity distributions

In Figs. 3,  4 and  5, the obtained multiplicity distributions for 0.9, 7, and 8 TeV with NSD, INEL and INEL >0 triggers are shown for five pseudorapidity ranges, \(\vert \eta \vert <2.0\), \(\vert \eta \vert <2.4\), \(\vert \eta \vert <3.0\), \(\vert \eta \vert <3.4\) and \(-3.4<\eta <+5.0\). The colored bands represent the combined systematic and statistical uncertainties. The distributions are scaled by powers of 10 for clarity. The lines show fits to double NBDs, explained in Sect. 5.2. Extending the pseudorapidity coverage with respect to the previous ALICE publications [2,3,4] allows to extend the high-multiplicity reach.

In Fig. 6 (left panel), the comparison to published CMS data [5] is shown for the NSD event class at \(\sqrt{s}=0.9\) TeV in three \(\eta \) ranges. Discrepancies are observed in the comparison with CMS, both in the first bins and, especially, in the tails. Differences in the first bins can be attributed presumably to the different models used to describe the diffraction masses. In particular, CMS is not using any diffraction tuned simulation. Moreover, in this paper, single-diffractive events include a cut on diffractive mass [24], which is different from what CMS has used. In the tails, the CMS data are systematically lower, due to normalization. This behavior was also observed when comparing CMS distributions to those obtained in a narrower pseudorapidity range where only SPD tracklets are used [4] as shown in the distribution for \(\vert \eta \vert <1.0\) in Fig. 6.

Fig. 6
figure 6

Comparison of the multiplicity distributions for NSD pp collisions at \(\sqrt{s}=0.9\) TeV (left) and 7 TeV (right) with CMS [5] measurements in the same pseudorapidity ranges and previous ALICE measurements [4]. Combined systematic and statistical uncertainties are shown as bands

In the right panel of Fig. 6, the comparison to CMS at 7 TeV [5] is shown. Good agreement with CMS is observed except in the very first bins presumably due to the different treatment of diffraction masses. The measurement reported in this manuscript, performed with the SPD clusters, agrees with the analysis performed on SPD tracklets [4] within systematic uncertainties.

Fig. 7
figure 7

Comparison of multiplicity distributions for INEL events to PYTHIA 6 Perugia 0, PYTHIA 8 Monash, PHOJET and EPOS LHC at 0.9 (left) and 7 TeV (right). Combined systematic and statistical uncertainties are shown as bands

In Fig. 7, comparisons with distributions obtained with the PYTHIA 6 Perugia 0 tune [8], PHOJET [9], PYTHIA 8 Monash tune [10] and EPOS LHC [11] models are shown for INEL events at 0.9 TeV (left plot) and 7 TeV (right plot). At 0.9 TeV, PHOJET, PYTHIA 8 Monash tune and EPOS LHC cannot reproduce the tails, and the lowest values of the multiplicity distributions, while PYTHIA 6 Perugia 0 tune does not reproduce the data at all. At 7 TeV, both PHOJET and PYTHIA 6 strongly underestimate the tails of the multiplicity distributions. PYTHIA 8, with the Monash tune that uses LHC data, reproduces the tails for the wider pseudorapidity range, but shows an enhancement in the peak region. EPOS LHC models the distributions well, both in the first bins, which are dominated by diffractive events, and in the tails.

The evolution of the multiplicity distributions with the center-of-mass energy \(\sqrt{s}\) can be studied using the KNO variable \(N_{\text {ch}}/\langle N_{\text {ch}}\rangle \) [27]. KNO scaling violation is observed if the tails of the distributions increase with increasing energy. The violation increases when going to larger pseudorapidity ranges. This behavior was already observed at central rapidities [4], and, therefore, it is not investigated any further.

The multiplicity distributions at 7 TeV are compared to those from the IP-Glasma model [12]. This model is based on the Color Glass Condensate (CGC) [26]. It has been shown that particle multiplicities are generated following an NBD within the CGC framework [28]. Moreover, the multiplicity distribution generated by the decay of the Glasma flux tubes [29] is a NBD with parameter k (see following Section) \(\propto Q_{\text {s}}^{2}S_{\perp }\), in which \(Q_{\text {s}}\) is the gluon saturation scale and \(S_{\perp }\) is the transverse overlap area of the collision [12]. The CGC based IP-Glasma model, therefore, has a built-in source of multiplicity fluctuations. In Fig. 8, the distribution for \(\vert \eta \vert <2.0\) is shown together with the IP-Glasma model distributions as a function of the KNO variable \(N_{\text {ch}}/\langle N_{\text {ch}}\rangle \). The IP-Glasma distribution, shown in green stars, generated with a fixed ratio between \(Q_{s}\) and density of color charge, thus introducing no fluctuations. The blue squares distribution is generated with fluctuations of the color charge density around the mean value following a Gaussian distribution with a width of \(\sigma =0.09\). The black diamonds distribution includes an additional source of fluctuations, dominantly of non-perturbative origin, from stochastic splitting of color dipoles that is not accounted for in the conventional frameworks of CGC [13]. In the IP-Glasma model shown, the evolution of color charges in the rapidity direction still needs to be implemented and in the present model the low multiplicity bins are not reproduced for the wide pseudorapidity range presented.

Fig. 8
figure 8

Charged-particle multiplicity distributions for pp collisions at \(\sqrt{s}=7\) TeV compared to distributions from the IP-Glasma model with the ratio between \(Q_{s}\) and the color charge density either fixed (green stars), allowed to fluctuate with a Gaussian (blue squares) [12] or with additional fluctuations of proton saturation scale (black diamonds) [13]

5.2 Double NBD fits

Already at 0.9 TeV for \(\vert \eta \vert <1.3\) in NSD events, ALICE observed that the distributions are not well described by a single NBD parameterization [2]. For this reason, a parameterization with the sum of two NBDs has been performed for the NSD, INEL and INEL \(>0\) event samples. The fits are plotted together with the results in Figs. 3, 4, 5, and parameters are given in Table 4 for the NSD event sample, in Table 5 for the INEL event sample and in Table 6 for the INEL >0 event sample.

Table 4 Double NBD fit parameters for multiplicity distributions, NSD events. The central values of the parameters are printed in bold with their relative uncertainty, while the top row shows the upper bound fit parameters, and the bottom row the lower bound ones
Table 5 Double NBD fit parameters for multiplicity distributions, INEL events. The central values of the parameters are printed in bold with their relative uncertainty, while the top row shows the upper bound fit parameters, and the bottom row the lower bound ones
Table 6 Double NBD fit parameters for multiplicity distributions, INEL >0 events. The central values of the parameters are printed in bold with their relative uncertainty, while the top row shows the upper bound fit parameters, and the bottom row the lower bound ones

The distributions have been fitted using the function

$$ \text {P}(n)=\lambda \left[ \alpha \text {P}_{\text {NBD}}(n,\langle n\rangle _{\text {1}},k_{\text {1}})+(1-\alpha )\text {P}_{\text {NBD}}(n,\langle n\rangle _{\text {2}},k_{\text {2}})\right] , $$
(5)

where

$$ \text {P}_{\text {NBD}}(n,\langle n\rangle ,k)=\frac{\Gamma (n+k)}{\Gamma (k)\Gamma (n+1)}\bigg (\frac{\langle n\rangle }{k+\langle n\rangle }\bigg )^{n}\bigg (\frac{k}{k+\langle n\rangle }\bigg )^{k}. $$
(6)

NBDs do not describe the bin with \(N_{ch}=0\) and the first bins for the wider pseudorapidities, in which there is a rise in the number of events measured due to diffractive events (bins removed from the fit). To account for this a normalization factor \(\lambda \) is introduced [4]. The systematic uncertainty for the efficiency correction produces a fully correlated shift of the distribution, and therefore is omitted in the fitted data sample. The other sources of uncertainty are kept in the data. From fitting with the above uncertainties included, one obtains the parameters in bold. The parameters of two extreme cases are printed as well in Tables 4, 5 and 6 to allow the reader to have an estimate on how much the fit parameters change due to the remaining correlations present. The upper and lower case correspond to fitting the distribution obtained adding and subtracting material in the detector. The behavior of the fit parameters is consistent with what was observed by CMS [5, 30]. The average multiplicity of the soft (first) component, \(\langle n\rangle _{\text {1}}\), increases with increasing energy and pseudorapidity range. The parameter \(k_{\text {1}}\) increases with increasing pseudorapidity range and decreases with center-of-mass energy. For the semi-hard (second) component, the \(\langle n\rangle _{\text {2}}\) parameter behaves similarly to \(\langle n\rangle _{\text {1}}\). Moreover, it is noted that \(\langle n\rangle _{\text {2}}\simeq 3\langle n\rangle _{\text {1}}\). The parameter \(\alpha \) reflects the fraction of the soft events. The percentage of soft events decreases with increasing pseudorapidity range and energy indicating that a higher percentage of semi-hard events is then present.

For KNO scaling to hold, the parameter k must be constant with energy. This means that KNO scaling is violated for all pseudorapidity ranges for both the soft and semi-hard components in the NSD, INEL and INEL >0 event samples. Three possible scenarios are proposed in the model by Ugoccioni and Giovannini [31] concerning KNO scaling in the semi-hard component. Parameter \(k_{\text {2}}\) can be constant, decrease linearly with increasing energy showing violation, or decrease with energy asymptotically to a constant value showing weak violation. The last scenario appears to be in agreement with the values obtained for \(k_{\text {2}}\), since the decrease from 0.9 to 7 TeV is very strong, while the values are compatible for 7 and 8 TeV. The analysis of 13 TeV data will reveal if the values will be compatible also in that case, or if it is only due to the vicinity of the 7 and 8 TeV energies.

6 Summary and conclusions

Data from the SPD and the FMD were used to access a uniquely wide pseudorapidity coverage at the LHC of more than eight units in pseudorapidity, from \(-3.4<\eta <5.0\). The charged-particle multiplicity distributions obtained from these data were presented for three pp collision energies, \(\sqrt{s}=0.9, 7\), and 8 TeV, and for three different event classes, INEL, INEL \(>0\), and NSD. The results shown extend the pseudorapidity coverage of the earlier results published by ALICE and CMS, and, consequently, the high-multiplicity reach. The extension of pseudorapidity coverage has higher systematic uncertainties due to the unknown fraction of the material budget in front of the FMD, estimated to be up to 14\(\%\).

The multiplicity distributions for 7 TeV collisions measured for NSD events are in agreement with those from CMS. For 0.9 TeV, the results shown are systematically lower in the low-multiplicity bins when compared to the CMS measurement. This is most probably due to a different estimation of the diffractive masses distribution used to tune the simulations for the efficiency correction between CMS and ALICE. At 0.9 TeV, PYTHIA, PHOJET and EPOS LHC cannot reproduce the multiplicity distributions, although PHOJET is closer to the results being tuned using LHC data. At 7 TeV, PYTHIA 6 and PHOJET strongly underestimate the fraction of high multiplicity events. PYTHIA 8 slightly underestimates the tails of the distributions, while EPOS LHC reproduces both the low and the high multiplicity events, showing better capabilities to model diffraction. The Color Glass Condensate based IP-Glasma model produces distributions, which underestimate the fraction of high multiplicity events, but introducing fluctuations in the saturation momentum allows to reproduce the measurements better. Double Negative Binomial distributions composed of a soft and hard component are fitted to the measured distributions. The fraction of soft events decreases with increasing pseudorapidity range and with the increasing collision energy. KNO scaling is violated for the three considered event classes, at all the collision energies probed. The 13 TeV data analysis will help delving into the description of the components of the multiplicity distributions.