Study of Monte Carlo event generators for proton-proton collisions at LHC energies in the forward region

In this paper we present a comparative study between PYTHIA, EPOS, QGSJET, and SIBYLL generators. The global event observables considered are the charged energy flow, charged-particle distributions, charged-hadron production ratios and $V^{0}$ ratios. The study is performed in the LHCb and TOTEM fiducial phase-spaces on minimum bias simulated data samples for \emph{pp} collisions at $\sqrt{s} = 7$ TeV using reference measurements from the aforementioned experiments. In the majority of cases, the measurements are within a band defined by the most extreme predictions. The observed differences between the predictions and the measurements seem to be, in most part, caused by extrapolation from the central pseudorapidity region ($|\eta| \leq$ 2.5), in which the generators were mainly tuned.


Introduction
One of the most important sources of information concerning elementary particle physics is the study of high energy cosmic rays. Up until the advent of powerful particle accelerators in the 1950s, the only source of high energy particles were the cosmic rays. The cosmic ray spectrum reaches energies of the order of 10 20 eV [1], whilst the most powerful collider to date, the Large Hadron Collider, reaches energies of 13 TeV in the center of mass frame or about 10 17 eV fixed target equivalent. So, there are two independent sources of information for pp collisions at the same energy scale. Combining the two helps create a better picture of the phenomena that take place in such collisions. Although the crosssection of hard interactions is considerable at these energy scales, the soft interaction part is still large. As soft processes imply non-perturbative QCD, we rely on phenomenological models and effective theories for predictions. Hadronic interactions generators have been developed for the description of the physics at the aforementioned energy scales, with an emphasis on either cosmic rays or collider physics. In recent years, cosmic rays generators have been extensively tuned to collider physics measurements, especially in the context of the newly available data from LHC. In this paper we compare the predictions obtained EPOS LHC [2], QGSJETII-04 [3] and SIBYLL 2.3 [4] generators included in the CRMC package [5] and the widely used event generator for LHC physics, PYTHIA (versions 8.186 [6] and 8.219 [7]) for pp interactions at √ s = 7 TeV with measurements from the LHCb and TOTEM experiments. The generators studied are all tuned using various observables measured at LHC experiments. Predictions obtained with PYTHIA 8.186 using the non-LHC tune 2M are also shown for reference. Throughout this paper we are referring to measurements/tunes performed in the "central" and "forward" regions defined with respect to the pseudorapidity of the particles. The central pseudorapidity region is defined as |η| ≤ 2.5, corresponding to the ATLAS, ALICE and CMS acceptances [8][9][10], and the forward pseudorapidity region as η ≥ 2.5, corresponding to the LHCb (2 ≤ η ≤ 5) and TOTEM (3.1 ≤ |η| ≤ 6.5) acceptances [11,12]. 2 The Monte Carlo event generators

General description
The generators used for this study are PYTHIA, a collider physics generator, EPOS, QGSJET and SIBYLL, which are cosmic ray collisions generators. They can be split in three categories according to the models on which they are based. PYTHIA is a parton based generator and it simulates parton interactions and parton showers, the hadronization being treated using the Lund string fragmentation model [13,14]. Another category would be the one of the generators based on the Regge theory such as QGSJET and SIBYLL. These models treat soft and semihard interactions as Pomeron exchanges ("soft" and "semihard" Pomerons), but also mix perturbative methods into the treatment of hard interactions [14,15]. EPOS is part of a distinct category in which the parton based description is mixed with aspects from the Regge theory [14]. The focus of the study is on minimum bias physics measurements and the generators used, especially the cosmic ray ones, are developed for the description of such observables. The selection of these particular generators is justified by their varied usage and basic assumptions, while at the same time sharing similarities and being tuned to LHC data, as it will be disccused below.
PYTHIA is one of the most used Monte Carlo event generator for collider physics with an emphasis on pp interactions. It is mainly based on Leading Order (LO) QCD, having implemented LO matrix elements and usualy using LO PDF sets (NLO PDF sets also available) [7,16,17]. The main event in a pp collision (internally called "hard process") can be represented by a plethora of processes like elastic and diffractive (described using Pomerons) [7,13,18], soft and hard QCD processes, electroweak processes, top quark production etc. The generator also implements parton showers (Initial State Radiation, ISR, and Final State Radiation, FSR) in Leading Log (LL) approximation with matching and merging methods between them and the hard processes [7,16]. Given that the colliding hadrons have a complex partonic structure, other partonic interactions aside from the main event are expected. These are called multiparton interactions (MPI) and are usually soft in nature, but the momentum transfer can also reach the hard interaction energy scale. PYTHIA implements a description of both types and also of the beam remnants which form after the extraction of MPI initiator partons [7]. The hadronization mechanism is based on the Lund string fragmentation model [7].
The Parton-Based Gribov-Regge Theory is an effective field theory using concepts from QCD in which the elementary interactions between the constituent partons of nucleons/nuclei proceed via exchanges of parameterised objects called Pomerons which have the quantum numbers of the vacuum [19,20]. In this theory the elementary collisions are treated as a sum of soft, semihard and hard contributions. If one considers a cutoff value of the momentum transfer squared of Q 2 0 ∼ 1 GeV 2 , below which perturbative QCD calculations can no longer be done, then the soft contribution (non-perturbative) is represented by processes with Q 2 < Q 2 0 and the hard contribution (perturbative) by processes with Q 2 > Q 2 0 . The processes in which sea partons with x 1 (Björken x ) are involved are called semihard and are represented by a parton ladder with soft Pomeron ends [19].
The generator EPOS is based on the effective theory described above [2]. EPOS is an acronym for Energy conserving quantum mechanical approach, based on Partons, parton ladders, strings, Off-shell remnants, and Splitting of parton ladders [21]. In EPOS the interaction of the two beam particles is described by means of Pomeron exchanges. As discussed above, these Pomerons can be soft, semihard or hard. A soft Pomeron can be viewed from a phenomenological standpoint as two parton ladders (or cut Pomeron) connected to the remnants by two color singlets (legs) from the parton sea [22]. A cut Pomeron can be viewed as two strings which fragment to create hadrons. The flavours of the string ends need to be compensated within the remnants. Thus, particle production in EPOS comes from two sources, namely cut Pomerons and the decay of remnants [22]. Through a recent development (from EPOS 1.99 onwards), EPOS is now a core-corona model. The core represents a region with a high density of string segments that is larger than some critical density for which the hadronization is treated collectively and the corona is the region with a lower density of string segments for which the hadronization is treated non-collectively. The strings from the core region form clusters which expand collectively. This expansion has two components, namely radial and longitudinal flow. Through this core-corona approach, EPOS takes into account effects not accounted for in other HEP models [2]. In EPOS, in the case of multiple scatterings (multi-Pomeron exchanges) the energy scales of the individual scatterings are taken into account when calculating the respective cross-sections, while in other models based on the Gribov-Regge Theory this is not the case. This leads to a consistent treatment of both exclusive particle production and cross-section calculation, taking energy conservation into account in both cases [19,22]. The multiplicity and inelastic cross section predictions of the model are directly influenced by energy momentum sharing and beam remnant treatment [22].
The elementary scatterings in QGSJET are also treated as Pomeron exchanges [15]. QGSJET is based on the Quark-Gluon string model, which is in turn based on the Gribov-Regge model [23]. In this model the Pomeron exchange can be viewed as an exchange of a non-perturbative gluon pair. Each of the colliding protons can be considered as being a system of a quark and a diquark with opposite transverse momenta. The quark from the first proton exchanges a non-perturbative gluon with the diquark from the second proton and viceversa, thus creating two quark-gluon strings which will decay according to fragmentation functions to create hadrons [24]. In a similar manner to EPOS, the soft (non-perturbative) and hard (perturba-tive) contributions are separated by a cutoff value of Q 2 0 . In QGSJET a Pomeron is actually a sum of two contributions: a "soft" Pomeron one and a "semi-hard" Pomeron contribution. The soft part represents a purely non-perturbative parton cascade, while the "semi-hard" Pomeron can be viewed as two "soft" Pomerons connected by a parton ladder [25]. At very high energies as those at the LHC and/or small impact parameters, the semi-hard contribution dominates and so it is crucial to take it into account [15,23]. In these high energy collisions large numbers of parton-parton interactions occur, the resulting cascades interacting with one another (Pomeron-Pomeron interactions) and thus their evolution is no longer indepedent, but correlated. QGSJET-II takes into account these non-linear effects which are computed with enhanced Pomeron diagrams [15,23].
SIBYLL is based on the dual parton model (DPM), using the minijet model for hard interactions and the Lund string fragmentation model for hadronization [26,27]. Similarly to both EPOS and QGJSET, soft and hard interactions are separated by a transverse momentum scale cutoff value. The soft interactions are treated using the dual parton model (DPM) in which the nucleon is treated as consisting of a quark and a diquark, and similar to the Quark-Gluon string model described above, a quark (diquark) from the projectile combines with the diquark (quark) from the target to form two strings which are fragmented separately using the Lund string fragmentation model. In SIBYLL 1.7 the cutoff value was set to p min T = √ 5 GeV, but from version 2.1 onwards it was changed to a function of the collision energy which for √ s = 7 TeV returns p min T ≈ 3, 87 GeV [26].

Versions used in the study
The default tune for PYTHIA 8.186 is Tune 4C with the CTEQ6L1, LO PDF set as the default one [7,28]. Tune 4C (default from version 8.150 onwards [29]) is obtained starting from Tune 2C for which Tevatron data have been used, by varying MPI and colour reconnection parameters to fit the measurements for minimum bias (MB) and underlying event (UE) observables from ALICE and ATLAS experiments at various collision energies (0.90, 2.36 and 7 TeV). The observables used are, for example: charged multiplicity and rapidity distributions, transverse momentum distributions, mean transverse momentum as a function of charged multiplicity distributions, transverse momentum sum densities etc. Tune 2M is obtained in a similar manner to 2C, using measurements from the CDF experiment at Tevatron, but uses the modified PDF set MRST LO** instead of the CTEQ6L1, LO PDF set [30]. From here on, PYTHIA 8.186 with Tune 2M will be refered to as PYTHIA 8.1 2M. PYTHIA 8.219 has the Monash 2013 tune as it's default (with the NNPDF3.3 QCD+QED LO PDF set) [7,29]. The Monash 2013 tune has been created for a better description of minimum bias and underlying event observables. Similar observables as for the previous tune have been used, with measurements from ATLAS and CMS experiments, and the charged pseudorapidity distribution from TOTEM in the forward region. The flavour-selection parameters of the string fragmentation model have been re-tuned using a combination of data from PDG and from the LEP experiments, resulting in an overall increase of about 10% in strangeness production and a similar decrease of the production of vector mesons. The kaon yields have clearly improved with respect to CMS measurements and the ones of hyperons are also slightly improved. The minimum bias charged multiplicity has also increased by about 10% in the forward region [31].
EPOS LHC's fundamental parameters are tuned to cross-section measurements from the TOTEM experiment at √ s = 7 TeV, leading to a highly improved description of charged multiplicity (compared to EPOS 1.99). In EPOS LHC the radial flow calculations are corrected. This correction affects the high multiplicity region, again leading to a highly improved description of this observable in this particular region. In EPOS 1.99 the baryon-antibaryon pair and strangeness production were largely overestimated in high energy collisions. This issue was corrected in EPOS LHC and by using the same string fragmentation parameters as for e + e − collisions, kaon/pion and proton/pion ratio measurements from CMS at √ s = 7 TeV are reasonably well described [2]. The statistical particle production mechanism from the core affects strangeness production by removing its suppression. This leads to a good description of strange baryon yield measurements from CMS at √ s = 7 TeV as shown in Figure 10 from [2]. The radial flow parameters are tuned using charged-particle transverse momentum distributions (for minimum bias pp collisions) obtained at the ATLAS experiment at √ s = 0.9 and 7 TeV. This leads to a very good agreement with experimental transverse momentum distributions of identified particles [2].
QGSJETII-04 distinguishes itself from the previous version, QGSJETII-03, by taking into account all significant enhanced Pomeron diagram contributions, including Pomeron loops, and the tuning to new LHC data [32]. As QGSJET is used for high energy cosmic rays studies, the current version of the generator has been tuned to LHC measurements for observables to which the extensive air shower (EAS) muon content is sensitive. Examples of such observables are: charged particle multiplicities and densities, anti-proton and strange particle yields etc. QGSJETII-03 predicts a steeper increase in multiplicity in pseudorapidity plots from √ s = 0.9 to 7 TeV than what is observed in ATLAS measurements for these collision energies. As a consequence, the Q 2 0 separation scale between soft and hard interactions has been increased from 2.5 GeV 2 to 3.0 GeV 2 . For a better description of ALICE measurements of the antiproton transverse momentum spectrum at √ s = 0.9 TeV, the anti-nucleon yield was slightly reduced and the hadronization parameters have been modified as to enlarge the average transverse momentum of the antinucleons. The strangeness production has been enhanced to better describe K 0 S and Λ rapidity distributions measured at CMS for √ s = 0.9 TeV and 7 TeV pp collisions. Another major tuning is done using inelastic cross section measurements at √ s = 7 TeV from the TOTEM experiment [33].
SIBYLL is a relatively simpler model and emphasis is put on describing observables on which the evolution of extensive air showers depends, like energy flow and particle production in the forward region [34]. In SIBYLL 2.3 soft gluons can be exchanged between sea quarks or sea and valence quarks also. A new feature in version 2.3 is the beam remnant treatment which is similar to that of QGSJET. This new treatment allows the particle production in the forward region to be tuned without modifying the string fragmentation parameters. A major tuning procedure has been done for the description of leading particle measurements from the NA22 and NA49 experiments [4]. SIBYLL 2.3 has also been tuned using measurements from √ s = 7 TeV pp collisions at LHC experiments namely, the inelastic cross section from TOTEM, average antiproton multiplicities and charged particle differential cross sections as a function of transverse momentum obtained at CMS. The SIBYLL 2.1 version was tuned using Tevatron data and it describes, for example, charged pseudorapidity density measurements reasonably well, even the ones from CMS at √ s = 7 TeV, as one can see in Figure 4 from [35]. At the same time SIBYLL 2.1 overestimates the inelastic cross section measurements at high collision energies (beyond 1 TeV), leading to the tuning of version 2.3 with the σ inel pp measurements at √ s = 7 TeV from TOTEM. The antiproton multiplicities measured in fixed target experiments at low collision energies seem to be reasonably well described by version 2.1, but the measurements obtained at the CMS experiment for various collision energies are largely underestimated. To correct this effect in SIBYLL 2.3, a different value of the quark/diquark production probability, P q/qq , has been assigned for the fragmentation of minijets than for all the other fragmentation processes. The value of P q/qq in SIBYLL 2.1 was fixed to 0.04 for all processes. SIBYLL 2.3 uses the same effective parton density function as the previous version, but the quark and gluon contributions are obtained from the same parametrizations used to calculate the minijet cross section. This leads to a steeper parton distribution function at low Björken x which combined with the correction of the definition of p min T , leads in turn to a better description of the measurements for charged particle cross sections as a function of transverse momentum obtained at CMS in the 2 ≤ p T ≤ 5 GeV/c range. Also, a charm hadron production model was implemented in version 2.3 [35].

Data generation and analysis strategy
Samples of 10 6 inelastic minimum bias pp events at √ s = 7 TeV were generated for each generator. For all generators a stable particle definition of cτ ≥ 3 m was used, where τ is the mean proper lifetime of the particle species.
This study treats five distinct aspects: charged energy flow, charged-particle distributions, charged-hadron production ratios and V 0 ratios.
Charged energy flow is computed as the total energy of stable charged particles (p,p, K ± , π ± , µ ± and e ± ) in the interval 1.9 ≤ η ≤ 4.9 (10 bins of ∆η = 0.3), divided by the width of the pseudorapidity bin and normalised to the number of visible inelastic pp interactions N int or: where N part , η is the number of stable charged particles (as defined above) in a ∆η = 0.3 bin and E i,η is the energy of the particles from the respective bin (see [36]). There are four event classes considered for the charged energy flow: inclusive minimum bias events, hard scattering events, diffractive enriched events and non-diffractive enriched events. The inclusive minimum bias events are required to have at least one charged particle in the range: 1.9 ≤ η ≤ 4.9. The hard scattering events require at least one charged particle with p T ≥ 3 GeV/c in the aforementioned range. Diffractive enriched events require that no particles are generated in the pseudorapidity range of −3.5 < η < −1.5 and nondiffractive enriched events require at least one particle in this range. These event class definitions are compatible with the ones from [36] from which the LHCb reference measurements were taken.
The purity of the diffractive enriched and nondiffractive enriched events samples have been studied for both versions of PYTHIA (as the generator has readily accessible event type information) and are about 94% and 92%, respectively. In Figure 1, the transverse momentum scale distributions of the hardest parton collision from hard and soft (non-hard and non-diffractive) events, obtained with PYTHIA 8.186, are shown. As can be seen, the peaks are reasonably well separated with µ ≈ 8.7 GeV/c, σ ≈ 4.5 GeV/c, for hard events and µ ≈ 4.2 GeV/c, σ ≈ 3.2 GeV/, for soft events. The fraction of events that pass both the hard and diffractive enriched event class conditions are negligible.
The values of the number of visible events for the different event classes are given in Table 1.
The transverse momentum, pseudorapidity and multiplicity distributions of charged stable particles (p, π, K, e, µ) are presented in Figures 3-6. The distributions were scaled with the number of visible events from the sample. The visible events are required to contain a minimum of one charged-particle satisfying the criteria listed below: • Figure 3: 2 < η < 4.8, p ≥ 2 GeV/c and p T > 0. 2 GeV/c [37].
•  The numbers of minimum bias and hard events with a minimum of one charged particle in the range 2 < η < 4.5 are given in Table 2.
For all of the distributions mentioned above pull plots of (x gen − x exp )/σ exp have been drawn.
A particle is defined as prompt if the sum of it's ancestors' mean proper lifetimes is less than 10 ps as in [37][38][39].
The prompt charged-hadron production ratios as a function of pseudorapidity are shown in  and are the following:p/p, π − /π + , K − /K + , (K + + K − )/(π + +π − ), (p+p)/(K + +K − ) and (p+p)/(π + +π − ). These ratios are computed in the phase-space defined by 2.5 ≤ η ≤ 4.5 and p ≥ 5 GeV/c and three transverse momentum intervals, namely p T < 0.8 GeV/c, 0.8 ≤ p T < 1.2 GeV/c and p T ≥ 1.2 GeV/c [40]. The prompt V 0 particle ratiosΛ/Λ andΛ/K 0 S as a function of rapidity are shown in Figure 12. The ratios are computed in the phase-space defined by 2 ≤ y ≤ 4.5 and three p T intervals: 0.15 < p T < 0.65 GeV/c, 0.65 < p T < 1.00 GeV/c and 1.00 < p T < 2.50 GeV/c. Figures  13-14 show the prompt V 0 particle ratios as a function of rapidity and as a function of transverse momentum in the 2 ≤ y ≤ 4.5 rapidity interval and the full p T interval 0.15 < p T < 2.50 GeV/c [41].
The statistical uncertainties of the MC predictions are negligible, reaching a maximum value of about 3 % in the least populated bins at the edges of the considered phase-space regions, while for the rest of the bins the uncertainties are of the order of 0.1 %.
The sources of the reference measurements used in the plots are given at the end of the captions.

Results and discussion
The charged energy flow for different event classes is presented in Figure 2. In Figures 1 and 2 from [36] one can find the predictions for older pre-LHC tuned versions of the generators used in this study.
The predictions of PYTHIA 6's versions [36] seem to be reasonably good in the central region (with the exception of diffractive events), but largely underestimate the measured values in the forward region in all cases. PYTHIA 8.135's predictions have a good description for the inclusive minimum bias, diffractive enriched and nondiffractive enriched event classes, but overestimate the measured values for the hard events. PYTHIA 8.1 2M exhibits a slight decrease in overall values relative to version 8.135 (which uses the older Tune 1 [29]) for the minimum bias, non-diffractive enriched and hard event classes. The description for the hard event class is improved, while for the other two event classes an underestimation trend is now observed. There is no major difference between the two versions for the diffractive event class.
With the exception of SIBYLL, a generator tuned to reproduce energy flow measurements, PYTHIA 8.186 seems to have the best description overall of the LHCtuned generators. It's predictions for the diffractive enriched class are very similar to that of version 8.135, but for the rest of the event classes the predictions are further away from the measurements, exhibiting a constant overestimation trend. PYTHIA 8.219 has a good description of the charged energy flow for the diffractive enriched class, being similar to that of version 8.186. One can see that the predictions tend to have an increased overestimation in the forward region, but are similar to the ones of version 8.186 in the central region. The   in the forward region implemented through the Monash 2013 tune [31]. EPOS 1.99's predictions [36] describe reasonably well the charged energy flow for inclusive minimum bias, hard and non-diffractive enriched event classes, slightly overestimating the measurements in the last two bins, and it underestimates the charged energy flow for diffractive processes in the forward region.
EPOS LHC's predictions are very similar to the ones of PYTHIA 8.219 for all event classes except the diffractive enriched class, where, similarly to the previous version, it underestimates the charged energy flow. As one can see in the restricted minimum bias plot, the apparent overestimation of the soft process component is similar to the one of PYTHIA 8.219. Compared to the previous version, we observe a worsening of the predictions (except for diffractive events). EPOS LHC shows an overall overestimation of the measurements with an increasing trend towards the forward pseudorapidity region.
The predictions of QGSJET01 and QGSJETII-03 from [36] are similar for the inclusive minimum bias class and they overestimate the charged energy flow. QGJSET01 has a better description of the diffractive and hard events class in the central region, but tends to overestimate the measurements for the hard events and underestimate them for the diffractive ones in the forward region. The general trend of QGSJETII-03 is of underestimating for the hard events. The prediction of QGSJETII-04 is similar to that of the previous versions for the inclusive minimum bias event class. The description of the charged energy flow for hard events is more underestimated than in the case of QGSJETII-03. The diffractive component's description is similar to that of QGSJETII-03, but with a slightly larger underestimation trend. For the rest of the event classes the differences with respect to the measured LHCb charged energy flow are significant. Although the absolute values are rather clearly far from the experimental values, the shapes are well described. QGSJETII-04 is very similar to EPOS LHC and PYTHIA 8.219 in it's description of the charged energy flow for inclusive minimum bias and non-diffractive enriched event classes. SIBYLL 2.1's prediction [36] describes very well the measurements for inclusive minimum bias events. It also has a reasonably good description for the diffractive events, the values being within the error bars, although an underestimation trend can be seen. The hard events component is well described in the central region, but it is overestimated in the forward region. SIBYLL 2.3 seems to have the best prediction for all event classes (on par with PYTHIA 8.186 for the diffractive enriched class). It can be seen that it has a slight underestimation trend in the forward region in the case of inclusive minimum bias and non-diffractive enriched event classes.
As one can see in Table 1, PYTHIA 8.219 and EPOS have similar ratios of hard events, but the number of visible events and the ratio of diffractive events are smaller for EPOS. PYTHIA 8.186's ratio of hard events is larger than version 8.219's one, but the ratios of diffractive events are close indicating that the mechanisms of diffractive processes are similar. QGSJETII-04's ratio of hard events is sensibly larger than the rest and the ratio of diffractive events is smaller, so the hard process component seems to be larger for this generator. Likewise, SIBYLL's hard process component is larger than PYTHIA's and EPOS's one.
As one can see in the transverse momentum plot from Figure 3, PYTHIA and QGSJET predictions are similar in shape. There is no major difference between PYTHIA's LHC-tuned versions. PYTHIA and EPOS predictions are rather similar in the interval 0.5-1.5 GeV/c. QGSJET's prediction seems closest to the LHCb measurements, but for all generators there are visible differences in absolute scale, especially in the hard part of the spectrum. SIBYLL-generated spectrum has a shape which approaches the experimental one, but the absolute values differ significantly. The shapes of the spectrums generated with QGSJET, EPOS and both versions of PYTHIA are close to the experimental one.
In the pseudorapidity plot from Figure 3 one can see that all the predictions cluster together at low values as the models were tuned using measurements from central LHC experiments. QGSJET, EPOS and PYTHIA 8.2 underestimate the measurements for values below η = 3.5 and overestimates them in the forward region (where they also remain clustered together). PYTHIA 8.1 also underestimates the measurements in the central region, but the prediction in the forward region seems to be reasonably good. SIBYLL largely underestimates the measurements across the whole range.
For the (probability density of) multiplicity distribution from Figure 3, the closest prediction seems to be the one of EPOS. All LHC-tuned generators reproduce the measurements well for this distribution, except SIBYLL which deviates significantly. One can see that EPOS's prediction clusters together with PYTHIA estimates in the medium-high multiplicity region. For values below n ch = 10 EPOS seems to be better than PYTHIA. QGSJET's prediction is close to the ones of EPOS and PYTHIA, but the underestimation at low multiplicities in the interval n ch = 10-20 is larger, the deviations from the measurements ranging between ∼ 3−5 σ. SIBYLL's prediction very strongly favours low multiplicities, but gets closer to the measured values towards high multiplicities.  Figure 4 is best described by PYTHIA 8.186. PYTHIA 8.219's prediction is close, too. EPOS and QGSJET estimates are a bit further away from the experimental values. SIBYLL's prediction is significantly different both in absolute value as well as shape of the distribution. With the exception of SIBYLL, the clustering of the predictions can be seen in the central pseudorapidity region, indicating the tuning was done using similar measurements. The prediction of EPOS describes the measurements reasonably well in the central region (2 < η < 2.5), but it diverges upwards from the measured values in the forward region. This effect of overestimation in the forward region is similar to the one seen in Figure 3. QGSJET slightly underestimates the measurements in the central region, but gets closer in the forward region (overlapping with PYTHIA 8.219).
The multiplicity distribution is not perfectly described by any of the generators, but one can see that the predictions of EPOS and PYTHIA seem to get better at higher multiplicities, as we have also seen for the previous multiplicity distribution. The distributions generated with SIBYLL and QGSJET are significantly different from the experimental ones.    The pseudorapidity plot from Figure 5 shows a good agreement between PYTHIA versions and the LHCb measurements. EPOS also has a good description of the measurements in the central region, but diverges upwards in the forward region. SIBYLL's prediction is similar to the one of QGSJET at low rapidity, but they diverge in the forward region and are both far from the experimental distribution. The discontinuity at η = 2.5 is due to the hard event selection criterion of a minimum of one particle with 2.5 ≤ η ≤ 4.5 and p T ≥ 1 GeV/c [38].
As in Figure 4, the multiplicity distribution is not well described by the generators with PYTHIA and EPOS being closest to the measurements.
As can be seen in Figure 6 the best predictions are the ones of QGSJET, EPOS and PYTHIA 8.219. All the generated shapes and spectrum slope agree well with the ones of the experimental distribution.
In the pseudorapidity plots from figures 4-6 it can be seen that the predictions of PYTHIA 8.1 2M largely underestimate the measurements. The differences between the predictions of PYTHIA with Tune 2M and the two LHC tunes are large in the central region and exhibit a converging trend towards higher pseudorapidity. The multiplicity plots from figures 3-6 are rather clearly not well reproduced by PYTHIA 8.1 2M's prediction which favours very low multiplicities. Table 2.
Number of events with a minimum of n ch ≥ 1 in 2 < η < 4.5 expressed as percentages from the total number of generated inelastic events Ngen = 10 6 . Hard events require a minimum of one charged particle with pT ≥ 1 GeV/c in 2.5 < η < 4.5. The ratios of hard events for PYTHIA and EPOS, given in Table 2, are close, suggesting a similarity between the descriptions of hard processes. SIBYLL's ratio is slightly higher than the previous generators.

Generator
QGSJET's ratio of hard events is considerably higher than the ratios of the other generators, so again one can see that it favours the hard processes.
The plot for thep/p ratio is shown in Figure 9. All predictions have the same trend of apparent decrease towards the beamline and it can be said that the ratio is reasonably well described. The π − /π + ratio which is shown in the same figure is also well described by all generators with the exception of QGSJET for the high p T region, where it seems to show a charge asymmetry between π + and π − . Also, all the predictions seem to cluster together, again with the exception of QGSJET at high p T . The K − /K + ratio shown in Figure 10 is fairly well described by all generators.
The closest prediction for the (K + + K − )/(π + + π − ) (shown in the same figure) seems to be that of SIBYLL followed by the one of EPOS, yet, overall all generators fail to describe this measurement. In the high p T range, QGSJET underestimates the measurements and has a pronounced ascending trend.
A clustering of the predictions in the low p T plot for the (p +p)/(π + + π − ) (shown in Figure 11) is observed. Here, all the generators have a good description of the measurements. For the high p T range the closest predictions are the ones of EPOS and PYTHIA 8.1, while for the middle p T range no generator seems to correctly describe the ratio. In the high p T range the ratio is again underestimated by QGSJET, the prediction of which again having an ascending trend, and SIBYLL largely overestimates the ratio.
The (p+p)/(K + +K − ) ratio is shown in the same figure. The best prediction overall is the one of EPOS LHC. SIBYLL and QGSJET have a good description of this ratio in the low p T range. In the middle p T range SIBYLL's prediction overlaps with the one of EPOS LHC. In the high p T range PYTHIA 8.219 and QGSJET also have a reasonably good description, although QGSJET exhibits again an ascending trend. SIBYLL again largely overestimates the ratio in this range together with PYTHIA 8.1. The predictions of PYTHIA for the proton/kaon and kaon/pion ratios are clearly improved by the strangeness enhancement from the Monash 2013 tune.
In Figure 8 the yields of protons and pions from the high p T region obtained with QGSJET are shown. It is rather clear that the slope of the decrease towards high pseudorapidity of the pions is higher than the corresponding one for the protons. The yields of protons, pions and kaons in the same p T region for all generators are shown in Figure 7. It can be seen that the slope of the proton yield distribution of QGSJET is the lowest, while the one of the pion yield is the highest. The slope of the kaon yield is in between the slopes of the other generators. These together with the observed ascending trend of the QGSJET predictions for the proton/pion, kaon/pion and proton/kaon ratios in the high p T range, while the data or the predictions of the other generators do not show such a trend, suggest that the proton multiplicity decreases too slowly and the pion multiplicity decreases too fast towards high pseudorapidity.
As one can see in Figures 12-14, theΛ/Λ ratio is best described by EPOS LHC and PYTHIA 8.219, pointing to a good baryon number transport. Nonetheless, all predictions have more or less the same trend. TheΛ/K 0 S ratio seems to be reasonably well described by QGSJET, while the other generators largely underestimate it.     Fig. 11. Prompt charged-hadron ratios as a function of pseudorapidity in the kinematic region of 2.5 ≤ η ≤ 4.5 and p ≥ 5 GeV/c in various pT intervals at √ s = 7 TeV. The LHCb data vertical bars represent the combined statistical and systematic uncertainties [40]

Conclusions
The generators that have been studied are EPOS LHC, QGSJETII-04, SIBYLL 2.3 and versions 8.186 and 8.219 of PYTHIA. The observables on which the study was conducted were the charged energy flow, chargedparticle multiplicities and densities, charged-hadron production ratios, V 0 ratios and other strange particle distributions. It is reasonably clear that no generator reproduces the data for all of the observables studied, but rather one generator describes well only a particular set of the observables or aspects of particle production. As a general trend, the predictions are better in the central region. The tuning using data from the central-rapidity range of general purpose LHC detectors is visible and clearly improves the estimations even for the forward region, though the effect of extrapolation to higher rapidity is in clear disagreement with experimental data.
It was observed that the charged energy flow, which can be regarded as a global event observable, is relatively well described by all the generators, at least in terms of shape. The best prediction overall for the charged energy flow is that of SIBYLL 2.3, a generator tuned specifically to reproduce correctly this type of observable. PYTHIA 8.186 has the best description of the other LHC-tuned generators.
EPOS and PYTHIA, especially version 8.219, are very similar in their description of the observables. The similarity between the generators may arise from the partonic approach and similar perturbative calculations that they both use for hard parton collisions.
QGSJET is similar to EPOS in the description of some observables like the charged energy flow (except for the hard event class) and charged particle densities, but also shares some similarities with SIBYLL.
The multiplicity distributions are generally not well reproduced by the generators. Here EPOS and PYTHIA have the best predictions overall. Also, they seem to get better with the increasing hardness of the processes, but exhibit a similar effect to the one of the other generators, i.e., favouring either very low or high multiplicity events, albeit at a much lower level than SIBYLL, for example, which has the most polarizing behaviour.
SIBYLL has a few notable successes in describing some particle ratios and also its predictions for charged particle pseudorapidity and transverse momentum distributions have a good shape.
The best baryon transport mechanism seems to be the one of EPOS, followed by the one of PYTHIA, while theΛ/K 0 S ratio is best described by QGSJET. Most of the observed differences seem to be an effect of extrapolation in the forward region. So, the extrapolation uncertainties seem to be rather large. Nonetheless, in the majority of cases, the measurements fall within a band defined by the most extreme predictions.
The relative contributions of particle production processes differ between the central and forward regions. In the central pseudorapidity region there is a significant contribution of hard parton-parton scatterings (with high squared momentum transfer) to which high multiplicity events and high p T jets are associated. In the forward region, on the other hand, the underlying event (multiparton interactions and beam remnants), as well as diffractive processes have a considerable contribution. The event generators usually have different sets of parameters for each process and as such, when tuning using measurements from one pseudorapidity region or the other, different parameters are constrained, so each tune is applicable for studies in its respective region. As shown in this paper, the predictions in the forward region are improved by the tuning of the generators using measurements from the central region, but it seems that a dedicated tuning procedure is still necessary. So, the utility of each tune is somewhat limited when extrapolating from the central to the forward region and vice versa. Ideally, measurements from both the forward and central regions should be used simultaneously when tuning a generator, but this is seldomly happening. In many cases there are intrinsic limitations of the generators or the models they are based on, which prevent a simultaneous tune in both regions and so, a more consistent overall description of the processes. Difficulties related to such a tuning procedure also arise from the different experimental conditions in each region.
As we have seen in this paper, it seems that the modelling of the soft processes is still open to improvement and a forward tuning of generators is required to improve precision in this rapidity range. Hence, it may prove useful to take into account during the tuning process measurements from LHCb and TOTEM, which are LHC experiments in the forward region, where the soft process component is sensibly larger than in the central region, the baryon transport is different, and the multi parton collisions might give a different signal.