Core-corona effect in hadron collisions and muon production in air showers

It is very well known that the fraction of energy in a hadron collision going into electromagnetic particles (electrons and photons, including those from decays) has a large impact on the number of muons produced in air shower cascades. Recent measurements at the LHC confirm features that can be linked to a mixture of different underlying particle production mechanisms such as a collective statistical hadronization (core) in addition to the expected string fragmentation (corona). Since the two mechanisms have a different electromagnetic energy fraction, we present a possible connection between statistical hadronization in hadron collisions and muon production in air showers. Using a novel approach, we demonstrate that the core-corona effect as observed at the LHC could be part of the solution for the lack of muon production in simulations of high energy cosmic rays. To probe this hypothesis, we study hadronization in high energy hadron collisions using calorimetric information over a large range of pseudorapidity in combination with the multiplicity of central tracks. As an experimental observable, we propose the production of energy in electromagnetic particles versus hadrons, as a function of pseudorapidity and central charged particle multiplicity.


I. INTRODUCTION
Cosmic ray particles reach Earth from galactic and extragalactic sources with enormous energies and produce huge particle cascades in the atmosphere. The resulting extensive air showers are measured with the aim to unveil the astrophysical nature and origin of high energy cosmic rays. The Pierre Auger Observatory [1,2] and the Telescope Array [3] are the largest contemporary experiments targeting the most energetic cosmic rays with energies beyond 10 18 eV.
Of particular interest is the cosmic ray mass composition, which is expected to carry a unique imprint of the physics at the sources. The mass composition as a function of the cosmic ray energy E 0 is inferred from air shower observables, of which the most important ones are the depth of the shower maximum X max and the number of muons N µ [4]. The depth X max is the integrated matter density column that a shower traversed until the maximum number of charged particles in the shower is reached. The number of muons is obtained by counting muons when the shower arrives at ground. Experimentally the muon counting is limited to a radial range around the shower axis as well as to a minimal energy of muons.
To infer the cosmic ray mass composition from these observables, accurate predictions from air shower simulations are needed for cosmic rays with various primary * sebastian.baur@ulb.ac.be masses. However, the Pierre Auger Observatory [5,6] and the Telescope Array [7] observed that the measured number of muons in air showers drastically exceeds expectations from model predictions at shower energies around and above 10 19 eV. A recent summary of muon measurements [8] shows that a consistent muon excess is seen by the majority of cosmic ray experiments over a very wide energy range. The discrepancy between results based on X max and N µ is currently preventing an unambiguous interpretation of air shower data in terms of mass composition.
The amount of energy ending up in electromagnetic particles in hadron collisions where E em is the summed energy over all γ (mostly from π 0 decay) as well as e ± , and E had the summed energy of all hadrons, is one of the crucial parameters driving muon production in extensive air showers [9][10][11]. It is closely related to the way an excited partonic system hadronizes.
In hadronic interaction models used to simulate air showers, the hadronization is mainly done using a string fragmentation model which was successfully developed to describe the hadron production in e + -e − collisions, and low energy proton-proton collisions. In systems with higher energy densities, such as heavy ion collisions, a statistical hadronization of a fluid is expected where the production of heavy particles is favored, thus, reducing the fraction of π 0 compared to other types of particles. In the early 2000s "collective effects" have been observed in heavy ion collisions (often referred to as large systems) at RHIC [12][13][14][15]. Similar effects have been predicted [16][17][18][19][20][21] for proton-proton collisions (aka small systems) and were eventually discovered at the LHC [22] (see Refs. [23,24] for detailed reviews). While a fluid-like behavior (referred to as collective effects in the following) is confirmed in both large and small systems, their origin is still unclear. In large systems the existence of a quark-gluon-plasma (QGP) is commonly assumed as a phase of parton matter where confinement is no longer required [25][26][27]. This QGP will evolve according to the laws of hydrodynamics and eventually decay statistically. There are various expected consequences of such a scenario, such as longrange two-particle correlations, the so-called "ridge" phenomenon [22,28], jet quenching [29,30], or enhanced production of strange hadrons [31]. It was initially a surprise when such effects were also discovered in small systems. While it was argued that also in central collisions of small systems the energy densities may be high enough to allow for the formation of a QGP [16], other recent studies have shown that collective effects can be achieved by alternative mechanisms such as microscopic effects in string fragmentation [32] or QCD interference [33]. The possibility of collective effects in smaller systems opens the door to study the impact of a different hadronization scheme in high energy interactions also within air showers. Air shower cascades are driven by collisions of hadrons and light nuclei at ultra-high energies. We show that statistical hadronization in collisions of hadrons and nuclei can play a so far underestimated importance in the understanding of muon production in air showers [34,35].
The underlying mechanism responsible for the production of these effects is expected to produce characteristic observables in the final state of hadron collisions. We demonstrate how statistical hadronization affects the energy fraction contained in electromagnetic versus hadronic particles, R, and show how this has important possible implications for the muon production in cosmic ray air showers.
We further propose detailed measurements of R as a novel opportunity to study collective hadronization in small systems at the LHC. This may lead to a better understanding of the underlying nature of statistical hadronization since different theoretical approaches lead to predictions that may be distinguished based on measurements. In addition those measurements are able to constrain models for air shower simulations.

II. THE MUON PROBLEM AND THE R OBSERVABLE
The dominant mechanism for the production of muons in air showers is via the decay of light charged mesons. The vast majority of mesons are produced at the end of the hadron cascade after typically five to ten generations of hadronic interactions (depending on the energy and zenith angle of the cosmic ray). The energy carried by neutral pions, however, is directly fed to the electromagnetic shower component and is not available for further production of more mesons and subsequently muons. The energy carried by hadrons that are not neutral pions is, on the other hand, able to produce more hadrons and ultimately muons in following interactions and decays. Using a simple Heitler type toy-model [36] based on [37], the neutral pion fraction c = N π 0 /N mult , defined as the number of neutral pions N π 0 divided by the total number of final-state particles N mult in a collision, was found to have a strong impact on the muon number and in particular on the slope of the energy dependence of the muon production. Indeed in this model we get where E 0 is the energy of the primary cosmic ray particle and E dec is the typical energy at which mesons decay in the cascade. So the muon number N µ increases strongly with decreasing c, which is understandable since more hadrons is available to produce muons. A second quantity with a strong impact on the muon number was identified to be the hadron multiplicity N mult . The value of c is very important for the muon production. Unfortunately, it is difficult to measure both N π 0 and N mult experimentally (for example at the LHC) since neutral particles cannot be easily counted individually. In general, secondary particle identification is unavailable at large pseudorapidities η where the energy flow is large enough to become relevant for the air shower development. Hence, we propose a new observable which is sensitive to properties of the hadronization and which can be directly related to c: the ratio of the electromagnetic to the hadronic energy density R given by Here the energy densities dE/dη are obtained by summing the energy of all final-state particles except for neutrinos in bins of η and averaging over a large number of collisions.
The neutral pion fraction c can be easily related to the energy ratio R, since both are very similar kinematic aspects of final state distributions. If all particles have the same energy such as in the generalized Heitler model, then we have simply R = c/(1 − c). But R is experimentally much easier to measure, since, using a calorimeter, the signals deposited by electromagnetic particles and by hadrons are characteristically different. We compute a detailed conversion between R and c using standalone epos lhc [38] simulations of fixed energy proton-proton collisions at various center-of-mass energies, and found that for the relevant parameter range, a change of R by ∆R affects c by ∆c ≈ 0.8 · ∆R, where R is computed by integrating eq. (3) over all η. In section IV, we will study R for different models as a function of η, and at fixed η as a function of the charged particle density at central pseudorapidity dN ch /dη| η=0 , which is determined as the average multiplicity within |η| < 0.5.
The influence of various effective parameters q in interaction models (like R, c, or N mult ) on the main air shower observables was investigated in a previous study [9] in which the behavior of hadronic interaction models in air shower simulations was modified in an energy-dependent way during full air shower cascade simulations within CONEX [39].
The effective quantity q of the hadronic event generators inside the air shower cascade simulation is changed in an energy dependent way using the modification scale f q , and the energydependent factor representing the assumption that models are well constraint by accelerator data at lower energies (below E th ), where F (E lab ) = 0, while they become logarithmically unconstrained going to higher energies. The parameter E scale is the reference energy scale. We will use E scale = E CR LHC s LHC /(2m p ) ≈ 90 PeV, using an LHC center-of-mass energy of 13 TeV. Typical threshold values are E th s Tevatron /(2m p ) ≈ 1 PeV, using the center-ofmass energy of the Tevatron accelerator. However, in particular particle production, in the important forward phase space, may be largely unconstrained by both Tevatron and LHC data, allowing much lower values of E th to be explored. It is a key point of the application of eq. (5) inside CONEX that a significant fraction of the air shower cascade is consistently modified during the simulations.
We apply eqs. (4,5) to explore the correlated impact of q = R and q = N mult on X max and lnN µ in full air shower simulations. The resulting correlated effect is shown in Fig. 1 as demonstrated for air showers at E 0 = 10 19 eV using epos lhc in CONEX. Lines in this figure show all possible resulting mean values of X max and lnN µ for any mass composition of cosmic rays between pure proton (bottom right end of lines) and pure iron (top left end of lines). The resulting values of X max and lnN µ are located on a straight line because the mean values for both are linear functions of the mean-logarithmic mass of cosmic rays [40,41] given a fixed air shower energy. The lineshape is universal, but its location, and to a lesser degree the slope and length, depend on the hadronic interaction model. Current hadronic interaction models predict lines, which are too low compared to experimental data from air showers, as indicated by the vertical gap between the representative data point from the Pierre Auger Observatory [5] and the epos lhc line. This discrepancy is the expression of the muon problem outlined above.
When N mult is modified the simulated line shifts along itself: the multiplicity has a correlated effect on X max 700 750 800 850 and lnN µ that cannot close the gap to the data. However, modifications of R mainly affect the muon number and leave X max unchanged, creating vertical shifts and tilts of the line in the plot. Thus, within the assumptions outlined here, we find that a decrease of R by f q = −15% at the LHC energy of √ s = 13 TeV would be sufficient to make the simulations compatible with the air shower data at 10 19 eV. These results have been cross-checked with alternate interaction models in the air shower simulations. There is a very good qualitative agreement in all cases. Furthermore, in Ref. [8] it was established that the muon discrepancy in simulations increases smoothly with energy. Thus, the slope of the energy dependence introduced in eq. (2) is also affected, pointing to a too small value of β. This may be related to a too large π 0 production. We explore this energy dependence in more detail in the next section.

III. CORE-CORONA EFFECT AND MUON PROBLEM
The discussion in the previous section suggests that a change of R (or c, which is equivalent) is a potential way to reduce the discrepancy between measurements and air shower simulations. Nevertheless, R is quite well constrained by theory as well as laboratory measurements and, thus, can not be changed entirely arbitrarily as studied in the previous section II. In a naive model like Ref. [37] where only pions are considered as secondary particles, R = 0.5. In a more realistic approach based on string fragmentation we have R ≈ 0.41. But as shown in Ref. [31], particle ratios such as K/π, p/π or Λ/π change with increasing secondary particle density, saturating to the value given by a thermal/statistical model with a freezeout temperature of 156.5 MeV [42] yielding R ≈ 0.34. Such a behavior can be explained in terms of a core-corona picture [43]. This approach has been used in the framework of realistic simulations [44], but also in simple model calculations [45][46][47][48]. The basic idea is that some fraction of the volume of an event (or even a fraction of events) behaves as a quark gluon plasma and decays according to statistical hadronization (core), whereas the other part produces particles via string fragmentation (corona). The particle yield N i for particle species i is then a sum of two contributions where N core i represents statistical (grand canonical) particle production, and N corona i is the yield from string decay. Crucial is the core weight ω core . In order to explain LHC data [31] the weight ω core needs to increase monotonically with the multiplicity, starting from zero for low multiplicity pp scattering, up to 0.5 or more for very high multiplicity pp, reaching unity for central heavy ion collisions (PbPb).
In the following, we are going to employ a straightforward core-corona approach, based on eq. (6), for any hadronic interaction model in CONEX air shower simulations. The particle yield from the chosen interaction model is by definition considered to be the corona yield, whereas we use the standard statistical hadronization (also referred to as resonance gas) for the core part. So ω core = 0 would be the "normal" simulation with the default interaction model. Choosing ω core > 0 amounts to mixing the yields from the interaction model according to the core-corona superposition shown in eq. (6). The core will certainly help concerning the "muon problem", because statistical hadronization produces more heavy particles and less pions compared to string fragmentation, and therefore R is smaller [34,35].
Technically, we directly modify individual particle ratios of the secondary particle spectra dN i /dE j , for particle species i and energy bins dE j , of hadronic interactions with air nuclei used by CONEX for numerical air shower simulations based on cascade equations. Knowing the initial ratios π 0 /π ± , p/π ± , K ± /π ± , p/n, K 0 /K ± (taking into account strange baryon decays) from a corona type model and the value of the same ratios from the core model, we compute new spectra in which the particle yields include both, core and corona according to ω core . Since the hadronization mechanism can affect only newly produced particles the properties of the leading particle should be preserved. To achieve that, the new particle yields are computed for all secondaries, but excluding the one corresponding to the respective projectile type, i.e. protons in proton-air, kaons in kaon-air interactions, and so on. The yield of the projectile-type particles is determined subsequently by exploiting energy conservation in all energy bins dE j summed over all secondary particle species i: the sum i E j dN i /dE j must be conserved. Since at high x F = E j /E lab only the projectiletype particles will have dN i /dE j significantly different from zero (aka leading-particle effect), the resulting modified leading-particle type spectra at high x F follow the original distribution, and are only affected by the scaling procedure at lower values of x F . Together, this assures that energy conservation as well as the total multiplicity are not affected, but only the particle ratios. More details will be given in a future publication. We expect the core weight ω core to increase with energy in a logarithmic way. Thus, we use to model this (in analogy to eq. (4)), starting already at fixed-target energies, E th = 100 GeV. Different energy dependencies are explored by changing E scale from 100 GeV (corresponding to a step function), to 10 6 GeV, and 10 10 GeV. The f ω scale is varied from 0.25, 0.5, 0.75 to 1.0; in addition we enforce F (E lab ; E th , E scale ) ! = 1 for all E lab ≥ E scale . This yields the ω core energy dependencies as depicted in Fig. 2. All these scenarios have been used to simulate full air showers with CONEX, using cascade equations from the first interaction to the ground, for proton and iron primary particles at E 0 = 10 19 eV. In Fig. 3 the results are shown in the X max -lnN µ plane for two models epos lhc (left) and QGSJetII.04 [49,50] (right). These examples illustrate that it is well possible with modified hadronization in air shower cascades to describe the data of the Pierre Auger Observatory. As expected, more core-like contributions are needed compared to what is currently provided by the models. This means, QGP-like effects also in light colliding systems and starting in central collisions at much lower center-ofmass energies may play a decisive role.
Furthermore, from eq. (2) also a different energy evolution of the muon production follows. To study the effect of our core-corona model on the muon production as a function of the energy, we can compare the different scenarios with the compilation of data presented in Ref. [8] using the renormalized factor with N µ being any muon related experimental observable and ln N µ p and ln N µ Fe being the average of the logarithm of the same observable simulated with proton and iron primaries respectively for a given reference hadronic interaction model. This allows a direct comparison between different experiments for various types of muon observables. Considering the energy dependence of z, there is an implicit dependence on the cosmic-ray mass A, since ln A varies with energy. However, as expected from the Heitler model formula, and even more importantly, verified via explicit simulations, z and ln A are related as z = a + b ln A , and from z(pure Fe) = 1 and z(pure p) = 0 we simply get a = 0 and b = 1/ln56. This is very useful, since it means that the A-dependence of z (called z mass ) is given as and the expectation of ∆z = z − z mass is zero for the case of full consistency between all experimental observables and the simulations based on a valid reference model. This means, plotting ∆z for experimental data, we should get zero if the reference model were perfect, whereas ∆z > 0 implies a muon deficit in the simulations. In this way we can visualize the energy dependence of the muon excess, corrected for mass dependencies. More details and references are given in Ref. [8].
As pointed out in Ref. [8], for all models the data have a positive ∆z showing a significant logarithmic increase with the primary energy, indicating an increasing muon deficit in the simulations. In Fig. 4 the effect of the different energy evolution of ω core for epos lhc and QGSJetII.04 on ∆z are shown. Here the new simulations are treated like data and the z factor is calculated using the original (quoted) models as a reference such that the new ∆z can be compared to the data points directly. The positive ∆z of the lines indicate a larger muon production when ω core increases and the positive slopes mean that the slope of the muon production as a function of the primary energy is larger when ω core increases. By including a consistent core-like hadronization, we thus reproduce the energy evolution as found in the data. This is even possible for values ω core < 1.
The possibility to see the effect of a core hadronization (QGP or similar more exotic phenomena) on air shower physics have already been studied in the literature [51][52][53][54]. Changes in the muon production because of a change of R under either extreme or exotic assumptions (which were not yet observed at the LHC) are usually assumed. Furthermore, it was shown that the production of a core only in very central, high-density, collisions is not sufficient to significantly change the muon numbers in air shower simulations [55].
In contrast to the new results presented here, in those previous studies the core-like production does not cover sufficient phase space to change the muon production in air showers significantly. We demonstrate that core-like effects potentially starting at much smaller colliding systems, and at much lower center-of-mass energies as studied here, have an important impact on muon production in air showers. There are various indications at the LHC in pp and pA collisions that such a scenario is compatible with current data [22,31], or even suggested by it, at energy densities as reached by cosmic rays interacting with the atmosphere [35]. Studying LHC data at mid-rapdity it is found that for events with dN ch /dη |η|<0.5 ∼ 10 (corresponding to typical proton-air interactions) ω core is already ≈ 50-75%. Since our study is based on the simple assumption that the full phase space has a modified π 0 ratio, it remains crucial for cosmic ray physics to conduct further dedicated measurements at the LHC to better understand π 0 production relative to other particles. The phase space for the formation of core-like effects is potentially significantly larger than previously studied, and in particular may extend towards larger rapidities.

IV. TESTING CORE CONTRIBUTIONS VIA MEASUREMENTS OF R AT THE LHC
As previously outlined an enhanced contribution of core-like hadronization can help to explain the data of the Pierre Auger Observatory. In the following we discuss how this can be probed with accelerator data.
We mainly use epos lhc as the baseline model to test sensitivity towards a QGP-like state. As alternative model we use pythia8 [56,57], which provides entirely different (non-QGP-like) physics concepts for collectivity. epos lhc is a general purpose event generator widely used in high energy physics, and in particular also for heavy ion collisions. It includes the description of a QGP-like behavior in high energy collisions. pythia8, on the other hand, is the reference model in high energy physics for proton-proton interactions. Both models generate a distribution of colored strings from the collision of a projectile and a target. Despite a very different underlying approach for the string generation (pQCD factorization for pythia8 and parton-based Gribov-Regge theory [58] for epos lhc), the string distributions are not very different, because they are strongly constrained by the data on particle multiplicities. These strings can be hadronized directly in both generators using the Lund string model [59] in pythia8, or the area  law [58] in epos lhc -both cases are strongly constrained by LEP data. At low energy (≈10 to 100 GeV) this is sufficient to successfully describe proton-proton interactions with good accuracy. Nevertheless, it turns out that at the LHC additional physics mechanisms are needed to describe the observed particle correlations and abundances in the final state. In pythia8, a modified color reconnection approach [60,61] or a "string shoving" mechanism [32] have been proposed to introduce collective effects such as a modified hadronization or particle correlations, similar to those obtained from a QGP.
In epos lhc, on the other hand, the "core-corona" approach [44] is used as originally developed for heavy ion collisions. As already explained, the core amounts to areas with high string/energy densities, where strings are assumed to "melt" and produce matter that expands hydrodynamically and then decays statistically, whereas the corona represents particles from ordinary string fragmentation, which escape from the dense regions. While in epos3 [62] the hydrodynamic expansion is fully implemented and hadronization occurs on a freeze-out hypersurface, in epos lhc this expansion is mimicked by parameterizing the flow at hadronization. This has proven to describe various collective observables well [38]. Simulations of epos lhc are readily available via the crmc software [63]. On generator level, we study particles with a lifetime cτ > 1 cm, which is consistent with most experimental detector designs. In fact, in epos lhc final-state particles originate from three different production mechanisms: standard string fragmentation (corona), statistical decay of a fluid (core), and the decay of the beam remnants. While experimentally the origin of the production mechanism for a particle cannot be identified, individual production mechanisms can still be studied since they predominantly contribute to different regions of phase space. This is demonstrated in Fig. 5 (top), which shows the relative contribution of these mechanisms to the total energy density dE/dη for minimum bias proton-proton collisions at a center-of-mass energy of 13 TeV. Three regions can be identified: The energy density at central pseudorapidities, |η| < 5, is dominated by particles originating in the dense core of the interaction, at intermediate rapidities, 5 < |η| < 8, it is dominated by particles from string fragmentation, and at large rapidities, |η| > 8, by the fragmentation of beam remnants. Underlying differences in particle production, therefore, lead to varying observables as a function of pseudorapidity.
A corresponding effect is also observed as a function of the central charged particle multiplicity N ch . Final states with large particle multiplicity are known to be an effective trigger for pronounced statistical hadronization [22]. Therefore, at fixed pseudorapidity, the influence of the core increases as a function of particle multiplicity. This effect is expected to be most significant at |η| ≈ 0 since the relative contribution of the core is largest. This is illustrated in the middle panel of Fig. 5 for η = 0 and in the bottom panel for η = 6. It can be seen that the contribu-tion of the core to the energy density at η = 0 becomes dominant for pp collisions with more than ≈ 7 charged particles per unit of pseudorapidity, while at η = 6 this transition is shifted to a larger number.
Using epos lhc we find that the fraction of secondary pions in the dense core is reduced because many other more massive hadrons and resonances are produced. This leads to a lower ratio of the electromagnetic to hadronic energy density in particles produced from the core. Accordingly, this effect can be seen in the pseudorapidity-dependent ratio of the average electromagnetic to hadronic energy density R shown in top panel of Fig. 6. At |η| ≈ 0, the energy density is dominated by the core and therefore the value of R for epos lhc is as low as 0.34. As the contribution of the core to the total energy decreases with increasing pseudorapidity, also R increases and reaches a value of 0.4 at |η| ≈ 7 before it decreases rapidly due to the very low electromagnetic contribution in the beam remnants. In comparison, a flat ratio below |η| ≈ 7 is obtained when statistical hadronization is disabled in epos lhc (corona only). The data point shown in this figure at η ≈ 6 is derived from Ref. [64], where we have corrected the original values from detector level to generator level using the Rivet routines provided by the CMS Collaboration [65]. The shaded region corresponds to the systematic uncertainties of the measurement. These data are consistent with all models within the experimental uncertainty; there is a slight tension with the pythia8 simulations using the modified colour reconnection approach [66]. Such data with smaller uncertainties, and measured over a wide range of η have the potential to differentiate between some of the models. In particular, any slope observed in the region 0 < |η| 6 would be a clear hint for a transition of several distinct hadronization mechanisms (i.e. core-corona).
The ratio of the electromagnetic to hadronic energy density R at η = 0 is shown as a function of the central multiplicity dN ch /dη| η=0 in the middle panel of Fig. 6. It can be observed that R drops down to values of 0.3 when statistical hadronization is enabled in epos lhc while it reaches a constant plateau of 0.4 in the case of disabled statistical hadronization, which is similar to the pythia8 predictions. At η = 6, it can be seen in the bottom panel of Fig. 6 how the different model predictions compare to the available CMS data (also from Ref. [64]). However, these data are taken at η ∼ 6, were one can see from Fig. 6 (top) that the sensitivity to model differences is unfortunately close to minimal. It would be a great way to study hadronization in hadron collisions by measuring this at LHC in a much wider η region.
We compare the simulations obtained with epos lhc also to predictions by pythia8 in the standard minimum bias configuration as well as with modified QCD-based color reconnection parameters as presented in Ref. [61], and enabled string shoving mechanisms. For the latter we use the example parameters provided within pythia8 version 8.235. We are aware that these settings are un-    tuned and results should be treated with care but first observations, in particular about the characteristic shape of the distributions, can be made. It is interesting that we do not find a visible effect of the string shoving mechanism on the ratio of electromagnetic to hadronic energy, R, compared to the default string fragmentation (see Fig. 6). This is consistent with the predictions of epos lhc when statistical hadronization is disabled. These findings can be understood since in all these cases particle production is driven by QCD string fragmentation, which is well tuned to LEP data. Thus, what is found here is a characteristic feature of string fragmentation. If a microscopic collectivity model does not modify particle production by string fragmentation, as it is the case for string shoving, this has no impact on the observable R. With the modified color reconnection on the other hand, a reduced value of R is observed within |η| < 7 as well as a more prominent decrease of R at central rapidity as a function of dN ch /dη| η=0 . This is due to enhanced baryon production as explained in Ref. [61]. Still, the decrease in R is not as strong as in epos lhc. More importantly, all configurations of pythia8 exhibit a flat ratio as a function of pseudorapidity within |η| < 7. The value of R is a global feature of the hadronization and independent of rapidity. No transition from a statistical, to a string-dominated phase as in epos lhc is observed.
epos lhc was released after the first LHC data became available. At that time, only average values and the evolution of the mean transverse momentum as a function of the particle multiplicity were known precisely. The increase of multi-strange baryon production with particle multiplicity was a prediction of the model, but -as shown in Ref. [31] -was only qualitatively correct. Effectively, the core is formed in epos lhc only at larger multiplicities compared to what is necessary to reproduce the data. Thus, it is expected that the density needed to produce the core is currently overestimated and, as a consequence, the effect on muon production in air showers is significantly underestimated (not enough phase space for core hadronization). It would be useful to have precise data on R versus multiplicity to support (or reject) this hypothesis.
The study made with epos lhc and pythia8 is just an example of what can be observed. A different model may have a different behavior, but it has been clearly demonstrated that R is sensitive to the type of hadronization. As a consequence, the observation of variations of the ratio of electromagnetic to hadronic energies as function of pseudorapidity or particle multiplicity is a strong test also of the nature of collective effects in proton-proton collisions (or other systems). Different implementations of statistical hadronization, QGP-like or macroscopic, can be distinguished. The proposed measurements will provide new constraints on the extension of the phase space in which statistical hadronization occurs, complementary to established measurements.
In any case, corresponding precision measurements of R to 5% at the LHC seem feasible and could contribute significantly to a better understanding of muon production in air showers as described in section III in particular if the measurements could be done with a light-ion beam such as oxygen [67]. Despite the fact that calorimetric data are taken at various pseudorapidities (central and forward calorimeters), such ratios are not commonly published-with currently one notable exception [64]. The reverse argument also holds: in future huge-aperture air shower experiments, the tail of the ln N µ -distribution could be used to indirectly measure the slope of the energy distribution of neutral pions far beyond the reach of the LHC [10,11].

V. SUMMARY
We have demonstrated that the muon production in air shower significantly depends on the ratio R = E em /E had , where E em is the sum of energy in secondary γ (from π 0 ) and e ± while E had is the sum of energy in hadrons in individual hadron collisions. We also showed that R itself depends on the hadronization mechanism. Thus, a change or transition in these mechanisms can help to explain the discrepancy between the observed number of muons in air showers by the Pierre Auger Observatory and the predictions based on current hadronic models. Since at the LHC, even in proton-proton interactions, one observes a transition from a string-type to a statisticaltype hadronization at mid-rapidity, we used the particle ratios of the statistical model at all pseudorapidities to show that such hadronization scheme would in principle be sufficient to resolve the observed difference between simulations and cosmic ray data. Experimental measurements of R at the LHC are currently compatible with this possibility. On the other hand, extreme scenarios where full statistical hadronization is reached at low energies (E lab ∼ O(100 GeV)) are already excluded by the slope of the energy-dependence of air shower muon data.
Furthermore, we discuss potential measurements of R at LHC, e.g. with calorimeters, as a function of pseudorapidity η or central charged particle multiplicity N ch . We show that this observable can reveal properties of the nature of underlying fundamental particle production mechanisms. In particular we show that it provides a new handle to characterize mechanisms proposed for the explanation of statistical hadronization in protonproton collisions. It is potentially possible to distinguish between quark-gluon-plasma-like (QGP-like) effects, as first known from heavy ion collisions, from alternative, more microscopic effects that do not require the formation of a QGP, see e.g. Refs. [32,61].
Dedicated measurements at the LHC have now another opportunity to study collectivity in proton-proton collision using this observable. This will contribute to a better understanding of the mechanisms of hadronization in hadron collisions, and collectivity in proton-proton collisions or other light system. Measuring R at the LHC potentially has a significant impact on resolving the current mystery of muon production in cosmic ray induced extensive air showers. Thus, at last, one aspect to resolve the cosmic ray muon mystery is a better understanding of statistical hadronization in small collision systems.