Evidence for a mixed mass composition at the `ankle' in the cosmic-ray spectrum

We report a first measurement for ultra-high energy cosmic rays of the correlation between the depth of shower maximum and the signal in the water Cherenkov stations of air-showers registered simultaneously by the fluorescence and the surface detectors of the Pierre Auger Observatory. Such a correlation measurement is a unique feature of a hybrid air-shower observatory with sensitivity to both the electromagnetic and muonic components. It allows an accurate determination of the spread of primary masses in the cosmic-ray flux. Up till now, constraints on the spread of primary masses have been dominated by systematic uncertainties. The present correlation measurement is not affected by systematics in the measurement of the depth of shower maximum or the signal in the water Cherenkov stations. The analysis relies on general characteristics of air showers and is thus robust also with respect to uncertainties in hadronic event generators. The observed correlation in the energy range around the `ankle' at $\lg(E/{\rm eV})=18.5-19.0$ differs significantly from expectations for pure primary cosmic-ray compositions. A light composition made up of proton and helium only is equally inconsistent with observations. The data are explained well by a mixed composition including nuclei with mass $A>4$. Scenarios such as the proton dip model, with almost pure compositions, are thus disfavoured as the sole explanation of the ultrahigh-energy cosmic-ray flux at Earth.


Abstract
We report a first measurement for ultra-high energy cosmic rays of the correlation between the depth of shower maximum and the signal in the water Cherenkov stations of air-showers registered simultaneously by the fluorescence and the surface detectors of the Pierre Auger Observatory.Such a correlation measurement is a unique feature of a hybrid air-shower observatory with sensitivity to both the electromagnetic and muonic components.It allows an accurate determination of the spread of primary masses in the cosmic-ray flux.Up till now, constraints on the spread of primary masses have been dominated by systematic uncertainties.The present correlation measurement is not affected by systematics in the measurement of the depth of shower maximum or the signal in the water Cherenkov stations.The analysis relies on general characteristics of air showers and is thus robust also with respect to uncertainties in hadronic event generators.The observed correlation in the energy range around the 'ankle' at lg(E/eV) = 18.5 − 19.0 differs significantly from expectations for pure primary cosmic-ray compositions.A light composition made up of proton and helium only is equally inconsistent with observations.The data are explained well by a mixed composition including nuclei with mass A > 4. Scenarios such as the proton dip model, with almost pure compositions, are thus disfavoured as the sole explanation of the ultrahigh-energy cosmic-ray flux at Earth.Keywords: Pierre Auger Observatory, cosmic rays, mass composition, ankle

Introduction
An important quantity to characterize the composition of cosmic rays is the spread in the range of masses in the primary beam.In theoretical source models regarding protons as the dominant particle type, the composition is expected to be (almost) pure, while in other scenarios also allowing heavier nuclei to be accelerated, a mixed composition is predicted.For instance, in the 'dip' model [1,2], two observed features of the energy spectrum could be naturally understood as a signature of proton interactions during propagation (ankle at lg(E/eV) 18.7 from pair-production and flux suppression at lg(E/eV) 19.6 from photopion production).Therefore, the dip model predicts an almost pure cosmic-ray composition with small spread in primary masses.
In a recent publication, the distributions of depths of shower maximum X max (the atmospheric depth where the number of particles in the air shower reaches a maximum value) observed at the Pierre Auger Observatory were interpreted in terms of primary masses [3] based on current hadronic interaction models.The results suggest a mixed mass composition, but there are differences between the interaction models, and a clear rejection of the dip model is hindered due to the uncertainties in modeling hadronic interactions 1 .Specifically, around the ankle, a very light composition consisting of proton and helium nuclei only is favoured using QGSJetII-04 [5] and Sibyll 2.1 [6], while for EPOS-LHC [7], intermediate nuclei (of mass number A 14) contribute.The spread of masses in the primary beam near the ankle, estimated from the moments of the X max distributions measured at the Pierre Auger Observatory [8,9], depends as well on the details of the hadronic interactions and the results include the possibility of a pure mass composition.Observations of X max by the Telescope Array in the northern hemisphere were found compatible within uncertainties to both a pure proton composition [10] and to the data from the Auger Observatory [11].
In this report, by exploiting the correlation between two observables registered simultaneously with different detector systems, we present results on the spread of primary masses in the energy range lg(E/eV) = 18.5 − 19.0, i.e. around the ankle feature.These results are robust with respect to experimental systematic uncertainties and to the uncertainties in the description of hadronic interactions.

Method and observables
We follow [12] where it was proposed to exploit the correlation between X max and the number of muons N µ in air showers to determine whether the mass composition is pure or mixed.The measurement must be performed by two independent detector systems to avoid correlated detector systematics.For pure cosmic-ray mass compositions, correlation coefficients close to or larger than zero are found in simulations.In contrast, mixed mass compositions show a negative correlation, which can be understood as a general characteristic of air showers well reproduced within a semi-empirical model [13]: heavier primaries have on average a smaller X max (∆X max ∼ −∆ ln A) and larger N µ (N µ ∼ A 1−β , β 0.9 [14]), such that for mixtures of different primary masses, a negative correlation appears.This way, the correlation coefficient can be used to determine the spread σ(ln A) of primary masses, given by σ(ln A) = ln 2 A − ln A 2 where ln A = i f i ln A i and ln 2 A = i f i ln 2 A i with f i being the relative fraction of mass A i .In particular, a more negative correlation indicates a larger spread of primary masses.At the Pierre Auger Observatory, the fluorescence telescopes allow a direct measurement of X max and energy, and the surface array of water Cherenkov detectors provide a significant sensitivity to muons: for zenith angles between 20 and 60 degrees, muons contribute about 40% to 90% [15] of S(1000), the total signal at a core distance of 1000 m.Due to this unique feature the proposed method can be adapted via replacement of N µ by S(1000), which is a fundamental observable of the surface array.
Since S(1000) and X max of an air shower depend on its energy and, in case of S(1000), also on its zenith angle, S(1000) and X max are scaled to a reference energy and zenith angle.This way we avoid a decorrelation between the observables from combining different energies and zenith angles in the data set.S(1000) is scaled to 38 • and 10 EeV using the parameterisations from [16].X max is scaled to 10 EeV using an elongation rate d X max /d lg(E/eV) = 58 g cm −2 /decade, an average value with little variation between different primaries and interaction models [9].Here, these scaled quantities will be denoted as X * max and S * 38 .Thus, X * max and S * 38 are the values of X max and S(1000) one would have observed, had the shower arrived at 38 • and 10 EeV.It should be noted that the specific choice of the reference values is irrelevant, since a transformation to another reference value shifts the data set as a whole, leaving the correlation coefficient invariant.
As a measure of the correlation between X * max and S * 38 the ranking coefficient r G (X * max , S * 38 ) introduced by Gideon and Hollister [17] is taken.Conclusions are unchanged when using other definitions of correlation coefficients, including the coefficients of Pearson or Spearman, or other ones [18].As for any ranking coefficient, the r G value is invariant against any modifications leaving the ranks of events unchanged (in particular to systematic shifts in the observables).The main distinction from other ranking coefficients is that the values of ranks are not used directly to calculate r G .Rather the general statistical dependence between X * max and S * 38 is estimated by counting the difference in numbers of events with ranks deviating from the expectations for perfect correlation and anti-correlation.Thus, the contribution of each event is equal to 0 or 1, making r G less sensitive to a removal of individual events, as it will be discussed also below.
The dependence of the statistical uncertainty ∆r G on the number of events n in a set and on the r G value itself was determined by drawing random subsamples from large sets of simulated events with different compositions.The statistical uncertainty can be approximated by ∆r G 0.9/ √ n.For the event set used here ∆r G (data) = 0.024.

Data and simulations
The analysis is based on the same hybrid events as in [9] recorded by both the fluorescence and the surface detectors during the time period from 01.12.2004 until 31.12.2012.The data selection procedure, described in detail in [9], guarantees that only high-quality events are included in the analysis and that the mass composition of the selected sample is unbiased.The reliable reconstruction of S(1000) requires an additional application of the fiducial trigger cut (the station with the highest signal should have at least 5 active neighbour stations).This requirement does not introduce a mass composition bias since in the energy and zenith ranges considered the surface detector is fully efficient to hadronic primaries [19,20].Selecting energies of lg(E/eV) = 18.5−19.0and zenith angles <65 • , the final data set contains 1376 events.The resolution and systematic uncertainties are about 8% and 14% in primary energy [21], < 20 g cm −2 and 10 g cm −2 in X max [9], and < 12% and 5% [22] in S(1000), respectively.
The simulations were performed with CORSIKA [23], using EPOS-LHC, QGSJetII-04 or Sibyll 2.1 as the high-energy hadronic interaction model, and FLUKA [24] as the low-energy model.All events passed the full detector simulation and reconstruction [25] with the same cuts as applied to data.For each of the interaction models the shower library contains at least 10000 showers for proton primaries and 5000 − 10000 showers each for helium, oxygen and iron nuclei.

Results
The observed values of X * max vs. S * 38 are displayed in Fig. 1.As an illustration, proton and iron simulations for EPOS-LHC are shown as well, but one should keep in mind that in this analysis we do not aim at a direct comparison of data and simulations in terms of absolute values.In contrast to the correlation analysis such a comparison needs to account for systematics in both observables and suffers from larger uncertainties from modeling of hadronic interactions.In Table 1, the observed r G (X * max , S * 38 ) is given along with simulated r G values for pure compositions (σ(ln A) = 0) and for the maximum spread of masses 0.5 p − 0.5 Fe (σ(ln A) 2) for all three interaction models.For the data, a negative correlation of r G (X * max , S * 38 ) = −0.125 ± 0.024 (stat) is found.For proton simulations correlations are close to zero or positive in all models.Pure compositions of heavier primaries show even more positive correlations (r G ≥ 0.09) than for protons.Hence, observations cannot be reproduced by any pure composition of mass A ≥ 1, irrespective of the interaction model chosen.In the proton dip model, even small admixtures of heavier nuclei, such as a 15 − 20% helium fraction at the sources, were shown to upset the agreement of the pair-production dip of protons with the observed flux [1,2,26,27].The values of r G in simulations for a mixture at Earth of 0.8 p − 0.2 He are added in Table 1.They are essentially unaltered compared to the pure proton case and equally inconsistent to the observed correlation.
Further, the correlation is found to be non-negative r G (X * max , S * 38 ) 0 for all p − He mixtures.Thus, the presence of primary nuclei heavier than helium A > 4 is required to explain the data.
We also checked the case of O − Fe mixtures, i.e. a complete absence of light primaries.A minimum value of r G ≈ −0.04 is reached for mixtures produced with EPOS-LHC for fractions close to 0.5 O − 0.5 Fe.With smaller significance, light primaries therefore appear required as well to describe the /eV) E lg( 18  observed correlation.In Fig. 2 the dependence of the simulated correlation r G (X * max , S * 38 ) on the spread σ(ln A) is shown for EPOS-LHC and QGSJetII-04 (results for Sibyll 2.1 are almost identical to those of QGSJetII-04).A comparison with the data indicates a significant degree of mixing of primary masses.Specifically, σ(ln A) 1.35 ± 0.35, with values of σ(ln A) 1.1 − 1.6 being consistent with expectations from all three models.The fact that differences between models are moderate reflects the relative insensitivity of this analysis to details of the hadronic interactions.
In Fig. 3 the observed values of r G are presented in four individual energy bins.From simulations, only a minor change of r G with energy is expected for a constant composition.The data are consistent with a constant r G with χ 2 /dof 6.1/3 (P 11%).Allowing for an energy dependence, a straight-line fit gives a positive slope and χ 2 /dof 3.2/2 (P 20%).More data are needed to determine whether a trend towards larger r G (smaller σ(ln A)) with energy can be confirmed.

Cross-checks
Several cross-checks were performed.In all cases, the conclusions were found to be unchanged.The cross-checks included: (i) a division of the data set in terms of time periods, FD telescopes or zenith angle ranges; (ii) variations of the event selection criteria; (iii) variations of the scaling functions when transforming to the reference zenith angle and energy; (iv) adopting other methods to calculate the correlation coefficient [18]; and (v) studying the effect of possible 'outlier' events.Regarding (iv), the smallest difference between the data and pure compositions is found for EPOS-LHC protons and it is 5.2σ stat for r G (c.f.Table 1), and ≥ 7σ stat for Pearson and Spearman correlation coefficients.As an example of the last point (v), events were artificially removed from the data set so as to increase the resulting value of r G as much as possible, i.e., to bring it closer to the predictions for pure compositions.Removing 20 events in this way increased the value of r G by ∼ 0.01 only.For removals of sets of 100 arbitrary events, the maximum increase was ∼ 0.02.This robustness of r G against the influence of individual events and even sub-groups of events was a main reason for choosing it in this analysis.

Systematic uncertainties
Due to the analysis method and the choice of using a correlation coefficient, systematics are expected to play only a minor role (for the special case of hadronic uncertainties see below): separate systematics in the observables X max and S(1000) have no effect on r G , and the measurement of the two observables by independent detectors avoids correlated systematics.Even a correlated systematic leaves r G invariant as long as the ranks of the events are unchanged.Also if there were a more subtle issue affecting the ranks of the observed events that might have gone unnoticed so far and could require future correction (e.g.updated detector calibrations or atmospheric parameters affecting only part of the data), we note that this typically leads to a de-correlation of the uncorrected data set, i.e., to an underestimation of the present value of |r G |.Moreover, the main conclusion about the spread of primary masses results from the difference between data and simulations which remains robust for anything affecting the two in a similar way such as, for instance, during reconstruction.
As an illustration, new data sets were created from the observed one by artificially introducing energy and zenith angle dependent 'biases' in X * max (up to 10 g cm −2 ) and S * 38 (up to 10%) (it should be stressed that these are arbitrary modifications).The values of r G changed by 0.01, which is well below the statistical uncertainty.A value of 0.01 is taken as a conservative estimate of the systematic uncertainty.
The systematics in energy affect the energy bin that the observed spread is assigned to, which may be shifted by ±14%.The difference between simulation and data is left invariant since r G is practically constant with energy for a given composition.

Uncertainties in hadronic interactions
Current model predictions do not necessarily bracket the correct shower behaviour.In fact, measurements of the muon content from the Auger Observatory indicate a possible underestimation of muons in simulations [28,29].Therefore we studied whether adjustments of hadronic parameters in simulations could bring primary proton predictions into full agreement with the data.The focus is on protons since heavier nuclei, due to the superposition of several nucleons and the smaller energy per nucleon, would require even larger adjustments.
Firstly, the (outdated) pre-LHC versions of EPOS and QGSJetII were checked.Despite the updates, values of r G differ by less than 0.02 from the current versions.
Secondly, an ad-hoc scaling of shower muons was applied in simulations.Different approaches were tested: a constant increase of the muon number; a zenith-angle dependent increase; and an accompanying increase of the electromagnetic component as motivated from shower universality [30].For an effective muon scaling by a factor 1.3 as suggested by data [28,29] the simulated r G values were reduced by 0.03.While possibly slightly decreasing the difference with the data, such a shift is insufficient to match expectations for pure compositions with data.
Thirdly, following the approach described in [31] and using CONEX [32] with the 3D option for an approximate estimation of the ground signal, the effect on r G was studied when modifying some key hadronic parameters in the shower simulations.Increasing separately the cross-section, multiplicity, elasticity, and pion charge ratio by a factor growing linearly with lg E from 1.0 at 10 15 eV to 1.5 at 10 19 eV compared to the nominal values (f 19 = 1.5, cf.[31]), r G turned out to be essentially unaffected except for the modified cross-section where the value was decreased by ∆r G ≈ −0.06.Despite the large increase of the crosssection assumed, this shift is still insufficient to explain the observed correlation.Moreover, ∆r G shows in this case a strong dependence on zenith angle ( 0.0 for 0−45 • and −0.1 for 45−60 • ) making the predictions inconsistent with the data.It should be noted that any such modification is additionally constrained by other data of the Auger Observatory such as the observed X max distributions [9] and the proton-air cross-section at lg(E/eV) 18.25 [33,34].

Discussion
A negative correlation of r G (X * max , S * 38 ) = −0.125 ± 0.024 (stat) is observed.Simulations for any pure composition with EPOS-LHC, QGSJetII-04 and Sibyll 2.1 give r G ≥ 0.00 and are in conflict with the data.Equally, simulations for all proton -helium mixtures yield r G ≥ 0.00.The observations are naturally explained by a mixed composition including nuclei heavier than helium A > 4, with a spread of masses σ(ln A) 1.35 ± 0.35.
Increasing artificially the muon component or changing some key hadronic parameters in shower simulations leaves the findings essentially unchanged.Thus, even with regard to hadronic interaction uncertainties, a scenario of a pure composition is implausible as an explanation of our observations.Possible future attempts in that direction may require fairly exotic solutions.In any case, they are highly constrained by the observations presented here as well as by previous Auger results.
The minor dependence of the mass spread determined in this analysis from hadronic uncertainties allows one to test the self-consistency of hadronic inter-action models when deriving the composition from other methods or observables (e.g.[9,3,35,36]).As mentioned in the beginning, when interpreting the X max distributions alone in terms of fractions of nuclei [3], different results are found depending on the model: using QGSJetII-04 or Sibyll 2.1, one infers values of σ(ln A) ≈ 0.7 and would expect r G ≈ 0.08.This is at odds with the observed correlation and indicates shortcomings in these two models.Using EPOS-LHC, values of σ(ln A) ≈ 1.2 and r G ≈ −0.094 are obtained, in better agreement with the observed correlation.
The conclusion that the mass composition at the ankle is not pure but instead mixed has important consequences for theoretical source models.Proposals of almost pure compositions, such as the dip scenario, are disfavoured as the sole explanation of ultrahigh-energy cosmic rays.Along with the previous Auger results [3,8,9], our findings indicate that various nuclei, including masses A > 4, are accelerated to ultrahigh energies (> 10 18.5 eV) and are able to escape the source environment.

Figure 2 :
Figure 2: Dependence of the correlation coefficients r G on σ(ln A) for EPOS-LHC (left) and QGSJetII-04 (right).Each simulated point corresponds to a mixture with different fractions of protons, helium, oxygen and iron nuclei, the relative fractions changing in 0.1 steps (4 points for pure compositions are grouped at σ(ln A) = 0).Colors of the points indicate ln A of the corresponding simulated mixture.The shaded area shows the observed value for the data.Vertical dotted lines indicate the range of σ(ln A) in simulations compatible with the observed correlation in the data.

Figure 3 :
Figure 3: The correlation coefficients r G for data in the energy bins lg(E/eV) = 18.5 − 18.6; 18.6 − 18.7; 18.7 − 18.8; 18.8 − 19.0.Numbers of events in each bin are given next to the data points.The gray band shows the measured value for data in the whole range lg(E/eV) = 18.5 − 19.0.Predictions for the correlations r G in this range for pure proton and iron compositions, and for the extreme mix 0.5 p − 0.5 Fe from EPOS-LHC and QGSJetII-04 are shown as hatched bands (for Sibyll 2.1 values are similar to those of QGSJetII-04).The widths of the bands correspond to statistical errors.

Table 1 :
Observed r G (X * max , S * 38 ) with statistical uncertainty, and simulated r G (X * max , S * 38 ) for various compositions using different interaction models (statistical uncertainties are ≈0.01).