Interpreting the cosmic ray spectrum and composition measurements across the ankle and up to the highest energies with the data of the Pierre Auger Observatory

. In this work we present a combined ﬁt of the energy spectrum and mass composition data above 6 × 10 17 eV as measured by the Pierre Auger Collaboration and we show that our data can be reproduced by the superposition of two (or more) components, either Galactic or extragalactic. We use a simple astrophysical model with two extragalactic components, one dominant at high energy and the other at low energy, whose superposition generates the so-called “ankle” feature. We ﬁnd that the component dominating the low (high) energy region is required to be emitted from the sources with a very soft (hard) spectrum. In our best ﬁt scenario the sources contributing above the ankle emit a mixed and increasingly heavier mass composition, whereas a mix of protons and intermediate-mass nuclei is observed below the ankle, which reach the Earth almost una ﬀ ected by the propagation in the intergalactic space. The possible presence of a Galactic population providing the intermediate-mass contribution at low energy is also explored. In order to evaluate our capability to constrain astrophysical models, we ﬁnally discuss the impact on the ﬁt results of the main experimental systematic uncertainties and of the assumptions on the quantities a ﬀ ecting propagation, both in the atmosphere and in the intergalactic medium.


Introduction
The existence of ultra-high-energy cosmic rays (UHE-CRs), the ones reaching Earth with energies above ∼ 10 18 eV, was proven in the early 1960s, but their origin and the acceleration mechanism governing them are still unsettled. The large ground-based experiments built in the past few decades, like the Pierre Auger Observatory, are providing very high data quality and setting the basis for shedding light on such open questions. The energy spectrum of cosmic rays exhibits several features in the highest-energy region: a sharp hardening, known as the ankle, is observed at ∼10 18.7 eV; a new feature, dubbed the instep, at ∼10 19.1 eV, could be attributed to the interplay of light-to-intermediate nuclei [1]; finally, the flux suppression above ∼10 19.7 eV may be due to energy losses during the propagation of UHECRs [2,3], to the limited maximum energy the sources acceleration can reach, or possibly to a combination of the two effects. The mass composition [4,5] is estimated by the distributions of depth of maximum development of the showers X max : below the ankle it is given by a mix of protons and medium-mass nuclei, whereas above the ankle it gets increasingly heavier and less mixed, suggesting a superposition of alternating groups of elements with a steep rigidity-dependent cutoff and a progressively heavier mass (a similar sequence, known as Peters cycle [6], is associated to the knee of the cosmic-ray spectrum around 10 15.3 eV).
The quest for the origin of UHECRs is strongly related to the investigation of the energy region between the second knee and the ankle, where the transition from Galactic to extragalactic cosmic rays occurs. In the region around the ankle, the dominance of a light-to-intermediate Galactic contribution can be excluded by the low level of anisotropy in the distribution of arrival directions [7]. Consequently, the protons at and below the ankle must have an extragalactic origin and may originate either from the interactions of heavier nuclei in the source environment [8,9] or from another population of sources; as concerns the medium-mass nuclei observed around the ankle, they should be provided by an additional component, which can be either Galactic or extragalactic. Above 8 EeV, the observation of a dipole pointing away from the Galactic centre suggest an extragalactic origin for UHE-CRs [10][11][12].
In this analysis [13] we aim to constrain the physical parameters related to the energy spectrum and the mass composition of particles escaping the environments of extragalactic sources (after acceleration and in-source propagation). We simultaneously fit a simple astrophysical model to both the energy spectrum and the mass composition data measured at the Pierre Auger Observatory, considering energies from 10 17.8 eV. In a previous publication [14], a model consisting of one single population of extragalactic sources was fitted to the data above the ankle (E > 10 18.7 eV). Here, with the aim of interpreting the ankle feature as the superposition of different components, we assume the presence of one (or more) additional contribution(s) at low energies, either Galactic or extragalactic. Further details can be found in Ref. [13,15].

The astrophysical model
In our model each extragalactic component originates from a population of identical sources uniformly distributed in the comoving volume, except for a local overdensity for distances smaller than ∼ 30 Mpc, with a minimum source distance of 1 Mpc, which takes into account that the Milky Way belongs to a group of galaxies embedded on the Local Sheet. For each extragalactic population of sources the spectrum of particles escaping from the source environment is given by the superposition of the contributions of n ≤ 5 representative nuclear species A, chosen among 1 H, 4 He, 14 N, 28 Si, 56 Fe, ejected according to a power-law spectrum with a rigidity-dependent broken exponential cutoff. The generation rate Q A (E), defined as the number of nuclei with mass A ejected per unit of energy, volume and time, is then given by: where Z A is the atomic number of each species A, and Q 0A is the generation rate at a reference energy E 0 , set to an arbitrary value lower than the energy cutoff of protons.
The free parameters for each component are the spectral index γ, the rigidity cutoff R cut , and n partial normalisations Q 0A . To immediately get a more physically meaningful information about the mass composition at the sources, in our fit results the mass fractions are expressed in terms of fractions I A of the total source emissivity L 0 of each population at redshift z = 0, defined as starting from the fit energy threshold E min = 10 17.8 eV.
We also consider the possible presence of a Galactic component at Earth, which is modelled as a power law with γ Gal ∼ 3 modified by a simple exponential cutoff. 1 We performed the fit by making different assumptions on its mass composition to find the one which best describes our data. The normalisation J Gal 0 at E Gal 0 = 10 16.85 eV and the rigidity cutoff R Gal cut are free parameters of the fit. The energy spectrum and mass composition of the particles escaping from the environment of extragalactic sources are modified during their propagation in the intergalactic medium by the adiabatic energy losses and the interactions with background photons, from both the cosmic microwave background (CMB) and the extragalactic background light (EBL). We take into account these effects by using SimProp [16] simulations, where the uncertain quantities are treated with phenomenological models: we model photodisintegrations via the cross sections computed by Talys [17][18][19] with the settings described in Ref. [20], or the ones from the Puget, Stecker and Bredekamp (PSB) [21,22] model, and the EBL with the Gilmore [23] or Domínguez [24] model. The observed energy spectrum J obs (E ) is thus obtained by integrating the contributions of all the sources weighted by the redshift and modified by the effects of interactions with radiation photons.
Since a direct measurement of the mass composition is not possible on an event-by-event basis, we use the distribution of X max as an estimator of the mass distribution in each energy bin. Such a conversion depends on the choice of hadronic interaction model (HIM), which is thus another source of uncertainty. In this work, we use the HIMs Epos-LHC [25], QGSJet II-04 [26] and Sibyll 2.3d [27].

The fit procedure
We use a measurement of the energy spectrum in log 10 (E/eV) bins of 0.1 width from 10 17.8 eV to 10 20.2 eV, obtained with the data collected over 15 years with the Surface Detector Array of the Observatory [28]. As for the X max distributions, measured by the fluorescence telescopes, we consider log 10 (E/eV) bins of 0.1 from 10 17.8 eV to 10 19.6 eV and one additional larger bin containing all events with greater energies; each X max distribution is binned in intervals of 20 g cm −2 [29]. In the fit we minimise the deviance D = − ln(L/L sat ), a generalised χ 2 , where L is the likelihood of our model and L sat that of a model which perfectly describes the data. It consists of two terms, D J and D X max . The first is for the energy spectrum and is a product of Gaussian distributions. The latter is a product of multinomial distributions used for the fit of the X max distributions; they are modelled as Gumbel distribution functions [30], whose parameters depend on the hadronic interaction model.
The best-fit parameter values for each scenario are those with which the total deviance D = D J + D X max attains its minimum value D min , which we locate using the Minuit package [31]; the statistical uncertainties on the spectral parameters correspond to the half extent of the 1D profile in the parameter space where D ≤ D min + 1, as computed using the MINOS routine of Minuit; the uncertainties on the emissivity and on the mass fractions are computed with Monte Carlo simulations.
All the reference results are obtained by using Talys for the photodisintegration cross sections, the Gilmore model for the EBL spectrum and evolution, and Epos-LHC as the HIM. Other combinations of models will be discussed in Section 4.2. In order to focus on the simplest case, in this section we assume a flat cosmological evolution for the

The reference results
We consider two possible scenarios in the reference configuration. The best-fit parameter values thus obtained are listed in Table 1.
In Scenario 1 we assume an extragalactic population with a mixed mass composition dominating at high energies ("HE"), plus an additional extragalactic population of pure protons dominating at low energies ("LE"). The LE extragalactic component could e.g. arise from the photodisintegration of HE cosmic rays in the environment of their sources [9]. The heavier nuclei below the ankle are instead assumed to originate from a Galactic population. We found that a Galactic component at Earth of pure nitrogen, extending up to a relatively high energy Z R Gal cut ≈ 2×10 18 eV, provides the best fit to the data, whereas heavier compositions with no nitrogen result in deviances D 1000. Hence, here we only show the results obtained in the case of a composition of pure nitrogen.
Alternatively, in Scenario 2 we describe our data by assuming that the ankle feature is due to the superposition of two extragalactic components, one dominating at LE and the other at HE. We assume that the two components are both ejected according to energy spectra described by Eq. (1) but with different parameter values, since they are reasonably associated to two different populations of sources. We are here implicitly assuming that a possible Galactic contribution is subdominant in the considered energy range.
The generation rate at the sources estimated in Scenario 1 and in Scenario 2 is shown for each component and each ejected nuclear species in the left panel of Fig. 1 and Fig. 2, respectively. In the right panel of the same figures we report also the corresponding contributions at the top of the atmosphere grouped according to mass number. For Scenario 1 the first two moments of the X max distributions along with their best-fit function are shown in Fig. 3; very similar results are obtained in Scenario 2.
A common finding between the two proposed scenarios is that the HE component requires very hard spectra with low rigidity cutoffs and intermediate mass compositions, while the LE component requires much steeper spectra.
In order to reproduce the very pronounced spectral features of the measured energy spectrum and the rather narrow X max distributions, a negative spectral index is estimated for the HE component. However, it is important to stress that the UHECR spectrum escaping the source environment may differ from the accelerated one due to energy-dependent effects of interactions and diffusion in the source environments. Besides, also the effect of intergalactic magnetic field, here neglected, may cause a suppression of the flux at low energies [32], making the observed spectrum harder than the actual one escaping from the sources.
As concerns the LE component, if for example the assumption of identical sources is relaxed and different maximal energies are taken into account, the effective en-    ergy spectrum obtained by integrating over them would be steeper than the one of each individual source, as demonstrated in Ref. [33], and this could be a possible explanation for the very soft estimated energy spectrum.
In terms of mass composition at LE, we find that the data can be described by a mix of nitrogen and hydrogen in both scenarios, their relative contributions respectively decreasing and increasing with energy, which confirms what found in Ref. [34] about the needed mixture at the ankle. In the case of Scenario 1, the dominance of Galactic iron in the region below the ankle is disfavoured. Hence, models proposing a medium-mass contribution e.g. from explosions in the winds of Wolf-Rayet-like stars [35] would better describe our data than models foreseeing a dominance of Galactic iron in the region below the ankle, like the one originally proposed by Hillas [36].
The HE rigidity cutoff found as a result of the fit suggests that the maximum energy emitted at the sources is not high enough to entirely attribute the suppression at the highest energies to propagation effects. As concerns the LE component, the fit is degenerate with respect to R LE cut for values 10 19.5 V, thus fixing this parameter to any arbitrarily higher value provides the same best-fit results. Such a degeneracy can be explained by the fact that the estimated energy spectrum of this component is very steep, and hence it is rapidly suppressed even in the absence of an exponential cutoff, making the energy range where this component is the dominant one rather narrow and the fit insensitive to the details of its shape.
The two scenarios provide very similar results in terms of deviance values. The scenario with two extragalactic mixed components provides a better fit of the X max measurements but a worse description of the very pronounced features in the energy spectrum. One way in which a Galactic and an extragalactic below-ankle medium-mass composition would differ is in their distribution of arrival directions, which are not considered in this work, hence firm conclusions about a favoured scenario could be drawn only after a further investigation of the Galacticto-extragalactic transition region.

Effect of the systematic uncertainties
The effect of the systematic uncertainties on the fit results is here shown only for Scenario 2, since the two scenarios are nearly equivalent and the effects of different assumptions about the distribution and evolution of extragalactic sources and the propagation in intergalactic space are expected to be less noticeable in Scenario 1.

Experimental uncertainties
The energy scale and the X max scale are the most important sources of experimental systematic uncertainties. For the energy scale, an energy independent uncertainty ΔE/E = 14% is adopted in the whole considered energy region [28]. As concerns the systematic uncertainties on the measured X max values, they are asymmetric and slightly energy-dependent, ranging from 6 to 9 g/cm 2 [37].
Regarding the energy scale uncertainty, we follow the same approach used in Ref. [14] and we shift all the measured energies by one systematic standard deviation in each direction. As concerns the X max scale uncertainty, since the correlations between the lowest and the highest energy bins can go down to ∼ 0.6, we add two nuisance parameters, a and b, to the fit to allow different shifts in X max at different energies. The first two eigenvectors of the covariance matrix define two functions of energy, v 1 (E i ) and v 2 (E i ); all the X max distributions are shifted by a quantity a v 1 (E) + b v 2 (E), and an additional term given by a 2 + b 2 is added to the deviance. The parameter a shifts all the X max distributions in the same direction by an energydependent amount, whereas b has an opposite effect on the high-energy and the low-energy distributions.
The variations on the predicted fluxes at Earth are shown in the left panel of Fig. 4. The rather large uncertainty on the predicted total fluxes (brown band) is mainly due to the ±14% shifts in the energy scale, which significantly affects only the estimated source emissivities, whereas the X max uncertainties have a stroger impact on the predictions at Earth and on the deviance value .

Uncertainties from models
We explored effect of the propagation models on the fit results by repeating the fit considering different combinations of them from those used in the reference configuration. As regards the photodisintegration, we tested the PSB model, that neglects photodisintegration channels in which alpha particles rather than single nucleons are ejected. Besides, as concerns the EBL spectrum and evolution, we tested also the Domínguez model, which has a higher spectral energy density in the far infrared than the Gilmore one. Regarding the HIM, we verified that QGSJet II-04 cannot properly describe our data (D 1000 in all cases), and is thus excluded from this analysis. Instead of fixing a single HIM, we allow for the possibility to describe our data with an intermediate model between Epos-LHC and Sibyll 2.3d by introducing an additional nuisance parameter δ HIM , limited between 0 and 1. In this way each HIM-dependent Gumbel parameter is interpolated as alpha as α HIM = δ HIM α Epos-LHC + (1 − δ HIM ) α Sibyll 2.3d , so that δ HIM = 0 corresponds to "pure" Sibyll 2.3d and δ HIM = 1 to "pure" Epos-LHC. The effect of such variations on the predicted fluxes at Earth is shown in the right panel of Fig. 4.
Regardless of the propagation models configuration, our data appear to be better described by pure Epos-LHC or by intermediate models much closer to Epos-LHC than to Sibyll 2.3d: the HIM choice is then the dominant uncertainty among the ones from models in terms of predictions at Earth.
As concerns the propagation models effects, even if the impact on the deviance and on the predicted fluxes at the Earth is smaller, some changes in the best fit parameters at the sources are observed, which are in agreement with what is expected to compensate the differences in the propagation to produce similar fluxes at the Earth. Further details and the parameters values can be found in Ref. [13].  However, it is worth stressing that the impact of changing the propagation models on the deviance and on the predicted fluxes at Earth is encompassed by the effect of the experimental systematic uncertainties.

Effect of the cosmological evolution of sources
We repeated the fit considering, for each population of sources, three different models for the cosmological evolution of the source emissivity, parameterised as ∝ (1 + z) m , namely m = +5, +3 and −3, in addition to the no-evolution (m = 0) case considered so far. Each one of these choices can be associated to different classes of sources [13]. In this analysis we consider an evolution (1 + z) m with constant m in the entire considered redshift range (up to z = 10), but we have verified that other possibilities for the behaviour of the evolution at z > 1 have only a small impact on the LE component, not affecting our main conclusions, and a completely negligible effect on the HE component. This can be explained by the fact that practically all nuclei reaching us with energies ≥ 10 17.8 eV originate from z 3, and in particular those with E 10 18.4 eV from z ≤ 1.
Since the LE and HE populations might be accelerated in different classes of sources, they could have different source evolutions. Hence, we consider all sixteen possible pairs of evolutions among m ∈ {−3, 0, +3, +5}. Our results in terms of total deviance are summarised in Fig. 5.
A positive (negative) evolution means that particles were on average accelerated longer ago (more recently) than in the no-evolution scenario, and hence had the time to undergo more (fewer) interactions in intergalactic space. The effects are more noticeable for the HE population, as interactions are more frequent at high energies. In the case of a strong positive (m = 5) evolution of the HE component, corresponding to the last column in Fig. 5, its secondary flux at ankle energies exceeds the observed allparticle spectrum, so that no good fit of the data is possible (D ∼ 1000), regardless of the source evolution of the LE component. On the other hand, the scenarios with no evolution for the HE population appear to be favoured overall, though acceptable fits can also be found with a weak evolution (m = ±3).
Additional constraints on the cosmological evolution of sources may be obtained by considering the flux of the expected cosmogenic neutrinos associated to the best-fit results of each chosen source evolution scenario. In fact, cosmogenic neutrinos do not undergo any interactions during their propagation, except for adiabatic energy losses due to the expansion of the Universe and flavour oscillations, so they can reach us even from very high redshifts, providing complementary information to that available from UHECR measurements.
In general, the contribution of the HE population to the flux of expected neutrinos is negligible, regardless of its cosmological evolution: due to its rather low rigidity cutoff, the pion photoproduction interactions cannot occur on CMB photons, but only on the EBL ones, which contributes to the neutrino flux to a lesser extent because of the much greater interaction length. As a consequence,  Fig. 6. In the plots of Fig. 6 the black solid curves represent the fluxes corresponding (from the bottom to the top) to z max = 1, 2, 3, 4 and 5, assuming a constant m value in the whole redshift range; the dashed black curve shows the expected fluxes corresponding to z max = 3 with a flat source evolution above z = 1. The maximum rigidity of the LE component and its uncertainties have also a strong impact on the neutrino flux and in particular on the peak at around 10 9 GeV, corresponding to the UHECR interactions with CMB photons.
It is worth noting that future neutrino detectors will provide an improved sensitivity to cosmogenic neutrinos at energies above 10 8 GeV. As shown in Fig. 6, our predictions in the cases of positive source evolutions would be constrained by the most stringent future limits, provided by the next-generation detector upgrade of IceCube [38] and by planned detectors. We can conclude that in the near future some additional constraints on the source properties may be set, for example excluding some source evolutions for the LE component and/or limiting the possible values of its rigidity cutoff. Finally, another messenger of potential interest is the flux of gamma-ray cascades originated from photohadronic interactions of UHECRs in intergalactic space, which develop until all the secondaries have E 100 GeV. A rough estimate of the resulting gamma-ray fluxes at E 100 GeV can be obtained by applying the analytical approximation from Ref. [39] to electrons and photons produced in SimProp simulations. In this case, both the LE and the HE component have a non-negligible contribution, as the Lorentz factor threshold for electron-positron pair production by UHECRs on CMB photons is two orders of magnitude lower than for pion production. The results are compared to Fermi-LAT measurements [40] and the only scenarios in tension with the data would be the ones where the HE component has a strong positive evolution and z max 2, or the LE component has a strong positive evolution and z max 4, as shown for example in Fig. 7. However, as shown in Fig. 5 the former are already excluded by the deviance of our combined fit, and as shown in Fig. 6 the latter may also result in amounts of cosmogenic neutrinos within the reach of future planned detectors.

Conclusions
In this analysis we have performed a fit of the energy spectrum and mass composition data across the ankle as measured at the Pierre Auger Observatory. We considered the hypothesis of two extragalactic components, in presence or not of a secondary Galactic contribution, and we showed that they reasonably succeed to reproduce the ankle feature, whose sharpness would be hardly described by other scenarios. As concerns the newly observed feature Despite the fact that a definite conclusion on the presence of a subdominant Galactic flux cannot be reached, our results show that its end is compatible with the data only if it is composed by medium-mass nuclei. An extension of the combined fit to even lower energies will be more effective to investigate the transition from Galactic to extragalactic cosmic rays, increasing the lever-arm with the use of composition data from HEAT (High Elevation Auger Telescopes). Composition results below 10 17.8 eV have been already reported in a preliminary analysis [4]. An update of the X max analysis in the whole energy range is currently in progress and its results are expected to push remarkably the sensitivity of the combined fit studies in the transition region.
A central part of this analysis is the detailed treatment of the systematic uncertainties and the investigation of their effect on the fit results. The possible systematic uncertainties from both experimental and model sources do not spoil our main conclusions, even though they have a non-negligible impact on the deviance value and they affect the fit parameters.
Thanks to the inclusion in our fit of data from the below-ankle region, some constraints on the cosmological evolution of the two populations can be set: very strong evolutions would cause a flux of secondary particles at the ankle exceeding the observed spectrum, even in presence of a negligible contribution from the LE component in that region, and are thus excluded by our fit. Besides, for some of the considered scenarios the predictions of cosmogenic neutrino fluxes might reach the sensitivity range of the next-generation detectors and could be thus excluded in the near future.
Further insight on the possible sources of UHECRs can be gained by extending the combined fit to include the arrival directions information to the spectrum and composition data. The results of a preliminary analysis were shown in Ref. [41].
In this analysis, the mass composition data do not extend to energies where the suppression occurs, because of the limited duty cycle of the FD. The interpretation of the suppression in the flux as due to propagation effects or to the maximum energy reached in the sources can provide some constraints on the sources of UHECRs and their properties. In the near future, mass composition estimates will be obtained through X max and the muon content of showers by using machine learning techniques on SD data [42,43]. Furthermore, the Pierre Auger Observatory is currently undergoing an upgrade, AugerPrime [44,45], that includes the deployment of scintillators on top of the SD stations to better distinguish the muonic and electromagnetic content of the showers. This will improve the measurement of the mass composition at the highest energies, allowing us to investigate the possible presence of a sub-dominant light contribution and extend up to the suppression region to perform an analysis similar to the one presented here with much larger statistics.