A novel approach to quantifying the sensitivity of current and future cosmological datasets to the neutrino mass ordering through Bayesian hierarchical modeling

We present a novel approach to derive constraints on neutrino masses from cosmological data, while taking into account our ignorance of the neutrino mass ordering. We derive constraints from a combination of current and future cosmological datasets on the total neutrino mass $M_\nu$ and on the mass fractions carried by each of the mass eigenstates, after marginalizing over the (unknown) neutrino mass ordering, either normal (NH) or inverted (IH). The bounds take therefore into account the uncertainty related to our ignorance of the mass hierarchy. This novel approach is carried out in the framework of Bayesian analysis of a typical hierarchical problem. In this context, the choice of the neutrino mass ordering is modeled via the discrete hyperparameter $h_{type}$. The preference for either the NH or the IH scenarios is then encoded in the posterior distribution of $h_{type}$ itself. Current CMB measurements assign equal odds to the two hierarchies, and are thus unable to distinguish between them. However, after the addition of BAO measurements, a weak preference for NH appears, with odds of 4:3 from Planck temperature and large-scale polarization in combination with BAO (3:2 if small-scale polarization is also included). Forecasts suggest that the combination of upcoming CMB (COrE) and BAO surveys (DESI) may determine the neutrino mass hierarchy at a high statistical significance if the mass is very close to the minimal value allowed by oscillations, as for NH and $M_\nu=0.06$ eV there is a 9:1 preference of NH vs IH. On the contrary, if $M_\nu$ is of the order of 0.1 eV or larger, even future cosmological observations will be inconclusive. The unbiased limit on $M_\nu$ we obtain with this innovative statistical strategy is crucial for ongoing and planned neutrinoless double beta decay searches.


Introduction
According to the standard theory of neutrino oscillations (see e.g. [1] for an updated review and relevant references), the observed neutrino flavours ν α are a superposition of the massive eigenstates ν i : where the index α can be any of the three active neutrino flavours e, µ, τ , the index i = 1, 2, 3 runs over the three massive eigenstates and U is the Pontecorvo-Maki-Nakagawa-Sakata mixing matrix, containing the neutrino mixing angles as well as the CP violating phases (one Dirac phase, as well as two additional Majorana phases, that are non vanishing only if neutrinos are Majorana particles). Cosmological measurements of the cosmic microwave background (hereafter CMB) anisotropies and of the spatial distribution of galaxies provide the tightest bounds on the total neutrino mass, defined as the sum of the three neutrino mass eigenstates, i.e. M ν ≡ m ν,i ≡ m 1 + m 2 + m 3 . The most reliable bound that can be obtained combining Planck data with external datasets is M ν < 0.21 eV (at 95% CL 1 ) [2], from temperature plus large-scale polarization CMB anisotropies and baryon acoustic oscillation (BAO) data (see also Refs [3][4][5][6][7] for constraints obtained by the combination of additional datasets and/or in more extended cosmological scenarios). From neutrino oscillation data, we know that at least two out of the three mass eigenstates should be massive, as two different mass splittings are measured with percent accuracy by current experiments: the solar ∆m 2 21 ≡ m 2 2 − m 2 1 7.6 × 10 −5 eV 2 and the atmospheric |∆m 2 31 | ≡ |m 2 3 − m 2 1 | 2.5 × 10 −3 eV 2 mass gaps. Matter effects in the sun tell us that the mass eigenstate with the larger electron neutrino fraction has the smaller mass. We identify this state with "1" and the heavier state (with a smaller electron neutrino fraction) with "2". Therefore, the solar mass splitting is positive. However, current neutrino oscillation data are unable to determine the sign of the largest mass splitting, the atmospheric mass gap. Two possible scenarios therefore appear, corresponding to the two possible signs of ∆m 2 31 : the normal hierarchy (NH hereafter), in which the atmospheric gap is positive, and corresponds to m 1 < m 2 < m 3 , and the 1 We would like to warn the reader that the abbreviation "CL" is generally reserved for frequentist confidence level, whereas throughout this work we refer to bayesian credible intervals (see e.g. [19]). In doing so, we have decided to adopt the common behaviour of speaking of confidence intervals even in the bayesian framework.
inverted hierarchy (IH in what follows), in which the atmospheric gap is negative, and corresponds to m 3 < m 1 < m 2 2 . Assuming that the mass of the lightest mass eigenstate is zero, which equals to set to zero the mass of m 1 (m 3 ) in the NH (IH), it is possible to obtain a lower bound on the sum of neutrino masses of M ν = ∆m 2 21 + ∆m 2 31 0.06 eV (M ν = ∆m 2 31 + ∆m 2 31 + ∆m 2 21 0.1 eV) from neutrino oscillation measurements.
Neutrino mixing phenomena are sensitive to the neutrino mass splittings only, not to the individual neutrino masses nor to the overall mass scale. Cosmology provides one of the most suitable places where to test and extract the neutrino mass ordering [8][9][10][11][12], see also the recent work of Refs. [7,13]. Despite the fact that current bounds on the neutrino mass M ν show a dependence on how the mass is distributed among the three mass eigenstates [6], present cosmological measurements are not able to firmly single out nature's choice for the mass hierarchy. Consequently, in the absence of a robust measurement of the neutrino mass ordering, a desirable bound on M ν would be one which does not rely on any assumption (or, to be more precise: that relies on the less informative possible assumption) about the hierarchical distribution of the total mass among the three eigenstates. This kind of problem, where the distribution of the parameters of the model under scrutiny are themselves conditionally dependent on the so-called hyperparameters (namely, the bounds on M ν are extracted by assuming a specific mass splitting), is a typical example of a hierarchical model in statistical inference. In this work, we propose a novel method to get a hierarchy-independent bound on M ν , by means of a new discrete parameter, the hyperparameter, h type (that can in practice be identified with the sign of the atmospheric mass splitting), introduced in the standard Monte Carlo Markov chain (MCMC) analysis. This innovative strategy benefits from the fact that the sensitivity to the neutrino hierarchy is simply and unbiasedly extracted from the posterior probability distribution of h type . We shall add this parameter while analyzing current and future cosmological data, to illustrate the power of this technique.
We stress that our approach is different from the one, already found in the literature, in which a single, continuous parameter is used to parametrize both the sum of neutrino masses and the hierarchy [12]. The drawback of the approach proposed in Ref. [12], when used in an MCMC framework 3 , is that it is not possible to disentangle the prior assumptions on the individual masses and on the hierarchy; in particular, a flat (uninformative) prior on the hierarchy implies a non-flat prior on the mass. In our approach, we are free to specify noninformative priors for both the hierarchy and the mass of the lightest eigenstate.
Apart from cosmological probes, there also exist laboratory avenues which are sensitive to the absolute mass scale. In this context, neutrinoless double β decay (0ν2β) searches (see e.g. Refs. [14][15][16][17]) are intriguing, as a positive signal would guarantee that neutrinos have a non-zero Majorana mass [18]. Double beta decay is a rare spontaneous nuclear transition in which the charge of two isobaric nuclei changes by two units, emitting two electrons. The dominant mode of this decay also produces two electron antineutrinos, conserving lepton number and therefore, it is allowed in the standard model framework. Double β decay without antineutrino emission, violating lepton number by two units, is the neutrinoless double β decay. Planned 0ν2β experiments might have the required sensitivity to completely cover the region of the parameter space where a positive signal is expected in the case of IH distribution of the total neutrino mass. Robust limits on the total neutrino mass coming from cosmology can further reduce the allowed region of the parameter space where to look for 0ν2β events.
The paper is organized as follows: we describe our method and provide details of the parameterization we adopt in Sec. 2; we present and discuss the implications for present cosmological data, future CMB and BAO missions and neutrinoless double beta decay experiments in Secs. 3, 4 and 5 respectively. We conclude in Sec. 6.

Method and data
2.1. Statistical framework and choice of the relevant parameters The problem we deal with in this work is a typical example of a statistically hierarchical 4 model (see e.g. [19]). A key feature of hierarchical problems is that the parameters θ of the model introduced for constraining the observables through the data d are modeled conditionally on further parameters, the hyperparameters φ, which have themselves their own prior probability distribution p( φ). As a result, we can define a joint prior distribution so that the proper posterior distribution P ≡ p( θ, φ | d) of the parameters (both "normal" and "hyper") can be written, using Bayes' theorem, as is the likelihood function, in which we dropped the explicit dependence on φ, since the data depends on φ only through θ, and E ≡ L · Π d θd φ is the model evidence, or marginal likelihood. The latter does not depend on the parameters, and thus represents just a multiplicative constant as long as parameter estimation is concerned 5 .
In the case under investigation in this work, the model parameters are extracted conditionally on the choice of the neutrino mass hierarchy. This choice is modelled by introducing a discrete hyperparameter h type that can take two values, corresponding to NH and IH (i.e to sgn ∆m 2 31 = +1 or −1, respectively). Since little is known from current experiments about the preference for one of the two neutrino hierarchies, either normal or inverted, we assign equal a priori probability to the two possible outcomes that h type could take.
We therefore perform a MCMC analysis of an eight-dimensional parameter space. We consider the usual set of six cosmological parameters in the ΛCDM scenario, namely the baryon density Ω b h 2 , the cold dark matter density Ω c h 2 , the angular size of the sound horizon θ s , the reionization optical depth τ , the scalar spectral index n S and the amplitude ln[10 10 A s ] of the power spectrum of primordial scalar perturbations normalized at the pivot scale k 0 = 0.05 Mpc −1 . All these parameters are extracted from flat prior distributions.
The inclusion of massive neutrinos is perfomed in the following way: we assume three massive non-degenerate eigenstates sharing the same temperature T ν = (4/11) 1/3 T γ . We sample, again with a flat prior 6 , over the lightest eigenstate mass m light (thus corresponding to the seventh parameter of the model), which is equivalent to m 1 (m 3 ) in the NH (IH) scenario. The mass of the remaining two neutrino states is set by oscillation measurements through the solar and atmospheric mass gaps, i.e. the so-called squared mass differences, defined as ∆m 2 ij = m 2 i − m 2 j , with i, j = 1, 2, 3. While the solar mass splitting is by convention positive, i.e. ∆m 2 21 > 0, the sign of the atmospheric mass gap ∆m 2 31 , as previously stated, remains still unknown, and it depends on the hierarchical distribution of the total mass among the eigenstates, with ∆m 2 31 > 0(< 0) in the NH (IH) scenario. This is the reason why the lightest eigenstate corresponds to m 1 in the NH scenario, while it is mapped onto m 3 in the IH scenario. We use the latest best-fit values for the oscillation mass gaps [21,22].
As anticipated at the beginning of this section, the choice of the hierarchy is addressed via the discrete hyperparameter h type , which is the eight parameter of the model. At each step, we extract h type from {NH, IH}, assigning equal a priori probability to the two hierarchical scenarios (i.e., we use the discrete equivalent of a flat prior). In the formalism sketched at the beginning of the section, θ ≡ (Ω b h 2 , Ω c h 2 , θ s , τ, n s , ln[10 10 A s ], m light ), while φ = h type . The inclusion of the hyperparameter allows us to handle our ignorance about the true hierarchical distribution of the mass as a nuisance parameter, to be marginalized over. In this way, the posterior distribution of h type for a given datasets contains information about the preference for one of the two hierarchies from that dataset. This is easily done from the chains generated by the MCMC algorithm, by computing the marginalized probabilities P NH and P IH , defined as and similarly for P IH . This information is conveyed by reporting the "odds" for NH vs. IH, i.e. the ratio P NH : P IH 7 . To compute the cosmological constraints and the posterior probability distributions in this extended ΛCDM scenario, we make use of the latest version of the publicly available MCMC package cosmomc [23,24], exploiting the Gelman and Rubin statistics for monitoring the convergence of the generated chains [25]. We quote our results in terms of 95% credible intervals for the parameters. Given that we will be dealing with possibly multimodal distributions, the credible intervals can consist of the union of disjointed regions.

Cosmological Datasets
Current CMB and BAO measurements are considered. Results are presented separately for CMB temperature and low-multipole polarization data (the former ranging from multipoles = 2 up to = 29) from the Planck mission (TT + lowP), and for the addition, to the previous measurements, of the high multipole (i.e. small-scale) polarization and cross-correlation spectra, i.e. the full Planck data release (TT, TE, EE + lowP), see Refs. [2,26,27]. We remind the reader that since, as discussed in 7 We would like to report that Eq. 4 is equivalent to Eq. 2.1 of [7], as it should be from the application of the basic rules of probability (including Bayes theorem). The novelty of our approach is that we include the parameter describing the hierarchy directly in the Monte Carlo, together with m light . This means that we dont need to assume that the likelihood only depends on Mν . This is certainly a wellmotivated approximation for present data. However our method can also be applied in cases in which this approximation does not work anymore (like for future experiments, or when non-cosmological data are added to the analysis). Another advantage is that, in our approach, we can include a flat prior on m light . We also obtain, for free, limits on the other parameters that take into account the uncertainty on the hierarchy. So, even though the starting point of the two approaches (Eq. 4 of this work and Eq. 2.1 of [7]) is the same, the implementation is different, and, in our case, more general. Nevertheless, we would like to emphasize that, when the two approaches are expected to lead to the same results -as it is the case for present data that are only sensitive to the sum of the masses -, they actually do.
Refs. [2,27], small-scale polarization data could still be affected by low-level residual systematics, results obtained without using them should be regarded as more reliable. These measurements are analyzed by means of the publicly available Planck likelihood code [27], and foregrounds or extra nuisance parameters are carefully treated following the prescription detailed in Refs. [2,27].
We combine the two CMB datasets described above with geometrical information from galaxy clustering, i.e. via the BAO signature. All the BAO measurements we exploit here are expressed as de- representing a combination of the line-of-sight clustering information (as encoded by the Hubble parameter H) and the transverse clustering information (encoded in the angular diameter distance D A ) at the effective redshift z eff of the survey, and r s (z drag ) being the sound horizon at the drag epoch 8 . Concretely, we make use of the BAO results from the 6dF Galaxy Survey (6dFGS) [29] and from the BOSS DR11 LOWZ and CMASS samples [30], focusing at z eff = 0.106, and z eff = 0.32, 0.57, respectively. The combination of these measurements will be referred to as BAO. The addition of galaxy clustering measurements, apart from breaking pure geometrical degeneracies among different cosmological parameters (as, for instance, the one existing between the Hubble constant H 0 and the neutrino mass M ν [31]), also helps enormously in pining down the neutrino mass limits, as the free streaming nature of sub-eV neutrinos will leave a clear imprint in the galaxy power spectrum at scales in the regime of interest, see e.g. [6]. We also perform forecasts for future CMB and galaxy clustering data.
For the fiducial values of the six ΛCDM parameters, we use the mean values of the estimates reported in [32] for PlanckTT+SIMlow. Concerning CMB measurements, we consider a future COrE-like [33] satellite mission, generating mock lensed temperature and polarization data accordingly to Refs. [34,35]. We assume perfect foreground subtraction as well as precise control of systematics. The expected noise spectra (which relies on specific experimental setup, such as the sky fraction, the beam width, and the temperature and polarization sensitivities) has been verified against previous results in the literature [36,37], finding an excellent agreement. Future galaxy clustering data are added by means of the expected independent observations of the BAO signal along and across the line of sight from the Dark Energy Instrument (DESI) Experiment [38]. DESI observations will provide separate measurements of H(z)r s and D A (z)/r s at a number of redshifts. This survey is expected to cover 14000 squared degrees of the sky in the redshift range 0.15 < z < 1.85. We follow the DESI Conceptual Design Report, and also Ref. [39] for the percentual errors on both H(z)r s (z) and D A (z)/r s (z) expected from the three types of DESI tracers, (namely, Emission Line Galaxies, Luminous Red Galaxies and High Redshift Quasars) and assume an identical 0.4 correlation coefficient between the percentual errors on H(z)r s (z) and D A (z)/r s (z) (see e.g. [39,40]).
Furthermore, we include, in the future cosmological data, a 1%-measurement of the Hubble constant H 0 . This is implemented in the form of a gaussian prior on H 0 in the analysis with future cosmological data. However, given the fact that the COrE sensitivity is such that it will expectedly allow to determine the value of H 0 below that precision, such a prior on the Hubble constant does not play any crucial role.

Results from present cosmological measurements
In this section, we present the bounds on the neutrino mass parameters derived from different combinations of current cosmological probes, as well as discuss the sensitivity of the very same probes to the neutrino mass ordering. Table 1 shows the 95% CL constraints on the total neutrino mass, on the lightest neutrino mass, and on the individual neutrino masses associated to each neutrino mass eigenstate after marginalization over the h type parameter. For each parameter that appears in the table, with the exception of h type , we quote our results in the following way: if i) the 95% confidence interval includes one of the edges of the prior range for that parameter (this is the case for M ν and for the individual masses), or ii) the posterior probability distribution is bimodal (this is sometimes the case for M ν , see below), then we report the 95% confidence interval in the form [min, max]; iii) otherwise,  we report the 95% confidence interval in the form (mean ± uncertainty). In the last line of the table, we report the results for h type , in the form (odds for NH : odds IH). The odds shown in the second column of Table 1 show how CMB temperature data alone are not sensitive to the different mass parameterizations. This is due to the broad bound that the CMB temperature data set on the total neutrino mass, M ν < 0.740 eV at 95% CL. In fact, the potential for cosmological observations to discriminate between the two mass orderings mainly relies on the capability to push the upper limit on M ν close or even below 0.1 eV, the minimal value of the mass allowed by oscillation data in the case of IH. This is mainly due to the fact that the region M ν < 0.1 eV is only allowed in the case of normal hierarchy, so that tighter upper bounds on M ν end up favouring the NH scenario simply because of the larger volume in parameter space available to the model. Moreover, the region of masses with M ν 0.1 eV is the one where the mass patterns predicted by NH and IH, and the resulting cosmological perturbations, differ the most. The differences are however small, given the sensitivity of present, and possibly also future, experiments, and the dominant contribution to the constraining power still comes from the sheer amount of volume in parameter space available to the two models. On the opposite, when M ν 0.1 eV (i.e., M ν ∆m 2 21 , ∆m 2 31 ), we are in a situation in which both hierarchies effectively coincide with the degenerate scenario m 1 m 2 m 3 , and the differences in the evolution of perturbations induced by the mass ordering are too small to have any observable consequence, given current sensitivities. Given that the we find a bound from CMB temperature anisotropies and large-scale polarization M ν < 0.740 eV, most of the parameter space available, given this data, is in the "effectively degenerate" region, where the two hierarchies cannot be distinguished . Indeed, the posterior distributions for the mass fractions f ν,i ≡ m i /M ν for the case of CMB temperature data are clearly peaking on f ν,i = 1/3, as it would be in the fully degenerate scenario. Notice that the addition of smallscale CMB polarization measurements slightly improves the neutrino mass bound (M ν < 0.558 eV at 95% CL) but it does not change significantly the overall picture.
We note that the bounds we find seems to be larger, when a direct comparison is possible, than those found in Ref. [2] without the marginalization over the two possible mass orderings: compare, e.g., our bound M ν < 0.740 eV with M ν < 0.715 eV, the 95% upper bound from Planck TT+lowP in the context of the ΛCDM + M ν model, assuming three massive degenerate neutrinos [2]. The reason is the following. In the present analysis, a nonvanishing lower bound on the total neutrino mass, M ν,min = 0.058 eV is naturally imposed by taking into account neutrino oscillation measurements, while in Ref. [2] it is only assumed that M ν ≥ 0. As a consequence, in our case the 95% confidence regions for M ν are shifted, by definition, towards larger masses. The same care should be applied when comparing to similar constraints reported in the literature.
The inclusion of BAO measurements results in much tighter neutrino mass bounds than those obtained with CMB data only. The combination of the cosmological data then starts to be sensitive to the region of m light in which the NH and the IH scenarios correspond to different neutrino mass spectra. This can be clearly noticed from the bimodal distributions in both M ν and in the neutrino mass fractions depicted in Fig. 1, where two distinct peaks appear, each one being associated to the most probable value of the parameter for a given choice of the hierarchy. In fact, focusing on M ν , we find that the most probable value is M ν = 0.059 eV = M NH ν,min , but a second peak is also clearly visible in M ν = 0.098 eV = M IH ν,min . These peaks are associated to the (single) peak at m light = 0 in the posterior for the mass of the lightest neutrino, that gets mapped to two distinct values of M ν depending on the hierarchy.
By focusing on the probability odds of the hyperparameter h type in the two hierarchies, one might be able to assess whether current cosmological data favour one of the two hierarchical scenarios and to what extent. We can conclude that while there is still no compelling evidence for the cosmological data to prefer one of the two scenarios, the combination of CMB and BAO slightly favors the NH scheme (4:3 odds in favor of NH, without using small-scale polarization, or 3:2 if we use it). This is also confirmed by the inspection of e.g. the top upper panel of Fig. 1, where the combination of CMB and BAO data is able to unveil a bimodal posterior distribution of M ν .
Finally, we note that the mean values and errors of the standard ΛCDM parameters Ω c h 2 , H 0 and Ω m shown in Tab. 1 are indeed very close to those quoted in Ref. [2] for the corresponding data sets, and derived without taking into account the uncertainty on the hierarchy 9 . This is yet another reflection of the fact that in the most part of the high-probability region of the parameter space, the mass spectrum is effectively degenerate.
To conclude this section, current cosmological data are only mildly sensitive to the neutrino mass ordering, with a preference of 4:3 in favour of NH from the combination of Planck CMB data and BAO measurements.

Forecasts for future CMB and BAO surveys
In this section, we present forecasted constraints from future cosmological surveys on the neutrino mass parameters and discuss whether the improved sensitivity of the next-generation cosmological observatories will help unraveling the dilemma about the neutrino mass ordering.
We present the results for the forecasted COrElike [33] CMB mission and a DESI-like survey [38] in Tab. 2 and Fig. 2. The results are quoted following the same scheme adopted in Tab. 1 and detailed at the beginning of Sec. 3. We have considered three possible fiducial scenarios: two NH schemes, one with M ν = 0.06 eV and the other one with M ν = 0.1 eV, and one IH scenario, also with M ν = 0.1 eV. Notice that even for a future CMB mission as COrE it will be very difficult to extract with high statistical significance the neutrino mass hierarchy in any of the three fiducial scenarios explored here, even in the case of m light = 0 (M ν = 0.06 eV). Nevertheless, by a comparison between the results in Tabs. 1 and those in Tab. 2 one can learn that the expected sensitivity of COrE alone on the neutrino mass measurements is slightly better than current combined CMB and galaxy clustering searches. This accuracy could be reinforced, for instance, by improved measurements from low-redshift experiments, such as adding a prior on the Hubble constant which simulates a ∼ 1%-measurement of H 0 . However, the impact of such a constraint will be almost negligible, as the COrE mission alone reaches already that precision in H 0 . We therefore focus on additional information coming from galaxy surveys and add 9 The cosmological parameters constraints for the ΛCDM model and many extensions, from several combinations of Planck 2015 and external data, can be downloaded from the Planck Legacy Archive http://www.cosmos.esa.int/web/ planck/pla.    future forecasted measurements of the Hubble parameters and of the angular diameter distance from the DESI survey [38]. Adding BAO measurements improves considerably the results for the fiducial model with M ν = 0.06 eV; in this case, we find a 9 : 1 preference of NH versus IH. The great improvement due to the addition of DESI BAO data when M ν = 0.06 eV can be clearly visualized from the first two panels of Fig. 2: notice that the second peak at ∼ 0.1 eV in the M ν posterior is significantly reduced after the inclusion of BAO data.
On the contrary, for the M ν = 0.1 eV case, the situation is dramatically different. In fact, even if the addition of BAO measurements helps at pinpointing the value of sum of neutrino masses (as it can be seen by comparing, in Fig. 2, the width of the dashed black and blue curves with that of their solid counterparts), nevertheless the data still remain completely uninformative for what concerns the mass splitting, as it can be inferred by looking at the numbers reported in the last row of Tab. 2, second and third columns. This points to the fact that, as already explained in Sec. II in reference to current experiments, the capability of future CMB and BAO observations to discriminate the neutrino mass hierarchy mainly relies on volume effects, i.e., on the possibility of excluding M ν ≥ 0.1 eV with a high statistical significance; this is the case for the fiducial model with M ν ≥ 0.06 eV. When instead M ν = 0.1 eV (or larger), as in the other two fiducial models considered here, the two mass orderings should be disentangled through the effect of the individual neutrino masses on the evolution of cosmological perturbations. Our findings clearly indicate that this is beyond the reach of next-generation CMB and BAO experiments, even in the most optimistic case (for M ν > 0.1 eV, the differences between the two hierarchies are even smaller). A possible improvement could come from highly accurate measurements of the matter power spectrum [12]; see also Ref. [41] for an appraisal of future 21 cm facilities. Alternatively, one should combine results coming from cosmological analysis with constraints obtained in laboratory searches, as we shall see in the following section.

Implications for 0ν2β
In the following, we shall discuss the implications of our analyses for current and future neutrinoless double beta decay searches, as well as dis-cuss the possibility to gain more information by the combination of cosmological observations with 0ν2β experiments. The non-observation of neutrinoless double β decay processes provides at present bounds on the so-called effective Majorana mass of the electron neutrino where T 0ν 1/2 is the neutrinoless double β decay halflife, m e is the electron mass, G 0ν is a phase-space factor and M is the nuclear matrix element (NME), a crucial quantity whose uncertainties affect significantly the interpretation of current and future searches for 0ν2β decay events.
The effective Majorana mass is related to the neutrino mass eigenvalues as follows: where we have written the mixing matrix U of Eq.
(1) as the product of a matrix V , that contains the mixing angles and the Dirac phase, and a diagonal matrix that contains the Majorana phases φ k , see e.g. Sec. 14 of Ref. [1]. Since one of the phases can always be rotated away, we can assume that φ 1 = 0. While the elements of V are the same that enter the oscillation probabilities, the Majorana phases play no role in neutrino oscillation processes. However, they enter in the determination of the 0ν2β half-life, as it is clear from Eqs. (6)- (7), and are thus crucial for 0ν2β experiments. In fact, if the hierarchy is normal, the phases could also arrange to produce a vanishing Majorana mass, and thus no observable 0ν2β signal, even for non vanishing values of the individual masses.
In the previous sections, we have combined cosmological observations (that mainly constrain M ν ) and oscillation measurements (that probe the mass differences) to derive limits on the individual masses, taking into account our ignorance of the mass hierarchy. Since oscillation measurements also constrain the elements of the mixing matrix, we can exploit the same strategy to derive constraints on the Majorana mass [related to the other parameters of the MCMC analysis by Eq. (7)], provided that we also take into account our ignorance of the true values of the Majorana phases.
Following Ref. [42], we thus consider the two Majorana phases as extra parameters in the MCMC analysis, φ 2 and φ 3 , with flat priors in the range [0, 2π], as currently the values of these phases are totally unknown. We also do not consider here the possibility of future independent measurements of the phases. We then extend the analysis discussed in the previous sections for Planck TT,TE,EE+lowP+BAO, as an example of current data, and for COrE+DESI, as an example of future data, extracting the posterior distribution for the Majorana mass. In case of future data, we consider two fiducial models with NH mass ordering and either M ν = 0.06 eV or M ν = 0.1 eV. The 95% CL limit we find from current cosmological data, after marginalization over the hyperparameter h type , is m ββ < 0.056 eV. We illustrate in Fig. 3 the 68% and 95% CL probability contours in the M ν − m ββ plane. We also depict together the tightest bounds on neutrinoless double beta decay searches, coming from the KamLAND-Zen experiment, with 90% CL limits on m ββ < 61−165 meV 10 , the precise value depending crucially on the NMEs assumption [44]. A visual inspection of the two-dimensional posterior makes evident that the posterior is bimodal, as it can be seen more clearly in Fig. 4. These results show that there exist two separated regions of large probability, one preferring vanishingly small values of m ββ and extending down to M ν = 0.06 eV, the other peaking around m ββ = 0.04 eV and M ν = 0.1 eV. This is due to the fact that we are not assuming one of the two hierarchies, but we are instead marginalizing over the mass ordering. The two regions of large probability roughly trace the portions of parameter space that would be preferred assuming either the normal or inverted hierarchy. To be more precise, the preference for m ββ = 0 is given by models with normal hierarchy, while the region around m ββ = 0.04 eV is mostly due to models with inverted ordering (with some contribution from the tail of the posterior distribution of models with h type = NH).
Interestingly, current sensitivities to m ββ start to reach the allowed region by cosmological and oscil-lation measurements, and, consequently, if nature has chosen the inverted hierarchy, a positive signal from neutrinoless double beta decay searches could be imminent (providing of course neutrinos possess a Majorana character and barring highly exotic physical scenarios). At present, both cosmological and laboratory tests of m ββ provide very similar constraints on the Majorana mass (assuming the most favorable values for the NMEs).
We would like to emphasize again that the two large-probability regions in the {M ν , m ββ } plane are not drawn separately by assuming, in turn, each of the two hierarchies as the true one, a priori, as usually done in literature. On the contrary, the appearance of the two regions is a direct consequence of the hierarchical model built via the hyperparameter h type and of the corresponding marginalization. This also allows to assess the relative probability of m ββ lying in each of the two regions.
We have also performed a forecast to compute the expected sensitivities to m ββ from future cosmological data. Combining CORE and DESI, for a NH scenario, we obtain the 95% CL bounds of m ββ < 0.034 eV and 0.005 eV < m ββ < 0.053 eV, assuming that M ν = 0.06 eV and M ν = 0.1 eV, respectively. Notice that for the M ν = 0.1 eV case the expected limit on m ββ is very close to the current one, as for this particular scenario future cosmological measurements will most likely be unable to determine the neutrino mass ordering. Figure 5 depicts the two-dimensional contours in the M ν −m ββ plane for the two possible fiducial models above mentioned, together with the expected upper bounds (assuming m ββ = 0) from a future, nEXOlike [47] neutrinoless double beta decay experiment. In the case of M ν = 0.1 eV the prospects of observing a positive signal from a future 0ν2β decay are very good (provided neutrinos are Majorana and the mass mechanism is responsible for 0ν2β decay), despite the fact that the hierarchy can not be determined via cosmological measurements. In this situation the hierarchy could be extracted by neutrinoless double beta decay itself, since a positive signal characterized by m ββ 0.05 eV would suggest an IH scenario. Alternatively, a positive signal characterized by m ββ 0.02 eV plus the expected sensitivity of σ(m ββ ) ∼ 0.01 eV would point to a NH scenario and discard the IH scenario with high statistical significance.
If, on the other hand, M ν is closer to the minimal value allowed in the NH scenario, the sensitivity of future neutrinoless double beta decay searches may not be enough to detect the putative signal from Majorana neutrinos, due to the possible disruptive interference played by oscillation parameters in the definition of the Majorana mass. Nevertheless, in that case, future cosmological data can single out the neutrino mass ordering with high significance.

Conclusions
We have presented constraints on cosmological parameters in the context of a ΛCDM + M ν scenario, with M ν representing the sum of neutrino masses, assuming three massive non-degenerate eigenstates and properly taking into account the neutrino mass ordering. Indeed, the novelty of the study presented here relies on our treatment of the neutrino mass ordering, currently totally unknown. We implement the neutrino hierarchy ambiguity by means of a hyperparameter h type to be marginalized over. This approach allows us to (i) model the exact mass splittings without making use of approximations, and including the information from oscillation measurements; (ii) to quantitatively assess the preference for one of the two hierarchies in a straightforward fashion, without the need for computing the Bayesian evidence for performing model comparison, and (iii) to account for the incomplete knowledge of the neutrino hierarchy that could potentially affect the neutrino mass bounds. We have employed current cosmological data coming fom the Planck satellite measurements of the CMB anisotropies and a compilation of BAO measurements at different redshifts. We have also performed forecasts for future cosmological missions, such as the proposed CMB satellite mission COrE and the future galaxy survey DESI.
Focusing on current cosmological measurements, we have shown that CMB temperature and polarization data alone are not sensitive enough to discriminate between the two hierarchies. When BAO information is included, present cosmological probes start to be weakly sensitive to the mass ordering (3 : 2 or 4 : 3 odds in favor of NH, with or without CMB small-scale polarization, respectively), although compelling evidence for one of the two is still lacking. Marginalizing over the hierarchy parameter slightly worsens the neutrino mass limits, albeit galaxy clustering data in the form of BAO measurements lead to results very similar to those obtained in the absence of the hyperparameter h type .
Concerning future experiments, their combination turns out to be really powerful, as it will lead to a 9 : 1 preference for the normal hierarchy scenario versus the inverted hierarchy one, assuming a fiducial cosmology with a sum of neutrino masses M ν = 0.06 eV. However, for larger masses M ν = 0.1 eV distinguishing the hierarchy via cosmological measurements alone turns out to be an extremely difficult task. Adding other possible future improvements, as, for instance, a 1% prior in the value of the Hubble constant, will not change significantly the results, since the COrE mission is expected to provide a smaller uncertainty on H 0 . Additional constraining power might come from more precise measurements of the shape of the matter power spectrum, provided that systematics and uncertainties related to the exact modelling of the perturbation behaviour in the non-linear regime (where we expect neutrinos to leave their most peculiar signature on the matter power spectrum) are kept under control.
We have also studied the implications and the complementarity with neutrinoless double beta decay searches. Current limits from the KamLAND-Zen experiment are competitive and consistent with the tightest cosmological limit we find here on the effective Majorana mass, m ββ < 0.056 eV. These results imply that, if nature has chosen the inverted hierarchy scheme and the Majorana neutrino character (versus the normal hierarchy scenario and the Dirac nature), a positive signal from neutrino neutrinoless double beta decay searches, as well as a cosmological detection of the neutrino mass, could be imminent. Future prospects for neutrinoless double beta decay experiment are promising for a total neutrino mass M ν = 0.1 eV, regardless of the neutrino mass hierarchy. However, if the lightest neutrino mass eigenstate turns out to be zero, and the hierarchy normal, the detection of this putative signal cannot be guaranteed, even for an ultimate, highly sensitive neutrinoless double beta decay experiment. The good news is that if this is the case realized in nature, cosmology will be able to tell us about the neutrino mass hierarchy with compelling statistical significance.  about the outcome one could obtain, should a uniform prior probability distribution for log(m light ) be chosen 11 . The posterior probability for either NH or IH, as defined in Eq. (4), can be written as (in the formulas we focus on NH for the sake of conciseness) It is of course completely equivalent to write the integral in terms of M ν or m light ; however, the former choice allows for a simplification, since we know that, to a good approximation, the likelihood of cosmological data does not depend directly on the neutrino hierarchy. Moreover, the joint prior probablity is being the prior probability of M ν subjected to the choice of the hierarchy, and Π(h type = NH) the prior probability of the NH. Then we can recast Eq. A.1 as We always assume equal prior probability for the two hierarchies throughout this work, i.e. Π(h type = NH) = Π(h type = IH) = 0.5. It is straightforward to obtain the prior probability of M ν (conditioned by the hierarchy) that appears in the integral on the right-hand side of Eq.  . The contours are calculated by taking into account the uncertainty on the mass ordering, without fixing a priori the mass hierarchy. The two regions that can be inferred in both figures (even if not completely isolated) correspond to the two mass orderings. The horizontal bands correspond to the 90% upper bounds on m ββ obtained from KamLAND-Zen [44] (orange) and to those expected from a future nEXO-like experiment (green) [47] (assuming a vanishing Majorana mass), for different assumptions for the values of the nuclear matrix elements that enter into the calculation of m ββ . a slight preference for values of the sum of the neutrino masses close to the minimal mass allowed in each hierarchical scenario. Nevertheless, in the limit of very low masses, the distribution never diverges. On the other hand, a uniform sampling over log(m light ) is mapped into a divergent distribution of M ν for values of the total neutrino mass close to the minimal mass in each hierarchical scenario, with large masses being suppressed as 1/(M ν − M ν,min ). Only the lower cutoff that we have imposed on m light prevents the distribution to formally diverge for M ν → M ν,min .
Regardless of the choice of the sampling over m light , the distributions for the two hierarchies merge for large values of the masses, as they should in the degenerate regime.
The last piece of information we need in order to compute probability odds for the hierarchies is the likelihood for cosmological data. For the sake of semplicity, at first order we can approximate L( d |M ν ) as a Gaussian distribution in M ν , centered in the fiducial value for M ν (M ν ), with standard deviation given by the sensitivity of the dataset under scrutiny (σ Mν ). In this appendix, we present the results for the following cases: current data (Planck TT,TE,EE+lowP and BAO) withM ν = 0 eV and of the three neutrino eigenstates) is assumed, and are subsequently marginalized over. Schwetz et. al (including the authors of this work) have already replied with a Comment [46] to Ref. [45].
When integrating Eq.A.1 for the different combinations listed above, we find the probability odds of NH versus IH reported in Tab.A.3. The posterior probability distribution of M ν in the different scenarios are reported in Fig. A.7.
The results in the case of a uniform distribution of m light are perfectly in agreement with those discussed in the main text and obtained with the full Monte Carlo analysis, both in terms of probability odds and the overall shape of the posterior distribution of M ν (compare for example the black curves in Fig.A.7 with the black dashed curve in the top left panel in Fig.1 and the dashed curves in the top left panel of Fig.2). This is a good sanity check and reinforce our confidence in the results obtained in this simple toy model in the case of uniform sampling of log(m light ). In this case, we still find a preference for normal hierarchy for the combinations Planck TT,TE,EE+lowP+BAO and COrE+DESI with M ν = 0.06 eV, even though milder than in the case of uniform sampling over m light . Furthermore, in the case of COrE+DESI with M ν = 0.1 eV, the uniform sampling over log(m light ) results in a mild preference for inverted hierarchy.
In each case, Fig.A.7 clearly shows that the posterior distributions of M ν , marginalized over the hierarchy, are highly squeezed towards the minimal mass allowed in each hierarchical scenario when we sample uniformly over log(m light ), as expected given the shape of the prior distributions depicted in Fig.A.6.
We also checked that we are able to reproduce these results by performing a full Monte Carlo analysis. Even in the case in which we assume a fiducial model with M ν = 0.1 eV distributed according to the normal hierarchy scenario, the final result is a mild preference for inverted hierarchy.
As expected, we find that the results are strongly driven by the prior choice, given that we assumed that the likelihood for cosmological data is independent of the hierarchy and only depends on the sum of the neutrino masses (i.e. to the total energy density in neutrinos). We argue that this is a fair approximation, since even if there are well known physical effects induced by a different hierarchical distribution of the masses among the three eigenstates on the cosmological probes, these effects are well out of the sensitivity reach of current and future cosmological experiments. This means that the choice of the prior should be addressed and moti-  vated carefully and that the role of the prior should be always made clear when discussing final results. In our work, we decide to employ a uniform distribution for m light since we wanted to use a parameter which could fit different datasets and not just cosmology. In other words, while it is true that cosmological datasets are at first order sensitive to the sum of neutrino masses instead of the mass of the single eigenstates, the same is not true for laboratory experiments, such as kinematic measurements from beta decay, or searches for neutrinoless double beta decay. We wanted to be as general as possible and leave open the possibility to incorporate external datasets in our analysis.
Furthermore, as it is clear from Fig.A.6, the choice of a uniform distribution of m light reflects in an almost flat distribution of M ν . We think this is a good choice, as it allows us to implicitly employ a (nearly) uniform distribution over the parameter that cosmological data are more directly sensitive to, namely M ν . On the contrary, a uniform distribution of log(m light ) highly favours values of M ν closer to the minimal mass allowed in each hierarchical scenario. We argue that, in a situation when the likelihood is not informative enough, the choice of a uniform distribution for m light better represents our ignorance about the value of this parameter.