Tensions in cosmology: a discussion of statistical tools to determine inconsistencies

We present a comprehensive analysis of statistical tools for evaluating tensions in cosmological parameter estimates arising from distinct datasets. Focusing on the unresolved Hubble constant ( H 0 ) tension, we explore the Pantheon Plus + SH0ES (PPS) compilation, which includes low-redshift Cepheid data from the SH0ES collaboration, and the Cosmic Chronometers (CC) dataset. Employing various tension metrics, we quantitatively assess the inconsistencies in parameter estimates, emphasizing the importance of capturing multidimensional tensions. Our results reveal substantial tension between PPS and Planck 2018 datasets. We highlight the importance of the adoption of these metrics to enhance the precision of future cosmological analyses and facilitate the resolution of existing tensions.


Introduction
In the past twenty years, the amount and precision of cosmological data has increased significantly.As a result, stringent constraints were established for the cosmological parameters such as the baryon and dark matter density parameters, the Hubble constant, and others.On the other hand, it is well known that the predictions of the standard cosmological model are in agreement with the majority of the observations.However, there are tensions between the values of some of the cosmological parameters obtained with different datasets.These discrepancies remain an open question for cosmologists and are usually used as a motivation for studying alternative cosmological models.One of the most important unsolved issues is the socalled H 0 tension: the value of the current Hubble parameter H 0 that has been obtained using data from the Cosmic Microwave Background (CMB) assuming a standard cosmological model (Planck Collaboration et al., 2020) is not in agreement with the one inferred in a model-independent way, with data from type Ia supernovae and Cepheids (Riess et al., 2022).This issue has been extensively discussed in the literature but there is no agreement within the community about the source of the discrepancy (Freedman, 2021;Riess and Breuval, 2023;Schöneberg et al., 2022;Vagnozzi, 2023).Another issue is the one called S 8 tension, namely, the difference between the estimation of this parameter (defined as S 8 = σ 8 Ω m 3 0.5 ) from CMB and BAO data Email address: mleize@df.uba.ar() with the one obtained from weak lensing and galaxy clustering data (Heymans et al., 2021;Abbott et al., 2022).Hence, the ability to quantify differences in the estimation of cosmological parameters when different datasets are considered is essential for the success of present and future research in Cosmology.
Given two posterior distributions obtained from two different datasets A and B, the most common method to study statistical tension is to evaluate the one-dimensional marginalization of the posterior distributions.A rule of thumb formula is introduced in Lemos et al. (2021) to measure the discrepancy, expressed in terms of the number of standard deviations σ, between parameter estimations θ derived from distinct datasets.The formulation is as follows: where the means µ A/B and the variances σ A/B correspond to the ones of the posteriors obtained from each dataset.However, this method carries some problems.For instance, marginalization can hide tensions that can only be seen in higher dimensions.This is caused by the fact that marginalization over some of the parameters necessarily implies a loss of information.Moreover, the number of dimensions of the problem also affects the inferred tension.The significance of the discrepancy in the parameter estimations from the two experiments depends on the number of shared parameters effectively constrained by both experiments.Therefore, a variety of methods have been developed to determine the consistency between the posterior distributions that are obtained from different datasets.For example, different tension metrics that quantify this problem in the whole parameter space at once have been studied in Raveri and Hu (2019) and Raveri and Doux (2021).In this work, we focus on a particular type of metrics, namely the ones that are based on the posterior distributions, to study tensions between different data sets1 .We analyze the tension in the most recently released Pantheon Plus + SH0ES (PPS) (Scolnic et al., 2022) with three different datasets: the CMB Planck 2018 release (Planck18), Baryonic Acoustic Oscillation (BAO) BOSS and eBOSS most recent datasets, and Cosmic Chronometers data compilation (CC).Although the tension between PPS and Planck18 datasets may be familiar to the reader, the tension between CC and PPS has not been studied in detail, despite being mentioned in Moresco (2023).Since CC data only contains information about the Hubble parameter, a non-trivial tension in the comparison between PPS and CC may point to a tension in the Hubble constant at the background level, providing new insights into the study of the H 0 tension.This work is organized as follows: in Section 2 we describe the tension metrics that we apply to the different datasets, while in Section 3 we present the results of the statistical analyses that show inconsistencies in the obtained confidence intervals of some parameters.In Section 4 we show the amount of inconsistency that we obtain applying the method proposed in this article, that is, from the metrics described in Section 2 and compare it with the one inferred from the rule of thumb.We also discuss these results in light of the interpretation of what each metric is quantifying.Finally, in Section 5 we present the conclusions of our work.

Tension metrics
In this section we describe the different metrics that have been introduced in Raveri and Hu (2019); Raveri and Doux (2021).We recall the Bayes formula that defines the posterior distribution where L(D|θ) is the likelihood of the data D given the parameters θ, Π(θ) is the prior distribution and ε(D) = L(D|θ) • Π(θ) dθ is the Bayesian Evidence2 .We can divide the metrics into two types (Lemos et al., 2021): the ones based on Bayesian Evidence ε(D) and the ones based on the Posterior distributions P(θ|D).In this work we focus on the last group, which was implemented in the Tensiometer repository3 .All the results are presented in terms of an effective number of standard deviation, N σ , which is defined by (Raveri and Hu, 2019) where P corresponds to a probability that will identify agreement or disagreement between the considered datasets while N σ is the number of standard deviations associated to a 1D gaussian distribution with probability P.Moreover, the interpretation of the probability P varies with the chosen metric and will be clarified for each case in what follows.

Gaussian metrics
Firstly, we describe tension metrics associated to quadratic estimators, which assumed gaussian posterior distributions.In Fig. 1 we show the area that corresponds to the probability to exceed (PTE) certain observed value of the estimator Q, which is noted as Q * .The assigned probability to estimate the tension is P = 1 − PTE, and the equivalent N σ is obtained using eq. 3.

Parameter differences in standard form
This method consists in calculating the statistical difference between the inferred parameters given by dataset A (θ A ) and the ones given by dataset B (θ B ), with the estimator where ( θi ) and Ĉi are the mean and the covariance matrix on the parameter space obtained from the statistical analysis with dataset i (where i correspond to dataset A or B).If both posterior distributions are gaussian, then Q DM follows a χ 2 ν distribution with ν = Rank[ ĈB + ĈA ] degrees of freedom4 .We stress that this metric measures the difference in the obtained confidence intervals of all parameters, while also including the effect of correlations between them.Therefore, it can be regarded as a reasonable generalization of the rule of thumb, with the additional benefit that it quantifies all inconsistencies jointly.

Parameter differences in updated form
Firstly introduced in Raveri and Hu (2019), this method consists in calculating the statistical difference between the inferred parameters given the dataset A (θ A ) and the ones given by the joint datasets (θ AB ), with the estimator where ĈAB is the covariance matrix obtained from the statistical analysis with the joint datasets.It can be seen that if the inferred parameters for both datasets A and A+B are gaussian distributed, Q UDM has a χ 2 ν distribution with ν = rank[( ĈAB − ĈA )] degrees of freedom.As it was shown in Raveri and Hu (2019), in order to compute Q UDM it is convenient to break the posterior distribution as a sum over the Karhunen-Loéve (KL) modes of the covariances involved.With this approach, the Q UDM estimator has a χ 2 distribution with ν degrees of freedom, where ν is the number of the KL eigenvalues that are taken into account.We can point out two interesting features of this metric.Firstly, the directions of the parameter space that shows significant tension can be identified a priori, which helps its physical interpretation.Secondly, non-gaussianities are mitigated since we can select the most constraining directions in the parameter space by the two datasets.Finally, it is worth noticing that this metric is asymmetric, since eq. 5 is not invariant under the exchange of A → B. We stress that this metric answers a different question than the other ones, namely, how the results of the statistical analysis using a given dataset are updated if a new dataset is added.For this reason, in our view it is not recommended to quantify the tension with UDM metric if a strong tension has been detected with the other metrics, for example, DM.

Goodness-of-fit loss
This estimator measures the difference in the likelihood function evaluated at the maximum values of the posterior distribution, considering the two datasets jointly and separately in the statistical analysis where θ A/B p are the Maximum a posteriori (MAP) parameters considering the dataset A/B.Notice that this statistic does not depend explicitly on the covariance of the distribution and only compares the evaluation of the likelihood on certain points.In the case in which the likelihood and the posterior are gaussian distributed, the estimator Q DMAP has a distribution χ 2 with ∆ν degrees of freedom, where In this case, the number of degrees of freedom is defined as where N is the number of data points and C Π , C p are the covariance matrix of the Prior and Posterior distributions, respectively.We emphasize that this metric provides a good quantification of the difference between how well the theoretical predictions can describe both data sets jointly with respect to the same situation but considering the two data sets separately.Therefore, it is a nice tool regarding how the data fit the theoretical prediction but does not give a good estimate of the inconsistency between parameters.

Non gaussian metrics: Exact Parameter Shift
This method is based on the computation of the parameter difference probability density P(∆θ), where ∆θ = θ A − θ B is the difference between the means of the posterior parameters that correspond to datasets A/B.The general expression for two uncorrelated datasets is given by (Raveri and Hu, 2019;Raveri and Doux, 2021) which is analogous to the expression of the cross-correlation function in signal processing.The statistical significance of the shift is calculated by summing over all the values of P(∆θ) over the isocontour corresponding to no shift ∆θ = 0: Here the assigned probability to identify the tension is P = ∆ and the equivalent N σ is calculated inverting eq. 3.Although it may look like the evaluation of the last equation is straightforward, it is particularly difficult when the parameter space is high dimensional.Taking the difference between the samples of the two MCMC chains as it was described in Raveri et al. (2020), we can generate the chain of the parameter difference, which is an estimation of the convolution integral in eq. 9. Finally, the integration in eq. 10 can be accomplished using Kernel Density Estimation (KDE) methods.This method does not assume a particular form of the posterior distribution.This metric measures the discrepancy directly from the chains of the parameter difference defined above.Therefore, like Q DM , it quantifies the tension directly from the outputs of the MCMC process.However, in the case that the isocontour of P(∆θ = 0) is far from the maximum of P(∆θ), the output of the metric is difficult to compute.
Next, we point out some final observations about the metrics described before.As it was shown in eq. 4, DM metric quantifies the difference between the parameters obtained from datasets A and B separately, weighted by the sum of their covariance matrices as it was shown in eq. 6.Furthermore, DMAP metric quantifies the difference between the fit with the joint datasets (A+B) with respect to the ones using the datasets separately.Consequently, both metrics quantify different aspects of the tension in the parameter space and it is not trivial to compare between them.On the other hand, Q UDM evaluates how the results using one dataset are updated when including another one.In short, in the case of gaussian posteriors, Q DM and Parameter Shift quantify the discrepancy between inferred parameters directly from the outputs of the inference process and therefore should be considered the best metrics to quantify discrepancies between parameters.

Parameter inferences with different datasets
In this paper, we compute the estimators described in Section 2 for different combinations of four cosmological datasets.Firstly, we consider the recently released Pantheon Plus compilation of type Ia supernovae (SNIa).We note that this release includes the option of using low redshift Cepheid data obtained by the SHOES collaboration, which are crucial for the calibration of SNIa and therefore for the Hubble tension.Therefore, we name this data set as Pantheon Plus + SHOES (PPS).The Pantheon Plus compilation consists of 1,701 SNIa at redshifts between 0.0012 < z < 2.26, which are available in Scolnic et al. ( 2022  Chain Monte Carlo (MCMC) analysis with the latest release of the MontePython software (Brinckmann and Lesgourgues, 2019;Audren et al., 2013).Table 1 shows the priors on the shared parameter space6 that are considered in our analyses.
Note that the prior in the physical baryon density ω b = Ω b h 2 is only used in the comparison between Planck18 and BAO.
For the CC dataset, the matter density Ω m and the Hubble parameter H 0 are the free parameters of the model.For PPS, the Supernovae absolute magnitude M abs has to be taken into account as a nuisance parameter.For BAO, we consider (Ω m , H 0 , Ω b ).Finally, when Planck18 data is taken into account, all parameters of the standard cosmological model, ΛCDM, (ω b , ω cdm , θ s , A s , n s , τ reio ) are considered, apart from the nuisance parameters of the Planck likelihood.In this case, Ω m and H 0 can be obtained as derived parameters of this analysis.Our analysis is focused on the comparison of different pairs of the mentioned datasets.Fig. 2 shows the results of the statistical analysis performed considering Planck18/CC and Planck18/PPS datasets jointly and separately7 .It follows that for Planck18/CC (left) the datasets are in agreement, while for Planck18/PPS the tension is considerable large.Besides, it is worth noticing that the shared parameter space for all datasets is the Ω m −H 0 plane, except for the pair Planck18/BAO in which the shared parameter space is (Ω m , H 0 , ω b ).The projections of the posterior distributions on the shared parameter subspace are shown in Figs. 3, 4, and 5. Finally, we point out that the posteriors are gaussian distributions in all cases.This is because priors are not informative.

Results
Here we present the results of our analysis.The pair of datasets that have been compared are Planck/CC, CC/PPS, PPS/BAO, Planck18/PPS and Planck18/BAO.Before presenting the results for each pair of datasets, we discuss the results in general for all the metrics.4.1.Discussion for each metric 4.1.1.DM Here we discuss DM metric in view of all our results.According to eq. 4, the tension for this metric is proportional to the difference between the means of the posterior distributions, and to the inverse of the sum of the covariance matrices.Following this definition, we encounter two paradigmatic cases:  Firstly, the tension between Planck18/CC presents the lowest tension of all the analyses (as it is shown in Fig. 3, the difference between the means of each posterior is negligible).Secondly, the tension between Planck18/PPS is the strongest one (the difference between the means is the most important one, as it is shown in Fig. 4 (right)).Besides, it is remarkable that DM metric indicates a different level of tension with respect to the sum of the 1D rule of thumb for all analysis.This is reasonable since the rule of thumb does not take into account the correlation between parameters.Finally, we encountered three cases with moderate tension: CC/PPS, Planck18/BAO and PPS/BAO 8 .Let us briefly discuss the first two cases.Although in the 1D projection it seems that CC/PPS presents weaker tension than Planck18/BAO, the 2D projection indicates that it is the opposite case.The fact that DM metric takes into account the covariance of the posterior distributions allows to quantify better the tension between datasets in moderate cases.

UDM
According to eq. 5, the tension for this metric is proportional to the difference between the means of the posterior distributions of one of the datasets and the joint analysis, and to the inverse of the difference between their covariance matrices.As it was discussed before, this metric is not symmetric and we expect different results when some dataset is updated with another one.In order to gain intuition of UDM results, we analyze two relevant cases.
Firstly, we analyze the case of CC/PPS.When CC results are updated with PPS, the obtained tension is ∼ 2.4σ, while when PPS is updated with CC the tension is ∼ 4.8σ.This can be explained as follows.According to eq. 5, the estimator may indicate higher tension if the difference in the means is high or if the difference |C AB − C A | is small.In this case, the covariance matrix of the posterior corresponding to PPS and PPS+CC are quite similar which makes the estimator Q * UDM and the corresponding N σ higher.
Secondly, we analyze the case of Planck18/PPS.When Planck18 results are updated with PPS the equivalent N σ is ∼ 2.3σ, while when PPS is updated with Planck18 we obtain ∼ 4.6σ.Contrary to the case of PPS+CC, here the dominant effect on the metric is the difference between the means of the compared distributions, rather than the difference in the covariance matrices, as it was shown in the right panel of Fig. 4. We note that in this case, the UDM metric has one effective degree of freedom.This is because the KL decomposition points out that the majority of the variance is condensed in the first eigenvalue.

DMAP
According to eq. 6, DMAP compares the likelihoods of the two datasets separately with the one of the joint analysis, all of them evaluated at the maximum of their corresponding poste-8 This last case shows weak to moderate tension rior distributions.If both datasets are independent, we can write eq.6 as This equation shows that the comparison is between the separate likelihoods evaluated at the maxima of their posteriors and the same likelihoods evaluated at the maxima of the joint posterior distribution.Assuming gaussianity, when θ pA+B is far from θ pA or θ pB , this results in a tension on the DMAP metric.
Unlike DM, this metric does not take into account the covariance matrix of the distributions but has information on the joint analysis.For this reason, both DM and DMAP quantify the tension in different ways and their results cannot be compared.However, like DM, this metric shows that the strongest tension also corresponds to Planck18/PPS and the weakest to Planck18/CC.Finally, since we are using flat uninformative priors, the effective number of degrees of freedom for all analyses is equal to the dimensions of the shared parameter space in all cases.
Also, we discuss two cases of moderate tension: CC/PPS and PPS/BAO.To facilitate the discussion, we assume that the posterior distribution is equal to the corresponding likelihood and that no relevant information is lost in the marginalization process 9 .The left panel of Fig. 4 (CC/PPS) shows that the maximum of the joint posterior is located in the 2σ contour of the corresponding posteriors obtained considering only one dataset, and the left panel of Fig. 5 (PPS/BAO) shows a similar behaviour but this time the maximum of the joint posterior is placed in the 1σ contour of the corresponding contours obtained with only one dataset.This is in agreement with the results shown in Tables 3 and 5.

Exact Parameter Shift
Here we discuss our results for Exact Parameter Shift.In Fig. 6, we show the distribution of P(∆θ) for three different levels of tension: the Planck18/CC comparison with negligible tension (left), the CC/PPS comparison with moderate tension (center) and the Planck18/PPS comparison with strong tension (right).Apart from the 1σ and 2σ contours, the last two cases show the isocontour that corresponds to P(∆θ = 0).
Besides, we expect that when the distributions are gaussian, the results of this metric is similar to the one of Q DM .Indeed, this is the case for CC/PPS, Planck18/BAO, Planck18/CC and PPS/BAO.It is particularly relevant the tension between Planck18/PPS, in which Exact Parameter Shift gives an infinite tension.Although this result may be uncomfortable, it can be explained as follows: there is no isocontour of P(∆θ) that corresponds to P(∆θ = 0) as it was shown in Fig. 6 (right).According to eqs. 3 and 10, this implies that ∆ = 1 and so the probability is translate to an infinite number of σ.We also checked that this effect is not due to a problem of sampling.

∆H0
Figure 6: Distribution of parameter differences defined in eq. 9, for the datasets Planck18/CC (left), CC/PPS (center) and Planck18/PPS (right).The first two delimited contours correspond to 65%, 95% confidence region.On the first figure, the black contour represents the contour that corresponds to P(∆θ = 0).On the last two figures, this contours is outside the 95% confidence region.

Comparison between pairs of datasets
In what follows, we describe our results for the tension metrics when taking the different datasets in pairs.

Comparison between Planck18 and CC
The comparison between Planck18 and CC shows weak tension (see Table 2 and Fig. 3).As we discussed in Subsec.4.1, in this comparison we expect maximum consistency between datasets.Is is worth noting that in all cases, the tension is higher than the one reported by the rule of thumb.This example shows that even in the case of no full consistency, the tension is higher when all shared parameter space is considered.

Comparison between PPS and CC
The comparison between PPS and CC is shown in Table 3  and Fig

Comparison between Planck18 and BAO
Comparison between Planck18 and BAO is presented in Table 4 and Fig. 5 (right) and shows a weak to moderate tension.Unlike the previous analyses, in this case the shared parameter space has three dimensions: (Ω m , H 0 , ω b ).This implies that we cannot visualize the total likelihood but only the 2D projections.agreement, while DMAP shows a weaker tension than the other ones.

Comparison between PPS and BAO
Comparison between PPS and BAO is shown in Table 5 and Fig. 5 (left).In this case, all metrics indicate a moderate tension.The results for the DM metric and Exact Parameter Shift are similar, while DMAP metric indicates a higher tension than the other ones.

Comparison between PPS and Planck18
Finally, the comparison between PPS and Planck18 is shown in Table 6 and Fig. 4 (right).As we discussed in Subsec.4.1, this is the case that shows the strongest tension.For the DM metric, we obtain a tension of ∼ 6.4σ, which is not far from the estimation obtained from the rule of thumb for H 0 .The most relevant result of this analysis is the infinite number of standard deviations using the Exact Parameter Shift metric, which has been already discussed in Subsec.4.1.

Summary and conclusions
In this work, we discuss the importance of using tension metrics to determine tensions between different datasets.We show that the different metrics analyzed here can quantify the tension more precisely than the widely applied Rule of Thumb.Among the metrics used, three of them (i.e., DM, DMAP, Exact Parameter Shift) quantify the difference between the inferred parameters while the other one (UDM) measures how much one dataset updates the results of another when added to the statistical analysis.Also, we discuss some implementation details; for example, some computational difficulties appeared when computing the Exact Parameter Shift while analyzing the tension between Planck18 and PPS.
Our results show two extreme cases: i) a very good agreement between CC and Planck18, and ii) a strong tension between PPS and Planck18.We also find three intermediate cases with moderate tension (BAO and Planck18; BAO and CC; BAO and PPS).These moderate tensions were not pointed out nor quantified before.Therefore, our analyses show that the tension metrics are excellent tools for quantifying moderate tension between distinct data sets.With the availability of new data in the near future, it is expected that the errors will be reduced, and the posterior contours will be narrower, potentially increasing the tension between datasets.
Applying the tension metrics that we have analyzed here, allows us to perform a detailed and precise analysis of the tension between different datasets, including also the correlation between parameters.We emphasize that the use of tension metrics determines the tension between distinct datasets in the plane of shared parameters rather than the tension in a single parameter as in the case of the rule of thumb.We expect that the use of these metrics will become relevant for the analysis of future datasets.For example, some of these metrics have been used to discuss the recently released DESI results (DESI Collaboration et al., 2024).The increase in the precision of new results may lead to stronger tensions, which can be quantitatively described using tension metrics.
Finally, we discuss which metric is best to quantify the tension between datasets.To give a final answer, several aspects have to be taken into account.Firstly, as it has been pointed out before, UDM metric quantifies how a new dataset updates another one, and its results are not comparable with the ones of DM/Exact Parameter Shift or DMAP.Secondly, although the rest of the metrics indeed quantify the tension between independent datasets, DM/Exact Parameter Shift and DMAP quantify different aspects of the tension: while DMAP quantifies the distance between the MAP evaluating the likelihood of the joint posterior and the one using the datasets separately, DM quantifies the distance between the means of the independent posteriors weighted by their covariance matrix.Therefore, in our opinion, DM/Exact Parameter Shift answers accurately the question about the tension between independent datasets.

Figure 1 :
Figure 1: An example of a quadratic estimator probability density distribution.In red, it is shown the area that represents the probability to exceed, PTE.

Figure 2 :Figure 3 :
Figure2: Results of the statistical analysis for the ΛCDM model using Planck likelihood 2018 with Cosmic Chronometers (left) and with Pantheon Plus + SH0ES (right), respectively.The darker and brighter regions correspond to 65% and 95% confidence regions, respectively.The plots in the diagonal show the posterior probability density for each of the free parameters of the model.

Figure 4 :Figure 5 :
Figure4: Results of the statistical analysis for the ΛCDM showing the marginalization on the shared plane on the parameter space Ω m − H 0 .The darker and brighter regions correspond to 65% and 95% confidence regions, respectively.The plots in the diagonal show the posterior probability density for each of the free parameters of the model.
. 4 (left).As we discussed in Subsec.4.1.DM and Exact Parameter Shift match their results as expected.Besides, DMAP metric indicates an smaller tension.Finally, note that DM and DMAP indicate moderate tension.

Table 1 :
Flat priors on the three cosmological parameters of the standard model used for all analyses in this work.Ω m and H 0 are derived parameters in Planck18, while the same applies to ω b in BAO.

Table 2 :
Results of the application of different metrics on the datasets of Planck18/CC.The results are presented in terms of the probability P for a χ 2 ν distribution with ν degrees of freedom when it corresponds (or 1 − ∆ in the case of Exact Parameter Shift) and in all cases for the number of standard deviations N σ .
As expected, DM metric and Exact Parameter Shift are in

Table 3 :
Results of the application of different metrics on the datasets of CC/PPS.The results are presented in terms of the probability PTE for a χ 2 ν distribution with ν degrees of freedom when it corresponds (or 1 − ∆ in the case of Exact Parameter Shift) and in all cases for the number of standard deviations N σ .

Table 4 :
Results of the application of different metrics on the datasets of Planck18/BAO.The results are presented in terms of the probability P for a χ 2 ν distribution with ν degrees of freedom when it corresponds (or 1 − ∆ in the case of Exact Parameter Shift) and in all cases for the number of standard deviations N σ .

Table 5 :
Results of the application of different metrics on the datasets of PPS/BAO.The results are presented in terms of the probability P for a χ 2 ν distribution with ν degrees of freedom when it corresponds (or 1 − ∆ in the case of Exact Parameter Shift) and in all cases for the number of standard deviations N σ .

Table 6 :
Results of the application of different metrics on the datasets of Planck18/PPS.The results are presented in terms of the probability P for a χ 2 ν distribution with ν degrees of freedom when it corresponds (or 1 − ∆ in the case of Exact Parameter Shift) and in all cases for the number of standard deviations N σ .