Properties of the reconciled distributions for Gaussian and count forecasts

Reconciliation enforces coherence between hierarchical forecasts, in order to satisfy a set of linear constraints. While most works focus on the reconciliation of the point forecasts, we consider probabilistic reconciliation and we analyze the properties of the distributions reconciled via conditioning. We provide a formal analysis of the variance of the reconciled distribution, treating separately the case of Gaussian forecasts and count forecasts. We also study the reconciled upper mean in the case of 1-level hierarchies; also in this case we analyze separately the case of Gaussian forecasts and count forecasts. We then show experiments on the reconciliation of intermittent time series related to the count of extreme market events. The experiments confirm our theoretical results and show that reconciliation largely improves the performance of probabilistic forecasting.


Introduction
Hierarchical forecasting requires the forecasts to be coherent, i.e., to satisfy a set of linear constraints determined by the structure of the hierarchy.The base forecasts, computed independently on each time series of the hierarchy are incoherent; reconciliation adjusts them to enforce coherence.
Most reconciliation approaches reconcile the point forecasts (Hyndman et al., 2011;Wickramasuriya et al., 2019;Panagiotelis et al., 2021;Di Fonzo and Girolimetto, 2022).However, reconciled predictive distributions are required to support decision making (Kolassa, 2022).Panagiotelis et al. (2023) performs probabilistic reconciliation through projection, learning the parameters of the projection via stochastic gradient descent.Two limits of this approach is that it cannot reconcile count variables and it prevents an analytical study of the reconciled distribution.These considerations are valid also for other methods for probabilistic reconciliation (Jeon et al., 2019;Taieb et al., 2021;Rangapuram et al., 2021;Hollyman et al., 2022).
A different approach to probabilistic forecast reconciliation is constituted by reconciliation via conditioning.In the case of Gaussian base forecasts, it yields the reconciled distribution in closed form (Corani et al., 2020), with the same mean and variance of MinT (Wickramasuriya et al., 2019).Reconciliation via conditioning can also reconcile the distribution of count variables, adopting a sampling approach (Corani et al., 2023;Zambon et al., 2024).
In this paper, we study the properties of the reconciled distribution obtained via conditioning.In the Gaussian case we prove that, regardless the amount of incoherence of the base forecasts, reconciliation decreases the variance of every variable of the hierarchy.In contrast, we show that in the discrete case reconciliation con increase or decrease the variance of the bottom variables, depending on the probability p c of the base forecasts being coherent.
We then analyze the reconciled mean, restricting our analysis to the case of 1-level hierarchies.In the Gaussian case the reconciled upper mean is a combination of the bottom-up mean and the mean of the upper base forecast (Corani et al., 2020;Hollyman et al., 2021); we refer to this as the combination effect.However, in the case of count variables we show that the reconciled mean of the upper time series can be lower than both the bottom-up and the base mean: we refer to this as the concordant-shift effect, as the reconciled means of all the time series are shifted towards zero.This happens when the base forecast distributions are right-skewed and reconciliation decreases the variance of the forecasts, shortening the right tails of the distributions and pulling the reconciled means of all time series towards zero.In other words, low counts forecasts across the hierarchy reinforce each other.
We present experiments on the reconciliation of intermittent time series referring to counts of extreme market events, interpreting them on the basis of our theoretical insights.We show that, on the intermittent time series of our case study, the concordant-shift effect on the mean is more common than the combination effect.We moreover report a major increase of accuracy for the probabilistic forecasts after reconciliation, confirming the beneficial effect of reconciliation for forecasting intermittent time series (Athanasopoulos et al., 2017;Kourentzes and Athanasopoulos, 2021;Corani et al., 2023;Zambon et al., 2024).
The paper is organized as follows.In Section 2, we recall probabilistic reconciliation through conditioning, and analyze the reconciled mean and variance of the reconciled distribution in the Gaussian and in the non-Gaussian case.In Section 3, we present our case study.Finally, the conclusions are in Section 4.

Probabilistic forecast reconciliation
Given a hierarchy, we denote by b = [b 1 , . . ., b n b ] T the vector of bottom variables, and by u = [u 1 , . . ., u nu ] T the vector of upper variables.To keep the notation simple, we do not show the time index; it is thus understood that all forecasts refer to the time t + h.We denote the vector of all the variables by The hierarchy may be expressed as a set of linear constraints: where I ∈ R n b ×n b is the identity matrix.S ∈ R n×n b is the summing matrix, while A ∈ R nu×n b is the aggregating matrix.The summing constraints can thus be written as u = Ab.
We assume the base forecasts to be in the form of predictive distributions.We denote by p π the base forecast distribution for the entire hierarchy, and by p π U and p π B the base forecasts for the upper and the bottom variables.The aim of probabilistic reconciliation is to find a reconciled distribution π that gives positive probability only to coherent points.To this end, we first obtain a reconciled bottom distribution π B from the base forecast distribution p π.Then, we obtain the reconciled distribution π on the entire hierarchy as: so that the probability of any set of incoherent points is zero.
Probabilistic bottom-up.If we set π B = p π B , we have the probabilistic bottomup, which ignores the base forecast distribution p π U of the upper variables of the hierarchy.The reconciled bottom-up distribution π bu is thus given by: Reconciliation through conditioning.In this work, we apply reconciliation via conditioning.In order to take into account the base forecasts of all variables, we introduce the random vector so that p π U and p π B are the distributions of p U and p B. We then define the reconciled bottom distribution by conditioning on the hierarchy constraints.
For discrete distributions we have: The same formula, π B (b) ∝ p π(Ab, b), also holds for continuous distributions.We refer to Zambon et al. (2024) for the derivation in the continuous case.

Gaussian reconciliation
In the Gaussian case, reconciliation via conditioning can be solved in closed form (Corani et al., 2020); its reconciled mean and variance are numerically equivalent to those of MinT (Wickramasuriya et al., 2019), despite the different derivation.
In Appendix A we derive in a novel way the reconciliation formulae; they are equivalent to those of Corani et al. (2020) and Wickramasuriya et al. (2019), but more convenient to prove some properties.Let us assume the base forecasts for the entire hierarchy to be multivariate Gaussian: where In Appendix A, we show that this is equivalent to the framework of Corani et al. (2020).The covariance matrix of the base forecasts p Σ Y can thus be computed in practice as the covariance matrix of the forecasting errors.Assuming p Σ Y to be positive definite, the reconciled bottom and upper distributions are multivariate Gaussian: and Proposition 1 (Reconciled Gaussian variance).In the Gaussian framework, the variance of each variable decreases after reconciliation.Indeed, for each i = 1, . . ., m, and j = 1, . . ., n − m: The proof is given in Appendix B. We remark that the variance of each variable decreases after reconciliation, regardless of the amount of incoherence.While we consider reconciliation via conditioning, Wickramasuriya (2021) provides a similar result for the minT reconciliation.In Wickramasuriya (2021), no Gaussian assumption is made; however, it assumes the unbiasedness of the base forecast, which we do not assume.
Reconciled Gaussian mean.In order to study the reconciled mean, we need some restrictive assumptions (which, however, do apply to the case study of Sect.3).In particular, assuming that: • there is only one upper variable, • there is no correlation between the bottom and the upper base forecasts, the reconciled upper mean is a convex combination of p u and A p b. Indeed, from (6), we have: where p σ 2 U is the variance of the upper base forecast and p σ 2 bu := A p Σ B A T ≥ 0 is the variance of the probabilistic bottom-up, defined in (2).The reconciled mean is thus a combination of the base and the bottom-up mean, as already observed (Corani et al., 2020;Hollyman et al., 2021).We call this the combination effect.
We can draw an analogy with the Gaussian conjugate model in Bayesian statistics: the posterior variance is always smaller than the prior variance (Gelman, 2011), and the posterior expectation is a convex combination of the prior expectation and the sample mean (Gelman et al., 2013, Ch. 2.5).

Sampling from the reconciled distribution
In the non-Gaussian case, the reconciled distribution π is in general not available in a parametric form; hence, we need to draw samples from π.We follow the approach of Zambon et al. (2024) based on importance sampling (Elvira and Martino, 2021).
Denoting by N the number of samples, the algorithm is as follows: 2. Compute the unnormalized weights q w (i) i=1,...,N as If we assume the bottom and upper base forecasts to be independent, as we do in this paper, we have 3. Compute the normalized weigths as The output b (i) i is an unweighted sample from the reconciled distribution π B .The algorithm assumes the base forecasts of the upper and the bottom variables to be conditionally independent, i.e., to be independent given the observations available up to time t.
Computationally, such algorithm is suitable for the reconciliation of small hierarchies such those considered in this paper (a single upper variable).Larger hierarchies can be reconciled by an extension of this algorithm, called bottomup importance sampling (BUIS, Zambon et al. (2024)).The BUIS algorithm is implemented in the R package bayesRecon (Azzimonti et al., 2023).

Reconciled variance of discrete variables
In the case of discrete variables, the variance of the reconciled distribution of the bottom variables can be larger than the variance of the base distribution; this is a major difference with the Gaussian reconciliation.In particular, this happens if the probability p c := P rob p U = A p B of the base forecasts being coherent is small.Proposition 2. Let us assume p B and p U to be discrete random variables with p c > 0.Then, for any j = 1, . . ., m: where The proof is in Appendix C. According to Eq. ( 11), a small p c can result in a large variance of the reconciled bottom distributions.We can draw yet another analogy with Bayesian statistics: in non-Gaussian models, the posterior variance can be larger than the prior variance if we condition on observations that are conflicting with the prior beliefs (Gelman et al., 2013, Ch. 2.2;Gelman, 2011).

U
We set p p 1 = 0.3, p p 2 = 0.2, and p q = [0.1,0.2, 0.7], which induce large inco-herence.The probability of coherence is p c = 0.17, computed as: In this case, the variance of all variables increases after reconciliation (Table 1).
Poisson base forecast.We now consider the case of Poisson independent base forecasts: We obtain a small probability of coherence by setting p λ 1 = 0.5, p λ 2 = 0.8, and p λ u = 6.0, which results in p c = 0.03.We perform reconciliation via importance sampling (Section 2.2).The variance of the bottom variables, computed using samples, increases after reconciliation (Table 1).

Reconciled mean in the non-Gaussian case
In some cases, the reconciled mean of the upper variable can be lower than both the mean of its base forecast and the bottom-up mean.We call this the "concordant-shift effect".This happens when reconciliation decreases the variance of base forecasts that are right-skewed, which is often the case with count variables: the right tail is shortened, lowering the expected values.
To show the concordant-shift effect on the minimal hierarchy (Fig. 1), we consider independent Poisson base forecasts with p λ 1 = 0.5, p λ 2 = 0.8, and p λ u = 1.5.Thus, all base forecasts convey information of low counts.Reconciliation fuses the base forecasts emphasizing the tendency towards 0: the mean of all the variables decreases after reconciliation (Table 2).The reconciled distribution of the upper variable has a lower mean than both the base and bottom-up distributions (Fig. 2a).In contrast, we can induce the combination effect (Fig. 2b) by setting p λ 1 = 5, p λ 2 = 7, and p λ u = 18: in this case, the base forecasts are less skewed and p c is smaller (Table 2).

Case study: modeling extreme market events
Credit default swaps (CDS) are financial instruments that guarantee insurance against the possible default of a given company (called "reference company") to the buyer.The CDS price is a function of the probability of default estimated by the market for that company.Thus, a high value of the CDS spread corresponds to an increase in the risk of a company default.
Following Raunig and Scheicher (2011), an extreme market event takes place when the value of the CDS spread on a given day exceeds the 90-th percentile of its distribution in the last trading year.In particular, we forecast the extreme market events for the companies of the Euro Stoxx 50 index (https://www.stoxx.com/) in the period 2005-2018, which includes 3508 trading days.As in Agosto et al. (2020), we consider the 29 companies included in the index having a regularly quoted CDS and we divide them into five economic sectors: Financial (FIN), Information and Communication Technology (ICT), Manufacturing (MFG), Energy (ENG), and Trade (TRD).Following Agosto (2022), we start from the CDS spread time series retrieved from Bloomberg and we count the daily number of extreme events for each sector, obtaining five daily time series.The series include 3508 data points each; they have low counts and a high frequency of zeros (Table 3); they are all intermittent, with large average inter-demand interval (ADI).Base forecasts and hierarchical structure.We organize the time series into a hierarchy with 5 bottom and 1 upper time series: the five economic sectors constitute the bottom level, while their sum constitutes the top level (Fig. 3).We compute the base forecasts for the bottom time series (counts in the different sectors) using the model of Agosto (2022), whose predictive distribution is a multivariate negative binomial with a static vector of dispersion parameters and a time-varying vector of location parameters following a score-driven dynamics (Harvey, 2013).The model of Agosto (2022) extends to the multivariate case the modeling approach of Agosto et al. (2016), who proposed a Poisson autoregressive model with exogenous covariates (PARX) to measure systemic risk in the corporate default dynamics.According to Agosto (2022), the predictive distribution computed at time t for the count of time t + 1 in sector i is: where µ t is the k × 1 vector of location parameters, k is the number of sectors, and α i ≥ 0. The model assumes the following dynamics: where C is a k × 1 vector, while D and E are k × k matrices (see Agosto (2022) for detailed properties and estimation details).Thus, the predicted event count in a given sector is a function of the past expectations (µ t ) and forecast errors (y t − µ t ) in the same sector and in the other sectors.The base forecasts for the upper time series are computed by fitting a univariate version of the model (Blasques et al., 2023) on the aggregate count time series.The base forecasts for the bottom series measure financial risks at the sector level, accounting for the dependencies between sectors, through a shock propagation mechanism, besides an autoregressive component.The base upper forecasts express instead a measure of aggregate systemic financial risk.

Experimental procedure
As in Agosto (2022), we estimate the model parameters through in-sample maximum likelihood.We then compute the 1-day ahead base forecasts for time t + 1 by conditioning the model on the counts observed up to time t.
We reconcile the in-sample base forecasts using importance sampling (Sect.2.2).We perform 3508 reconciliations, drawing each time N = 100, 000 samples from the reconciled distribution; each reconciliation is almost instantaneous (<0.1 sec).
As recommended by Panagiotelis et al. (2023), we compare base and reconciled distributions through the energy score (ES, Székely and Rizzo (2013)): where P t is the forecast distribution on the whole hierarchy, s t , s ′ t are a pair of independent random variables distributed as P t , and y t is the vector of the actual values of all the time series at time t.We compute the ES, with β = 2, using the sampling approach of Wickramasuriya (2023).We compute the ES of the joint predictive distribution of the upper and bottom time series; we thus have a single ES for the entire hierarchy.
Moreover, we evaluate the prediction intervals using the interval score (IS, Gneiting and Raftery ( 2007)): where α ∈ (0, 1), l t and u t are the lower and upper bounds of the (1−α)×100 % prediction interval and y t is the actual value of the time series at time t.We use α = 0.1, i.e. we score prediction interval whose nominal coverage is 90%.Finally, we evaluate the point forecasts measuring the squared error (SE) and the absolute error (AE): where ŷt|t−1 denotes the optimal point forecast.The optimal point forecast depends on the error measure: it is the median of the predictive distribution for AE, and the expected value for SE (Kolassa, 2016(Kolassa, , 2020)).
Skill score.The skill score measures the improvement of the reconciled forecasts over the base forecasts.For example, the skill score for AE is defined as: For all indicators, a positive skill score implies an improvement of the reconciled forecast compared to the base forecasts, and vice versa.The skill score defined in ( 15) is symmetric, allowing to fairly compare base and reconciled forecasts.
3.3 2.1 0.9 1.2 1.0 0.8  In Table 4 we report the skill scores averaged over the 3508 reconciliations.Reconciliation largely improves the ES (with an average skill score of 0.89) and the IS (average skill score ranging between 0.2 and 1.1, depending on the chosen time series).The boxplot of the skill scores on ES on each day (Fig. 4) confirms the improvement due to reconciliation.As a further insight, reconciliation reduces by 15-50% the width of the prediction intervals (Table 5) without compromising their coverage (Table 6).We observe large skill scores (0.8-1) on the SE.On the other hand, the skill core on AE is close to 0. Indeed, often the median of the predictive distribution is already 0, and it does not change after reconciliation.
Hence, in our experiments, reconciliation yields a major improvement over the base forecasts, confirming the positive effect of probabilistic reconciliation (Corani et al., 2023;Zambon et al., 2024) and point forecast reconciliation (Kourentzes and Athanasopoulos, 2021) for intermittent time series.
Coherent vs optimal point forecast for the upper time series.Even if we have a reconciled distribution, only the SE-optimal point forecasts are coherent.The AE-optimal point forecasts, i.e., the medians of the reconciled distributions, are instead generally incoherent (Kolassa, 2022).
Hence, besides optimal point forecasts, we evaluate the coherent point forecasts: we take the sum of the medians of the reconciled bottom distributions and use it as upper point forecast.We compute the mean absolute error for the upper time series obtained using the AE-optimal, the coherent, and the base point forecasts (Table 7).As expected, the AE worsens by imposing coherence rather than optimality; yet, it remains better than the AE of the base forecasts.

Analysis of the reconciled mean and variance
In most reconciliations (96%), we observe the concordant-shift effect, i.e., the reconciled upper mean is lower than both the bottom-up and the base upper mean.In Fig. 5a, we show an example.Reconciliation largely reduces the variance of "ALL" and "FIN", shortening the right tail of the distribution.Within the bottom time series, reconciliation mostly affects FIN, which is characterized by the largest counts and overdispersion.The predictive distribution for other time series, such as ICT (shown in figure), are less affected by reconciliation, being characterized by lower counts and variability.The reduction of the variance shifts the expected values towards zero, since the base distributions have a positive skewness.
In Fig. 5b, we show an example of the combination effect.The reconciled distribution of "ALL" is a combination between its base forecast and the probabilistic bottom-up.Reconciliation decreases the expected values of the bottom time series, while increasing the expected value of the upper time series.We plot the base and reconciled probability mass functions for "ALL", "FIN", and "ICT".For "ALL", we also show the bottom-up pmf.The vertical lines correspond to the means of the distributions.
The variance of all the variables decreases in most cases (97%).However, in Fig. 6 we show an example in which the variance of the bottom variables increases after reconciliation because of large incoherence of the base forecasts.The forecast of the upper time series, which is characterized by large uncertainty, is instead sharply shifted towards smaller values.

Conclusions
We proved that reconciliation via conditioning decreases the variance of all variables in the Gaussian framework, while it can increase or decrease the variance of discrete variables, depending on the incoherence of the base forecasts.We also discussed two different effects that the reconciliation can have on the upper mean.The empirical analysis of time series of extreme market events confirmed our theoretical insights.We leave as future research the study of the theoretical properties of the reconciliation of other continuous non-Gaussian distributions, including skewed ones.

Acknowledgements
Work partially funded by the Swiss National Science Foundation (grant 200021 212164/1), by the Hasler foundation (project 23057), and by the European Union's Horizon 2020 research and innovation programme (grant 101016233).

Figure 3 :
Figure 3: Hierarchical structure of the extreme events time series.

Figure 4 :
Figure 4: Boxplot of the skill scores on ES, over 3508 reconciliations.
Concordant-shift effect.The reduction of variance of the asymmetric distribution shortens the right tail and decreases the mean.

Figure 5 :
Figure5: Concordant-shift vs combination effect.We plot the base and reconciled probability mass functions for "ALL", "FIN", and "ICT".For "ALL", we also show the bottom-up pmf.The vertical lines correspond to the means of the distributions.

Table 1 :
Reconciliations that increase the variance of the bottom variables (grey background).

Table 2 :
Mean before and after reconciliation; the base forecasts are Poisson.
Moreover, the skill score is bounded between −2 and 2. Di Fonzo and Girolimetto (2022) compare competing approaches by computing the geometric mean of the indicators.However, this is not suitable for count time series, where the SE and the AE are often zero.

Table 4 :
Average skill scores for the different time series.Positive values indicate an improvement of the reconciled forecasts over the base forecasts.The ES is computed with respect to the joint distribution on the entire hierarchy.

Table 7 :
Mean absolute error for the upper time series "ALL".