Skewness and kurtosis of multivariate Markov-switching processes

Exact formulae are provided for the calculation of multivariate skewness and kurtosis of Markov-switching Vector Auto-Regressive (MS VAR) processes as well as for the general class of MS state space (MS SS) models. The use of the higher-order moments in non-linear modelling is illustrated with two examples. A Matlab code that implements the results is available from the authors.


Introduction
Markov-switching models are now widespread in applied macroeconomics and finance. By extending linear specifications with a discrete latent process that controls parameter switches, MS models have gained the ability to fit time series subject to non-linearities. In macroeconomics, MS models have been introduced by Hamilton (1989) with the aim of capturing the asymmetry of the business cycle. Kim and Nelson (1999a) and Mc Connell and Perez-Quiros (2000) have extended the Hamilton model to account for the reduction in business cycle fluctuations known as the Great Moderation. Phillips (1991) has applied the Hamilton model to a multi-country case. Ang and Beckaert (2002a) have underlined the usefulness of a multivariate dimension when analyzing switches in the dynamics of the US, UK, and German short-term interest rates. Favero and Monacelli (2005) and Sims and Zha (2006) have resorted to the MS VAR framework to detect shifts in the US monetary and fiscal policy. Given the empirical evidence about the existence of policy regimes, the last generation of dynamic stochastic general equilibrium models includes Markov-switching policy reaction functions (see Leeper, 2007, andChung, 2004). In this context MS VAR models arise as fundamental solution of the forward-looking structural equations Zha, 2009, 2011). MS models have also been intensively used in empirical finance to reproduce the fat tails, leverage effects, volatility clustering, and time-varying correlations that characterize many financial return series. Also in this context switching regimes have been inserted into equilibrium models: Mark (1990, 1993), for instance, have added regimes to the conventional asset pricing model through switching processes for dividends and consumption. General discussions and additional references can be found in Krolzig (1997), Nelson (1999b), Fruhwirth-Schnatter (2006), Timmermann (2011), andGuidolin (2012).
The statistical properties of MS VAR models have been analyzed, among others, by Yang (2000), Zakoian (2001, 2002), and Cavicchioli (2013Cavicchioli ( , 2014. These studies focus on stationarity issues, on the first two unconditional moments, and on the determination of the number of regimes. Timmermann (2000) derives the first four moments for univariate MS models. In spite of their relevance, the higher-order moments are still unknown in the general multivariate case. We give closed-form formulae for multivariate MS VAR processes as well as for the general class of MS state space models (see Kim 1993Kim , 1994. In an independent research paper, Cavicchioli (2015) also derives closed-form expressions for the moments of MS VARMA models up to any order and proposes alternative measures of skewness and kurtosis.
The general MS VAR and MS state space models are presented in Section 2. We focus on models where the discrete latent variable takes a finite number of states with time-invariant transition probabilities. In Section 3, we derive formulae for the higher-order moments for both MS VAR and MS SS models. The use of the higher-order moments is illustrated in Section 4 with two examples. Section 5 concludes.

Model and assumptions
Let (Ω, F, P) be a probability space on which a vector process {ε t } and a K-state Markov chain {S t } are defined at discrete time t. The first-order MS VAR process for the n x -dimensional vector {x t } is generated by the stochastic difference equation: Specifications involving more lags can easily be cast into the formulation above through the VAR(1) companion form. The following assumptions are supposed to hold: (ii) The Markov chain {S t } is homogeneous, irreducible, aperiodic, and non-null persistent.
(iii) The processes {ε t } and {S t } are independent.
The variable x t may be unobserved. In this case it is typically related to a vector of n y observations y t through the measurement equation: Equations (2.1)-(2.2) make up a MS state space model. The following additional assumptions are made: (v) The n u -dimensional process {u t } is such that u t ∼ iiN (0, I nu ).
(vi) The process {u t } is independent of {ε t } and {S t }.
(vii) The n y × 1 vector a St , the n y × n x matrix H St , and the n y × n u matrix γ St take K different values depending on the realization of the discrete latent variable S t .
Throughout the paper, we denote p jk the conditional probability p jk = P(S t = k|S t−1 = j) for j, k = 1, · · · , K, π k the marginal probability of state k i.e. π k = P(S t = k), and π the K × K matrix with (π 1 , · · · , π K ) on the main diagonal and zeros elsewhere. We call J n,k the n × nK matrix J n,k = [0 n×n(k−1) , I n , 0 n×n(K−k) ], k = 1, · · · , K; all together the K matrices J n,k sum to J n = ∑ K k=1 J n,k . For any two n × 1 vectors A and B we denote R n the n 2 × n 2 commutation matrix such that A ⊗ B = R n (B ⊗ A). This commutation matrix can be built Magnus and Neudecker, 1999). For any integer m, we also define P m (Φ) the Kn m x × Kn m x matrix such that: Finally we denote ρ(M ) the spectral radius of matrix M .

Multivariate measures of skewness and kurtosis
In macroeconomics and finance, non-linearities are typically analyzed through pairwise measures of skewness and kurtosis such as: where w it and w jt are two scalar elements of a n-dimensional random vector w t with finite moments, V and Cov stand for variance and covariance respectively, and k, ℓ are strictly positive integers whose sum k + ℓ = 3, 4 gives the moment order. The case i = j yields the univariate higher order moments and i ̸ = j the mixed-moments. All moments given by (3.1) can be collected into the n 3 × 1 and n 4 × 1 vectors: where E is the expectation and Σ w is variance-covariance matrix of w t with zeros outside the . Mardia (1970) proposed alternative definitions of multivariate skewness and kurtosis known as β 1,n and β 2,n which aggregate univariate and mix-moments. As shown in Kollo and Srivastava (2005) and Kollo (2008), Mardia's statistics can be easily retrieved from (3.2) since: , and Λ w is any symmetric square root of V (w t ). Other measures of skewness and kurtosis can be found in the literature, for instance in Mori, Rohatgi, and Szekely (1993); they can be similarly calculated using formula (3.2). Without loss of generality we focus on the measures of skewness and kurtosis given in (3.2).

Markov-switching vector autoregressive models
Given the MS VAR process {x t } in (2.1) with assumptions (i)-(iv), let us define z t the stan- It is easily checked that {z t } follows the MS VAR process: Theorem 1 below uses the auxiliary process {z t } for deriving the skewness and kurtosis for the general MS VAR process (2.1).
Theorem 1 Suppose {x t } follows the process (2.1) and that ρ(P m (Φ)) < 1 for m ≤ 4. Then: (I) the skewness of the vector x t is given by , and the n 3 x × n 3 x matrix A 3 is given below

5)
(II) the kurtosis of the vector x t is given by where P 4 (Ψ) is like in (2.3), M is the Kn 4 x × 1 vector M = (π 1 m 1 , π 2 m 2 , · · · , π K m K ) ′ whose n 4 x × 1 elements m k are equal to: where the matrix D is detailed in (I) above, the n 4 x × n 4 x matrix A 4 and the n 4 x -dimensional vector B verify: and the n 4 x × n 4 x matrixÃ 4 is such as Proof: See Supplementary material.
In the absence of the autoregressive lag, i.e. when Φ St = 0, model (2.1) is gaussian conditionally to the concurrent state S t so the distribution of x t is a finite mixture of normal densities (see for example Fiorentini, Planas, and Rossi, 2014). In empirical finance the efficient market hypothesis provides a compelling argument for excluding autoregressive terms, so the finite mixture model has often been applied to the analysis of returns, for instance by Ang and Beckaert (2002b) and Taamouti (2012). Theorem 1 simplifies as follows: We turn to the moments of MS SS process.

Markov-switching state space models
The first two unconditional moments of vector y t in (2.1)-(2.2) are easily derived from the state conditional moments E(x t |S t ) and E(x t ⊗ x t |S t ) since: Like for the MS VAR case, we define y * t the standardized variable y * . Again, this standardization simplifies algebra since Sk(y t ; Σ y ) = E(y * the process: where z t and Σ x are defined in Section 3.1. Theorem 2 below provides the skewness and kurtosis of MS SS processes. Theorem 2 Suppose {y t } evolves as in (2.1)-(2.2) and that ρ(P m (Φ)) < 1 for m ≤ 4. Then: (I) the skewness of the vector y t is given by where a * k , H * k , and γ * k are shown in (3.12), the n 3 y × n 3 y matrix A * 3 is like in (3.5) with dimension n y instead of n x , Σ x = diag[V (x t )], and D is detailed in Theorem 1.
(II) the kurtosis of the vector y t is given by where M is detailed in Theorem 1, the n 4 y × n 4 y matrices A * 4 andÃ * 4 are like in (3.7) and (3.9) with dimension n y instead of n x , and the n 4 u × 1 vector B * is like in (3.8) with dimension n u instead of n x .
The proof is omitted as it follows closely that of Theorem 1 when Φ St = 0, the measurement equation (2.2) not involving autoregressive lags. It makes use of the higher-order moments of the state variable x t which are known. Two examples below illustrate the use of the higher-order moments in multivariate MS models.

Examples
UK asset returns Guidolin and Timmermann (GT, 2005) fit a MS VAR model to the UK stock and bond monthly excess returns for the period 1976-2 to 2000-12. They consider three regimes that impact the intercept, the autoregressive matrix, and the shocks variance-covariance matrix. The regimes are interpreted as bear, normal, and bull market periods. Table 1 shows the model-based skewness and kurtosis of the UK stock and bond excess returns as implied by the parameter values given in GT's Table 4. The co-skewness statistics reported in Table 1 relates the level of the first variable to the square of the second one, whereas the co-kurtosis relates the level of the first variable to the cube of the second variable. In order to gauge the model fit, the empirical counterparts are also displayed together with the 95% confidence intervals computed using the block bootstrap proposed by Politis and Romano (1994). Notes: the model-based moments have been calculated using the parameter estimates given in Table 4  US business cycle Chauvet (1998) and Kim and Nelson (1998) consider a MS dynamic factor model to extract a composite index of the US business cycle out of the growth rates of four US macroeconomic series, namely industrial production, non-farm payroll employment, personal income less transfer payments, and real manufacturing and trade sales. The dynamic factor model is specified as:  Table 2 shows both empirical and model-based moments of the growth rate of Industrial Production and Employment. For the two variables the empirical skewness is negative as well as the co-skewness. The model adequately reproduces these features. The two series exhibit an excess kurtosis which is sizeable and significant. Model M 0 however implies almost zero excess kurtosis. Table 2 also shows the co-kurtosis statistics which relates the square of the two variables. Its empirical value is equal to 3.57 with confidence interval (2.34, 5.21). Since the theoretical value under normality equals 1.37, this reveals the presence of excess co-movements in volatility between Employment and Industrial Production in the US. Model M 0 however does not foresee this feature as it implies a co-kurtosis of 1.62, outside of the confidence interval. To catch this non-linearity, we allow for heteroskedasticity in the common shock a t as in: The variance of σ S 2t a t now switches between two regimes according to the two-state Markovvariable S 2t which is independent of S 1t . We estimate model M 1 by approximated maximum likelihood (Kim, 1994). The higher-order moments under M 1 are displayed in the last row of Table 2. Model M 1 yields third and fourth moments that lie inside the empirical 95% confidence intervals. Hence modelling co-movements in volatility improves the characterization of US Employment and Industrial Production compared to the original CPP's specification.

Conclusion
We extend the early work by Timmermann (2000) on univariate MS models by deriving closedform formulae for the multivariate skewness and kurtosis in both MS VAR and MS state space models. Besides enriching the model interpretation by summarizing non-linear features, these formulae provide a useful tool for diagnostic checking via moment-matching. A Matlab code that implements the results in the paper is available from the authors.