High-dimensional GARCH process segmentation with an application to Value-at-Risk

An assumption in modelling financial risk is that the underlying asset returns are stationary. However, there is now strong evidence that multivariate financial time series entail changes not only in their within-series dependence structure, but also in the correlations among them. For this reason, we propose a method for consistent detection of multiple change-points in (possibly) high-dimensional GARCH panel data set, where both individual GARCH processes and their correlations are allowed to change. We prove its consistency in multiple change-point estimation, and demonstrate its good performance through an extensive simulation study and an application to the Value-at-Risk problem on a real dataset. Our methodology is implemented in the R package segMGarch, available from CRAN.


Introduction
The increased financial uncertainty during the recent economic crisis has confirmed the strong volatility linkage between asset markets. For example, there is now strong evidence that the economy and oil prices (Hamilton, 2003), foreign exchange rates (Baillie, 1991), equity markets (Baele, 2003) and crude oil and agricultural commodities (Du et al., 2011) are related -a phenomenon commonly referred to as volatility spillover. Such behavior is naturally expected since the rate of information influences the volatility in asset returns and therefore, the information flow from one asset market can be incorporated into another related market (Ross, 1989).
Motivated by such observations, joint modelling of several financial returns has attracted considerable attention in the literature, and many models have been proposed for multivariate GARCH processes including the vectorized multivariate GARCH models (Bollerslev et al., 1988), the Baba-Engle-Kraft-Kroner (BEKK) model (Engle and Kroner, 1995), the constant conditional correlation (CCC) model (Bollerslev, 1990), the dynamic conditional correlation (DCC) model (Engle, 2002), the generalized orthogonal GARCH models ( Van der Weide, 2002), the full-factor multivariate GARCH models (Vrontos et al., 2003) and the conditionally uncorrelated components-based multivariate volatility processes (Fan et al., 2008); for a survey of recent approaches to multivariate GARCH modelling and inference, see Bauwens et al. (2006).
The assumption that the underlying dynamics remain unchanged is rather restrictive considering that the fundamentals driving an economy, or the asset markets in particular, exhibit sudden changes or regimes switches. These change-points (a.k.a. structural breaks or breakpoints) in various economic and financial time series are well documented, see Hansen (2001) and Stock and Watson (2002) among many others. Pesaran and Timmermann (2007) examined eight major currency pairs and established that stationary volatility modelling induces an upward bias in the mean square forecast error when change-points exist in the exchange rate volatility. Failing to account for structural breaks, exchange rate volatility is found close to be integrated. Cappiello et al. (2006) documented a change-point in the correlation between international equity and bond returns after the introduction of the euro currency, supporting the need for a method that accounts for change-points in both the conditional volatility and correlation. Ewing and Malik (2013) found a significant transmission of volatility (spillover) between oil and gold when change-points in variance were considered, compared to a weaker transmission when change-points are ignored. Diebold and Inoue (2001) noted that stochastic regime switching may be confused with long memory and the presence of structural breaks may easily be regarded as fractional integration. In Mikosch and Stȃricȃ (2004), a theoretical basis is provided for accounting for the long-range dependence in volatility and the integrated GARCH effect in long log-return series, by assuming that the series undergoes changes in unconditional mean or variance.
Investigations into the tests for a single structural break in univariate, ARCH-type models have been made in Kokoszka and Leipus (2000), Kokoszka and Teyssière (2002), Lee et al. (2003), Berkes et al. (2004) andDe Pooter andVan Dijk (2004). Fryzlewicz and Subba Rao (2014) developed a two-stage procedure termed BASTA (binary segmentation for transformed ARCH), for detecting multiple change-points in the parameters involved in modelling the conditional variance of univariate series. Andreou and Ghysels (2003) studied change-point detection in the co-movement of a pair of returns, by examining the product of returns standardized via data-driven volatility filters. For modelling multivariate, possibly high-dimensional data such as asset portfolios consisting of hundreds of different assets, the aforementioned changepoint detection methods may be repeatedly applied to individual and pairs of component series. However, identification of the common change-points becomes challenging even for data of moderately high dimensions, due to the bias present in the change-points estimated from each application. Besides, such an approach does not fully exploit the fact that a changepoint may be shared cross-sectionally, and therefore loss of power in change-point detection cannot be ruled out.
In this paper, we propose to address the need for multiple change-point detection in multivariate, possibly high-dimensional GARCH processes where both within-series and crosscorrelation structure may undergo changes. Simultaneous segmentation of high-dimensional GARCH processes is achieved via two-stage procedure in line with Fryzlewicz and Subba Rao (2014) and Andreou and Ghysels (2003). Firstly, the N GARCH processes are transformed into N pN`1q{2 sequences based on empirical residual series and their cross-products, whereby change-points are detectable from the means of the transformed series. The collection of transformed series serves as an input to the second stage, which adopts the Double CUSUM Binary Segmentation algorithm proposed in Cho (2016) for multiple change-point detection in generic additive, high-dimensional panel data. We show the consistency of the two-stage methodology in estimating both the total number and locations of the multiple change-points.
The rest of the paper is organized as follows. In Section 2, we introduce a time-varying multivariate GARCH model which provides a framework for the theoretical treatment of our methodology. Section 3 is devoted to the description of the proposed two-stage methodology.
Its finite sample performance is investigated on sets of simulated data in Section 4. In Section 5, we apply our method on a real financial data set and validate its performance by showing the importance of knowing the change-points in a portfolio of assets in risk management. Section 6 concludes the paper. All the proofs are provided in the supplementary document.
Then, Ω i ptq and Θ i ptq are piecewise constant in t and share B change-points 0 " η 0 ă η 1 ă In (1)-(2), we assume that the (unconditional) correlations across the components of r t are attributed to the correlations of ε t , and that the conditional variance within each component series is modelled separately as in the standard univariate GARCH processes. Thus, the tv-MGARCH model is reduced to the CCC model of Bollerslev (1990) over each stationary where F t´1 denotes the σ-algebra σtr s , s ď t´1u. We take this approach with the aim of specifying the parameters to which structural changes may be introduced, rather than in order to actually model the conditional correlation between pairs of component series; it can be accomplished once the structural change-points are estimated and (approximately) stationary segments are identified.
We note that the multivariate GARCH processes literature, as those cited in Introduction, considers a relatively lower-dimensional applications (N ď 8), whereas we consider GARCH modelling of both simulated and real datasets of higher dimensions (N ě 31) in this paper.
(A3) For some 2 P p0, 1q and all T , we have Assumption (A1) indicates that the dimensionality can either be fixed or increase with T at a polynomial rate. Assumptions (A2)-(A3) guarantee that between any two consecutive change-points, each r i,t admits a well-defined solution a.s.
For a non-stationary stochastic process X t , its strong-mixing rate is defined as a sequence of coefficients HPσpXt,X t´1 ,...q |PpG X Hq´PpGqPpHq|.
For BEKK multivariate GARCH models introduced in Engle and Kroner (1995), Theorem 2.4 of Boussama et al. (2011) shows that there exists a unique and strictly stationary solution, which is geometrically β-mixing and hence also strong mixing, i.e., αpkq Ñ 0 as k Ñ 8 with αpkq " α k for some α P p0, 1q. Adopting the theoretical tools developed in Fryzlewicz and Subba Rao (2011), where the mixing rate of univariate, time-varying ARCH processes was investigated, we establish that any pair of time-varying GARCH processes r ii 1 ,t " pr i,t , r i 1 ,t q J , 1 ď i ă i 1 ď N , is strong mixing at a geometric rate. In order to achieve this goal, the following Lipschitz-type condition is imposed on the joint density of pε 2 i,t , ε 2 i 1 ,t q J . We adopt the notations _ and^to denote a _ b " maxpa, bq and a^b " minpa, bq.
(A4) The joint distribution of ε 2 i,t and ε 2 i 1 ,t , denoted by f i,i 1 pu, vq, satisfies the following: for any a ą 0, there exists fixed K ą 0 independent of a such that Then, time-varying bivariate GARCH processes r ii 1 ,t , i ă i 1 , is strong mixing as below (see  where M is a finite constant independent of t and k.
3 Two-stage change-point detection methodology

Stage 1: Transformation of GARCH processes
Empirical residuals have been widely adopted for detecting changes in the unconditional variance and/or parameters of univariate, conditionally heteroscedastic processes, see Kokoszka and Teyssière (2002), Lee et al. (2003), De Pooter andVan Dijk (2004) and Fryzlewicz and Subba Rao (2014). For stationary processes, the squared empirical residuals may be regarded as approximating a series of i.i.d. random variables. Even in the presence of change-points, such transformation tends to reduce the autocorrelations in the transformed series to a considerable degree, while the change-points are reflected in their means.
For the segmentation of tv-MGARCH processes defined in Section 2, we propose to transform r t into d " d N " N pN`1q{2 series such that a change in any of Ω i ptq or Θ i ptq for all i " 1, . . . , n, are detectable as a change-point in the mean of at least one out of the d transformed series. We first transform the N processes r i,t individually using a function g 1 : R 1`p`q Ñ R, which takes r t´p i,t " pr i,t , . . . , r i,t´p q J and h t´q i,t´1 " ph i,t´1 , . . . , h i,t´q q J as an input. We require that g 1 is bounded and Lipschitz continuous.
Motivated by the univariate methods based on empirical residuals including BASTA-res (Fryzlewicz and Subba Rao, 2014), we select g 1 such that Following Fryzlewicz and Subba Rao (2014), we add X 2 t to q h i,t in order to ensure the boundedness of U t with an arbitrary small positive constant . When any parameter in Ω i ptq undergoes a shift over time, this change is expected to be reflected as a shift in EpU i,t q. More specifically, so that EpU i,t q " Eph i,t { q h i,t q contains any change in Ω i ptq as a change in its value. Choices of the parameters C i,j , j " 0, . . . , p`q and the treatment of (unobservables) h i,t are discussed in Section 3.3.
To capture any changes in the cross-sectional dependence structure of r t or, more precisely, in any σ i,i 1 ptq for i ‰ i 1 , we select the function g 2 : R 2`2p`2q Ñ R: where s i,i 1 P t1,´1u and, with a slight abuse of notation, we define so that the sign of r i,t is preserved as the sign of U 1{2 i,t . When g 1 meets the requirements in (A5), g 2 is also bounded and Lipschitz continuous in its squared arguments (see Appendix A.3 of the supplementary document). Similarly as in (4), U ii 1 ,t is decomposed as such that EpU ii 1 ,t q contains any change in σ i,i 1 ptq as a change in its value.
By regarding U i,t as empirical residuals obtained by applying volatility filters, Andreou and Ghysels (2003) proposed to search for structural changes in the co-movement of a pair of 2,t |. Instead, we adopt U ii 1 ,t . Its formulation is motivated by the observation made in Cho and Fryzlewicz (2015): for given ta t u and tb t u, any changes in Epa 2 t q, Epb 2 t q and Epa t b t q are detectable by jointly examining Epa 2 t q, Epb 2 t q and Etpa t´sa,b b t q 2 u for any s a,b P t1,´1u regardless of its choice. We provide a guide on the empirical choice of s i,i 1 in Section 3.3.
In addition, we can regard U ii 1 ,t on an equal footing with U i,t and U i 1 ,t , which is essential in accomplishing the simultaneous segmentation of r t according to both within-series and crosssectional dependence structure. Under the stationarity, the choice of Upon further imposing Gaussianity on ε i,t , all U i,t and U ii 1 ,t follow a scaled χ 2 1 distribution. Note that the same arguments do not apply to U 1{2 i,t U 1{2 i 1 ,t or other transformations studied in Andreou and Ghysels (2003).
In summary, we propose to detect any changes in Ω i ptq and Θ i ptq, i " 1, . . . , N simultaneously from the d-dimensional transformation of r t : which serves as an input to the high-dimensional change-point detection algorithm we introduce in the next section.
Let tr b i,t u T t"1 denote a stationary multivariate GARCH(p, q) process which follows (1)-(2) with the fixed parameters associated with the stationary segment rη b`1 , η b`1 s, and the identical innovation ε t that has constant covariance matrix over Besides, denoting the index of the change-point strictly to the left of and nearest to t by vptq " maxt0 ď b ď B : η b ă tu, we define r g i,t " Ep r U vptq i,t q and r g ii 1 ,t " Ep r U vptq ii 1 ,t q. Unlike EpU i,t q or EpU ii 1 ,t q, thus-generated r g i,t and r g ii 1 ,t are exactly constant between any two adjacent change-points.
Then, denoting each generic element of the input data in (5) by x j,t (with j " pN´i{2qpi1 q`i 1 and U ii,t " U i,t for notational convenience), it admits the decomposition where f j,t " r g ii 1 ,t (with r g ii,t " r g i,t ) are piecewise constant signals sharing the B change-points in Ω i ptq and Θ i ptq for all i " 1, . . . , N . In other words, at each η b , there exists at least a single ii 1 ,t does not satisfy Epz j,t q " 0 but, due to the mixing properties of U ii 1 ,t given in Proposition 1, its partial sums can be bounded by a term logarithmic in T .

Stage 2: Binary segmentation for tv-MGARCH processes
In this section, we introduce the second stage of the proposed methodology that segments the d-dimensional transformation of r t , using the notations given below the additive model in (6).

Double CUSUM binary segmentation
Cumulative sum (CUSUM) statistics have been widely adopted for change-point detection in both univariate and multivariate data. A series of CUSUM statistics computed on x j,t over a generic segment rs, es for some 1 ď s ă e ď T , is defined as Combined with the binary segmentation algorithm, the CUSUM statistic has been frequently adopted to segment univariate data; see Vostrikova (1981) and Venkatraman (1992) for the theoretical treatment of its application to the canonical additive model with independent noise, and Fryzlewicz and Subba Rao (2014) for its application to segment time-varying ARCH processes.
For change-point detection in multivariate data, we may apply change-point detection procedures for univariate data to each x j,t separately as suggested by Andreou and Ghysels (2003), and then prune down the estimated change-points cross-sectionally. However, identification of common change-points is a challenging task even for moderately large d. Moreover, such an approach does not exploit the cross-sectional structure of the change-points that may be shared by multiple coordinates. The importance of simultaneous change-point analysis in high-dimensional data has been recognised in recent years, and several methods have been proposed which are based on aggregating high-dimensional CUSUM series X j 1,c,T , typically by taking their point-wise maximum (referred to as MAX throughout the paper) or average (AVG); see Groen et al. (2013), Horváth and Hušková (2012) and Jirak (2015) for their details.
We adopt the Double CUSUM Binary Segmentation (DCBS) algorithm proposed in Cho The Double CUSUM (DC) operator takes the CUSUM series X j s,c,e and returns a twodimensional array of DC statistics: (7) for c " s, . . . , e´1 and n " 1, .
which is compared against a threshold π d,T for determining the presence of a change-point over the interval rs, es. If T s,e ą π d,T , the location of the change-point is identified as The Double CUSUM Binary Segmentation (DCBS) algorithm is formulated as below, The index l is used to denote the level (indicating the progression of the segmentation procedure) and m to denote the location of the node at each level in the binary tree grown by the DCBS algorithm.

The double CUSUM Binary Segmentation algorithm
Step 0 Set pl, mq " p1, 1q, s l,m " 1 and e l,m " T .
Step 1 At the current level l, repeat the following for all m.
Step 1.1 Letting s " s l,m and e " e l,m , obtain the series of CUSUMs X j s,c,e for c P rs, eq and j " 1, . . . , d, on which D s,c,e pnq is computed over all c and n P t1, . . . , du.
Step 1.2 Obtain the test statistic T s,e " max cPrs,eq max 1ďnďd D s,c,e pnq.
Step 1.3 If T s,e ď π d,T , quit searching for change-points on the interval rs, es.
On the other hand, if T s,e ą π d,T , identify p η " arg max cPrs,eq max 1ďnďd D s,c,e pnq and proceed to Step 1.4.
Step 2 Once rs l,m , e l,m s for all m are examined at level l, set l Ð l`1 and go to Step 1.
Step 1.3 furnishes with a stopping rule to the DCBS algorithm; the search for further change-points is terminated once T s,e ď π d,T on every segment rs, es defined by two adjacent estimated change-points.

Theoretical properties of DCBS algorithm
From the definition of r g i,t and r g ii 1 ,t , piecewise constant signals tf j,t u T t"1 , j " 1, . . . , d contain the B change-points η b , b " 1, . . . , B shared by the vectors of piecewise constant parameters Ω i ptq and Θ i ptq, i " 1, . . . , N . In other words, at each change-point η b , there exists an index . . , du with its cardinality satisfying m b " |Π b | ě 1. In order to establish the theoretical consistency of the DCBS algorithm, we impose the following conditions chiefly on the quantities that control the detectability of each change-point η b .
(B3) The number of change-points, B, is unknown and satisfies B ď log ϑ T for some fixed The condition in (B1) holds trivially as a consequence of (A5). In general, change-point detection becomes more challenging as the distance between two adjacent change-points decreases. This is reflected on (B2), which still allows the case where T´1pη b`1´ηb q Ñ 0 as T Ñ 8. In conjunction with (B2), Assumption (B3) imposes a bound on the total number of change-points B, which is permitted to grow slowly with T at a logarithmic rate. Assumption (B4) dictates the minimum requirement on the cross-sectional size of the change, for each change-point to be detected as well as being located with accuracy. Note that it is not a trivial matter to gauge the magnitude of a jump in f j,t "g ii 1 ,t brought in by the changes in Ω i ptq or Θ i ptq, and hence (B4) is written directly in terms of ∆ j,b rather than using the parameters involved in (1)-(2). Detailed discussion on the high-dimensional efficiency of DC statistic, a tool studied in Aston and Kirch (2014) that allows the quantification and comparison of the power of different change-point tests, can be found in Cho (2016).
denote the change-points estimated by the DCBS algorithm with a test criterion π d,T . Assume that (A1)-(A5) and (B1)-(B4) hold and π d,T satisfies C 1 d∆´1 d,T T 5p1´γq{2 log T ă π d,T ă C 2 ∆ d,T T γ´1{2 for some constants C 1 , C 2 ą 0 and ą 2ϑ for ϑ in (B3). Then there exists c 0 ą 0 such that From the condition imposed on the rate of ∆ d,T in (B4), it is easily seen that T {T γ Ñ 0 as T Ñ 8. That is, in the re-scaled time interval r0, 1s, the estimated change-points satisfy T´1|p η b´ηb | ď T´γ|p η b´ηb | Ñ 0 for all b " 1, . . . , B. Defining the optimality in change-point detection as when each of the true change-points and the corresponding estimated changepoint are within the distance of O p p1q (see e.g. Korostelev (1987)), it is attained up to a logarithmic factor when the change-points are maximally spread with γ " 1, r ∆ b " 1 and indicates that a " Opbq and b " Opaq.

Choice of parameters for transformation
While Theorem 1 is attained by x j,t resulting from any transformation g 1 meeting (A5), empirical performance of the two-stage methodology is influenced by the choice of the coefficients C i,j , j " 0, . . . , p`q. As in the references given at the beginning of Section 3.1, we may replace C i,j with the maximum likelihood estimates (MLEs) of the GARCH parameters obtained under the stationarity, say p ω i , p α i,j , 1 ď j ď p and p β i,k , 1 ď k ď q.
The BASTA-res algorithm proposed by Fryzlewicz and Subba Rao (2014) is applied to the transformation of univariate time-varying ARCH process which is formulated similarly to (3). They recommend the use of 'dampened' versions of GARCH parameters. In our setting, this leads to the choice of C i,0 " p with within-series dampening parameters F i ě 1. Empirically, the motivation behind the introduction of F i is as follows. For r i,t with time-varying parameters, we often observe that α i,j and β i,k are over-estimated such that ř p j"1 p α i,j`ř q k"1 p β i,k is close to, or even exceeds, one, while p ω i is under-estimated. Therefore, using the raw estimates in place of C i,j 's leads to U i,t where any change in EpU i,t q induced by a change-point in Ω i ptq tends to be small. Hence we adopt the dampening parameter F i which is chosen as By construction, F i is bounded as F i P r1, 99s and brings p ω i and ř p j"1 p α i,j`ř q k"1 p β i,k to approximately the same scale.
Also, g 1 involves the unobservable conditional variance h i,t , which we propose to replace Typically, the GARCH orders p and q are unknown. We propose to use pp, qq " p1, 1q: the GARCH(1, 1) model is very simple yet is known to provide a good fit and accurate predictions for a wide range of datasets (see e.g., Hansen and Lunde (2005)). We also note that our approach allows the additional flexibility to the model with time-varying parameters.
In simulation studies reported in Section 4, we study the effect of misspecifying the GARCH orders on change-point analysis, which shows that the choice of pp, qq " p1, 1q works well even when it under-specifies the GARCH orders.
Theoretically, both choices of s i,i 1 P t1,´1u within g 2 are permitted as any changes in i 1 ,t q 2 u with either of s i,i 1 P t1,´1u. However, its choice may influence the empirical performance in the sense that a certain value results in a transformed series which exposes the structural changes better. Based on the observations made in Cho and Fryzlewicz (2015), we favour the choice s i,i 1 " signp y corr s,e pU 1{2 i,t , U 1{2 i 1 ,t qq where y corr s,e p¨,¨q denotes the sample correlation computed over a segment rs, es.

Choice of threshold for DCBS algorithm
Theorem 1 provides a range for the rate of π d,T that returns consistent estimation of multiple change-points. However, the theoretical range involves typically unattainable knowledge on the quantities such as γ or ∆ d,T and even when such knowledge is available, finite sample performance may be affected by the choice of the multiplicative constant to the given rate.
Instead, we propose a parametric resampling procedure that enables us to approximate the distribution of T s,e in the presence of no change-points. A similar approach was adopted in the bootstrap tests proposed by Kokoszka and Teyssière (2002) in the context of testing the presence of a change-point in the parameters of univariate GARCH models.
Bootstrap algorithm for threshold selection.
Step 1: Fit a GARCH(p, q) model to each tr i,t u T t"1 and estimate the p`q`1 GARCH parameters for all i " 1, . . . , N . As suggested in Section 3.3, we use p " q " 1 in our numerical studies.
Step 4: From the th bootstrap realisation tr i,t , 1 ď i ď N ; 1 ď t ď T u, we obtain T s,e in the same manner as T s,e , with rs, es denoting an interval that is examined at some iteration of the DCBS algorithm. For a given significance level α P r0, 1s, our choice of π d,T is the 100p1´αq%-quantile of T s,e , " 1, . . . , R.

Simulation study 4.1 Models
We study the change-point detection consistency of the two-stage change-point detection methodology described in Section 3 on the datasets simulated from the following models. The (M0) Stationary MGARCH (1, 1) processes. Let ω i " ω`δ ω,i , α i,1 " α`δ α,i and β i,1 " β`δ β,i , where δ¨, 1 iid " Up´∆, ∆q for some small ∆ ą 0 is added to each GARCH parameter so that every r i,t has a slightly different set of GARCH parameters. The innovations are generated from two different distributions, namely (i) ε t iid " N p0, Σq where σ i,i 1 " ρ |i´i 1 | with ρ "´0.75 and (ii) ε i,t iid " t 10 for each i and t, and T " 1000.
(M3) Misspecification of the orders p and q.

Results
Conducting only the first iteration of the binary segmentation procedure in Stage 2 of the proposed methodology, we performed the at-most-one-change test on the data simulated from (M0)-(M1) over 100 realisations, which enables us to investigate its size and power behaviour, respectively, along with accuracy in change-point estimation; see Tables 1-5.
When the innovations are generated from a Gaussian distribution with cross-correlations, the test manages to keep the size below the significance level α " 0.05 when N " 50, while spurious false alarm is observed when N " 100 for (M0.2). We note that, although not directly comparable, Table 1 of Fryzlewicz and Subba Rao (2014) obserevd similar size behaviour from their BASTA-avg2 as well as the change-point test from Andreou and Ghysels (2002) for univariate GARCH processes generated with the same GARCH parameters as in (M0.2). When the innovations are drawn as i.i.d. random variables from a t 10 -distribution, the parameter configuration of (M0.2) brings in greater size distortion.
As expected, the test achieves higher power when the change-point is 'dense' cross-sectionally (with larger ), and better change-point location accuracy. For most GARCH parameter configurations, the test attains power above 0.9 even when the change-point is relatively sparse ( " 0.25), but for (M1.2), (M1.4) and (M1.8). Interestingly, the univariate change-point tests adopted in Table 1 of Fryzlewicz and Subba Rao (2014) for comparative simulation studies, performed uniformly worse for (M1.1) and (M1.5) compared to other models, for which our test performs well. We also note that even when the location of the change-point is skewed (η 1 " r0.9T s), our method generally attains high power and location accuracy if the change-point is not too sparse, in most settings where it shows good performance when the change-point is centrally located (η 1 " rT {2s).

Real data analysis
In this section, we examine the effect of ignoring possible changes in the volatility of a multiasset portfolio. We consider Value-at-Risk (VaR), a widely used measure of market risk embraced by financial institutions for regulatory or other internal purposes, which measures the extreme loss (change in value) of an asset or a portfolio of assets with a prescribed probability level during a given holding period. VaR has been criticized due to its unrealistic assumptions (linearity and normality), parameter sensitivity (to estimation and holding periods) and its inadequacy during crises especially when changing correlation between assets is observed (Persaud, 2000;Danielsson, 2002). The last point is of particular interest, since compared to the period of market stability, it is expected that correlations are significantly higher when markets are falling. Jäckel and Rebonato (2001)  We propose to examine our method by means of a stressed VaR (sVaR), mainly to highlight the possible flaws of this regulatory proposal to monitor the risks of financial institutions.
The Basel Accord (1996 Amendment) requires the use of sVaR which is based on a covariance matrix from a crisis period in the past. The accord does not specify the exact time period to be used but instead proposes the judgement-based and the formulaic approaches (European Banking Authority, 2012). The former relies on a high-level analysis of the risks related to the holding portfolio, while the latter is a more systematic, quantitative approach. Hence, in addition to a judgement-based approach, information on the stressed past periods will provide a robust risk management, and our proposed methodology can serve as a formulaic approach.
We first introduce the VaR metric which can be formulated as follows: where F´1p.|G t q is the quantile function of the loss and profit distribution. This distribution changes over time due to market and portfolio specific conditions, which is embodied in G t .
The accuracy of a model to estimate VaR is tested by examining the null hypothesis that the observed probability of occurrence over a specified horizon is equal to the confidence level α, the violation of which signals concerns. Typically, a financial institution should backtest its VaR by means of statistical tests before reporting it to the interested parties, e.g., stakeholders and regulators. Kupiec (1995) proposes two methods, the Proportion of Failure (PoF) and the Time until First Failure (TFF), to accomplish this. For a sample of T observations, the Kupiec test statistics takes the form of likelihood ratio where x f denotes the number of failures occurred and t f the number of days until the first failure within the T observations, and α is the VaR level. Under H 0 , both LR P oF and LR T F F are asymptotically χ 2 1 -distributed, and their exceedance of the critical value implies that the VaR model is inadequate. We use these two backtests to examine the performance of our method in the context of stressed VaR.
We collect the daily US Treasury zero-coupon yield curves from the US Federal Reserve Board, which are based on a large set of outstanding bonds and notes with maturity from 1 to 30 years, as well as the daily prices of S&P500 index, see Figure 1. We note that the unconditional correlation between the 30 time series ranges from 0.787 to 0.994. Thirty logdifferenced Treasury price series, along with the log-returns of the S&P500 index, from 1 January 2000 to 31 December 2014, serve as an input to our methodology for change-point detection. For validation of our method, we also collect the same data from 1 January 2015 to 31 December 2016.
Our method, using the default parameters described in Section 3.3, gives three changepoints as reported in Table 10. We then estimate a DCC model (Engle, 2002) on each segment defined by two consecutive dates, as well as the sample covariance p Σ b of the residuals p ε t from the fitted DCC models for 1 ď b ď 4, where b denotes the index of the stationary period (patch).
We now take a simple scenario: almost a year (250 days) after the end of the initial sample (i.e., 1 January 2016), a financial institution wishes to report the sVaR of its equally weighted asset portfolio for the year ahead on a daily basis. At the first stage, it calculates the weighted average portfolio returns vector r t and its (sample) covariance matrix p Σ 5 for the Period 5 spanning 1 January 2015 to 31 December 2015, which is typically employed in the standard VaR measure. Now, we validate our method by forecasting one-day ahead sVaR using a 250-day rolling window sample. We denote the rolling returns as r t`ĩ and the rolling (sample) covariance matrix as p Σ˜i 5 forĩ " 0, . . . , 250. Obviously, whenĩ " 0 then p Σ 0 5 :" p Σ 5 .
The question we try to answer is: how will the portfolio sVaR behave using a covariance matrix from different patches in the past?
To calculate the sVaR, a transformation of the following form (Duffie and Pan, 1997) is applied: where r b,t`ĩ is the return vector with covariance matrix p Σ b , L b is the Cholesky decomposition of p Σ b and L˜i the Cholesky decomposition of the rolling p Σ˜i 5 . At the confidence level α " 0.95 and 0.99, we calculate the rolling empirical quantiles of r b,t`ĩ in order to calculate the sVaR using the four periods identified by our method, see Figure 2 for α " 0.99.
The results in Table 10 indicate that the most stressed VaR (found by calculating the quantiles of r b,t`ĩ forĩ " 0) was obtained from the period spanning from 11 September 2008 to 26 March 2009, a rather expected outcome given the high volatility that the markets experienced after the bankruptcy of Lehman Brothers in September 2008. We note that this period spans a period of less than 12 months, shorter than the maximum allowed duration by the Basel Accord and thus is a valid choice for an sVaR. It also coincides with the Bank of England's view on the historical periods per region with the worst market moves (Bank of England, 2016). Finally, it is worthwhile to notice that Period 4 exhibits 'high stress' characteristics, but less severe than those from the aftermath of the Lehman Brothers collapse.
Having the stressed periods in hand, we turn to the one-day ahead sVaR forecasting exercise. The PoF and TFF tests are used to statistically evaluate the sVaR outputs. For PoF, the null hypothesis for a 250 day period is 12 and 2 expected violations in sVaR at 95% and 99% levels. An sVaR model will be adequate if it cannot reject the null hypothesis.
Similarly, for TFF is 20 and 125 days until the first violation occurs at 95% and 99% level, respectively. in contrast to Periods 1 and 2 (see also Figure 2) whereas, opting for the less 'conservative' 95% sVaR, Periods 2 and 4 are statistically adequate. Hence, the choice of the most stressed period boils down to the regulator who will decide the sVaR level.
It should be mentioned that adopting the 'traffic light' approach of the Basel Accord (Basle Committee on Banking Supervision, 1996) will reveal Period 3 as the most appropriate input to the sVaR measure at both 95% and 99% levels. According to this approach, a 99% (95%) VaR model is in the green zone and therefore deemed adequate when 0 to 4 (0 to 17) violations occur in a period of 250 days. In the PoF setting, any number of violations fewer than the expected violations is treated similarly with those of more than the expected violations. This is due to the fact that the null hypothesis of the PoF test is two-side.

Conclusions
In this paper, we propose a methodology for detecting multiple change-points in both withinseries and cross-correlation structures of multivariate volatility processes. The two-stage methodology first transforms the N -dimensional series so that the structural change-points are detectable as change-points in the means of N pN`1q{2-dimensional transformed data, which serves as an input to the multiple change-point detection algorithm in the second stage.
We show the consistency of the methodology in terms of the total number and locations of estimated change-points, and propose a bootstrap procedure for threshold selection, which shows reasonably good performance in our simulation studies. Finally, we apply the proposed method to demonstrate its usage in identifying periods of 'stress' in a multivariate financial dataset consisting of thirty US Treasury zero-coupon yield curves and the S&P500 index. This exercise indicates the efficacy of our methodology in risk management tasks, such as when calculating the (stressed) Value-at-Risk of a portfolio of risky assets.