Dynamic factors in the presence of blocks

https://doi.org/10.1016/j.jeconom.2010.11.004Get rights and content

Abstract

Macroeconometric data often come under the form of large panels of time series, themselves decomposing into smaller but still quite large subpanels or blocks. We show how the dynamic factor analysis method proposed in Forni et al. (2000), combined with the identification method of Hallin and Liška (2007), allows for identifying and estimating joint and block-specific common factors. This leads to a more sophisticated analysis of the structures of dynamic interrelations within and between the blocks in such datasets, along with an informative decomposition of explained variances. The method is illustrated with an analysis of a dataset of Industrial Production Indices for France, Germany, and Italy.

Introduction

In many fields–macroeconometrics, finance, environmental sciences, chemometrics, …–information comes under the form of a large number of observed time series or panel data. Panel data consist of series of observations (length T) made on n individuals or “cross-sectional items” that have been put together on purpose, because, mainly, they carry some information about some common feature or unobservable process of interest, or are expected to do so. This “commonness” is a distinctive feature of panel data: mutually independent cross-sectional items, in that respect, do not constitute a panel (or then, a degenerate one). Cross-sectional heterogeneity is another distinctive feature of panels: n (possibly non-independent) replications of the same time series would be another form of degeneracy. Moreover, the impact of item-specific or idiosyncratic effects, which have the role of a nuisance, very often dominate, quantitatively, that of the common features one is interested in.

Finally, all individuals in a panel are exposed to the influence of unobservable or unrecorded covariates, which create complex interdependencies, both in the cross-sectional as in the time dimension, which cannot be modelled, as this would require questionable modelling assumptions and a prohibitive number of nuisance parameters. These interdependencies may affect all (or almost all) items in the panel, in which case they are “common”; they also may be specific to a small number of items, hence “idiosyncratic”.

The idea of separating “common” and “idiosyncratic” effects is thus at the core of panel data analysis. The same idea is the cornerstone of factor analysis. There is little surprise, thus, to see a time series version of factor analysis emerging as a powerful tool in the context of panel data. This dynamic version of factor models, however, requires adequate definitions of “common” and “idiosyncratic”. This definition should not simply allow for identifying the decomposition of the observation into a “common” component and an “idiosyncratic” one, but also should provide an adequate translation of the intuitive meanings of “common” and “idiosyncratic”.

Denote by Xit the observation of item i(i=1,,n) at time t(t=1,,T); the factor model decomposition of this observation takes the form Xit=χit+ξit,i=1,,n,t=1,,T, where the common component χit and an the idiosyncratic one ξit are mutually orthogonal (at all leads and lags) but unobservable. Some authors identify this decomposition by requiring the idiosyncratic components to be “small” or “negligible”, as in dimension reduction techniques. Some others require that the n idiosyncratic processes be mutually orthogonal white noises. Such characterizations do not reflect the fundamental nature of factor models: idiosyncratic components indeed can be “large” and strongly autocorrelated, while white noise can be common. For instance, in a model of the form Xit=χt+ξit, where χt is white noise and orthogonal to ξit=εit+aiεi,t1, with i.i.d. εit’s, the white noise component χt, which is present in all cross-sectional items, very much qualifies as being “common”, while the cross-sectionally independent autocorrelated ξit’s, being item-specific, exhibit all the attributes one would like to see in an “idiosyncratic” component.

A possible characterization of commonness/idiosyncrasy is obtained by requiring the common component to account for all cross-sectional correlations, leading to possibly autocorrelated but cross-sectionally orthogonal idiosyncratic components. This yields the so-called “exact factor models” considered, for instance, by Sargent and Sims (1977) and Geweke (1977). These exact models, however, are too restrictive in most real life applications, where it often happens that two (or a small number of) cross-sectional items, being neighbours in some broad sense, exhibit cross-sectional correlation also in variables that are orthogonal, at all leads and lags, to all other observations throughout the panel. A “weak” or “approximate factor model”, allowing for mildly cross-sectionally correlated idiosyncratic components, therefore also has been proposed (Chamberlain, 1983, Chamberlain and Rothschild, 1983), in which, however, the common and idiosyncratic components are only asymptotically (as n) identified. Under its most general form, the characterization of idiosyncrasy, in this weak factor model, can be based on the behavior, as n, of the eigenvalues of the spectral density matrices of the unobservable idiosyncratic components, but also (Forni and Lippi, 2001) on the asymptotic behavior of the eigenvalues of the spectral density matrices of the observations themselves: see Section 2 for details. This general characterization is the one we are adopting here.

Finally, once the common and idiosyncratic components are identified, two types of factor models can be found in the literature, depending on the way factors are driving the common components. In static factor models, it is assumed that common components are of the form χit=l=1qbilflt,i=1,,n,t=1,,T, that is, the χit’s are driven by q factors f1t,,fqt which are loaded instantaneously. This static approach is the one adopted by Chamberlain (1983), Chamberlain and Rothschild (1983), Stock and Watson, 1989, Stock and Watson, 2002a, Stock and Watson, 2002b, Stock and Watson, 2005, Bai and Ng, 2002, Bai and Ng, 2007, and a large number of applied studies. The so-called general dynamic model decomposes common components into χit=l=1qbil(L)ult,i=1,,n,t=1,,T, where u1t,,uqt, the unobservable common shocks, are loaded via one-sided linear filters bil(L). That “fully dynamic” approach (the terminology is not unified and the adjective “dynamic” is often used in an ambiguous way) goes back, under exact factor form, to Chamberlain (1983) and Chamberlain and Rothschild (1983), but was developed, mainly, by Forni et al., 2000, Forni et al., 2003, Forni et al., 2004, Forni et al., 2005, Forni et al., 2009, Forni and Lippi, 2001 and Hallin and Liška (2007).

The static model (1) clearly is a particular case of the general dynamic one (2). Its main advantage is simplicity. On the other hand, both models share the same assumption on the asympotic behavior of spectral eigenvalues—the plausibility of which is confirmed by empirical evidence. But the static model (1) places an additional and rather severe restriction on the data-generating process, while the dynamic one (2), as shown by Forni and Lippi (2001), does not—we refer to Section 2 for details. Moreover, the synchronization of clocks and calendars across the panel is often quite approximative, so that the concept of “instantaneous loading” itself may be questionable.

Both the static and the general dynamic models are receiving increasing attention in finance and macroeconometric applications where information usually is scattered through a (very) large number n of interrelated time series (n values of the order of several hundreds, or even one thousand, are not uncommon). Classical multivariate time series techniques are totally helpless in the presence of such values of n, and factor model methods, to the best of our knowledge, are the only ones that can handle such datasets. In macroeconomics, factor models are used in business cycle analysis (Forni and Reichlin, 1998, Giannone et al., 2006), in the identification of economy-wide and global shocks, in the construction of indices and forecasts exploiting the information scattered in a huge number of interrelated series (Altissimo et al., 2001), in the monitoring of economic policy (Giannone et al., 2005), and in monetary policy applications (Bernanke and Boivin, 2003, Favero et al., 2005). In finance, factor models are at the heart of the extensions proposed by Chamberlain and Rothschild (1983) and Ingersol (1984) of the classical arbitrage pricing theory; they also have been considered in performance evaluation and risk measurement (Chapters 5 and 6 of Campbell et al., 1997), and in the statistical analysis of the structure of stock returns (Yao, 2008).

Factor models in the recent years also generated a huge amount of applied work: see d’Agostino and Giannone (2005), Artis et al. (2005), Bruneau et al. (2007), Den Reijer (2005), Dreger and Schumacher (2004), Schumacher (2007), Nieuwenhuyzen (2004), Schneider and Spitzner (2004), Giannone and Matheson (2007), and Stock and Watson (2002b) for applications to data from UK, France, the Netherlands, Germany, Belgium, Austria, New Zealand, and the US, respectively; Altissimo et al. (2001), Angelini et al. (2001), Forni et al. (2003), and Marcellino et al. (2003) for the Euro area and Aiolfi et al. (2006) for South American data—to quote only a few. Dynamic factor models also have entered the practice of a number of economic and financial institutions, including several central banks and national statistical offices, who are using them in their current analysis and prediction of economic activity. A real time coincident indicator of the EURO area business cycle (EuroCOIN), based on Forni et al. (2000), is published monthly by the London-based Center for Economic Policy Research and the Banca d’Italia: see (http://www.cepr.org/data/EuroCOIN/). A similar index, based on the same methods, is established for the US economy by the Federal Reserve Bank of Chicago.

Although heterogeneous, panel data very often are obtained by pooling together several “blocks” which themselves constitute “large” subpanels. In macroeconometrics, for instance, data typically are organized either by country or sectoral origin: the database which is used in the construction of EuroCOIN, the monthly indicator of the euro area business cycle published by CEPR, includes almost 1000 time series that cover six European countries and are organized into eleven subpanels including industrial production, producer prices, monetary aggregates, etc. Depending on the objectives of the analysis, such a panel could be divided into six blocks (one for each country), or into eleven blocks (one for each subpanel). When these blocks are large enough, several dynamic factor models can be considered and analyzed, allowing for a refined analysis of interblock relations. In the simple two-block case, “marginal common factors” can be defined for each block, and need not coincide with the “joint common factors” resulting from pooling the two blocks.

The objective of this paper is to provide a theoretical basis for that type of analysis. For simplicity, we start with the simple case of two blocks. We show (Section 2) how a factorization of the Hilbert space spanned by the n observed series leads to a decomposition of each of them into four mutually orthogonal (at all leads and lags) components: a strongly idiosyncratic one, a strongly common one, a weakly common, and a weakly idiosyncratic one. In Sections 3 Identifying the factor structure; population results, 4 Recovering the factor structure; estimation results, we show how projections onto appropriate subspaces provide consistent data-driven reconstructions of those various components. Section 5 is devoted to the general case of K2 blocks, allowing for various decompositions of each observation into mutually orthogonal (at all leads and lags) components. The tools we are using throughout are Brillinger’s theory of dynamic principal components, a key result (Proposition 2) by Forni et al. (2000), and the identification method developed by Hallin and Liška (2007). Proofs are concentrated in Appendix.

The potential of the method is briefly illustrated, in Section 6, with a panel of Industrial Production Index data for France and Germany (K=2 blocks, q=3 factors), then France, Germany and Italy (K=3 blocks, q=4 factors). Simple as it is, the analysis of that dataset reveals some striking facts. For instance, both Germany and Italy exhibit a “national common factor” which is idiosyncratic to the other two countries, while France’s common factors are included in the space spanned by Germany’s. The (estimated) percentages of explained variation associated with the various components also are quite illuminating: Germany, with 25% of common variation, is the “most common” out of the three countries. But it also is, with only 6.4% of its total variation, the “least strongly common” one. France has the highest proportion (82.4%) of marginal idiosyncratic variation but also the highest proportions of strongly and weakly idiosyncratic variations (72.6% and 9.8%, respectively). We do not attempt here to provide an economic interpretation for such facts. Nor do we apply the method to a more sophisticated dataset.2 But we feel that the simple application we are proposing provides sufficient evidence of the potential power of the method, both from a structural as from a quantitative point of view.3

Section snippets

The dynamic factor model in the presence of blocks

We throughout assume that all stochastic variables in this paper belong to the Hilbert space L2(Ω,F,P), where (Ω,F,P) is some given probability space. We will study two double-indexed sequences Y{Yit;iN,tZ} and Z{Zjt;jN,tZ}, where t stands for time and i,j are cross-sectional indices, of observed random variables. Let Yny{Yny,t;tZ} and Znz{Znz,t;tZ} be the ny- and nz-dimensional subprocesses of Y and Z, respectively, where Yny,t(Y1t,,Ynyt) and Znz,t(Z1t,,Znzt), and write Xn,t(Y1

Identifying the factor structure; population results

Based on the n-dimensional vector process Xn,t=(Yny,t,Znz,t), we first asymptotically identify ϕy;it,ψy;it,νy;it,ϕz;jt,ψz;jt and νz;jt as min(ny,nz). More precisely, we show that, under specified spectral structure, all those quantities can be consistently recovered from the finite-n subpanels {Xn,t} as min(ny,nz).

Recovering the factor structure; estimation results

The previous section shows how all components of Yit and Zjt can be recovered asymptotically as min(ny,nz), provided that the spectral density Σn and the numbers q,qy, and qz of factors are known. The estimates ϕy;itn,ψy;itn and νy;itn all take the form of a filtered series of the observed process Xn,t. We have indeed ϕy;itn=H¯y;n,i(L)Vz;tnK¯ϕy;n,i(L)Xn,tψy;itn=χy;itnϕy;itn=[K¯χxy;n,i(L)K¯y;n,i(L)K¯ϕy;n,i(L)]Xn,tK¯ψy;n,i(L)Xn,t,andνy;itn=χxy;itnχy;itn=[K¯y;n,i(L)K¯χxy;n,i(L)K¯y;n

Dynamic factors in the presence of K blocks (K>2)

The ideas developed in the previous sections extend to the more general case of K>2 blocks, with, however, rapidly increasing complexity. Each subset {k1,,k},=0,1,,K of {1,,K} indeed characterizes a decomposition of H into mutually orthogonal common and idiosyncratic subspaces, H{k1,,k}χ and H{k1,,k}ξ, say, leading to 2K distinct implementations of the Hallin–Liška and Forni et al. procedures.

Instead of Yit and Zjt, denote all observations as Xk;it (i=1,,n; k=1,,K), with the

Real data applications

We applied our method to a dataset of monthly Industrial Production Indexes for France, Germany, and Italy, observed from January 1995 through December 2006. All data were preadjusted by taking a log-difference transformation (T=143 throughout—one observation is lost due to differencing), then centered and normalized using their sample means and standard errors.

In practice, some care has to be taken, however, due to the fact that, for finite n and T, the joint and marginal common spaces

Acknowledgements

Marc Hallin gratefully acknowledges the support of the Sonderforschungsbereich “Statistical modelling of nonlinear dynamic processes” (SFB 823) of the Deutsche Forschungsgemeinschaft and a Discovery Grant of the Australian Research Council. Part of this work was completed while visiting the Economics Department at the European University Institute in Florence under a Fernand Braudel Fellowship. The authors are grateful to Christine De Mol, Richard Spady, Marco Lippi and Mario Forni for many

References (44)

  • d’Agostino, A., Giannone, D., 2005. Comparing alternative predictors based on large-panel dynamic factor models. ECB...
  • Aiolfi, M., Catão, L., Timmerman, A., 2006. Common factors in Latin America’s business cycle. Working Paper 06/49....
  • Altissimo, F., Bassanetti, A., Cristadoro, R., Forni, M., Hallin, M., Lippi, M., Reichlin, L., 2001. A real time...
  • Angelini, E., Henry, J., Mestre, R., 2001. Diffusion index based inflation forecasts for the euro area. ECB Working...
  • M. Artis et al.

    Factor forecasts for the UK

    Journal of Forecasting

    (2005)
  • J. Bai et al.

    Determining the number of factors in approximate factor models

    Econometrica

    (2002)
  • J. Bai et al.

    Determining the number of primitive shocks in factor models

    Journal of Business and Economic Statistics

    (2007)
  • B.S. Bernanke et al.

    Monetary policy in a data rich environment

    Journal of Monetary Economics

    (2003)
  • D.R. Brillinger

    Time Series: Data Analysis and Theory

    (1981)
  • C. Bruneau et al.

    Forecasting inflation using economic indicators: the case of France

    Journal of Forecasting

    (2007)
  • J.Y. Campbell et al.

    The Econometrics of Financial Markets

    (1997)
  • G. Chamberlain

    Funds, factors, and diversification in arbitrage pricing models

    Econometrica

    (1983)
  • G. Chamberlain et al.

    Arbitrage, factor structure and mean-variance analysis in large asset markets

    Econometrica

    (1983)
  • Den Reijer, A.H.J., 2005. Forecasting Dutch GDP using large scale factor models. DNB Working Paper 028. De...
  • C. Dreger et al.

    Estimating large scale factor models for economic activity in Germany

  • C. Favero et al.

    Principal components at work: empirical analysis of monetary policy with large datasets

    Journal of Applied Econometrics

    (2005)
  • M. Forni et al.

    The generalized factor model: representation theory

    Econometric Theory

    (2001)
  • M. Forni et al.

    Let’s get real: a factor analytical approach to disaggregated business cycle dynamics

    Review of Economic Studies

    (1998)
  • M. Forni et al.

    The generalized dynamic factor model: identification and estimation

    Review of Economics and Statistics

    (2000)
  • M. Forni et al.

    Do financial variables help forecasting inflation and real activity in the euro area?

    Journal of Monetary Economics

    (2003)
  • M. Forni et al.

    The generalized dynamic factor model: consistency and rates

    Journal of Econometrics

    (2004)
  • M. Forni et al.

    The generalized dynamic factor model: one-sided estimation and forecasting

    Journal of the American Statistical Association

    (2005)
  • Cited by (0)

    1

    Present address: Institute for Health and Consumer Protection (IHCP), European Commission Joint Research Centre, I-21027 Ispra (VA), Italy.

    View full text