Testing for correlation between the regressors and factor loadings in heterogeneous panels with interactive effects

Kapetanios, George; Serlenga, Laura; Shin, Yongcheol

doi:10.1007/s00181-023-02390-1

Testing for correlation between the regressors and factor loadings in heterogeneous panels with interactive effects

Open access
Published: 11 May 2023

Volume 64, pages 2611–2659, (2023)
Cite this article

Download PDF

You have full access to this open access article

Empirical Economics Aims and scope Submit manuscript

Testing for correlation between the regressors and factor loadings in heterogeneous panels with interactive effects

Download PDF

1193 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

A large literature on modelling cross-section dependence in panels has been developed through interactive effects. However, there are areas where research has not really caught on yet. One such area is the one concerned with whether the regressors are correlated with factor loadings or not. This is an important issue because if the regressors are uncorrelated with loadings, we can simply use the consistent two-way fixed effects (FE) estimator without employing any more sophisticated econometric methods such as the principal component (PC) or the common correlated effects estimators. We explore this issue, which has received surprisingly little attention and propose a Hausman-type test to address the matter. Further, we develop two nonparametric variance estimators for the FE and PC estimators as well as their difference, that are robust to the presence of heteroscedasticity, autocorrelation and slope heterogeneity. Under the null hypothesis of no correlation between the regressors and loadings the proposed test follows the $\chi ^{2}$ distribution asymptotically. Monte Carlo simulation results confirm satisfactory size and power performance of the test even in small samples. Finally, we provide extensive empirical evidence in favour of uncorrelated factor loadings. In this situation, the FE estimator would provide a simple and robust estimation strategy which is invariant to nontrivial computational issues associated with the PC estimator.

Testing slope homogeneity in panel data models with a multifactor error structure

Article 10 July 2017

Panel data models with cross-sectional dependence: a selective review

Article 05 June 2016

Alternative estimation approaches for the factor augmented panel data model with small T

Article Open access 08 October 2020

1 Introduction

Panel data models have been increasingly popular in applied economics and finance, due to their ability to model various sources of heterogeneity. A standard practice is to impose strong restrictions on error cross-section dependence (CSD). This takes the form of independence across individual units under the fixed effects model whilst a common time effect severely restricts the nature of CSD under the random effects specification. However, most recently, a large number of studies have developed econometric methodologies for modelling CSD, mainly through the structure of interactive effects (hereafter IE) introducing heterogeneous unobserved factors into the error components and allowing for a richer cross-sectional covariance structure.

In this framework, conventional wisdom has been that the standard two-way fixed effects (FE) estimator would be inconsistent, due to ignoring the potential endogeneity arising from the correlation between the regressors and factors and/or factor loadings (e.g. Bai 2009). Hence, two leading approaches have proposed in the literature, see Chudik and Pesaran (2015) for a survey. The first, based on the principal component (PC) estimation, estimates the factors jointly and iteratively with the main slope parameters, see Bai (2009) and Moon and Weidner (2015), Fernandez-Val and Weidner (2016) and Charbonneau (2017) for extensions. The second approach, advanced by Pesaran (2006), treats factors as nuisance terms, and removes their effects through proxying them by the cross-section averages of the dependent and independent variables. This is referred to as the common correlated effects (CCE) estimator. A growing number of extensions have been developed by Kapetanios et al. (2011), Chudik and Pesaran (2015), Westerlund and Urbain (2015) and Petrova and Westerlund (2020).

In the empirical work, the CCE estimator is mostly used as this is easier to implement with respect to the PC. Indeed, a common practice is to apply the CCE estimator after detecting the existence of strong CSD by the Pesaran (2015) CD test, see Mastromarco et al. (2016), Holly et al. (2010), Baltagi and Li (2014), among others.

This paper contributes to this literature by raising some important issues which might be considered relevant for practitioners. We start by highlighting a simple fact that the FE estimator is not always inconsistent even in the presence of IE. If the regressors are correlated with factors but uncorrelated with loadings, then the FE estimator is shown to be consistent, which has also been noted earlier by Coakely et al. (2006), Sarafidis and Wansbeek (2012) and Westerlund (2019a). In such a case, we formally show that the FE estimator is consistent and asymptotically normally distributed. But, the variance estimator provided by the standard FE estimation will be invalid due to the presence of remaining zero-mean IE in the error components. Hence, we provide two consistent nonparametric variance estimators that are robust to the presence of heteroscedastic and serially-correlated disturbances as well as the slope parameters heterogeneity. Via Monte Carlo studies, we find that FE and CCE estimators display a similar and satisfactory performance when the regressors are correlated with factors but uncorrelated with loadings. Further, the coverage rate of the FE estimator evaluated using nonparametric variance estimators reaches the nominal 95%. The performance of both CCE and FE estimators worsens significantly if the regressors are correlated with loadings, which is in line with Westerlund and Urbain (2013). As expected, the performance of the PC estimator is not unduly affected by the presence of correlation between the regressors and loadings.

Furthermore, we point out that in the specification tests that have been proposed in the literature to testing the presence of the CSD or IE, e.g. Pesaran (2015), Sarafidis et al. (2009), Bai (2009) and Castagnetti et al. (2015), the rejection of the null hypothesis does not always imply that the FE estimator is inconsistent under the alternative model with IE. For instance, Sarafidis et al. (2009) maintain an assumption that factor loadings (between equations for the dependent variable and the regressors) are uncorrelated even under the alternative. More importantly, we show that the Hausman test for the null hypothesis of the two way additive fixed effects against the alternative hypothesis of IE proposed by Bai (2009), would be inconsistent against the alternative, especially if the regressors are uncorrelated with loadings. This suggests that the presence of no correlation between the regressors and loadings emerges as an influential but under-appreciated feature of the panel data model with IE. For large T, in order to avoid any potential omitted variables bias, it is natural to allow for the regressors to be correlated with unobserved factors. But, it still remains the important issue to test whether the regressors are correlated with loadings or not in practice.

Despite a growing number of studies on modelling CSD through IE, it is rather surprising to find that the literature has been silent on investigating the important issue of testing the validity of correlation between the regressors and factor loadings in panels with IE. This is the important hypothesis to be tested because if the loadings are uncorrelated with the regressors, we can just use the simple but consistent FE estimator. In what follows we develop the Hausman-type test that determines the validity of whether the regressors are correlated with loadings. Both the FE and PC estimators are consistent under the null hypothesis of uncorrelated factor loadings whilst only the latter is consistent under the alternative hypothesis. Our proposed test is different from the Hausman test developed by Bai (2009), because our null hypothesis is subsumed under his alternative model with IE. As a result, the FE estimator is not necessarily more efficient than the PC estimator under the null hypothesis. Based on this idea, we develop two nonparametric variance estimators for the difference between the FE and PC estimators, that are shown to be robust to the presence of heteroscedasticity, autocorrelation and slope heterogeneity. We derive that the proposed test statistic follows the $\chi ^{2}$ distribution asymptotically. Monte Carlo simulation results confirm that the size and the power performance of the test is quite satisfactory even in small samples.

Finally, our most important contribution is the provision of extensive empirical evidence that regressors are uncorrelated with factor loadings, in many panel datasets employed in the literature. We find that the null hypothesis of the regressors being uncorrelated with factor loadings, is not rejected in thirteen out of fourteen datasets considered. Next, we find that Bai’s Hausman test rejects the null of additive effects model against the alternative of IE only once whilst the CD test by Pesaran (2015) strongly rejects the null of weak CSD for all the datasets. Such conflicting results could provide an additional support for our main findings that the regressors are indeed uncorrelated with factor loadings even in cross-sectionally correlated panels with IE, in which case we show that Bai’s Hausman test is inconsistent. Furthermore, the FE estimator is invariant to any complex issues related to selecting the number of unobserved factors incorrectly which would significantly affect the performance of PC estimators (Moon and Weidner 2015), and to employing the inconsistent initial estimates which may not guarantee the convergence of the iterative PC estimator (Hsiao 2018). In this regard, the FE estimation combined with nonparametric variance estimators will provide the simple and robust approach, avoiding uncertainty in specifying and estimating nuisance parameters for potential efficiency gain. This suggests that the FE estimator can still be of considerable applicability in a wide variety of cross-sectionally correlated panel data, especially if the regressors are found to be uncorrelated with factor loadings, the validity of which can be easily verified by our proposed test.

The paper proceeds as follows. Section 2 describes the model and highlights that the FE estimator is still consistent in panels with IE, under the condition that the regressors are uncorrelated with factor loadings. Section 3 develops the Hausman-type test for the validity of correlated factor loadings, which is the important hypothesis to be tested. Section 4 employs a range of Monte Carlo simulations to investigate the finite sample performance of the alternative estimators and the proposed test statistic. Section 5 presents empirical evidence documenting that the null hypothesis of the regressors uncorrelated with factor loadings, is not rejected for thirteen out of fourteen datasets. Section 6 offers some concluding remarks. Mathematical proofs, the data descriptions and additional empirical results are relegated to Appendices. Additional simulation results can be found in Online Supplement.

2 The model

Consider the following heterogeneous panel data model with IE:

$$\begin{aligned} y_{it}=\varvec{\beta }_{i}^{\prime }\varvec{x}_{it}+\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t}+\varepsilon _{it} \end{aligned}$$

(1)

where $y_{it}$ is the dependent variable of the ith cross-sectional unit in period t, $\varvec{x}_{it}$ is the $k\times 1$ vector of covariates with $\varvec{\beta }_{i}$ the $k\times 1$ vector of parameters, and $\varepsilon _{it}$’s are idiosyncratic errors. $\varvec{f}_{t}$ is an $r\times 1$ vector of unobserved common factors while $\varvec{\gamma }_{i}$ is an $r\times 1$ vector of random heterogeneous loadings.

We make the following assumptions:

Assumption A

(i) $\varepsilon _{it}$ is independently distributed across i with $E\left( \varepsilon _{it}\right) =0$, $E\left( \varepsilon _{it}^{2}\right) =\sigma _{\varepsilon _{i}}^{2}$ and $E\left( \varepsilon _{it}^{8+\delta }\right) <\infty $ for some $\delta >0$. Each $\varepsilon _{it}$ follows a linear process with absolutely summable autocovariances such that $\lim _{T\rightarrow \infty }T^{-1} {\textstyle \sum _{s=1}^{T}} {\textstyle \sum _{t=1}^{T}} \left| E\left( \varepsilon _{is}\varepsilon _{it}\right) \right| ^{1+\delta }<\infty $, $E\left| N^{-1/2} {\textstyle \sum _{i=1}^{N}} \left[ \varepsilon _{is}\varepsilon _{it}-E\left( \varepsilon _{is} \varepsilon _{it}\right) \right] \right| ^{4}<\infty $ for all t, s, and $\lim _{T,N\rightarrow \infty }T^{-2}N^{-1} {\textstyle \sum _{i=1}^{N}} {\textstyle \sum _{s=1}^{T}} {\textstyle \sum _{t=1}^{T}} {\textstyle \sum _{r=1}^{T}} {\textstyle \sum _{w=1}^{T}} \left| Cov\left( \varepsilon _{is}\varepsilon _{it},\varepsilon _{ir}\varepsilon _{iw}\right) \right| <\infty $. The largest eigenvalue of $E\left( \varvec{\varepsilon }_{i}\varvec{\varepsilon }_{i}^{\prime }\right) $ is bounded uniformly in every i and t, where $\varvec{\varepsilon }_{i}=\left( \varepsilon _{i1},\ldots ,\varepsilon _{iT}\right) ^{\prime }$. $\varepsilon _{it}$ is independent of $\varvec{x} _{js}$, $\varvec{\gamma }_{j}$ and $\varvec{f}_{s}$ for all i, j, s and t.

(ii) $\varvec{f}_{t}$ is covariance stationary with finite mean and variance, $\varvec{\Sigma }_{f}$ with $E\left( \left\| \varvec{f} _{t}\right\| ^{4}\right) <\infty $ where $\varvec{\Sigma }_{f}$ is an $r\times r$ positive definite matrix.

(iii) $\varvec{\gamma }_{i}$ is iid across i with finite mean and variance, $\varvec{\Sigma }_{\gamma }$, where $\varvec{\Sigma }_{\gamma }$ is an $r\times r$ positive definite matrix. $\varvec{\gamma }_{i}$ are independent of $\varepsilon _{jt}$ and $\varvec{f}_{t}$ for all i, j and t.

(iv) The $k\times 1$ vector of $\varvec{\beta }_{i}$ are generated as $\varvec{\beta }_{i}=\varvec{\beta }+\varvec{\eta }_{i}$. $\varvec{\eta }_{i}$ is independent across i with $E\left( \varvec{\eta }_{i}\right) =0$ and $E\left( \varvec{\eta } _{i}\varvec{\eta }_{i}^{\prime }\right) =\Omega _{\eta \eta ,i}$, where $\Omega _{\eta \eta ,i}$ is a positive definite matrix uniformly for every i. $E\left\| \varvec{\eta }_{i}\right\| ^{4}\le \Delta <\infty $ and $\left\| \varvec{\beta }\right\| <\infty $. $\varvec{\eta }_{i}$ is independent of $\varepsilon _{it}$ and $\varvec{\gamma }_{i}$.

Assumption A is standard in the literature, see Bai (2009), Karabiyik et al. (2017) and Cui et al. (2019) (CHNY, hereafter).

For a consistent estimation of the parameters in (1), we need to first account for unobserved factors, and then estimate $\varvec{\beta }$ by applying panel estimators to (1) with defactored variables. On the basis of this idea, two popular approaches have been proposed. The common correlated effects (CCE) estimator advanced by Pesaran (2006), imposes that $\varvec{x}_{it}$ share the same factors, $\varvec{f}_{t}$

$$\begin{aligned} \varvec{x}_{it}=\varvec{\Gamma }_{i}^{\prime }\varvec{f} _{t}+\varvec{v}_{it} \end{aligned}$$

(2)

where $\varvec{\Gamma }_{i}$ an $r\times k$ matrix of random heterogeneous loadings and $\varvec{v}_{it}$ are idiosyncratic errors, and proposes to approximate $\varvec{f}_{t}$ by the cross-section averages of the dependent and independent variables. Next, Bai (2009) allows $\varvec{x}_{it}$ to be arbitrarily correlated with both $\varvec{\gamma }_{i}$ and $\varvec{f}_{t}$, and proposes the iterative principal component (PC) approach that estimates the factors jointly and iteratively with the slope parameters. The validity of the CCE approach depends crucially upon whether an appropriate rank condition, that has to be assumed, holds. Westerlund and Urbain (2015) argue that the issue of correctly selecting the number of factors, r in the PC estimation, is essentially the same as the issue of satisfying the condition, $r\le k+1$ in CCE estimation. Further, it is shown that both estimators involve bias terms, which do not disappear unless $N/T \rightarrow 0$. The finite sample performance of the two approaches has been intensively investigated. The earlier studies by Kapetanios and Pesaran (2005) and Chudik et al. (2011), provide Monte Carlo evidence in favour of the CCE estimator, which is partly due to uncertainty associated with estimating the true number of unobserved factors in the PC estimation. Further, Westerlund and Urbain (2015) show that the performance of the PC estimator is sensitive to the value of $\beta $. For $\beta =0$, the PC estimator outperforms CCE, while for $\beta \ne 0$, the CCE estimator tends to outperform.

However, we find that the performance of the two-way fixed effect (FE) estimator has not been widely investigated. Exceptions include the studies by Coakely et al. (2006), Sarafidis and Wansbeek (2012) and Westerlund (2019a). This simply reflects the conventional view that the FE estimator would be inconsistent in the presence of IE, due to ignoring endogeneity stemming from the correlation between regressors and factors/loadings. We aim to challenge this maintained view. For large T, suppose that $\varvec{f}_{t}$ may represent the unobserved common policy or globalisation trend, and $\varvec{\gamma }_{i}$ are the heterogeneous individual responses (parameters).

In practice, it is important to test the validity of whether $\varvec{x}_{it}$ are correlated with $\varvec{\gamma }_{i}$ or not. Formally, we set the null and alternative hypothesis as follows:

$$\begin{aligned}{} & {} H_{0}:\varvec{x}_{it}\text { uncorrelated with }\varvec{\gamma }_{i} \end{aligned}$$

(3)

$$\begin{aligned}{} & {} H_{1}:\varvec{x}_{it}\text { correlated with }\varvec{\gamma }_{i} \end{aligned}$$

(4)

Under Assumptions A(ii) and (iii), we can express $\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t}$ in (1) by^{Footnote 1}

$$\begin{aligned} \varvec{\gamma }_{i}^{\prime }\varvec{f}_{t}=\mu +\alpha _{i} + \theta _{t} +\varvec{\mathring{\gamma }}_{i}^{\prime } \varvec{\dot{f}}_{t} \end{aligned}$$

(5)

where $\mu =\varvec{\bar{\gamma }}^{\prime }\varvec{\bar{f}}$, $\alpha _{i}=\varvec{\gamma }_{i}^{\prime }\varvec{\bar{f}}$, $\theta _{t}=\varvec{\bar{\gamma }}^{\prime }\varvec{f}_{t}$, $\varvec{\mathring{\gamma }}_{i}=\varvec{\gamma }_{i}-\varvec{\bar{\gamma }}$ and $\varvec{\dot{f}}_{t} = \varvec{f}_{t}-\varvec{\bar{f}}$ with $\varvec{\bar{\gamma }}=N^{-1}\sum _{i=1}^{N}\varvec{\gamma }_{i}$ and $\varvec{\bar{f}} = T^{-1}\sum _{t=1}^{T}\varvec{f}_{t}$. Using (5) in (1), we have:

$$\begin{aligned} y_{it}=\varvec{\beta }_{i}^{\prime }\varvec{x}_{it}+\mu +\alpha _{i}+\theta _{t}+\varvec{\mathring{\gamma }}_{i}^{\prime }\varvec{\dot{f}}_{t}+\varepsilon _{it} \end{aligned}$$

(6)

This transformation clearly shows that the panel data model with nonzero-mean IE, $\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t}$ in (1) can be equally expressed as the 2-way fixed effects panel data model with zero-mean IE, $\varvec{\mathring{\gamma }}_{i}^{\prime } \varvec{\dot{f}}_{t}$ in (6).^{Footnote 2} Next, applying the 2-way within transformation to (6) to obtain^{Footnote 3}

$$\begin{aligned} \ddot{y}_{it}=\varvec{\beta }_i^{\prime } \varvec{\ddot{x}}_{it} +\ddot{u}_{it},\ \ddot{u}_{it}=\varvec{\mathring{\gamma }}_{i}^{\prime }\varvec{\dot{f}}_{t}+\ddot{\varepsilon }_{it} \end{aligned}$$

(7)

where $\ddot{y}_{it}=y_{it}-\bar{y}_{i.}-\bar{y}_{.t}+\bar{y}_{..}$ with $y_{i.}=T^{-1}\sum _{t=1}^{T}y_{it}$, $y_{.t}=N^{-1}\sum _{i=1}^{N}y_{it}$, $\bar{y}_{..}=\left( NT\right) ^{-1}\sum _{i=1}^{N}\sum _{t=1}^{T}y_{it}$, and similarly for $\varvec{\ddot{x}}_{it}$ and $\ddot{\varepsilon }_{it}$.

Under Assumption A and (3), it is easily seen by the independence of $\varvec{\gamma }_{i}$ from all other random quantities in the model and $E\left( \varvec{\mathring{\gamma }}_{i}\right) =E\left( \varvec{\gamma }_{i}-\varvec{\bar{\gamma }}\right) =0$ that $\varvec{\ddot{x}}_{it}$ is uncorrelated with the composite error, $\ddot{u}_{it}=\varvec{\mathring{\gamma }}_{i}^{\prime }\varvec{\dot{f}}_{t}+\ddot{\varepsilon }_{it}$ in (7), provided $\varvec{x}_{it}$ are strictly exogenous with respect to $\varepsilon _{it}$ because

$$\begin{aligned} E\left( \varvec{\ddot{x}}_{it}^{\prime }\varvec{\mathring{\gamma }}_{i}^{\prime }\varvec{\dot{f}}_{t}\right) =E\left\{ \varvec{\ddot{x}}_{it}^{\prime }\varvec{\dot{f}}_{t}^{\prime }E\left( \varvec{\mathring{\gamma }}_{i}|\varvec{\ddot{x}}_{it},\varvec{\dot{f}}_{t}\right) \right\} =0. \end{aligned}$$

(8)

See also Section 5 in Hsiao (2018). Therefore, under the null hypothesis, (3), we can apply the two-way FE estimation to (1) and obtain a consistent estimator of $\varvec{\beta }$ from (7). Conversely, if $\varvec{x}_{it}$ and $\varvec{\gamma }_{i}$ are correlated, it is clear that $E\left( \ddot{u}_{it}\varvec{x}_{it}\right) \ne 0$ so that the FE estimator is inconsistent. Notice that the consistency of the FE estimator requires only $\varvec{\gamma }_{i}$ to be uncorrelated with $\varvec{x}_{it}$, but this is implicitly a maintained assumption in the CCE literature.^{Footnote 4} A further possibility that we do not entertain is that $\varvec{x}_{it}$ contains a different set of factors to that entering $y_{it}$ directly and that the two sets of factors are uncorrelated. This points out the symmetry of the role of loadings and factors in the IE setting. Then, (8) may hold even if (3) does not. However, we view this setting as too unlikely to be of interest.

The two-way FE estimator of $\varvec{\beta }$ is given by

$$\begin{aligned} \hat{\varvec{\beta }}_{FE}=\left( \sum _{i=1}^{N}\varvec{\ddot{X}} _{i}^{\prime }\varvec{\ddot{X}}_{i}\right) ^{-1}\sum _{i=1}^{N} \varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{y}}_{i} \end{aligned}$$

(9)

where $\varvec{\ddot{X}}_{i}=\left( \varvec{\ddot{x}}_{i1},\ldots ,\varvec{\ddot{x}}_{iT}\right) ^{\prime }$ and $\varvec{\ddot{y} }_{i}=\left( \ddot{y}_{i1},\ldots ,\ddot{y}_{iT}\right) ^{\prime }$. As $\ddot{u}_{it}$ in (7) still contains zero-mean IE, $\varvec{\mathring{\gamma }}_{i}^{\prime }\varvec{\dot{f}}_{t}$, the standard variance estimator for $\hat{\varvec{\beta }}_{FE}$ will be invalid. Thus, we propose the two consistent variance estimators, which are also robust to the heteroscedasticity and the serial-correlation as well as the slope heterogeneities. The first is the nonparametric variance estimator, similarly applied in deriving the variance of the CCE estimator by Pesaran (2006):

$$\begin{aligned}&\hat{\varvec{V}}^{NON}\left( \hat{\varvec{\beta }}_{FE}\right) \nonumber \\ {}&\quad =\left( \sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) ^{-1} \left( \sum _{i=1}^{N}\left( \varvec{\ddot{X}} _{i}^{\prime }\varvec{\ddot{X}}_{i}\right) \left( \hat{\varvec{\beta }}_{FE,i}-\varvec{\bar{\beta }}_{FE}\right) \left( \varvec{\hat{\beta }}_{FE,i}-\varvec{\bar{\beta }}_{FE}\right) ^{\prime }\left( \varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) \right) \left( \sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X} }_{i}\right) ^{-1} \end{aligned}$$

(10)

where $\hat{\varvec{\beta }}_{FE,i}=\left( \varvec{\ddot{X}} _{i}^{\prime }\varvec{\ddot{X}}\right) ^{-1}\varvec{\ddot{X}} _{i}^{\prime }\varvec{\ddot{y}}_{i}$ and $\varvec{\bar{\beta }} _{FE}=\frac{1}{N}\sum _{i=1}^{N}\hat{\varvec{\beta }}_{FE,i}$. Next, we consider the following heteroscedasticity and autocorrelation robust variance estimator (see CHNY):

$$\begin{aligned} \hat{\varvec{V}}^{HAC}\left( \hat{\varvec{\beta }}_{FE}\right) =\left( \sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X} }_{i}\right) ^{-1}\left( \sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\hat{\varvec{u}}_{FE,i}\hat{\varvec{u}}_{FE,i}^{\prime } \varvec{\ddot{X}}_{i}\right) \left( \sum _{i=1}^{N}\varvec{\ddot{X} }_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) ^{-1} \end{aligned}$$

(11)

where $\hat{\varvec{u}}_{FE,i}=\varvec{\ddot{y}}_{i}-\varvec{\ddot{X}}_{i}\hat{\varvec{\beta }}_{FE}$.

We show that $\hat{\varvec{\beta }}_{FE}$ is consistent and follows the normal distribution asymptotically under the null, (3). The result holds for both homogeneous and heterogeneous $\varvec{\beta }$.

Theorem 1

Under Assumption A and under (3), as $N,T \rightarrow \infty $,

$$\begin{aligned} \sqrt{N}\left( \hat{\varvec{\beta }}_{FE}-\varvec{\beta }\right) \rightarrow _{d}N\left( 0_{k\times 1},\varvec{\Psi }_{FE}^{-1} \varvec{R}_{FE}\varvec{\Psi }_{FE}^{-1}\right) \end{aligned}$$

(12)

where $\varvec{\Psi }_{FE}=\lim _{N,T\rightarrow \infty }\frac{1}{N}\sum _{i=1}^{N}E\left( \frac{\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}}{T}\right) $. Considering $\varvec{\beta }_{i}=\varvec{\beta }+\varvec{\eta }_{i}$, $\varvec{R}_{FE}=\varvec{R}_{1,FE} +\varvec{R}_{2,FE}$ where

$$\begin{aligned} \varvec{R}_{1,FE}= & {} \lim _{N,T\rightarrow \infty }\frac{1}{N}\sum _{i=1} ^{N}E\left( \frac{\varvec{\ddot{X}}_{i}^{\prime }\varvec{\dot{F}}}{T}\varvec{\mathring{\gamma }}_{i}\varvec{\mathring{\gamma }} _{i}^{\prime }\frac{\varvec{\dot{F}}^{\prime }\varvec{\ddot{X}}_{i}}{T}\right) \end{aligned}$$

(13)

$$\begin{aligned} \varvec{R}_{2,FE}= & {} \lim _{N,T\rightarrow \infty }\frac{1}{N}\sum _{i=1}^{N} E\left( \frac{\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}}{T} \varvec{\eta }_{i}\varvec{\eta }_{i}^{\prime } \frac{\varvec{\ddot{X}}_{i}^{\prime } \varvec{\ddot{X}}_{i}}{T}\right) , \end{aligned}$$

(14)

and $\varvec{\dot{F}}=\left( \varvec{\dot{f}}_{1},\ldots ,\varvec{\dot{f}}_{T}\right) ^{\prime }$. Furthermore,

$$\begin{aligned}{} & {} \hat{\varvec{V}}^{NON}\left( \hat{\varvec{\beta }}_{FE}\right) ^{-1/2}\left( \hat{\varvec{\beta }}_{FE}-\varvec{\beta }\right) \rightarrow _{d}N\left( 0,\varvec{I}_{k}\right) \text { and } \hat{\varvec{V}}^{HAC}\left( \hat{\varvec{\beta }}_{FE}\right) ^{-1/2}\nonumber \\ {}{} & {} \quad \left( \hat{\varvec{\beta }}_{FE}-\varvec{\beta }\right) \rightarrow _{d}N\left( 0,\varvec{I}_{k}\right) . \end{aligned}$$

(15)

If $\varvec{\eta }_{i}=0$, $\forall i$, then (12) and (15) continue to hold with $\varvec{R}_{FE}=\varvec{R}_{1,FE}$.

3 The Hausman-type test

A number of specification tests have been proposed to test the presence of the CSD or the multiplicative IE in panels. The most popular test is the cross-section dependence (CD) test statistic proposed by Pesaran (2015), that is increasingly applied to the residuals of regression models for use as an ex-post diagnostic tool. However, the CD test fails to reject the null hypothesis of no error CSD when the factor loadings have zero means, implying that the CD test will display very poor power when it is applied to cross-sectionally demeaned data. Furthermore, the residual-based CD test has been shown to often reject the null hypothesis of no remaining CSD in the case of the CCE estimator (e.g. Mastromarco et al. 2016). Juodis and Reese (2018) show that the application of the CD test to regression residuals obtained from IE models introduces a bias term of order $\sqrt{T}$, rendering an erroneous rejection of the null.^{Footnote 5} Sarafidis et al. (2009) propose an alternative testing procedure for the null hypothesis of homogeneous factor loadings against the alternative of heterogeneous loadings after estimating a linear dynamic panel data model by GMM. This approach is valid only when N is large relative to T, but it can be applied to testing for any remaining error CSD after including time dummies. But, they maintain an assumption that the loadings between equations for y and $\varvec{x}$ are uncorrelated (see their Assumption 5(b)).

The PC estimator is consistent both under models with two-way additive (fixed) effects and under models with IE, but less efficient than the FE estimator under the null model with additive effects only. But, the FE estimator is inconsistent under the alternative model with IE. Following this idea, Bai (2009, Section 9) advances the following Hausman test for testing the null of additive effects, i.e. $\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t} = \alpha _{i} +\theta _{t}$ against the alternative of IE^{Footnote 6}:

$$\begin{aligned} H_{B}=\left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }}_{PC_B}\right) ^{\prime } \left( \varvec{V}_{B}\right) ^{-1} \left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }}_{PC_B}\right) , \end{aligned}$$

(16)

where $\hat{\varvec{\beta }}_{PC_B}$ is the iterative PC estimator proposed by Bai (2009), $\varvec{V}_{B}=\widetilde{Var}\left( \hat{\varvec{\beta }}_{PC_B}\right) -\widetilde{Var}\left( \hat{\varvec{\beta }}_{FE}\right) $, $\widetilde{Var}\left( \hat{\varvec{\beta }}_{FE}\right) $ is the the standard variance estimator provided by the two-way FE estimation, and $\widetilde{Var}\left( \hat{\varvec{\beta }}_{PC_B}\right) $ is the analytic (sandwich-form) variance estimator, which takes into account unknown form of heteroscedastic and autocorrelated errors. Bai (2009) derives that $H_{B}\rightarrow _{d}\chi _{k}^{2}$ under the null.^{Footnote 7} Westerlund (2019b) proposes the alternative Hausman test statistic obtained by replacing the PC estimator with the CCE estimator, $H_{W}$.

The conventional wisdom is that if the null hypothesis of no error CSD is rejected, the FE estimator would be biased due to the potential endogeneity arising from the correlation between the regressors and unobserved factors and/or loadings.

In empirical applications we apply the CD test and Bai’s Hausman test to the number of datasets that have been employed in the literature, and find that the CD test strongly rejects the null hypothesis of weak error CSD while the $H_{B}$ test rarely rejects the null of additive-effects. The results of the CD test confirm the presence of strong CSD while the latter indicates the absence of IE. This suggests that the FE estimator is consistent (and potentially efficient). However, if the regressors are uncorrelated with loadings, the $H_{B}$ test is inconsistent against the alternative model.

The results of the Monte Carlo simulation (in Section S1 in the Online Supplement) clearly demonstrate the limitation of applying the $H_{B}$ in practice because it cannot distinguish between panels with the 2-way additive fixed effects and panels with IE where the regressors are uncorrelated with loadings.^{Footnote 8}

The above discussion suggests that the null hypothesis of the absence of correlation between the regressors and factor loadings emerges as an influential but underappreciated feature of the panel data model with IE.

We have shown that the presence of IE does not always imply that the FE estimator is inconsistent. In particular, the FE estimator is still consistent under the null (3), even though the regressors are correlated with factors. In this case we may prefer to use the simple FE estimator, which is invariant to any complex issues related to selecting the number of unobserved factors incorrectly which would significantly affect the performance of PC estimators (Moon and Weidner 2015), and to employing the inconsistent initial estimates which may not guarantee the convergence of the interactive PC estimator (Hsiao 2018).

In this regard, it is surprising to find that the literature has been silent on investigating the important issue of testing if the regressors are correlated with loadings or not in panels with IE. For large T context, it is natural to allow for $\varvec{x}_{it}$ to be correlated with $\varvec{f}_{t}$ to avoid any omitted variables bias. It still remains the important issue to test whether $\varvec{x}_{it}$ are correlated with $\varvec{\gamma }_{i}$. Given the pervasive evidence of cross sectionally dependent errors in panels (Pesaran 2015), as the main contribution, we proceed to develop a novel Hausman-type test that investigates the validity of the null hypothesis, (3). In the model (1), recall that the PC estimator is consistent under the null, (3) and under the alternative, (4) whereas the FE estimator is consistent only under the null, (3). Following this idea, we propose the Hausman-type test based on the difference between the FE and PC estimators as follows:

$$\begin{aligned} H=\left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }}_{PC}\right) ^{\prime }\varvec{V}^{-1}\left( \hat{\varvec{\beta }}_{FE} -\hat{\varvec{\beta }}_{PC}\right) \end{aligned}$$

(17)

where $\hat{\varvec{\beta }}_{PC}$ is the bias corrected PC estimator to be defined in (18) below, and $\varvec{V} = Var\left( \hat{\varvec{\beta }}_{FE} - \varvec{\hat{\beta }}_{PC}\right) = Var\left( \hat{\varvec{\beta }}_{FE}\right) + Var\left( \hat{\varvec{\beta }}_{PC}\right) - Cov\left( \hat{\varvec{\beta }}_{FE},\hat{\varvec{\beta }}_{PC}\right) - Cov\left( \hat{\varvec{\beta }}_{PC}, \hat{\varvec{\beta }}_{FE}\right) $. Notice that the FE estimator is not necessarily more efficient than the PC estimator under the null, which implies that

$$\begin{aligned} Var\left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }}_{PC}\right) \not =Var\left( \hat{\varvec{\beta }}_{FE}\right) -Var\left( \hat{\varvec{\beta }}_{PC}\right) \end{aligned}$$

in contrast to the well-established finding in Hausman (1978). Hence, our proposed test is not exactly the Hausman test. We interpret the Hausman-type test in (17) as a test for the null hypothesis, (3) in heterogeneous panels with IE, (1).

Before developing the asymptotic theory for the Hausman-type statistic, we describe the asymptotic distribution of the bias-corrected PC estimator given by

$$\begin{aligned} \hat{\varvec{\beta }}_{PC} = \varvec{\tilde{\beta }}_{PC} - \frac{1}{N}\hat{\varvec{B}}_{NT} - \frac{1}{T}\hat{\varvec{C}}_{NT} \end{aligned}$$

(18)

where the $\varvec{\tilde{\beta }}_{PC}$ is the PC estimator obtained by iteratively solving the set of nonlinear equations:

$$\begin{aligned} \varvec{\tilde{\beta }}_{PC}= & {} \left( \sum _{i=1}^{N}\varvec{X} _{i}^{\prime }\varvec{M}_{\hat{F}}\varvec{X}_{i}\right) ^{-1} \sum _{i=1}^{N}\varvec{X}_{i}^{\prime }\varvec{M}_{\hat{F} }\varvec{y}_{i}\text { and }\\ {}{} & {} \left[ \frac{1}{NT}\sum _{i=1}^{N}\left( \varvec{y}_{i}-\varvec{X}_{i}{\tilde{\varvec{\beta }}}_{PC}\right) \left( \varvec{y}_{i}-\varvec{X}_{i}{\tilde{\varvec{\beta }}} _{PC}\right) ^{\prime }\right] {\hat{\varvec{F}}={\varvec{\hat{F}}V}}_{NT} \end{aligned}$$

where $\varvec{M}_{\hat{F}}=\varvec{I}_{T}-\hat{\varvec{F}}\left( \hat{\varvec{F}}^{\prime }\hat{\varvec{F}}\right) ^{-1} \hat{\varvec{F}}^{\prime }$, $\varvec{V}_{NT}$ is the diagonal matrix that consists of the r largest eigenvalues of the above matrix in the brackets arranged in a decreasing order, $\hat{\varvec{F}}$ is $\sqrt{T}$ times the corresponding eigenvectors, and $\frac{1}{N}\hat{\varvec{B}}_{NT}$ and $\frac{1}{T}\hat{\varvec{C}}_{NT}$ are the bias correction terms derived in CHNY (see Appendix 9 for details).

Next, similar to the nonparametric and HAC variance estimators developed for the FE estimator, we propose two versions of the robust variance estimator for the PC estimator as follows^{Footnote 9}

$$\begin{aligned}&\hat{\varvec{V}}^{NON}\left( \hat{\varvec{\beta }}_{PC}\right) \nonumber \\&\quad =\left( \sum _{i=1}^{N}\varvec{X}_{i}^{\prime } \varvec{M}_{\hat{F}}\varvec{X}_{i}\right) ^{-1}\left( \sum _{i=1}^{N}\left( \varvec{X}_{i}^{\prime }\varvec{M}_{\hat{F} }\varvec{X}_{i}\right) \left( \varvec{\tilde{\beta }}_{PC,i} - \varvec{\tilde{\beta }}_{PC}\right) \left( \varvec{\tilde{\beta }}_{PC,i} - \varvec{\tilde{\beta }}_{PC}\right) ^{\prime }\left( \varvec{X}_{i}^{\prime }\varvec{M}_{\hat{F}}\varvec{X}_{i}\right) \right) \nonumber \\&\qquad \left( \sum _{i=1}^{N} \varvec{X}_{i}^{\prime }\varvec{M}_{\hat{F}} \varvec{X}_{i}\right) ^{-1} \end{aligned}$$

(19)

where $\varvec{\tilde{\beta }}_{PC,i}=\left( \varvec{X}_{i}^{\prime } \varvec{M}_{\hat{F}}\varvec{X}_{i}\right) ^{-1} \varvec{X}_{i}^{\prime } \varvec{M}_{\hat{F}}\varvec{y}_{i}$, and

$$\begin{aligned} \hat{\varvec{V}}^{HAC}\left( \hat{\varvec{\beta }}_{PC}\right) =\left( \sum _{i=1}^{N}\varvec{X}_{i}^{\prime }\varvec{M}_{\hat{F}} \varvec{X}_{i}\right) ^{-1} \left( \sum _{i=1}^{N}\hat{\varvec{X}}_{i}^{\prime } \hat{\varvec{u}}_{PC,i}\hat{\varvec{u}}_{PC,i}^{\prime } \hat{\varvec{X}}_{i}\right) \left( \sum _{i=1}^{N}\varvec{X}_{i}^{\prime } \varvec{M}_{\hat{F}}\varvec{X}_{i}\right) ^{-1} \end{aligned}$$

(20)

where $\hat{\varvec{u}}_{PC,i} = \varvec{y}_{i} - \hat{\varvec{X}}_{i} \hat{\varvec{\beta }}_{PC}$.

We provide the asymptotic distribution of the $\hat{\varvec{\beta }}_{PC}$ estimator in Theorem 2.

Theorem 2

Suppose that Assumption A holds. Considering, $\varvec{\beta }_{i} = \varvec{\beta } + \varvec{\eta }_{i}$, as $N,T \rightarrow \infty $,

$$\begin{aligned} \sqrt{N}\left( \hat{\varvec{\beta }}_{PC}-\varvec{\beta }\right) \rightarrow _{d}N\left( 0_{k\times 1},\varvec{\Psi }_{PC}^{-1} \varvec{R}_{1,PC}\varvec{\Psi }_{PC}^{-1}\right) \end{aligned}$$

(21)

where $\varvec{\Psi }_{PC}=\lim _{N,T\rightarrow \infty }\frac{1}{N}\sum _{i=1}^{N}E\left( \frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}\right) $ with $\varvec{V}_{i}=(\varvec{v}_{i1},\ldots ,\varvec{v}_{iT})^{\prime }$ defined in (33) in Appendix 7, and

$$\begin{aligned} \varvec{R}_{1,PC}=\lim _{N,T\rightarrow \infty }N^{-1}\sum _{i=1}^{N}E\left( \frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}\varvec{\eta } _{i}\varvec{\eta }_{i}^{\prime }\frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}\right) \end{aligned}$$

(22)

Furthermore,

$$\begin{aligned}{} & {} \hat{\varvec{V}}^{NON}\left( \hat{\varvec{\beta }}_{PC}\right) ^{-1/2}\left( \hat{\varvec{\beta }}_{PC}-\varvec{\beta }\right) \rightarrow _{d}N\left( 0,\varvec{I}_{k}\right) \text { and }\nonumber \\{} & {} \quad \hat{\varvec{V}}^{HAC}\left( \hat{\varvec{\beta }}_{PC}\right) ^{-1/2}\left( \hat{\varvec{\beta }}_{PC}-\varvec{\beta }\right) \rightarrow _{d}N\left( 0,\varvec{I}_{k}\right) . \end{aligned}$$

(23)

It is worth noting in the homogeneous case with $\varvec{\beta }_{i}=\varvec{\beta }$ for all i that while the FE estimator is $\sqrt{N}$-consistent, the PC estimator can achieve a faster rate of convergence as it completely removes the effect of the unobserved factors, asymptotically. Further, the rate of convergence of the FE estimator is also shared by the CCE estimator, if the rank condition in Pesaran (2006) does not hold. Such condition cannot be ascertained but needs to be assumed, in which case the FE and CCE estimators have comparable theoretical properties. Nevertheless, the superiority of the PC estimator does not necessarily extend to its small sample properties as we examine in Monte Carlo study below.

Having established that the two versions of the robust estimator can consistently standardise the estimator, we propose to estimate $Cov\left( \hat{\varvec{\beta }}_{FE}, \hat{\varvec{\beta }}_{PC}\right) $ by^{Footnote 10}

$$\begin{aligned}&\hat{\varvec{C}}^{NON}\left( \hat{\varvec{\beta }}_{FE} ,\hat{\varvec{\beta }}_{PC}\right) \\&\quad =\left( \sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) ^{-1}\left( \sum _{i=1}^{N}\left( \varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) \left( \hat{\varvec{\beta }}_{FE,i}-\hat{\varvec{\beta }}_{FE}\right) \left( \varvec{\tilde{\beta }}_{PC,i}-\varvec{\tilde{\beta }}_{PC}\right) ^{\prime }\left( \varvec{X}_{i}^{\prime }\varvec{M}_{\hat{F} }\varvec{X}_{i}\right) \right) \\&\qquad \left( \sum _{i=1}^{N}\varvec{X} _{i}^{\prime }\varvec{M}_{\hat{F}}\varvec{X}_{i}\right) ^{-1}\\&\qquad \hat{\varvec{C}}^{HAC}\left( \hat{\varvec{\beta }}_{FE} ,\hat{\varvec{\beta }}_{PC}\right) =\left( \sum _{i=1}^{N} \varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) ^{-1}\left( \sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\varvec{\hat{u}}_{FE,i}\hat{\varvec{u}}_{PC,i}^{\prime }\hat{\varvec{X}} _{i}\right) \left( \sum _{i=1}^{N}\varvec{X}_{i}^{\prime }\varvec{M} _{\hat{F}}\varvec{X}_{i}\right) ^{-1}. \end{aligned}$$

Accordingly, we define two operating versions of the Hausman-type statistic by

$$\begin{aligned} H^{NON}= & {} \left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }} _{PC}\right) ^{\prime }\left( \hat{\varvec{V}}^{NON}\right) ^{-1}\left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }}_{PC}\right) \end{aligned}$$

(24)

$$\begin{aligned} H^{HAC}= & {} \left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }} _{PC}\right) ^{\prime }\left( \hat{\varvec{V}}^{HAC}\right) ^{-1}\left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }}_{PC}\right) \end{aligned}$$

(25)

where

$$\begin{aligned} \hat{\varvec{V}}^{NON}= & {} \hat{\varvec{V}}^{NON}\left( \varvec{\hat{\beta }}_{FE}\right) +\hat{\varvec{V}}^{NON}\left( \varvec{\hat{\beta }}_{PC}\right) -2\hat{\varvec{C}}^{NON}\left( \varvec{\hat{\beta }}_{FE},\hat{\varvec{\beta }}_{PC}\right) \end{aligned}$$

(26)

$$\begin{aligned} \hat{\varvec{V}}^{HAC}= & {} \hat{\varvec{V}}^{HAC}\left( \varvec{\hat{\beta }}_{FE}\right) +\hat{\varvec{V}}^{HAC}\left( \varvec{\hat{\beta }}_{PC}\right) -2\hat{\varvec{C}}^{HAC}\left( \varvec{\hat{\beta }}_{FE},\hat{\varvec{\beta }}_{PC}\right) \end{aligned}$$

(27)

We provide the main result in the following Theorem.

Theorem 3

Under Assumption A, as $N,T \rightarrow \infty $,

$$\begin{aligned} H^{j}\rightarrow _{d}\chi _{k}^{2}\ \mathrm{for\ }j=NON,HAC \end{aligned}$$

$H^{j}$ follows the $\chi _{k}^{2}$ distribution even though the rate of convergence of the PC estimator is $\sqrt{NT}$ while the FE estimator is $\sqrt{N}$-consistent. This follows from the use of the robust covariance estimators that properly normalise the test statistic as shown in Appendix 7.

Next, notice that our proposed test, (17), is fundamentally different from Bai’s Hausman test, (16), because our null hypothesis, (3) is subsumed under his alternative model with IE, as is clearly demonstrated in (6). Furthermore, Bai’s test will be consistent only if the regressors are correlated with both factors and loadings. Importantly, Bai’s Hausman test will be inconsistent if $\varvec{x}_{it}$ are uncorrelated with $\gamma _{i}$ in (1), which is mainly because the FE estimator is still consistent under (3). This suggests that the non-rejection of the null by the Bai’s test is not informative because it cannot distinguish between the panel data model with the 2-way additive fixed effects only and the model with IE where the regressors are uncorrelated with loadings. See the Online Supplement for the simulation evidence. In the empirical applications below we find that Bai’s Hausman test rarely rejects the null of additive effects model against the alternative of IE even though the CD test strongly rejects the null of weak CSD for all the datasets. Such conflicting results may suggest that the regressors are indeed uncorrelated with factor loadings even in the panels with IE, which could provide the support for the usefulness of our proposed test.

4 Monte Carlo simulations

4.1 Review of previous studies

Westerlund and Urbain (2013) find that the CCE estimator does not perform well in the presence of correlated factor loadings, especially if the full rank condition is not satisfied. Karabiyik et al. (2017) discuss the role of the rank condition in the CCE estimation, and show that the second moment matrix of the estimated factors becomes asymptotically singular if the number of factors is strictly less than the number of dependent and independent variables, invalidating the key arguments commonly applied to establish the asymptotic theory. Westerlund and Urbain (2015) provide a formal comparison between the CCE and PC estimators by employing the same data generating process (DGP)^{Footnote 11} and show that the two estimators are asymptotically equivalent only if $N/T \rightarrow 0$ whereas their asymptotic distributions are no longer equivalent if $N/T\rightarrow \tau >0$, especially in terms of asymptotic biases.

Though a number of papers have examined the small sample performance of the CCE and PC estimators, we find that only two studies by Sarafidis and Wansbeek (2012) and Westerlund (2019a), have explicitly analysed the performance of the FE estimator in the presence of CSD. Assuming the homogeneous parameters with $N=100$ and $T=50$, Sarafidis and Wansbeek (2012) compare the performance of the FE, CCE and PC estimators. If the factor loadings between the equations for y and $\varvec{x}$ are uncorrelated and the rank condition is satisfied, they find that all three estimators perform well in terms of bias and RMSE. If the factor loadings are correlated, however, the FE estimator is severely biased. The CCE estimator is substantially biased if the rank condition is violated. As expected, the performance of the PC estimator is not significantly affected by the presence of correlated factor loadings.

Recently, Westerlund (2019a) shows that the FE estimator can be consistent even in the presence of IE, because both FE and CCE estimators belong to a class of estimators that satisfy a zero sum restriction. But, he maintains the assumption that factor loadings are uncorrelated in which case he demonstrates that the performance of the FE and CCE estimators is satisfactory.

4.2 Monte Carlo design

We generate the data as follows:

$$\begin{aligned} y_{it}= & {} \beta _{i}x_{it}+\gamma _{1i}f_{1t}+\gamma _{2i}f_{2t}+\varepsilon _{it}, \end{aligned}$$

(28)

$$\begin{aligned} x_{it}= & {} \Gamma _{1i}f_{1t}+\Gamma _{2i}f_{2t}+u_{it}, \end{aligned}$$

(29)

where $\left( f_{1t},f_{2t},\varepsilon _{it},u_{it}\right) ^{\prime }$ are drawn from the multivariate normal distribution with zero means and covariance matrix, $\varvec{\Sigma }_{i} = diag \left( \sigma _{f1}^{2},\sigma _{f2}^{2},\sigma _{\varepsilon _{i}}^{2},\sigma _{u_{i}}^{2}\right) $ = $\varvec{I}_4$. We follow Pesaran (2006) and Westerlund and Urbain (2013), and generate the factor loadings, $\left( \gamma _{1i},\gamma _{2i}\right) $ and $\left( \Gamma _{1i},\Gamma _{2i}\right) $ as follows:

Experiment 1 with uncorrelated factor loadings and the full rank in which case $\gamma _{1i}\sim iidN(1,1)$, $\gamma _{2i}\sim iidN(0,1)$, $\Gamma _{1i}\sim iidN(0,1)$, $\Gamma _{2i}\sim iidN(1,1)$ such that $E\left( \begin{array}{cc} \gamma _{1i} &{}\quad \gamma _{2i}\\ \Gamma _{1i} &{}\quad \Gamma _{2i} \end{array} \right) =\left( \begin{array}{cc} 1 &{}\quad 0\\ 0 &{}\quad 1 \end{array} \right) .$
Experiment 2 with uncorrelated factor loadings and the rank deficiency in which case $\gamma _{1i}\sim iidN(1,1)$, $\gamma _{2i} \sim iidN(0,1)$, $\Gamma _{1i} \sim iidN(1,1)$, $\Gamma _{2i}\sim iidN(0,1)$, such that $E\left( \begin{array}{cc} \gamma _{1i} &{}\quad \gamma _{2i}\\ \Gamma _{1i} &{}\quad \Gamma _{2i} \end{array} \right) =\left( \begin{array}{cc} 1 &{}\quad 0\\ 1 &{}\quad 0 \end{array}\right) .$
Experiment 3 with correlated factor loadings and the full rank in which case: $\gamma _{1i}=\gamma _{1}+\upsilon _{1i}$, $\gamma _{2i}=\gamma _{2} +\upsilon _{2i}$, $\Gamma _{1i}=$ $\Gamma _{1}+\upsilon _{1i}$, and $\Gamma _{2i}=$ $\Gamma _{2}+\upsilon _{2i}$ with $\gamma _{1}=1$, $\gamma _{2}=0$, $\Gamma _{1} =2$, $\Gamma _{2}=1$ and $\left( \upsilon _{1i},\upsilon _{2i}\right) \sim iidN(0,I_{2})$, such that $E\left( \begin{array}{cc} \gamma _{1i} &{}\quad \gamma _{12}\\ \Gamma _{1i} &{}\quad \Gamma _{12} \end{array} \right) =\left( \begin{array}{cc} 1 &{}\quad 0\\ 2 &{}\quad 1 \end{array} \right) $
Experiment 4 with correlated factor loadings and the rank deficiency in which case $\gamma _{1i}\sim iidN(1,1)$, $\gamma _{2i}\sim iidN(0,1)$, $\gamma _{1i}=\Gamma _{1i}$ and $\gamma _{2i}=\Gamma _{2i}$ such that $E\left( \begin{array}{cc} \gamma _{1i} &{}\quad \gamma _{2i}\\ \Gamma _{1i} &{}\quad \Gamma _{2i} \end{array} \right) =\left( \begin{array}{cc} 1 &{}\quad 0\\ 1 &{}\quad 0 \end{array} \right) $.

We specify the main slope parameter as $\beta _{i} = 1 + \eta _{i}$, $\eta _{i} \sim iidN \left( 0,0.04\right) $ and consider the following combination of $\left( N,T\right) = 20,30,50,100,200$, setting the number of replications at $R=1,000$.^{Footnote 12}

4.3 The small sample performance of FE, CCE and PC estimators

We examine the finite sample performance of the following estimators: the two-way fixed effect (FE) estimator, $\hat{\beta }_{FE}$, the CCE estimator by Pesaran (2006), $\hat{\beta }_{CCE}$, and the bias corrected PC estimators proposed by CHNY, $\hat{\beta }_{PC}$. We consider both pooled and mean group estimator except for $\hat{\beta }_{PC}$ (see Appendix 7 for details). Notice that consistency of the PC estimator depends crucially upon correctly selecting the number of unobserved factors (Moon and Weidner 2015). In this regard, to address uncertainty associated with the selection criteria, we initially consider the two information criteria, denoted $IC_{p1}$ and $AIC_{1}$, proposed by Bai and Ng (2002). Overall, we find that the PC estimator using $IC_{p1}$ outperforms that with $AIC_{1}$, and we only report the results based on $IC_{p1}$.

We report the following summary statistics:

Bias: $\hat{\beta }_{R}-\beta _{0}$, where $\beta _{0}$ is a true parameter value and $\hat{\beta }_{R}=R^{-1}\sum _{r=1}^{R}\hat{\beta }_{r}$ is the mean coefficient across R replications.
RMSE: the root mean square error estimated by $\sqrt{R^{-1} \sum _{r=1}^{R} \left( \hat{\beta }_{r}-\beta _{0}\right) ^{2}}.$

Table 1 shows the simulation results for Experiment 1 with the full rank and uncorrelated factor loadings. The biases of all estimators are mostly negligible even in small samples with the FE performing slightly worse than other estimators when $N=20$. The results for RMSEs display qualitatively similar patterns. RMSEs of CCE and PC estimators are lower than those of the FE and decline as N or T grows. On the other hand, the RMSE of the FE estimator improves only with N. Finally, biases and RMSEs of the pooled and mean group estimators display almost identical patterns. The relative performance of FE, CCE and PC estimators is generally in line with the simulation results reported in Chudik et al. (2011), Sarafidis and Wansbeek (2012) and CHNY.

The important exception is the poor performance of the PC estimator using $AIC_{1}$.^{Footnote 13} In this case the biases are substantial in small samples. They decline only if both N and T become large. Further, their RMSEs are much larger than those of the other estimators and decrease only if N and T are large. This demonstrates the influence of the estimated number of factors for the PC estimator. Given that information criteria have very variable performance, this is a problematic issue for PC estimators in which case the FE estimator can make an operational alternative.

Table 1 Simulation results for Experiment 1 with uncorrelated loadings and the full rank

Full size table

Table 2 presents simulation results for Experiment 2 where factor loadings are uncorrelated but the rank condition is violated. The performance of the CCE estimators tends to slightly deteriorate, both bias and RMSE of the CCE estimator are higher than in the case with the full rank. The performance of the CCE estimator improves slowly with N only, suggesting that the rank deficiency may slow down its performance. On the other hand, the bias and the RMSE of the PC and FE estimators do not appear to be affected by the rank deficiency. Finally, we find that the mean group estimator performs slightly better than the pooled estimator in small samples.

Table 3 shows the results for Experiment 3 with correlated loadings and full rank. Now, only the FE estimator is severely biased. Next, the biases of the CCE estimator are not negligible for small N, but its performance improves sharply with N, a consistent finding with Westerlund and Urbain (2013), who note that ’the problem with correlated loadings goes away if the rank condition is satisfied’. The overall performance of the PC estimator is qualitatively similar to the previous cases, confirming that it is still consistent with both N and T.

Table 4 presents the simulation results for Experiment 4 with correlated loadings and the rank deficiency. Both CCE and FE estimators are severely biased, confirming our theoretical prediction that both estimators are inconsistent in the presence of correlated factors loadings as also discussed in Sarafidis and Wansbeek (2012) and Westerlund and Urbain (2013). On the other hand, the performance of the PC estimators is qualitatively similar to those presented in Table 2.

Overall, our results show that, when the factor loadings are uncorrelated, all the estimators show a similar and satisfactory performance, suggesting that the FE estimator can produce reliable results even in the presence of IE. When factor loadings are correlated, however, the FE estimator becomes severely biased and the performance of the CCE estimator tends to worsen. Only under the full rank condition, the performance of the CCE improves with N. The performance of the bias-corrected PC estimator is qualitatively similar across all four experiments.

Table 2 Simulation results for Experiment 2 with uncorrelated loadings and the rank deficiency

Full size table

Table 3 Simulation results for Experiment 3 with correlated loadings and the full rank

Full size table

Table 4 Simulation results for Experiment 4 with correlated loadings and the rank deficiency

Full size table

Table 5 Size and power of the $H^{\textrm{NON}}$ statistic and coverage rates at 95% level for heterogeneous $\beta $s, $\beta _{i}=1+\eta _{i}$, $\eta _{i}\sim iidN(0,0.04)$ and no serial correlation

Full size table

4.4 The performance of the Hausman-type test statistic

We examine the small sample performance of the H test statistics, under the above four experiments, considering the following combination of $\left( N,T\right) =50,100,150,200,500$. To construct the H statistic, we consider the difference between the FE estimator, $\varvec{\beta }_{FE}$ and the bias corrected PC estimator, $\varvec{\beta }_{PC}$ standardised respectively by both versions of robust variance estimator, denoted NON and HAC.^{Footnote 14} We examine size and power of the H statistic, but we also report the coverage rates for the three estimators. We consider slope heterogeneity such as $\beta _{i} = \beta + \eta _{i}$, $\eta _{i} \sim N(0,0.04)$ and serially correlated errors given by

$$\begin{aligned} \varepsilon _{it} = \rho _{\varepsilon } \varepsilon _{i,t-1} + v_{\varepsilon _{it}} \text { and } u_{it} = \rho _{u} u_{i,t-1} + v_{uit} \text { with } \rho _{\varepsilon } =\rho _{u}=0 \text { or } 0.5, \end{aligned}$$

where $\left( v_{\varepsilon _{it}}, v_{uit} \right) ^{\prime }$ are drawn from the bivariate normal distribution with zero means and covariance matrix, $ diag \left( \sigma _{v_{\varepsilon i}}^{2}, \sigma _{v_{u i}}^{2}\right) $ = $\varvec{I}_2$. Hence, we examine the following two cases:

Case 1: Heterogeneous $\beta $s and no serial correlation; see Tables 5 and 6.

Case 2: Heterogeneous $\beta $s and serial correlation; see Tables 7 and 8.

Overall, the test performance of the H statistics reported in Tables 5, 6, 7 and 8, is satisfactory and qualitatively similar in terms of the empirical size and power. This confirms that all the estimators are consistent under the null with and without serial correlation. Furthermore, the satisfactory coverage rates revealed by the three estimators demonstrate that both nonparametric and HAC variance estimators are also robust to serial correlation.

Table 6 Size and power of the H^HAC statistic and coverage rates at 95% level for heterogeneous $\beta $s, $\beta _{i}=1+\eta _{i}$, $\eta _{i}\sim iidN(0,0.04)$ and no serial correlation

Full size table

In Experiments 1 and 2, the sizes of both $H^{NON}$ and $H^{HAC}$ tests approach the nominal level (0.05) in most cases as the sample size rises. The power of the H test is always one under Experiments 3 and 4. In particular, when the regressors are uncorrelated with factor loadings, $\varvec{\beta }_{FE}$ is shown to be consistent and its coverage rate reaches the nominal 95% in Experiments 1 and 2, irrespective of the rank condition. In Experiments 3 and 4 when loadings are correlated with the regressor, however, $\varvec{\beta }_{FE}$ is significantly biased and displays a zero coverage rate. The coverage rates of the bias-corrected PC estimator tend to 95% under all four experiments.^{Footnote 15}

We have also considered the cases with homogeneous $\beta $’s and obtained qualitatively similar results, which are reported in the Online Supplement.

4.5 The pretest estimator

The estimated number of factors can influence the performance of the PC estimator considerably, and this issue needs to be handled carefully. The previous literature has not provided clear evidence on what is the best course of action to choose the number of factors. In this regard, we propose a pretest estimator which is constructed as follows. The pretest estimator, denoted $\hat{\beta }_{pretest}$, selects either the FE or the PC estimator depending on the Hausman-type test results. To be more specific, we first evaluate the $H^{NON}$ and $H^{HAC}$ statistics. If the null hypothesis, (3) is not rejected, then we select $\hat{\beta }_{pretest} =\hat{\beta }_{FE}$ while, if the null is rejected, we set $\hat{\beta }_{pretest} =\hat{\beta }_{PC}$.

In the Online Supplement we have examined the finite sample performance of this pretest estimator under the same four experiments considered above. Its overall performance is satisfactory in terms of bias and RMSE, irrespective of whether factor loadings are correlated or not. This suggests that such an estimator has considerable potential as it alleviates the issue of selecting the number of factors, especially in the case where the regressors are found to be uncorrelated with factor loadings in practice.

5 Empirical applications

We investigate the empirical relevance of the null hypothesis of no correlation between the regressors and factor loadings by applying our proposed statistics $H^{HAC}$ defined in (25) to fourteen datasets.^{Footnote 16} The details of the data and the empirical specifications are provided in Appendix 8.

Table 7 Size and power of the H^NON statistic and coverage rates at 95% level for heterogeneous $\beta $s, $\beta _{i}=1+\eta _{i}$, $\eta _{i}\sim iidN(0,0.04)$ and serial correlation, $\varepsilon _{it}=\rho _{\varepsilon }\varepsilon _{it}+v_{\varepsilon it}$, $u_{it}=\rho _{u}u_{it}+v_{uit}$, $\rho _{\varepsilon }=\rho _{u}=0.5$

Full size table

The Cobb–Douglas production function The first application comprises five different cases—the OECD members ($N=26$, $T=41$, Mastromarco et al. 2016), the 20 Italian regions ($N=20$, $T=21$), the 48 U.S. States ($N=48$, $T=17$) and the aggregate sectorial data for manufacturing from developed and developing countries ($N=25$, $T=25$). Following the economic growth literature, we estimate the Cobb–Douglas production function by the FE and PC estimators and then apply our proposed Hausman-type test. For OECD, the output is measured by the per capita GDP while the regressor is the capital-labour ratio. For the Italian regions, output is the per capita value added while for the U.S. application, the output is the per capita gross State product, with the same regressor. In the fourth application, the output is measured as the aggregated manufacturing sector value-added of OECD countries, see Eberhardt and Teal (2019). In the fifth application, the production function is augmented by the R &D stock expenditure, and the output is the aggregate sectorial value added for manufacturing, see Eberhardt et al. (2013).

The gravity model of bilateral trade flows Next, we consider the estimation of a gravity model of the bilateral trade flows for the EU14 countries, counting $N=91$ pairs from 1960 to 2008 ($T=49$). Here, we follow Serlenga and Shin (2007) and estimate the gravity panel data regression, in which the bilateral trade flow is set as a function of GDP, countries’ similarity, relative factor endowment, the real exchange rate as well as the trade union and common currency dummies.

The gasoline demand function This application aims at estimating the price and income elasticity of gasoline demand. In particular, we focus on estimating the demand function for gasoline using the data from Liu (2014), which contains quarterly data for the 50 States in the U.S. over the period 1994–2008 ($N=50$, $T=60$).

Housing prices We estimate the income elasticity of real housing prices from 1975 to 2010. We consider two datasets; the first data from Holly et al. (2010) covers the 49 U.S. States ($N=49$, $T=36$) while the second covers the 384 Metropolitan Statistical Areas ($N=384$, $T=36$) obtained from Baltagi and Li (2014).

Technological spillovers on productivity We consider two applications. First, we estimate the effects of domestic and foreign R &D on TFP controlling for the human capital. We use a balanced panel of 24 OECD countries over the period 1971–2004 ($N=24$ and $T=34$), see Coe et al. (2009) and Ertur and Musolesi (2017). In the second application we explore the channels through which technological investments affect the productivity performance of industrialised economies by estimating the productivity effects of R &D and Information and Communication Technologies (ICT), controlling for the inputs accumulation as labour and (non-ICT) capital for OECD industries. We use a balanced panel of 49 high-tech industries over the period 1977–2006 ($N=53$ and $T=30$) from Pieri et al. (2018).

Health care expenditure and income We estimate the relationship between healthcare expenditure and income after controlling for public expenditure over total health expenditure. We consider a panel of 167 countries covering the period 1995–2012 ($N=167$ and $T=18$), see Baltagi et al. (2017).

Demographic and business cycle volatility. We estimate the impact of the age composition of the labor force on business cycle volatility. We employ a balanced panel dataset for 51 countries over the period 1957–2000 ($N=51$ and $T=44$) provided by Everaert and Vierke (2016).

Carbon emissions and trade We explore the nexus between carbon emissions and trade using a balanced panel of 32 OECD countries over the period 1990–2013 ($N=32$ and $T=24$), see Liddle (2018).

In Table 9, we present the estimation and test results. First of all, the test results by $H_{HAC}$ provide a surprisingly convincing evidence that the null hypothesis of the regressors being uncorrelated with factor loadings, is not rejected (even at 1% significance level) in thirteen out of fourteen datasets considered.^{Footnote 17} We also report the results for the CD test proposed by Pesaran (2015), which tests the null of no (weak) CSD against the alternative of strong CSD, and the Hausman test proposed by Bai (2009), $H_{B}$ in (16) and the Hausman test proposed by Westerlund (2019b), $H_{W}$, which test the null of additive-effects against the alternative of IE. The CD test strongly rejects the null hypothesis for all the datasets whilst both $H_{B}$ rejects only once the null hypothesis of additive-effects model, at 10% significance level, and the $H_{W}$ test reject three times. These test results are rather in conflict, since the former suggests the presence of CSD while the latter suggests no IE. As highlighted in Sect. 2, however, the rejection of CD test does not always imply that the FE estimator is biased in panels with IE. Further, in Sect. 3, we show that the $H_{B}$ test has no power against the alternative model with IE, especially if the regressors are uncorrelated with factor loadings. Indeed, such conflicting results can provide support for our main test results that the regressors are indeed uncorrelated with factor loadings in the panels with IE.

Table 8 Size and power of the H^HAC statistic and coverage rates at 95% for heterogeneous $\beta $s, $\beta _{i}=1+\eta _{i}$, $\eta _{i}\sim iidN(0,0.04)$ and serial correlation, $\varepsilon _{it}=\rho _{\varepsilon }\varepsilon _{it}+v_{\varepsilon it}$, $u_{it}=\rho _{u}u_{it}+v_{uit}$, $\rho _{\varepsilon }=\rho _{u}=0.5$

Full size table

Next, we turn to the slope estimates provided by both FE and PC estimators, and find that they are mostly significant. Their magnitudes and signs are relatively similar to each other, and consistent with theoretical predictions. There is only an exception reported in the gravity model of international trade.^{Footnote 18}

Combining all the above test and estimation results, we come to a conclusion that the regressors are uncorrelated with factor loadings in many cross-sectionally correlated panels with IE in practice. In this situation, the FE estimation can produce consistent estimator. We emphasise that the FE estimator is invariant to any complex issues related to selecting the number of unobserved factors incorrectly which would significantly affect the performance of PC estimators (Moon and Weidner 2015), and to employing the inconsistent initial estimates which may not guarantee the convergence of the iterative PC estimator (Hsiao 2018). This suggests that the FE estimator can still be of considerable applicability in a wide variety of cross-sectionally correlated panel data with IE, especially if the regressors are found to be uncorrelated with factor loadings, the validity of which can be easily verified by our proposed test.

6 Conclusions

A large strand of the literature on panel data has focused on analysing CSD, based on the error components model with IE, which is implicitly understood to bias the conventional two-way FE estimator, due to the potential endogeneity arising from the correlation between regressors and factors/loadings. Two main approaches have been advocated to deal with this issue: the CCE estimator by Pesaran (2006) and the PC estimator by Bai (2009).

Table 9 Empirical applications to fourteen different datasets

Full size table

In this paper we have shown that the panel data model with IE can be encompassed by the standard two-way error components model if the regressors are correlated with factors but uncorrelated with the loadings. This suggests that the null hypothesis of no correlation between the regressors and factor loadings emerges as an influential but under-appreciated feature of the panel data model with IE. We propose the Hausman-type test, which follows the $\chi ^{2}$ distribution asymptotically under the null hypothesis. Monte Carlo simulation results confirm that the size and the power of the proposed test is quite satisfactory even in small samples.

Finally, we apply the proposed tests to a number of existing panel datasets, and find strong evidence in favor of the regressors uncorrelated with factor loadings in nine of ten datasets. In this situation, the FE estimator would provide a simple and robust estimation strategy in practice by avoiding nontrivial computational issues associated with the PC estimator, the performance of which relies crucially upon applying the complex bias-corrections and using reliable information criteria correctly selecting the number of unobserved factors.

We conclude by noting a couple of avenues for future research. A natural but challenging extension is to develop the LM-type test which does not require us to estimate the PC estimator at all. Next, it is worthwhile to develop the Hausman-type test in the dynamic heterogeneous panel data model with IE.

Notes

Hsiao (2018) argues that the assumption of zero mean for $\varvec{\gamma }_{i}$ or $\varvec{f}_{t}$ often used as normalization conditions, is not innocuous. With the mean zero assumption for $\varvec{\gamma }_{i}$, the cross-sectional mean equation of (1)
$$\begin{aligned} \bar{y}_{t}=\varvec{\beta }^{\prime }\varvec{\bar{x}}_{t} + \bar{\varepsilon }_{t},\quad t=1,\ldots ,T \end{aligned}$$
no longer involves $\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t}$, where $\bar{y}_{t}=N^{-1}\sum _{i=1}^{N}y_{it}$, and similarly for $\varvec{\bar{x}}_{t}$ and $\bar{\varepsilon }_{t}$. Thus the least squares regression of $\bar{y}_{t}$ on $\varvec{\bar{x}}_{t}$ is consistent and asymptotically normally distributed as $T\rightarrow \infty $. Similarly, under the mean zero assumption for $\varvec{f}_{t}$, the individual time series mean equation
$$\begin{aligned} \bar{y}_{i} = \varvec{\bar{x}}_{i}^{\prime }\varvec{\beta } + \bar{\varepsilon }_{i},\quad i=1,\ldots ,N, \end{aligned}$$
does not involve $\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t}$, where $\bar{y}_{i}=T^{-1}\sum _{t=1}^{T}y_{it}$, and similarly for $\varvec{\bar{x}}_{i}$ and $\bar{\varepsilon }_{i}$. The least squares regression of $\bar{y}_{i}$ on $\varvec{\bar{x}}_{i}$ can yield consistent and asymptotically normally distributed estimator of $\varvec{\beta }$ if N is large. In this regard, we consider the general case with $E\left( \varvec{\gamma }_{i}\right) \not =0$ and $E\left( \varvec{f}_{t}\right) \not =0$.
This may suggest that the additive case with $\varvec{\gamma }_{i} = \left( \alpha _{i},1\right) ^{\prime }$ and $\varvec{f}_{t} =\left( 1,\theta _{t}+\mu \right) ^{\prime }$ may not always be the special case of the interactive effects, $\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t}$, as argued by Bai (2009), because we have both 2-way effects, $\alpha _{i} +\theta _{t}$ and zero-mean interactive effects, $\varvec{\mathring{\gamma }}_{i}^{\prime }\varvec{\dot{f}}_{t}$ in (6).
More precisely, we obtain: $\ddot{y}:_{it} = \varvec{\beta }_{i}'\varvec{\ddot{x}}_{it} + \ddot{u}_{it} + \ddot{\zeta }_{it}$ where $\ddot{\zeta }_{it} = (1 - \frac{1}{N}) \sum _{i=1}^{N} \varvec{\eta }_{i}'(\varvec{{x}}_{it}- \varvec{\bar{x}}_{i.})$. Under the assumption that $\varvec{\eta }_{i}$ and $\varvec{x}_{it}$ are uncorrelated, it follows that $\ddot{\zeta }_{it} \rightarrow _{p} 0$. When evaluating $\ddot{\zeta }_{it}$ numerically for the different combinations of (N, T), say $(N,T) = \{(25,25), (50,50), (100,100)\}$, we find that its value is almost zero in all cases.
Pesaran (2006) implicitly assumes that the factor loadings $\varvec{\gamma }_{i}$ in (1) and $\varvec{\Gamma }_{i}$ in (2), are uncorrelated. Bai (2009) discusses this implication in detail, and shows via simulations that the CCE estimator is biased when $\varvec{x}_{it}$ is correlated with both $\varvec{\lambda }_{i}$ and $\varvec{f}_{t}$. See also Westerlund and Urbain (2013) for the simulation evidence showing that the CCE estimator performs poorly when factor loadings are correlated. Remark 2 in Westerlund and Urbain (2013) also questions the uncorrelated factor loadings assumption by arguing that a common shock that has a positive effect on savings, should have negative effects on interest rates. However, their discussion relates to the sign of the average effect of common shocks or the sign of the cross-section mean of loadings. Since the independence assumption does not restrict the sign of these means, the relevance of such a relaxation would be somewhat questionable.
Nonetheless, the CD test may be used as a model-selection tool, with a reduction in the absolute value of the CD test statistic typically being interpreted as an indication of an improved model specification.
Focusing on the special cases, Castagnetti et al. (2015) propose two tests for the null of no factor structure: one for the null that factor loadings are cross sectionally homogeneous, and another for the null that common factors are homogeneous over time. Using extremes of the estimated loadings and common factors, they show that their statistics follow an asymptotic Gumbel distribution under the null. Furthermore, they show that the average-type statistics diverge under the null while the Hausman test is inconsistent.
See Sections 3.2 and 9 in Bai (2009) for details.
We investigate this issue by examining the the finite sample performance of the CD and the $H_{B}$ tests for the heterogeneous panel data with the multiplicative IE. We allow the regressors to be uncorrelated with loadings under Experiment 1 while they are correlated under Experiment 2. In both experiments we maintain that the regressors are correlated with factors. As expected, the CD test results display that the null of weak residual CSD is strongly rejected for all the data generating process (DGP), correctly indicating the presence of IE under both Experiments. However, we find that the $H_{B}$ test is consistent only under Experiment 2. Surprisingly, the $H_{B}$ test does not display any power against the DGP under Experiment 1, where its rejection probability is close to, and sometimes lower than, the nominal size, especially in the presence of serially correlated errors. This is mainly because the FE estimator is still consistent under Experiment 1.
The asymptotic validity of these estimators is verified in Theorem 2. Through the stochastic simulations (see Sect. 4), we find that the finite sample performance of both estimators is satisfactory and close to each other. Given the popularity of the HAC variance estimator, we propose to apply it in practice (see Sect. 5).
Following Bai (2009), we have also employed the analytic (sandwich-form) variance estimator of $\varvec{V}$, taking into account unknown form of heteroscedastic and autocorrelated errors. After conducting the preliminary simulations, we find that the two robust estimators perform more satisfactory.
The DGP and the estimators are not identical to what have proposed by Pesaran (2006) and Bai (2009).
We have considered the cases with homogeneous $\beta $’s and obtained qualitatively similar results, which are reported in the Online Supplement. We have also explored the experiments under the stronger parameter heterogeneity by generating $\eta _{ik} \sim iidN(0,0.25)$ as wells as $\eta _{ik} \sim iidN(0,1)$, and still obtained qualitatively similar results (unreported to save space). Finally, we obtained qualitatively similar results for serially correlated factors. These results are available upon request.
For a complete comparison we report the simulation results based on $AIC_{1}$ in the Online Supplement.
In what follows, we apply the bias corrected PC estimators using $IC_{p1}$. We have also investigated the performance of the H statistics using the uncorrected PC estimators, and obtained qualitatively similar results.
We have also examined the coverage rates for estimators using the analytic variance estimator derived in Bai (2009, Section 9). We find that, when errors are serially correlated and conditionally heteroscedastic and/or $\beta $s are heterogeneous, coverage rates of the PC estimators are mostly well below the nominal level. Similar findings are reported in Chudik et al. (2011) and Sarafidis and Wansbeek (2012). This demonstrates an importance of using the robust variance estimators for a reliable inference.
The evidence provided by $H_{NON}$ is qualitatively similar, as shown in Appendix 10.
The results of $H_{NON}$ are shown in Appendix 10 and Table 10.
Notice that the FE estimation tends to produce substantially large coefficient on GDP, which has been widely reported in the literature.

References

Bai J (2009) Panel data models with interactive fixed effects. Econometrica 77(4):1229–1279
Article Google Scholar
Bai J, Ng S (2002) Determining the number of factors in approximate factor models. Econometrica 70(1):191–221
Article Google Scholar
Baltagi BH, Li J (2014) Further evidence on the spatio-tempral model oh house prices in the Unite States. J Appl Econom 29(3):515–522
Article Google Scholar
Baltagi BH, Lagravinese R, Moscone F, Tosetti E (2017) Health care expenditure and income: a global perspective. Health Econ 26(7):863–874
Article Google Scholar
Castagnetti C, Rossi E, Trapani L (2015) Testing for no factor structures: on the use of Hausman-type statistics. Econ Lett 130:66–68
Article Google Scholar
Charbonneau KB (2017) Multiple fixed effects in binary response panel data models. Econom J 20(3):S1–S13
Article Google Scholar
Chudik A, Pesaran MH (2015) Common correlated effects estimation of heterogeneous dynamic panel data models with weakly exogenous regressors. J Econom 188(2):393–420
Article Google Scholar
Chudik A, Pesaran MH, Tosetti E (2011) Weak and strong cross section dependence and estimation of large panels. Econom J 14(1):C45–C90
Article Google Scholar
Coakely J, Fuertes AM, Smith R (2006) Unobserved heterogeneity in panel time series models. Comput Stat Data Anal 50(1):2361–2380
Article Google Scholar
Coe DT, Helpman E, Hoffmaister AW (2009) International R &D spillovers and institutions. Eur Econ Rev 53(7):723–741
Article Google Scholar
Cui G, Hayakawa K, Nagata S, Yamagata T (2019) A robust approach to heteroskedasticity, error serial correlation and slope heterogeneity for large linear panel data models with interactive effects. ISER DP No. 1037
Davidson J (1994) Stochastic limit theory. Oxford University Press, Oxford
Book Google Scholar
Eberhardt M, Teal F (2019) The magnitude of the task ahead: macro implications of heterogeneous technology. Rev Income Wealth 66:334–360
Article Google Scholar
Eberhardt M, Helmers C, Strauss H (2013) Do spillovers matter when estimating private returns to R &D? Rev Econ Stat 95(2):436–448
Article Google Scholar
Ertur C, Musolesi A (2017) Weak and strong cross-sectional dependence: a panel data analysis of international technology diffusion. J Appl Econom 32(3):477–503
Article Google Scholar
Everaert G, Vierke H (2016) Demographics and business cycle volatility: A spurious relationship? J Appl Econom 31(7):1467–1477
Article Google Scholar
Fernandez-Val I, Weidner M (2016) Individual and time effects in nonlinear panel models with large N, T. J Econom 192(1):291–312
Article Google Scholar
Hausman JA (1978) Specification tests in econometrics. Econometrica 46(6):1251–1271
Article Google Scholar
Holly S, Pesaran MH, Yamagata T (2010) A spatio-temporal model of house prices in the USA. J Econom 158(1):160–173
Article Google Scholar
Hsiao C (2018) Panel models with interactive effects. J Econom 206(2):645–673
Article Google Scholar
Jaimovich N, Siu HE (2009) The young, the old, and the restless: demographics and business cycle volatility. Am Econ Rev 99(3):804–26
Article Google Scholar
Juodis A, Reese S (2018) The incidental parameters problem in testing for remaining cross-section correlation. Papers arXiv:1810.03715
Kapetanios G, Pesaran MH (2005) Alternative approaches to estimation and inference in large multifactor panels: small sample results with an application to modelling of asset returns. CESifo working paper series 1416, CESifo Group Munich
Kapetanios G, Pesaran MH, Yamagata T (2011) Panels with non-stationary multifactor error structures. J Econom 160(2):326–348
Article Google Scholar
Karabiyik H, Reese S, Westerlund J (2017) On the role of the rank condition in CCE estimation of factor-augmented panel regressions. J Econom 197(1):60–64
Article Google Scholar
Liddle B (2018) Consumption-based accounting and the trade-carbon emissions nexus. Energy Econ 69:71–78
Article Google Scholar
Liu W (2014) Modeling gasoline demand in the united states: a flexible semiparametric approach. Energy Econ 45:244–253
Article Google Scholar
Mastromarco C, Serlenga L, Shin Y (2016) Modelling technical efficiency in cross sectionally dependent stochastic frontier panels. J Appl Econom 31(1):281–297
Article Google Scholar
Moon HR, Weidner M (2015) Linear regression for panel with unknown number of factors as interactive fixed effects. Econometrica 83(4):1543–1579
Article Google Scholar
Munnell A (1990) How does public infrastructure affect regional economic performance? Conf Ser [Proc] 34:69–112
Google Scholar
Pesaran MH (2006) Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica 74(4):967–1012
Article Google Scholar
Pesaran MH (2015) Testing weak cross-sectional dependence in large panels. Econom Rev 34(6):1089–1117
Article Google Scholar
Petrova Y, Westerlund J (2020) Fixed effects demeaning in the presence of interactive effects in treatment effects regressions and elsewhere. J Appl Econom 35:960–964
Article Google Scholar
Pieri F, Vecchi M, Venturini F (2018) Modelling the joint impact of R &D and ICT on productivity: a frontier analysis approach. Res Policy 47(9):1842–1852
Article Google Scholar
Sarafidis V, Wansbeek T (2012) Cross-sectional dependence in panel data analysis. Econom Rev 31(5):483–531
Article Google Scholar
Sarafidis V, Yamagata T, Robertson D (2009) A test of cross section dependence for a linear dynamic panel model with regressors. J Econom 148(2):149–161
Article Google Scholar
Serlenga L, Shin Y (2007) Gravity models of intra-EU trade: application of the CCEP-HT estimation in heterogeneous panels with unobserved common time-specific factors. J Appl Econom 22(2):361–381
Article Google Scholar
Westerlund J (2019a) On estimation and inference in heterogeneous panel regressions with interactive effects. J Time Ser Anal 40(5):852–857
Article Google Scholar
Westerlund J (2019b) Testing additive versus interactive effects in fixed-ICT panels. Econ Lett 174(C):5–8
Article Google Scholar
Westerlund J, Urbain J-P (2013) On the estimation and inference in factor-augmented panel regressions with correlated loadings. Econ Lett 119(3):247–250
Article Google Scholar
Westerlund J, Urbain J-P (2015) Cross-sectional averages versus principal components. J Econom 185(2):372–377
Article Google Scholar

Download references

Author information

Authors and Affiliations

King’s College London, London, UK
George Kapetanios
University of Bari, Bari, Italy
Laura Serlenga
University of York, York, UK
Yongcheol Shin
European Commission, JRCIspra, Italy
Laura Serlenga

Authors

George Kapetanios
View author publications
You can also search for this author in PubMed Google Scholar
Laura Serlenga
View author publications
You can also search for this author in PubMed Google Scholar
Yongcheol Shin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongcheol Shin.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We are mostly grateful for the insightful comments by the Co- Editors, and two anonymous referees as well as Jia Chen, In Choi, Matthew Greenwood-Nimmo, Namhyun Kim, Young Hoon Lee, Rui Lin, Patrick Saart, Ron Smith, Weining Wang, Joakim Westerlund, Takashi Yamagata, Chaowen Zheng, and seminar participants at King’s College London, University of Exeter and University of York for their helpful comments. The previous version of this paper has been circulated as “Testing for Correlated Factor Loadings in Panels with Interactive Effects.” Shin acknowledge partial financial support from the ESRC (Grant Reference: ES/T01573X/1). The usual disclaimer applies.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 344 KB)

Appendices

Appendix: Proofs

1.1 Preliminary lemmas

We first provide two Lemmas that extend the Law of Large Numbers and Central Limit Theorem to cover the martingale difference sequence for the panel data. We define the concept of spatial martingale difference arrays as follows: Let $W_{i,T}$ for $i=1,\ldots ,N$ and $T=1,\ldots ,T,$ be arrays of matrices of random variables. Define the $\sigma $-field generated by $W_{j,T}$ for $j=1,\ldots ,N$, $j\ne i$, as $\mathcal {F}_{-i}$. Then, $W_{i,T}$ is a spatial martingale difference array if $E\left( W_{i,T}|\mathcal {F}_{-i}\right) =0$, $i=1,\ldots ,N$. It is clear that the resulting sequence is a martingale difference array sequence for any ordering of the random matrices $W_{i,T}$.

Lemma 4

Let $W_{i,T}$ and $\mu _{i,T}$ for $i=1,\ldots ,N$ and $T=1,\ldots ,$ be arrays of matrices of random variables and constants such that $W_{i,T} -\mu _{i,T}$ is a spatial martingale difference array where $\sup _{i,T}E\left\| W_{i,T}\right\| ^{1+\delta }<\infty $ for some $\delta >0.$ Then, as $(N,T)\rightarrow _{j}\infty $,

$$\begin{aligned} N^{-1}\sum _{i=1}^{N}\left( W_{i,T}-\mu _{i,T}\right) \rightarrow _{p}0. \end{aligned}$$

Proof

By Theorem 12.11 of Davidson (1994), if $\sup _{i,T}E\left\| W_{i,T}\right\| ^{1+\delta }<\infty $, then

$$\begin{aligned} \lim _{M\rightarrow \infty }\sup _{i,T}E\left( \left\| W_{i,T}-\mu _{i,T}\right\| I_{\left\{ \left\| W_{i,T}-\mu _{i,T}\right\| >M\right\} }\right) =0\text {,} \end{aligned}$$

which is a generalisation of uniform integrability to arrays. Then, the result follows immediately by Corollary 19.9 of Davidson (1994). $\square $

Lemma 5

Let $w_{i,T}$ and $\mu _{i,T}$, for $i=1,\ldots ,N$ and $T=1,\ldots ,$ be arrays of vectors of random variables and constants such that $w_{i,T} -\mu _{i,T}$ is a spatial martingale difference array where $E\left[ \left( w_{i,T}-\mu _{i,T}\right) \left( w_{i,T}-\mu _{i,T}\right) ^{\prime }\right] =\Sigma _{i,T}$, and $\sup _{i,T}E\left\| w_{iT}\right\| ^{2+\delta }<\infty $ for some $\delta >0.$ Assume that $\Sigma =\lim _{N,T\rightarrow \infty }N^{-1}\sum _{i=1}^{N}\Sigma _{i,T}$ is positive definite and $\sup _{N,T} N^{-1}\sum _{i=1}^{N}\Sigma _{i,T}<\infty $. Then, as $(N,T)\rightarrow _{j} \infty $,

$$\begin{aligned} N^{-1}\sum _{i=1}^{N}\left( w_{iT}-\mu _{iT}\right) \rightarrow _{d} N(0,\Sigma ). \end{aligned}$$

(30)

Proof

By Theorem 12.11 of Davidson (1994), if $\sup _{i,T}E\left\| w_{i,T}\right\| ^{2+\delta }<\infty $, we obtain the uniform integrability condition,

$$\begin{aligned} \lim _{M\rightarrow \infty }\sup _{i,T}E\left( \left\| W_{i,T}-\mu _{i,T}\right\| I_{\left\{ \left\| W_{i,T}-\mu _{i,T}\right\| >M\right\} }\right) =0. \end{aligned}$$

Together with $\sup _{N,T}N^{-1}\sum _{i=1}^{N}\Sigma _{i,T}<\infty $, this implies that the Lindeberg condition holds by Theorem 23.18 of Davidson (1994). Then, by Theorem 23.16 of Davidson (1994), it follows that

$$\begin{aligned} \max _{i,T}N^{-1}\left( w_{iT}-\mu _{iT}\right) \rightarrow _{p}0. \end{aligned}$$

(31)

Together with $\sup _{i,T}E\left\| w_{i,T}\right\| ^{2+\delta }<\infty $, (31) implies (30) by Theorem 24.3 of Davidson (1994). $\square $

1.2 Proof of Theorem 1

Considering $\varvec{\beta }_{i}=\varvec{\beta }+\varvec{\eta }_{i}$, we have

$$\begin{aligned} \hat{\varvec{\beta }}_{FE}-\varvec{\beta }&=\left( \sum _{i=1} ^{N}\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) ^{-1}\sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\left( \varvec{\ddot{X}}_{i}\varvec{\eta }_{i}+\varvec{\dot{F}\mathring{\gamma }} _{i}+\varvec{\ddot{\varepsilon }}_{i}\right) \nonumber \\&=\left( \sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) ^{-1}\sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\left( \varvec{\ddot{X}}_{i}\varvec{\eta }_{i}+\varvec{\dot{F} \mathring{\gamma }}_{i}+\varvec{\varepsilon }_{i}\right) +o_{p}\left( 1\right) . \end{aligned}$$

(32)

Using Lemma 4, it is easily seen that as $(N,T)\rightarrow _{j}\infty ,$

$$\begin{aligned} \frac{1}{N}\sum _{i=1}^{N}\frac{\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}}{T}\rightarrow _{p}\lim _{N,T\rightarrow \infty } \frac{1}{N}\sum _{i=1}^{N}E\left( \frac{\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}}{T}\right) =\varvec{\Psi }_{FE} \end{aligned}$$

Next, by the independence of $\varvec{\gamma }_{i}$ and $\varvec{\eta }_{i}$ each other and from $\varvec{X}_{i}$ and $\varvec{F}$ across i, and using the fact that $E\left( \varvec{\mathring{\gamma }} _{i}\right) =E(\varvec{\eta }_{i})=0$, it follows that $\varvec{\ddot{X}}_{i}^{\prime }\left( \varvec{\dot{F}\mathring{\gamma }}_{i} +\varvec{\ddot{\varepsilon }}_{i}\right) $ is a spatial martingale difference but also a martingale difference sequence for any ordering across i. To see this, for any ordering over i, we have:

$$\begin{aligned}&E\left[ \varvec{\ddot{X}}_{i}^{\prime }\left( \varvec{\ddot{X} }_{i}\varvec{\eta }_{i}+\varvec{\dot{F}\mathring{\gamma }} _{i}+\varvec{\varepsilon }_{i}\right) |\varvec{\ddot{X}} _{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] \\&\quad =E\left[ \varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}} _{i}\varvec{\eta }_{i}|\varvec{\ddot{X}}_{j},\varvec{\dot{F} },\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] +E\left[ \varvec{\ddot{X}}_{i}^{\prime }\varvec{\dot{F}\mathring{\gamma }} _{i}|\varvec{\ddot{X}}_{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] +E\left[ \varvec{\ddot{X} }_{i}^{\prime }\varvec{\varepsilon }_{i}|\varvec{\ddot{X}} _{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] \\&\quad =E\left[ \varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}} _{i}|\varvec{\ddot{X}}_{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] E\left[ \varvec{\eta } _{i}|\varvec{\ddot{X}}_{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] \\&\qquad +E\left[ \varvec{\varvec{\ddot{X}}_{i}^{\prime }\varvec{\dot{F}} }|\varvec{\ddot{X}}_{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] E\left[ \varvec{\mathring{\gamma } }_{i}|\varvec{\ddot{X}}_{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] \\&\qquad +E\left[ \varvec{\dot{X}}_{i}|\varvec{\ddot{X}}_{j} ,\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta } _{j}\right] E\left[ \varvec{\varepsilon }_{i}|\varvec{\ddot{X}} _{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] \text { for }j\ne i \end{aligned}$$

Since

$$\begin{aligned} E\left[ \varvec{\eta }_{i}|\varvec{\ddot{X}}_{j},\varvec{\dot{F} },\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] =E\left[ \varvec{\mathring{\gamma }}_{i}|\varvec{\ddot{X}}_{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta }_{j}\right] =E\left[ \varvec{\varepsilon }_{i}|\varvec{\ddot{X}}_{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta } _{j}\right] =0\text { for }j\ne i \end{aligned}$$

hence

$$\begin{aligned} E\left[ \varvec{\ddot{X}}_{i}^{\prime }\left( \varvec{\ddot{X}} _{i}\varvec{\eta }_{i}+\varvec{\dot{F}\mathring{\gamma }}_{i} +\varvec{\varepsilon }_{i}\right) |\varvec{\ddot{X}}_{j},\varvec{\dot{F}},\varvec{\mathring{\gamma }}_{j},\varvec{\eta } _{j}\right] =0\text { for }j\ne i \end{aligned}$$

which proves the martingale difference property. Notice that we repeatedly use the fact that the product of a stochastic process with a second process, that is independent over its index as well as of the first process, is a martingale difference process. Next, we note that $\left\{ \frac{\varvec{\ddot{X} }_{i}^{\prime }\varvec{\varepsilon }_{i}}{\sqrt{T}}\right\} _{i=1}^{N}$, $\left\{ \frac{\varvec{\ddot{X}}_{i}^{\prime }\varvec{\dot{F} \mathring{\gamma }}_{i}}{T}\right\} _{i=1}^{N}$ and $\left\{ \frac{\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\varvec{\eta }_{i}}{T}\right\} _{i=1}^{N}$ are spatial martingale difference series. Notice that $\left( \sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) ^{-1}\sum _{i=1}^{N}\varvec{\ddot{X} }_{i}^{\prime }\varvec{\varepsilon }_{i}=O_{p}\left( \frac{1}{\sqrt{NT} }\right) $, is of the smaller probability order of magnitude than the other two terms in the RHS of (32). Therefore, it follows that as $(N,T)\rightarrow _{j}\infty ,$

$$\begin{aligned}{} & {} \frac{1}{\sqrt{N}}\sum _{i=1}^{N}\frac{\varvec{\ddot{X}}_{i}^{\prime }\varvec{\dot{F}\mathring{\gamma }}_{i}}{T}\rightarrow _{d}N\left( 0,\varvec{R}_{1,FE}\right) \\{} & {} \frac{1}{\sqrt{N}}\sum _{i=1}^{N}\frac{\varvec{\ddot{X}}_{i}^{\prime }\left( \varvec{\ddot{X}}_{i}\varvec{\eta }_{i}+\varvec{\dot{F}\mathring{\gamma }}_{i}\right) \varvec{\dot{X}}_{i}^{\prime }}{T}\rightarrow ^{d}N\left( 0,\varvec{R}_{1,FE}+\varvec{R} _{2,FE}\right) . \end{aligned}$$

where $\varvec{R}_{1,FE}$ are $\varvec{R}_{2,FE}$ defined in (13) and (14). This proves (12) in Theorem 1. We will prove (15) in the proof of Theorem 3.

The results for the case of $\varvec{\eta }=0$, $\forall i$, follows straightforwardly from the above analysis.

1.3 Proof of Theorem 2

Let $R_{NT}$ denote terms of the lower order of probability than the leading terms. By Theorem 6 in CHNY, considering $\varvec{\beta }_{i} =\varvec{\beta }+\varvec{\eta }_{i}$, we have:

$$\begin{aligned} \hat{\varvec{\beta }}_{PC}-\varvec{\beta =}\left( \sum _{i=1} ^{N}\varvec{V}_{i}^{\prime }\varvec{V}_{i}\right) ^{-1}\sum _{i=1} ^{N}\varvec{V}_{i}^{\prime }\varvec{V}_{i}\varvec{\eta }_{i}+R_{NT} \end{aligned}$$

CHNY assume that $\varvec{x}_{it}$ follows a linear factor structure, (2), which can be expressed as

$$\begin{aligned} \varvec{X}_{i}=\varvec{F\Gamma }_{i}+\varvec{V}_{i}, \end{aligned}$$

(33)

where $\varvec{X}_{i}=(\varvec{x}_{i1},\ldots ,\varvec{x} _{iT})^{\prime }$, $\varvec{F}=(\varvec{f}_{1},\ldots ,\varvec{f} _{T})^{\prime }$ and $\varvec{V}_{i}=(\varvec{v}_{i1},\ldots ,\varvec{v}_{iT})^{\prime }$. See also Assumptions B1-B5 in CHNY.

Note that $\sum _{i=1}^{N}\varvec{V}_{i}^{\prime }\varvec{V} _{i}\varvec{\eta }_{i}=O_{p}\left( \sqrt{N}T\right) $ and $\sum _{i=1} ^{N}\varvec{V}_{i}^{\prime }\varvec{\varepsilon }_{i}=O_{p}\left( \sqrt{NT}\right) $. Using Lemmas 4 and 5, it follows that as $(N,T)\rightarrow _{^{j}}\infty ,$

$$\begin{aligned}{} & {} \left( \frac{1}{N}\sum _{i=1}^{N}\frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}\right) ^{-1}\rightarrow _{p}\lim _{N,T\rightarrow \infty }\frac{1}{N}\sum _{i=1}^{N}E\left( \frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}\right) =\varvec{\Psi }_{PC}\\{} & {} \quad \frac{1}{\sqrt{N}}\sum _{i=1}^{N}\frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}\varvec{\eta }_{i}}{T}\rightarrow _{d}N\left( 0,\varvec{R}_{1,PC}\right) \end{aligned}$$

where $\varvec{R}_{1,PC}$ is defined in (22). This proves (21). (23) follows by the proof of Theorem 3.

1.4 Proof of Theorem 3

Given Theorems 1 and 2, it suffices to derive the equivalence and consistency of the two robust covariance estimators for $\hat{\varvec{\beta }}_{FE}$ and $\hat{\varvec{\beta }}_{PC}$, which are given by (10), (11), (19) and (20), respectively. Rewrite them compactly as

$$\begin{aligned}{} & {} \varvec{V}^{NON}\left( \hat{\varvec{\beta }}_{FE}\right) =\hat{\varvec{\Psi }}_{FE}^{-1}\hat{\varvec{R}}_{FE}^{NON} \hat{\varvec{\Psi }}_{FE}^{-1};\ \ \varvec{V}^{HAC}\left( \hat{\varvec{\beta } }_{FE}\right) =\hat{\varvec{\Psi }}_{FE}^{-1}\hat{\varvec{R}}_{FE} ^{HAC}\hat{\varvec{\Psi }}_{FE}^{-1}\\{} & {} \varvec{V}^{NON}\left( \hat{\varvec{\beta }}_{PC}\right) =\hat{\varvec{\Psi }}_{PC}^{-1}\hat{\varvec{R}}_{PC}^{NON}\hat{\varvec{\Psi }}_{PC}^{-1};\ \ \varvec{V}^{HAC}\left( \hat{\varvec{\beta } }_{PC}\right) =\hat{\varvec{\Psi }}_{PC}^{-1}\hat{\varvec{R}}_{PC} ^{HAC}\hat{\varvec{\Psi }}_{PC}^{-1} \end{aligned}$$

where $\hat{\varvec{X}}_{i}=\varvec{M}_{\hat{F}}\varvec{X}_{i},$

$$\begin{aligned} \hat{\varvec{\Psi }}_{FE}&=\sum _{i=1}^{N}\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i},\ \hat{\varvec{\Psi }}_{PC}=\sum _{i=1} ^{N}\hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{X}}_{i}=\sum _{i=1} ^{N}\varvec{X}_{i}^{\prime }\varvec{M}_{\hat{F}}\varvec{X} _{i},\text { }\\ \hat{\varvec{R}}_{FE}^{NON}&=\sum _{i}^{N}\left( \varvec{\ddot{X} }_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) \left( \varvec{\hat{\beta }}_{FE,i}-\hat{\varvec{\beta }}_{FE}\right) \left( \varvec{\hat{\beta }}_{FE,i}-\hat{\varvec{\beta }}_{FE}\right) ^{\prime }\left( \varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) ,\\ \hat{\varvec{R}}_{PC}^{NON}&=\sum _{i}^{N}\varvec{X}_{i}^{\prime }\varvec{M}_{\hat{F}}\varvec{X}_{i}\left( \varvec{\tilde{\beta } }_{PC,i}-\varvec{\tilde{\beta }}_{PC}\right) \left( \varvec{\tilde{\beta }}_{PC,i}-\varvec{\tilde{\beta }}_{PC}\right) ^{\prime }\varvec{X}_{i}^{\prime }\varvec{M}_{\hat{F}}\varvec{X}_{i},\\ \hat{\varvec{R}}_{FE}^{HAC}&=\sum _{i}^{N}\varvec{\ddot{X}} _{i}^{\prime }\hat{\varvec{u}}_{FE,i}\hat{\varvec{u}}_{FE,i}^{\prime }\varvec{\ddot{X}}_{i},\ \hat{\varvec{R}}_{PC}^{HAC}=\sum _{i} ^{N}\hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{u}}_{PC,i} \hat{\varvec{u}}_{PC,i}^{\prime }\hat{\varvec{X}}_{i}. \end{aligned}$$

Finally, we define:

$$\begin{aligned} \hat{\varvec{C}}^{NON}\left( \hat{\varvec{\beta }}_{FE},\hat{\varvec{\beta }}_{PC}\right) =\hat{\varvec{\Psi }}_{FE} ^{-1}\hat{\varvec{R}}_{FE,PC}^{NON}\hat{\varvec{\Psi }}_{PC}^{-1};\text { }\hat{\varvec{C}}^{HAC}\left( \hat{\varvec{\beta }}_{FE},\hat{\varvec{\beta }}_{PC}\right) =\hat{\varvec{\Psi }}_{FE} ^{-1}\hat{\varvec{R}}_{FE,PC}^{HAC}\hat{\varvec{\Psi }}_{PC}^{-1} \end{aligned}$$

where

$$\begin{aligned} \hat{\varvec{R}}_{FE,PC}^{NON}&=\sum _{i}^{N}\varvec{\ddot{X}} _{i}^{\prime }\varvec{\ddot{X}}_{i}\left( \hat{\varvec{\beta }} _{FE,i}-\hat{\varvec{\beta }}_{FE}\right) \left( \varvec{\tilde{\beta }}_{PC,i}-\varvec{\tilde{\beta }}_{PC}\right) ^{\prime }\hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{X}}_{i},\\ \hat{\varvec{R}}_{FE,PC}^{HAC}&=\sum _{i}^{N}\varvec{\ddot{X}} _{i}^{\prime }\hat{\varvec{u}}_{FE,i}\hat{\varvec{u}}_{PC,i}^{\prime }\hat{\varvec{X}}_{i} \end{aligned}$$

To establish that the two covariance estimators are (asymptotically) equivalent, we need to show:

$$\begin{aligned} \hat{\varvec{R}}_{FE}^{NON}= & {} \hat{\varvec{R}}_{FE}^{HAC}+R_{NT} \end{aligned}$$

(34)

$$\begin{aligned} \hat{\varvec{R}}_{PC}^{NON}= & {} \hat{\varvec{R}}_{PC}^{HAC}+R_{NT} \end{aligned}$$

(35)

$$\begin{aligned} \hat{\varvec{R}}_{FE,PC}^{NON}= & {} \hat{\varvec{R}}_{FE,PC}^{HAC}+R_{NT} \end{aligned}$$

(36)

where $R_{NT}$ denotes terms of the lower order of probability than the leading terms. We focus on the PC estimator in (35). First, consider $\hat{\varvec{R}}_{PC}^{HAC}$ and notice that

$$\begin{aligned} \hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{u}}_{PC,i}=\varvec{\hat{X}}_{i}^{\prime }\left( \hat{\varvec{u}}_{PC,i}+\hat{\varvec{X}} _{i}\left( \hat{\varvec{\beta }}_{PC}-\varvec{\beta }\right) \right) . \end{aligned}$$

By Theorem 11 in CHNY, it follows that as $\left( N,T\right) \rightarrow \infty $ and $\frac{T}{N}\rightarrow c\in (0,\Delta ]$ with $\Delta <\infty $,

$$\begin{aligned} \sum _{i=1}^{N}\hat{\varvec{X}}_{i}^{^{\prime }}\hat{\varvec{u}} _{PC,i}\hat{\varvec{u}}_{PC,i}^{\prime }\hat{\varvec{X}}_{i}=\sum _{i=1}^{N}\varvec{V}_{i}^{\prime }\varvec{u}_{PC,i}\varvec{u} _{PC,i}^{\prime }\varvec{V}_{i}+R_{NT} \end{aligned}$$

where $\varvec{u}_{PC,i}=\varvec{X}_{i}\varvec{\eta } _{i}+\varvec{\varepsilon }_{i}$. Then, we have

$$\begin{aligned} \hat{\varvec{R}}_{PC}^{HAC}=\sum _{i=1}^{N}\varvec{V}_{i}^{\prime }\varvec{V}_{i}\varvec{\eta }_{i}\varvec{\eta }_{i}^{\prime }\varvec{V}_{i}^{\prime }\varvec{V}_{i}+\sum _{i=1}^{N}\varvec{V} _{i}^{\prime }\varvec{\varepsilon }_{i}\varvec{\varepsilon }_{i}^{\prime }\varvec{V}_{i}+R_{NT}. \end{aligned}$$

(37)

Next, it is easily seen that

$$\begin{aligned} \varvec{\tilde{\beta }}_{PC,i}-\varvec{\beta }= & {} \left( \varvec{\hat{X}}_{i}^{\prime }\hat{\varvec{X}}_{i}\right) ^{-1}\hat{\varvec{X} }_{i}^{\prime }\varvec{\varepsilon }_{i}+\varvec{\eta }_{i}\\ \varvec{\tilde{\beta }}_{PC}-\varvec{\beta }= & {} \frac{1}{N}\sum _{i=1} ^{N}\left[ \left( \hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{X}} _{i}\right) ^{-1}\hat{\varvec{X}}_{i}^{\prime }\varvec{\varepsilon }_{i}+\varvec{\eta }_{i}\right] \end{aligned}$$

Then,

$$\begin{aligned} \hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{X}}_{i}\left( \varvec{\tilde{\beta }}_{PC,i}-\varvec{\tilde{\beta }}_{PC}\right)&=\hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{X}}_{i}\left( \varvec{\tilde{\beta }}_{PC,i}-\varvec{\beta }_{i}+\varvec{\beta } _{i}-\varvec{\beta }+\varvec{\beta }-\varvec{\tilde{\beta }} _{PC}\right) \nonumber \\&=\hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{X}}_{i}\left( \varvec{\tilde{\beta }}_{PC,i}-\varvec{\beta }_{i}\right) +\hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{X}}_{i}\varvec{\eta }_{i}+\hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{X}}_{i}\left( \varvec{\beta }-\varvec{\tilde{\beta }}_{PC}\right) \nonumber \\&=\hat{\varvec{X}}_{i}^{\prime }\varvec{\varepsilon }_{i} +\hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{X}}_{i}\varvec{\eta }_{i}+\hat{\varvec{X}}_{i}^{\prime }\hat{\varvec{X}}_{i}\left( \varvec{\beta }-\varvec{\tilde{\beta }}_{PC}\right) \end{aligned}$$

(38)

By Theorem 11 in CHNY, we then obtain:

$$\begin{aligned} \hat{\varvec{R}}_{PC}^{NON}=\sum _{i=1}^{N}\varvec{V}_{i}^{\prime }\varvec{V}_{i}\varvec{\eta }_{i}\varvec{\eta }_{i}^{\prime }\varvec{V}_{i}^{\prime }\varvec{V}_{i}+\sum _{i=1}^{N}\varvec{V} _{i}^{\prime }\varvec{\varepsilon }_{i}\varvec{\varepsilon }_{i}^{\prime }\varvec{V}_{i}+R_{N,T}. \end{aligned}$$

(39)

This proves (35).

Noticing that both $\frac{\varvec{V}_{i}^{\prime }\varvec{\varepsilon }_{i}\varvec{\varepsilon }_{i}^{\prime }\varvec{V}_{i}}{T}-E\left( \frac{\varvec{V}_{i}^{\prime }\varvec{\varepsilon }_{i} \varvec{\varepsilon }_{i}^{\prime }\varvec{V}_{i}}{T}\right) $ and $\frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}\varvec{\eta } _{i}\varvec{\eta }_{i}^{\prime }\frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}-E\left( \frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}\varvec{\eta }_{i}\varvec{\eta }_{i}^{\prime }\frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}\right) $ are iid and martingale difference processes over i and by Lemma 4, we have:

$$\begin{aligned}{} & {} \frac{1}{N}\sum _{i=1}^{N}\frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i} }{T}\varvec{\eta }_{i}\varvec{\eta }_{i}^{\prime }\frac{\varvec{V} _{i}^{\prime }\varvec{V}_{i}}{T}\rightarrow _{p}\lim _{NT,\rightarrow \infty }\frac{1}{N}\sum _{i=1}^{N}E\left( \frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}\varvec{\eta }_{i}\varvec{\eta }_{i}^{\prime }\frac{\varvec{V}_{i}^{\prime }\varvec{V}_{i}}{T}\right) \\{} & {} \frac{1}{N}\sum _{i=1}^{N}\frac{\varvec{V}_{i}^{\prime } \varvec{\varepsilon }_{i}\varvec{\varepsilon }_{i}^{\prime }\varvec{V}_{i}}{T}\rightarrow _{p}\lim _{N,T\rightarrow \infty }\frac{1}{N}\sum _{i=1}^{N}E\left( \frac{\varvec{V}_{i}^{\prime } \varvec{\varepsilon }_{i}\varvec{\varepsilon }_{i}^{\prime }\varvec{V}_{i}}{T}\right) \end{aligned}$$

This provides consistency of both variance estimators.

Along similar lines to (38), it is straightforward to prove (34) and (36) because

$$\begin{aligned} \left( \varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\right) \left( \hat{\varvec{\beta }}_{FE,i}-\hat{\varvec{\beta }}_{FE}\right) =\varvec{\ddot{X}}_{ii}^{\prime }\varvec{\varepsilon }_{i} +\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\varvec{\eta }_{i}+\varvec{\ddot{X}}_{i}^{\prime }\varvec{\dot{F}\mathring{\gamma } }_{i}+\varvec{\ddot{X}}_{i}^{\prime }\varvec{\ddot{X}}_{i}\left( \varvec{\beta }-\hat{\varvec{\beta }}_{FE}\right) , \end{aligned}$$

and the term $\varvec{\ddot{X}}_{i}^{\prime }\varvec{\dot{F} \mathring{\gamma }}_{i}$ can be analysed similarly to $\varvec{\ddot{X} }_{i}^{\prime }\varvec{\ddot{X}}_{i}\varvec{\eta }_{i}$. Using these results, it readily follows that

$$\begin{aligned}{} & {} \left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }}_{PC}\right) ^{\prime }\left( \hat{\varvec{V}}^{NON}\right) ^{-1}\left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }}_{PC}\right) \sim \chi _{k}^{2}\\{} & {} \left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }}_{PC}\right) ^{\prime }\left( \hat{\varvec{V}}^{HAC}\right) ^{-1}\left( \hat{\varvec{\beta }}_{FE}-\hat{\varvec{\beta }}_{PC}\right) \sim \chi _{k}^{2} \end{aligned}$$

where $\hat{\varvec{V}}^{NON}$ and $\hat{\varvec{V}}^{HAC}$ are defined in (26) and (27), respectively.

Appendix: The data and empirical specifications

We describe the empirical specifications and the data in details. For the production function, we estimate the following panel data regression:

$$\begin{aligned} \ln \left( \frac{Y}{L}\right) _{it}=\beta \ln \left( \frac{K}{L}\right) _{it}+e_{it},\ e_{it}=\varvec{\gamma }_{i}^{\prime }\varvec{f} _{t}+\varepsilon _{it} \end{aligned}$$

(40)

The first group consists of 26 OECD countries; Australia, Austria, Belgium, Canada, Chile, Denmark, Finland, France, Germany, Greece, Hong Kong, Ireland, Israel, Italy, Japan, Korea, Mexico, the Netherlands, New Zealand, Norway, Portugal, Spain, Sweden, Turkey, the U.K. and the U.S. The data is collected from PWT 7.0 and covers the period 1970–2010. Y is GDP measured in million U.S. $ at the 2005 price, K the capital measured in millions U.S. $, constructed using the perpetual inventory method (PIM), and L the labour measured as the total employment in thousands. The second group contains the EU27 countries; Austria, Belgium, Bulgaria, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, the Netherlands, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden and the U.K. The data are extracted from PWT 9.0 over the period 1990–2015 and the definition of the variables, Y, K and L is the same as above. The third group includes 20 Italian regions over the period 1995–2016; Piemonte, Valle d’Aosta, Liguria, Lombardia, Trentino Alto Adige, Veneto, Friuli-Venezia Giulia, Emilia-Romagna, Toscana, Umbria, Marche, Lazio, Abruzzo, Molise, Campania, Puglia, Basilicata, Calabria, Sicilia and Sardegna. Due to the data availability, we construct Y by the value added measured in million Euros at the 2010 price, L by the total employment in thousands, and K by Gross Fixed Capital Formation in millions Euros. The data, gathered from ISTAT, covers the period 1995 to 2000. The fourth data taken from Munnell (1990), comprises the 48 U.S. states and covers the period, 1970–1986. Y is the per capita gross state product, K is the private capital computed by apportioning Bureau of Economic Analysis (BEA) national stock estimates, and L is the number of employers in thousands in non-agricultural payrolls. The fifth application employs the aggregate sectorial data for manufacturing from developed and developing countries for the period 1970–2002, collected from UNIDO by Eberhardt and Teal (2019). We extract a balanced panel of 25 countries with 25 time periods from 1970 to 1995, where we cover Australia, Belgium, Brazil, Colombia, Cyprus, Ecuador, Egypt, Spain, Finland, Fiji, France, Hungary, Indonesia, India, Italy, Korea, Malta, Norway, Panama, Philippines, Poland, Portugal, Singapore, the USA and Zimbabwe.

The production function is augmented with R &D in the sixth application. From the data provided by Eberhardt et al. (2013), we extract a balanced panel of 82 country-industry units representing manufacturing industries across ten OECD economies (Denmark, Finland, Germany, Italy, Japan, Netherlands, Portugal, Sweden, the United Kingdom, and the US) from 1980 to 2005. We consider an augmented Cobb-Douglas production function:

$$\begin{aligned} \ln Y_{it}=\beta _{l}\ln L_{it}+\beta _{k}\ln K_{it}+\beta _{rd}\ln RD_{it}+e_{it},\ e_{it}=\varvec{\gamma }_{i}^{\prime }\varvec{f} _{t}+\varepsilon _{it} \end{aligned}$$

(41)

where Y measured as deflated value added, L is the total number of hours worked by persons engaged, K is the total tangible assets by book value and RD is the R &D stock expenditure. See Eberhardt et al. (2013) for details.

Next, we consider the gravity model specifications for the bilateral trade flows given by

$$\begin{aligned} \ln \left( trade_{it}\right){} & {} =\beta _{gdp}\ln \left( gdp_{it}\right) +\beta _{rer}\ln \left( rer_{it}\right) +\beta _{sim}\ln \left( sim_{it} \right) +\beta _{rlf}\ln \left( rlf_{it}\right) \nonumber \\{} & {} \quad +\beta _{cee}cee_{it}+\beta _{euro}euro_{it}+e_{it},e_{it}=\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t}+\varepsilon _{it} \end{aligned}$$

(42)

Here, $trade_{it}$ is the sum of bilateral import flows ($import_{odt}$) and export flows (export$_{odt}$) measured in million U.S. dollars at the 2000 price with o and d denoting the origin and the destination country, $gdp_{it}$ is the sum of $gdp_{ot}$ and $gdp_{dt}$ both of which are measured as the gross domestic product at the 2000 dollar price, $rer_{it} =ner_{odt}\times xpi_{US}$ is the real exchange rate measured in the 2000 dollar price, where $ner_{hft}$ is the bilateral nominal exchange rate normalised in terms of the U.S. $, sim is a measure of similarity in size constructed by $sim_{it}=\left[ 1-\left( \frac{gdp_{ot}}{gdp_{ot}+gdp_{dt} }\right) ^{2}-\left( \frac{gdp_{dt}}{gdp_{ot}+gdp_{dt}}\right) ^{2}\right] ,$ and $rlf_{it}=\left| pgdp_{ot}-pgdp_{dt}\right| $ measures countries’ difference in relative factor endowment where pgdp is per capita GDP. cee and euro represent dummies equal to one when countries of origin and destination both belong to the European Economic Community and share the euro as common currency, respectively. The data are collected from the IMF Direction of Trade Statistics, and covers the period, 1960–2008. We consider a sample of 91 country-pairs amongst the EU14 member countries (Austria, Belgium-Luxembourg, Denmark, Finland, France, Germany, Greece, Ireland, Italy, Netherlands, Portugal, Spain, Sweden and the U.K.).

Further, we estimate the gasoline demand function by

$$\begin{aligned} \ln \left( q_{it}\right) =\beta _{p}\ln \left( p_{it}\right) +\beta _{inc} \ln \left( inc_{it}\right) +e_{it},\ e_{it}=\alpha _{i}+\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t}+\varepsilon _{it} \end{aligned}$$

(43)

where gasoline consumption, $q_{it}$, is approximated as monthly sales volumes of motor gasoline, per capita per day; $p_{it}$ is the after tax gasoline prices computed by adding the state/federal tax rates to the motor gasoline sales to end user price and $inc_{it}$ represent the quarterly personal disposable income. Prices, income, and tax rates are converted to constant 2005 dollars using GDP implicit price deflators. The source of data is Liu (2014).

Finally, we estimate the income elasticity of real house price using the specification:

$$\begin{aligned} \ln \left( p_{it}\right) =\beta _{inc}\ln \left( inc_{it}\right) +e_{it},\ e_{it}=\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t} +\varepsilon _{it}, \end{aligned}$$

(44)

where $p_{it}$ is the housing price index and $inc_{it}$ is the real per capital income. We consider two annual datasets. The first sample from Holly et al. (2010) consists of the panel data for 48 U.S. States (excluding Alaska and Hawaii) plus the District of Columbia ($N=49)$ over the period 1975–2010. The second sample taken from Baltagi and Li (2014), contains a panel data of 384 Metropolitan Statistical Areas over the period 1975–2010.

Technological knowledge spillovers We estimate the effect of technological knowledge spillovers on total factor productivity (TFP). Following Coe et al. (2009), we estimate the effect of domestic and foreign $ R \& D$ capital stocks on TFP, controlling for the impact of human capital, using the following regression:

$$\begin{aligned} tfp_{it}=\beta _{sd}s_{it}^{d}+\beta _{sf}s_{it}^{f}+\beta _{hc}hc_{it}+e_{it},\ e_{it}=\varvec{\gamma }_{i}^{\prime }\varvec{f} _{t}+\varepsilon _{it} \end{aligned}$$

where $tfp_{it}$ is the TFP, $s_{it}^{d}$ is domestic R &D capital stocks, $s_{it}^{f}$ is foreign R &D capital stocks and $hc_{it}$ is human capital. TFP is defined as the log of output minus a weighted average of labor and capital inputs; domestic and foreign R &D capital stocks are measured in U.S. $ at 2000 prices and PPP exchange rates. We use a balanced panel of 24 OECD countries (Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Japan, Korea, the Netherlands, New Zealand, Norway, Portugal, Spain, Sweden, Switzerland, the U.K. and the U.S.) observed over the period 1971–2004. The measures of TFP and R &D capital stock come from Coe et al. (2009) (source: OECD Economic Outlook) while the average number of years of schooling used to measure human capital is taken from (Ertur and Musolesi 2017).

Next, we explore the channels through which technological investment affects the productivity performance of industrialized economies using industry-level data, extracted from the EU KLEMS database and the OECD ANBERD database. The sample includes fourteen OECD countries (Austria, Belgium, Denmark, Germany, Spain, Finland, France, Ireland, Italy, Japan, Netherlands, Sweden, the UK and the US). In particular, we estimate the productivity effects of R &D and ICT identifying three channels of transmission: input accumulation, technological change and spillovers by the following specification

$$\begin{aligned} \ln (va_{it})= & {} \beta _{l}\ln (l_{it})+\beta _{nitc}\ln (nonITC_{it})+\beta _{itc}\ln (ITC_{it})+\beta _{rd}\ln (rd_{it})\\ {}{} & {} +e_{it},\ e_{it}=\varvec{\gamma }_{i}^{\prime }\varvec{f}_{t}+\varepsilon _{it} \end{aligned}$$

where $va_{it}$ is value added; $l_{it}$ is the labour input measured as the number of hours worked; $rd_{it}$ is the R &D input measured as the cumulative value of industry research expenses; ICT ($ITC_{it}$) and non-ICT ($nonITC_{it}$) assets are built from annual investment flows by means of the perpetual inventory method and adopting an asset-specific rate of geometric depreciation. We extract a balanced panel of 49 high-tech industries, from 1977 to 2006, from the dataset provided by Pieri et al. (2018).

Health care expenditure and income We estimate the relationship between healthcare expenditure and income using data on 167 countries over the period 1995–2012 by

$$\begin{aligned} \ln (h_{it})=\beta _{gdp}\ln (gdp_{it})+\beta _{pe}\ln (pe_{it})+e_{it},\ e_{it}=\varvec{\gamma }_{i}^{\prime }\varvec{f} _{t}+\varepsilon _{it} \end{aligned}$$

where $h_{it}$ is per-capita health spending and $gdp_{it}$ is per capita GDP and $pe_{it}$ is public-health expenditure rate. The data on per-capita health expenditure and per-capita GDP are expressed at constant 2005 PPP prices. The public-health expenditure rate is defined as the percentage of public expenditure over total health expenditure. The source is the World Bank and the dataset is taken from Baltagi et al. (2017).

Demographic and business cycle volatility We estimate the impact of the age composition of the labor force on business cycle volatility. Jaimovich and Siu (2009) argue that a significant fraction of the run-up of US volatility in the mid-1960s and of the marked decline since the mid-1980s, known as the Great Moderation, is accounted for by long swings in the age composition of the US population induced by the baby boom and subsequent baby bust. Jaimovich and Siu (2009) define the volatile-age labor force share, $s_{it}$ as the fraction of the 15- to 64-year-old labor force accounted for by those aged 15–29 and 60–64 and linked it to the time-varying standard deviation of output $\sigma _{it}$ in the following benchmark regression:

$$\begin{aligned} \sigma _{it}=\beta _{s}s_{it}+\varepsilon _{it}. \end{aligned}$$

They show that shifts in the volatile-age share variable $s_{it}$ have a large and significant effect on cyclical volatility in the G7 countries from 1963 to 1999. By using a different dataset and allowing for cross-sectional dependence Everaert and Vierke (2016) find a much lower effect. Here, we use the balanced panel dataset for 51 countries over the period 1957–2000 taken from Everaert and Vierke (2016) where the GDP data (taken from the Penn World Table) are used to calculate output volatility, which is defined as the 9-year rolling standard deviation of logged annual GDP. The demographic data are taken from the United Nations World Population Prospects.

Trade and carbon emissions We explore the nexus between carbon emissions and trade in the 32 OECD countries (Australia, Austria, Belgium, Canada, Chile, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Japan, Korea, Luxembourg, Mexico, the Netherlands, New Zealand, Norway, Poland, Portugal, Slovak Republic, Spain, Sweden, Switzerland, Turkey, the UK and the US). Following Liddle (2018), we estimate the following regression:

$$\begin{aligned} co_{it}=\beta _{gdp}gdp_{it}+\beta _{tsh}tsh_{it}+\beta _{ish}ish_{it}+\beta _{fsh}fsh_{it}+e_{it},\ e_{it}=\varvec{\gamma }_{i}^{\prime }\varvec{ f}_{t}+\varepsilon _{it} \end{aligned}$$

where $co_{it}$ is the consumption-based per capita carbon emissions, $gdp_{it}$ the real GDP per capita, $tsh_{it}$ the trade (import plus export) expressed as a percentage of the GDP, $ish_{it}$ the industry value added expressed as a percentage of the GDP and $fsh_{it}$ the fossil fuel energy consumption as a share of total energy consumption. Data sources are the World Bank’s World Development Indicators and the Global Carbon Project. We extract a balanced panel of 32 OECD countries from 1990 to 2013, from the dataset in Liddle (2018).

Appendix: The bias corrected PC estimator

The bias corrected estimator proposed by Cui et al. (2019) is given by

$$\begin{aligned} \hat{\varvec{\beta }}_{PC}=\varvec{\tilde{\beta }}_{PC}-\frac{1}{N}\hat{\varvec{B}}_{NT}-\frac{1}{T}\hat{\varvec{C}}_{NT} \end{aligned}$$

where the estimator for ($\varvec{\beta }_{PC}$,$\varvec{F}$) denoted as ($\varvec{\tilde{\beta }}_{PC}$,$\hat{\varvec{F}}$) is the solution of the set of nonlinear equations

$$\begin{aligned}{} & {} \varvec{\tilde{\beta }}_{PC}=\left( \sum _{i=1}^{N}\varvec{X} _{i}^{\prime }\varvec{M}_{\hat{F}}\varvec{X}_{i}\right) ^{-1} \sum _{i=1}^{N}\varvec{X}_{i}^{\prime }\varvec{M}_{\hat{F} }\varvec{y}_{i}\text { and } \\ {}{} & {} \quad \left[ \frac{1}{NT}\sum _{i=1}^{N}\left( \varvec{y}_{i}-\varvec{X}_{i}\varvec{\tilde{\beta }}_{PC}\right) \left( \varvec{y}_{i}-\varvec{X}_{i}\varvec{\tilde{\beta }} _{PC}\right) ^{\prime }\right] \hat{\varvec{F}}=\hat{\varvec{F}}\varvec{V}_{NT} \end{aligned}$$

where $\varvec{M}_{\hat{F}}=\varvec{I}_{T}-\hat{\varvec{F}}\left( \hat{\varvec{F}}^{\prime }\hat{\varvec{F}}\right) ^{-1} \hat{\varvec{F}}^{\prime }$, $\varvec{V}_{NT}$ is the diagonal matrix that consists r largest eigenvalues of the above matrix in the brackets arranged in a decreasing order and $\hat{\varvec{F}}$ is $\sqrt{T}$ times the corresponding eigenvectors. The bias correction term is given by

$$\begin{aligned} {\hat{\varvec{B}}}_{NT}=-\left( \frac{1}{NT}\sum _{i=1}^{N}{\hat{\varvec{Z}}}_{i}{\varvec{M}}_{\hat{F}}{\hat{\varvec{Z}}}_{i}^{\prime }\right) ^{-1}\frac{1}{NT^{2}}\sum _{i=1}^{N}\sum _{t=1}^{T}{\hat{\varvec{Z}}} _{i}{\hat{\varvec{F}}}{\hat{\varvec{\Upsilon }}}_{\gamma }^{-1}{\hat{\gamma }}_{i}{\hat{u}}_{it}^{2} \end{aligned}$$

where ${\hat{\varvec{Z}}}_{i}={\varvec{X}}_{i}-\frac{1}{N}\sum _{j=1} ^{N}{\hat{a}}_{ij}{\varvec{X}}_{j}$ with ${\hat{a}}_{ij}={\hat{\gamma }} _{i}^{\prime }{\hat{\varvec{\Upsilon }}}_{\gamma }^{-1}{\hat{\gamma }}_{j}$, ${\hat{\varvec{\Upsilon }}}_{\gamma }=({\hat{\Gamma }}^{\prime }{\hat{\Gamma }}/N)$, ${\hat{\Gamma }}=\left( {\hat{\varvec{\gamma }}}_{1},{\hat{\varvec{\gamma }} }_{2},\ldots {\hat{\varvec{\gamma }}}_{N}\right) ^{\prime }$ and

$$\begin{aligned} {\hat{\varvec{C}}}_{NT}=-\left( \frac{1}{NT}\sum _{i=1}^{N}{\varvec{\hat{Z}}}_{i}{\varvec{M}}_{\hat{F}}{\hat{\varvec{Z}}}_{i}^{\prime }\right) ^{-1}\frac{1}{NT}\sum _{i=1}^{N}{\hat{\varvec{X}}}_{i}{\varvec{M}}_{{\hat{F}}}{\hat{\varvec{\Omega }}{\hat{\varvec{F}}}{\hat{\varvec{\Upsilon }}}}_{\gamma }^{-1}{\hat{\gamma }}_{i} \end{aligned}$$

where ${\hat{\varvec{\Omega }}}=diag\left( \frac{1}{N}\sum _{j=1}^{N}{\hat{u}}_{j1}^{2},\ldots ,\frac{1}{N}\sum _{j=1}^{N}{\hat{u}}_{jT}^{2}\right) .$

Appendix: Further empirical results

See Table 10.

Table 10 Empirical applications to fourteen different datasets

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kapetanios, G., Serlenga, L. & Shin, Y. Testing for correlation between the regressors and factor loadings in heterogeneous panels with interactive effects. Empir Econ 64, 2611–2659 (2023). https://doi.org/10.1007/s00181-023-02390-1

Download citation

Received: 02 February 2023
Accepted: 10 February 2023
Published: 11 May 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s00181-023-02390-1

Keywords

JEL Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Testing for correlation between the regressors and factor loadings in heterogeneous panels with interactive effects

Abstract

Similar content being viewed by others

Testing slope homogeneity in panel data models with a multifactor error structure

Panel data models with cross-sectional dependence: a selective review

Alternative estimation approaches for the factor augmented panel data model with small T

1 Introduction

2 The model

Assumption A

Theorem 1

3 The Hausman-type test

Theorem 2

Theorem 3

4 Monte Carlo simulations

4.1 Review of previous studies

4.2 Monte Carlo design

4.3 The small sample performance of FE, CCE and PC estimators

4.4 The performance of the Hausman-type test statistic

4.5 The pretest estimator

5 Empirical applications

6 Conclusions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 344 KB)

Appendices

Appendix: Proofs

1.1 Preliminary lemmas

Lemma 4

Proof

Lemma 5

Proof

1.2 Proof of Theorem 1

1.3 Proof of Theorem 2

1.4 Proof of Theorem 3

Appendix: The data and empirical specifications

Appendix: The bias corrected PC estimator

Appendix: Further empirical results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation