Efficient estimation of distributed lag model in presence of heteroscedasticity of unknown form: A Monte Carlo evidence

Abstract: In the presence of heteroscedasticity, the ordinary least-squares (OLS) estimator remains no more efficient while the popular Almon technique is being considered for a finite distributed lag model (DLM). The available literature proposes few adaptive estimators which are more efficient than the OLS estimator when there is heteroscedasticity of unknown form. This study suggests the similar adaptation combined with the Almon technique in order to get more efficient estimator of vector of lag coefficients in the DLM. Performance of the proposed estimator has been evaluated through the Monte Carlo simulations. The simulation results show an attractive performance of the proposed estimator in terms of efficiency.

ABOUT THE AUTHOR Muhammad Aslam is a tenured associate professor at the Department of Statistics, Bahauddin Zakariya University, Multan, Pakistan. He acquires a teaching experience of more than 20 years at postgraduate level. His main area of interest revolves around the inference of linear regression models, including the distributed lag model (DLM) and panel data model (PDM), with issues of heteroscedasticity and multicollinearity. He has conducted a number of researches in the stated area while leading a research team, comprising of his research students and few colleagues. Recently, this team has developed three R packages, namely "mctest," "lmridge," and "liureg" which are available on the R CRAN. These are comprehensive packages for detection of multicollinearity, estimation of the ridge and Liu regression models with different choices of penalties. This paper is a part of PhD research project of Mr. Abdul Majid (the Principal author) under the supervision of Dr. Aslam. This paper also suggests some efficient estimation the DLM in the presence of heteroscedasticity.

PUBLIC INTEREST STATEMENT
A distributed lag model (DLM) is a model for a sort of time series data in which a regression equation is used to predict current values of a dependent variable based on both the current values of an explanatory variable and the past period (lagged) values of this explanatory variable. Precisely, a DLM is a dynamic model in which the effect of a regressor on regressand occurs over time rather than all at once. The non-constant variances of the error term (i.e., issue of heteroscedasticity), associated with such model, may result in inefficient estimation of the unknown coefficients of the stated DLM. This paper addresses the same issue and suggests some more efficient estimation method.

Introduction
The distributed lag model (DLM) includes lagged values of independent variables as explanatory variables. The use of such model is motivated by the fact that the effect of change in an independent variable is not always completely exhausted within one time period but is "distributed" over several, and perhaps many, future periods. The DLM has gained paramount importance because of its frequent use in econometrics and statistics. However, estimation of the DLM has remained core issue for many researchers because of two problems that are almost certain to arise when the ordinary least-square s (OLS) method is applied directly to this model. Firstly, if the number of lags is large enough but the sample size is small, it may not be possible to infer about the parameters because of inadequate degrees of freedom to carry out the traditional tests of significance. Secondly, in most of the economic time series, the successive (lag) values tend to be highly correlated causing the problem of multicollinearity which leads to imprecise estimation of parameters because the variances of estimates tend to be large due to multicollinearity among explanatory variables (Baltagi, 2011;Gujarati, 2003;Maddala, 1977). To overcome these problems, various methods have been proposed by different researchers (Almon, 1965;Koyck, 1954;Shiller, 1973) for estimation of the DLM. All these methods are based on some prior knowledge about the behaviour of parameters and this source of prior knowledge includes non-stochastic and stochastic smoothness priors (Gujarati, 2003;Vinod & Ullah, 1981). Among the available methods proposed for estimation of the DLM, a technique of polynomial distributed lag (PDL) proposed by Almon (1965) has gained much popularity. In Almon's technique, the DLM is transformed into a model which has fewer explanatory variables and thus, severity of collinearity reduces considerably. This technique not only reduces the effect of multicollinearity but it also reduces the number of parameters to be estimated in the DLM (Maddala, 1977). The Almon procedure is based on the assumption that parameters in the DLM lie on some polynomial of suitable degree.
Application of the OLS, combined with the Almon technique, requires that the DLM meets all the assumptions about the error term which are considered for a classical linear regression model. The assumption of homoscedasticity is considered as one of the important assumptions which states that the error variances are constant across all the observations. But in practice, this assumption is frequently violated and the random errors in such case are said to be heteroscedastic. In the presence of heteroscedasticity, the usual OLS estimator (OLSE) remains unbiased, consistent and asymptotically normal but becomes inefficient and its usual covariance matrix estimator becomes biased and inconsistent.
If the form of heteroscedasticity is known, the method of weighted least-squares (WLS) is the best choice to get efficient estimates of parameters but the form of heteroscedasticity is seldom known. In such situation, the estimated weighted least-squares (EWLS) estimator can be used which can provide more efficient estimates than the OLSE (Fuller & Rao, 1978;Pasha, 1982). Besides this, some adaptive estimators have been proposed which are more efficient than the OLSE (Carroll, 1982;Carroll & Ruppert, 1982;Robinson, 1987). A vast literature (Ahmed, Aslam, & Pasha, 2011;Aslam, 2006Aslam, , 2014Aslam, Riaz, & Altaf, 2013) is available to justify the use of such adaptive estimators as a substitute of the OLSE in the presence of heteroscedasticity of unknown form. However, estimation of DLM by the Almon technique in presence of heteroscedastic errors has not grabbed the attention of researchers. This paper addresses this issue and proposes the use of adaptive estimator combined with the Almon technique as a replacement of the OLSE for efficient estimation of the DLM in the presence of heteroscedasticity of unknown form. This paper is organized as follows. Section 2 describes the DLM with heteroscedastic errors and the adaptive estimation. Section 3 briefly describes the Monte Carlo scheme and data generating process along with the results. Section 4 concludes the paper.

The distributed lag model and adaptive estimation
The general form of a finite DLM is β i x tÀi þ u t : In matrix notation, the model in Equation (1) can be written as follows: Clearly, y is a T À s ð ÞÂ1 vector of observations on response variable, β is a s þ 1 ð ÞÂ1 vector of unknown lag coefficients or lag weights, u is a vector of random error with mean zero and variance Direct application of the OLS to estimate DLM given in Equation (1) may have some serious problems as discussed earlier. For estimation of the DLM, Almon (1965) proposed a technique which is widely used by the practitioners. Almon assumed that the coefficients β i could be well approximated by a polynomial of degree r in i which was less than s (the lag length) i.e., Substituting Equation (3) in Equation (1), one can estimate α by the usual OLS procedure and then using Equation (3), one can get the estimates of β i . In matrix notations, Equation (3) can be written as where β is as mentioned before and are s þ 1 ð ÞÂ r þ 1 ð Þ matrix and r þ 1 ð ÞÂ1 vector, respectively. By substituting Equation (4) in Equation (2), one can obtain Here, Z ¼ XA The OLS estimator of α is as follows: The parameter vector β can be estimated as follows: which is the Almon estimator of β. If the error term is homoscedastic, it is the best linear unbiased estimator (BLUE) of β (Judge, Griffiths, Hill, Lütkepohl, & Lee, 1980). A major advantage of the Almon technique is that if a distributed lag is assumed to lie on a polynomial of a specified degree then the distributed lag can be estimated by standard linear regression methods (Fair & Jaffee, 1971). Moreover, the Almon technique reduces the effect of multicollinearity (Fair & Jaffee, 1971;Judge et al., 1980;Maddala, 1977) because there are fewer explanatory variables in the transformed model as compared to the actual DLM.
In the presence of heteroscedasticity,α OLS does not remain efficient and consequentlyβ also becomes inefficient. Thus, some efficient estimator of α is desired which will further give an efficient estimator of β. The WLS estimator is one possible solution to estimate model (5) However, in practical situations, σ 2 t are usually not known and their estimates are used. The resulting EWLS estimator of the parametric vector α iŝ where b W is a nonsingular diagonal matrix containing estimated weights used for variancestabilizing transformation. Now the concern is to constructŴ and for this purpose, an adaptive estimator proposed by Carroll (1982) can be used. Carroll (1982) assumed the error variance to be a smooth function of the mean values as: where g is unknown and τ t ¼ Z 0 t α: The estimate of τ t , in context of model (5), is given as follows: Carroll (1982) presented the kernel estimator of σ 2 t in the form of Nadaraya-Watson (Nadaraya, 1964;Watson, 1964) estimator aŝ where K Á ð Þ is the kernel function with smoothing parameter h andû j are the OLS residuals of model (5).
t and the adaptive weighted leastsquares (AWLS) estimator of α can be defined as follows: Carroll (1982) proved thatα AWLS has the same asymptotic properties as that of the WLS estimator (i.e.,α WLS ), based on the true weights. It has a normal limiting distribution with mean α and covariance Z 0 WZ ð Þ À1 . Furthermore, such adaptive estimator is more efficient than the OLSE in the presence of heteroscedasticity of unknown form (for more details, see Carroll [10, p. 1226Carroll [10, p. -1227).
The use of such efficient estimator combined with the Almon technique may increase the efficiency of the Almon estimator when a DLM is plagued with the heteroscedasticity of unknown form. Therefore, an efficient estimator of parameter vector β based on adaptive estimatorα AWLS can be given as follows:

Numerical evaluation
Following Frost (1975) and Güler, Gültay, and Kaçiranlar (2017), we used the following model to generate observations: x 1 ¼ v 1 (7:a) where u t ¼ σ t t and t are independent normal variates with mean 0 and standard deviation 1. Stated differently u t ¼ σ t t is normally distributed with mean 0 and standard deviation σ t . Following the work of Alkhamisi, Khalaf, and Shukur (2006), Månsson, Shukur, and Kibria (2010) and Aslam (2014), two other probability distributions for the error term are used which are Student's t(6) and F(4,16). Following Frost (1975), the lag length s = 10 and the degree of polynomial r = 3 are used and the values of β 0 to β 10 are set to be 0. The variate v t is independent normal with mean 0 and standard deviation 1. For each replication, the data are generated by drawing values of v t for t ¼ 1; 2; . . . ; 40 and u t for t ¼ 11; 12; . . . ; 40. The values of x t are generated using Equation (7a) and (7b) and the values of y t are generated by Equation (6). Each replication allows for regression with T = 30 observations (t = 11 to 40) to be run with lags up to 10 periods. The observations of x t (T = 30) thus generated are kept fixed in the simulation study. However, these observations are replicated twice, thrice and so on to get large samples with T = 60, 120 and 240 observations, respectively. This replication was done so that the degree of heteroscedasticity be retained the same for all sample sizes (see Aslam 2014; Cribari-Neto 2004 for such replications to get large samples).
The expected correlation between successive values of x t is indicated by λ. Following Güler et al. (2017), we used five different values for λ as 0.5, 0.8, 0.9, 0.95, and 0.99. It is expected that the degree of multicollinearity increases as λ increases and this setup leads us to investigate behavior of the estimators with the increasing degree of multicollinearity.
The values of x t are generated at the beginning of the simulations and kept fixed while the values of y t are repeated through the replications. After constructing the matrix X, Z is obtained by Z = XA.
Following Cribari-Neto and da Silva (2011) and Aslam (2014), the variance of error term is generated as follows: The degree of heteroscedasticity is measured by δ ¼ max σ 2 t À Á =min σ 2 t À Á : The degree of heteroscedasticity is a useful measure to represent strength of heteroscedasticity from mild to severe. For each, the value of γ is chosen in such a way that δ % 4, 36 and 100 in order to represent mild, moderate and severe heteroscedasticity, respectively. Obviously, for ¼ 0, δ ¼ 1 indicates the homoscedasticity in error terms.
Following Ahmad et al. (2011) and Aslam et al. (2013), we used the normal kernel in our study for the estimation of error variances and it has been cited in Roy (2002) that in the nonparametric literature, the choice of the kernel function is not crucial as long as it satisfies certain regulatory conditions and the sample size is not small (see also (Delgado, 1992)). The normal kernel is According to Silverman (1998), the optimal choice for the smoothing parameter h in case of normal kernel is h ¼ CS x n À0:2 for c = 1.06 with S x as the sample standard deviation of the regressor x. The number of Monte Carlo Simulations is set to be 5000. All the computations are performed through programming routines, developed in the R language (R 3.2.4).
To evaluate the performance of the conventional OLSE and our proposed AWLSE when the DLM is plagued with the problem of heteroscedasticity of unknown form, we use mean squared error (MSE) criterion and focus on the estimator which yields lower MSE. A colossal literature (Alheety and Kibria, 2009;Ahmed et al., 2011;Aslam, 2014;Aslam et al., 2013;Gibbons, 1981;Kibria, 2003;Lawless & Wang, 1976;Özbay, Kaçıranlar, & Dawoud, 2017) is available to justify the use of MSE to evaluate the performance of estimators.
For any particular estimatorβ of β, the MSE is given by However, for evaluation purpose, the estimated MSE for any estimatorβ of β can be given by where M is the number of repetitions in a simulation andβ i ð Þ is the estimated value of β in ith repetition. The relative efficiency of the AWLSE with respect to OLSE is given as follows: The simulated EMSE results for the OLSE and AWLSE are presented in Figures 1 and 2 when the probability distribution of the error term is N(0,1). Figure 1 shows the EMSE along with sample size for the case when λ = 0.50. In Figure 1, (a), (b), (c) and (d) are used for the case of homoscedasticity, mild, moderate and severe heteroscedasticity, respectively. Figure 1 indicates that the EMSE of OLSE and AWLSE remains almost identical for all sample sizes when the error term is homoscedastic (δ = 1). For the case of mild heteroscedasticity (δ = 4), the EMSE of AWLS remains slightly lower than that of the OLSE. However, the EMSE of both estimators tends to increase with the increase in degree of heteroscedasticity. For the case of moderate and severe heteroscedasticity (i.e., δ = 36 and 100), we note that EMSE of the AWLSE remains smaller than that of the OLSE for all sample sizes which indicates the superiority of the AWLSE on OLSE in terms of efficiency. Furthermore, the EMSE of both estimators decreases considerably with the increase in sample size. Almost the similar behaviour of estimators is observed for other collinearity levels.
The results for the case of severe heteroscedasticity are presented in Figure 2 for discussion purpose. In Figure 2, (a), (b), (c) and (d) are used for λ = 0.80, 0.90, 0.95 and 0.99, respectively. A dramatic increase in the EMSE of both estimators is observed with the increase in collinearity level. However, the AWLSE yields lower EMSE for all collinearity levels. Moreover, the difference in the EMSE of both estimators reduces with the increase in collinearity levels.
Tables 1 and 2 present the EMSE of OLSE and AWLSE when the probability distribution of error term is t(6) and F(4, 16). Almost similar performance of the AWLSE can be observed as it exhibits in case of normal errors, when the probability distribution of error term is t(6) and F(4, 16). Efficiency of the AWLSE improves further when the error term is not only heteroscedastic but it is also non-normal, especially when the probability distribution of error term is   standardized F(4, 16). For example, when T = 30 and λ = 0.50, the gain in efficiency when using the AWLSE is 70.78%, 74.29% and 103.18 for the errors distributed as N(0, 1), t(6) and F(4, 16), respectively, if the heteroscedasticity is mild (δ = 36) while the corresponding figures about gain in efficiency are 106.75%, 110.60% and 161.12% for severe heteroscedasticity (δ = 100). This shows that the AWLSE gives better performance not only in case of normal error distribution but also in case of non-normal error distribution.
Moreover, it is observed that the level of collinearity and the degree of heteroscedasticity has a negative effect on the EMSE of both estimators (the OLSE and AWLSE). The EMSE of both estimators tends to increase with the increase in level of collinearity and degree of heteroscedasticity. On the other hand, the stated EMSE decreases as the sample size grows. The probability distribution of errors has almost similar effect on the EMSE of both estimators but the gain in efficiency of the AWLSE is the largest when the errors are F(4, 16).

Conclusion
This study is focused to estimate the DLM plagued with the heteroscedasticity of unknown form. One of the most popular technique, which is used for estimating the parameters of the DLM, is proposed by Almon (1965). In Almon technique, the DLM is transformed under the assumption that lag coefficients lie on a polynomial of some suitable degree and then the OLS is applied on the transformed model. If the error term is homoscedastic, there is no hindrance in applying the OLS to the transformed model but in presence of heteroscedastic errors, the OLSE for the transformed model may become seriously inefficient. Based on the work of Carroll (1982) we have proposed an adaptive estimator combined with the Almon technique, which is more efficient as compared to the conventional Almon estimator, which is based on the OLS. A Monte Carlo experiment is conducted and the AWLSE is compared with the conventional OLSE for the Almon technique, using the MSE criterion. The estimators are evaluated by varying degree of heteroscedasticity, degree of collinearity, sample size and probability distribution for the error term. The simulation results reveal that the proposed estimator (AWLSE) is an attractive choice for being more efficient over the conventional Almon OLSE in the presence of heteroscedasticity of unknown form.