Macroeconomic Forecasts in Models with Bayesian Averaging of Classical Estimates

The aim of this paper is to construct a forecasting model oriented on predicting basic macroeconomic variables, namely: the GDP growth rate, the unemployment rate, and the consumer price inflation. In order to select the set of the best regressors, Bayesian Averaging of Classical Estimators (BACE) is employed. The models are a theoretical (i.e. they do not reflect causal relationships postulated by the macroeconomic theory) and the role of regressors is played by business and consumer tendency survey-based indicators. Additionally, survey-based indicators are included with a lag that enables to forecast the variables of interest (GDP, unemployment, and inflation) for the four forthcoming quarters without the need to make any additional assumptions concerning the values of predictor variables in the forecast period. Bayesian Averaging of Classical Estimators is a method allowing for full and controlled overview of all econometric models which can be obtained out of a particular set of regressors. In this paper authors describe the method of generating a family of econometric models and the procedure for selection of a final forecasting model. Verification of the procedure is performed by means of out-of-sample forecasts of main economic variables for the quarters of 2011. The accuracy of the forecasts implies that there is still a need to search for new solutions in the a theoretical modelling.


Introduction
The main aim of this paper is to construct a model that would accurately forecast basic macroeconomic indicators, i.e. the GDP growth rate (GDP), the consumer price inflation (CPI), and the unemployment rate (UNE) measured in line with the methodology of the Polish Labour Force Survey (BAEL). The model presented in this paper is by definition atheoretical, i.e. its specification does not intend to reflect the causal relations driven from the economic theory. Additionally, it is based on the results of business and consumer tendency surveys.
The events of the crisis period proved that Diebold's concern about macroeconomic forecasting was justified.
Dynamic simultaneous equations models, general equilibrium models, dynamic stochastic general equilibrium models and vector autoregressive (VAR) models were predominantly unable to generate accurate short-term macroeconomic predictions in that period.
The concept of an atheoretical model construction was proposed by Sargent and Sims (1977) and Sims (1980). The idea was either acknowledged or criticised for being "atheoretical macroeconomics" (Cooley & LeRoy, 1985). Nevertheless, later it was distinguished by the Nobel Memorial Prize in Economic Sciences.
One of the first scholars to employ atheoretical modelling augmented with business and consumer survey

Macroeconomic Forecasts in Models with Bayesian Averaging of Classical Estimates
Macroeconomic Forecasts in Models with Bayesian Averaging of Classical Estimates data in macroeconomic forecasting were Hansson, Jansson, and Lof (2003). Data from business and consumer tendency surveys were employed in the Polish forecasting models and discussed, among others, in papers by Białowolski et al., (2007) and Białowolski, Kuszewski & Witkowski, (2010). The model based on the survey data comprises opinions and forecasts of thousands of respondents answering questionnaires concerning the economic situation. Following Anuszewska (2010), it is assumed that respondents in such surveys share and shape a common knowledge on economic processes.
The earlier paper by Białowolski, Kuszewski & Witkowski, (2010) proposed specification of 6 families of forecasting models for the aforementioned variables: GDP, CPI, and UNE. The order in which regressors were explained was different for each specification and the parameters were estimated with supervised stepwise regression which enabled to estimate parameters in the models follwong the general to specific rule in order to avoid the phenomenon of collinearity. In this paper authors investigate whether a different approach (i.e. BACE) leads to better out-of-sample forecasts.

Bayesian Averaging of Classical Estimators
The algorithm of Bayesian Averaging of Classical Estimators (BACE) was developed by Sala-i-Martin, Doppelhofer, and Miller (2004) and later elaborated by Stadelmann (2010). The general principle for this model can be summarized by the following assumptions: let Y be a regressor and X={X 1 , X 2 ,...,X s } be an s-element set of potential predictor variables.
Assuming that particular independent variables X i , i=1,...s, are not collinear, it is possible to form 2 s -1 non-empty subsets of set X. It is possible to estimate a regression model for each of these subsets. Particular coefficients in the final model (β i ) obtained with the BACE method are weighted averages of parameter estimates for the predictor variable X i in all regressions estimated with the application of a classical estimator (authors provide estimates obtained with ordinary least squares). Weights employed in the calculation of averages are proportional to the "quality" of particular regressions determined by the values of the Schwarz criterion. In contrast to the classical Bayesian methods, the role of prior assumptions in the model is minimal.
As long as we treat particular potential regressors as independent, the BACE approach does not require to assume any prior distribution for all parameters but only to specify the total number of predictor variables in the "correct" model.
The aim of the BACE algorithm is to estimate the β i coefficients in a regression for a given variable Y on a particular X i without limiting the analysis to a single subset of X. Moreover BACE estimation allows to check the significance of the particular X i without limiting the models taken into consideration to merely one chosen subset of X. Since it is possible to form 2 s subsets of X, it is also possible to form and estimate 2 s different models using one dependent variable Y. The models are marked as M i , where i=1,...,2 s . Let the number of independent variables in a set of regressors X i of model M i be equal to k i . Let s stand for the expected number of independent variables in the "correct" model. In the original BACE approach, each potential independent variable X i is considered separately regardless of the remaining potential independent variables as it is assumed that the probability of including a variable in the "correct" model does not vary among the variables. It leads to the assumption that the probability of including a variable X i in a model equals for each i=1,...,s. Therefore, the prior probability of model M i , P(M i ) may be written with application of binomial distribution: where the prior probability for each model with a predefined number of independent variables X i is identical. The posterior probability of model M i may be estimated with the use of a Bayes formula: where D is an n-element dataset (i.e. the values of a regressant and regressors), and the probability of generating D by model M i is: where the value of (for a given variable X i ), which has been calculated with ordinary least squares in model M j . Moreover, since formula (5) may be employed for the purposes of establishing the level of "correctness" of a given model, the posterior probability that variable X i can be included in the "correct" model may be expressed by: where I i j β , ≠0 is an indicator of the presence of variable X i in M j and it takes on value 1 if the variable X i is included in the model M j and 0 otherwise.
It is noteworthy that assuming independence and exogeneity of particular regressors, the only parameter which requires a prior assumption is s . As a rule, s is given a value that reflects the expected number of predictor variables in the "correct" model. However, it is difficult to talk about the "correct" model since the following considerations are of atheoretical nature.
Application of the BACE algorithm is time consuming as it requires a considerable number of models (precisely 2 s ) to be estimated. In the estimations carried out by Sala-i-Martin, Doppelhofer, and Miller (2004), the number of variables amounted to 60, which made estimation of all possible models unattainable.
The common practice in such situations is to estimate only a part of all possible equations on the basis of sets of regressors randomly selected from the initial set.
In order to employ the BACE algorithm, a set of potential regressors and the regressant must be predefined. Furthermore, the regressors selected for the final model are not actually orthogonal which may result in the problem of collinearity. Consequently, if we treat BACE as an algorithm of pre-selection of variables for the final model only, which is the most frequent type of approach, it is necessary to verify the possible excessive collinearity of some regressors andprobably -to modify also the list of selected variables.
Consequently, sequence of activities presented below was followed: 1. It was assumed that the aim of this paper is to construct models of the following form: where: GDP t , GDP t-1 -is the current and one-quarter lagged GDP growth rate expressed in the year-on-year basis.;  The remaining part of this paper is devoted to a presentation of theoretical grounds for selecting the pre-liminary set of predictors. Moreover, a further presentation and discussion of the results are provided.

Indicators from economic tendency surveys in modelling the GDP, inflation and unemployment
Economic tendency surveys and the indicators derived from them are intended to provide leading information about the state of economy. Various papers focused on the applicability of qualitative indicators from economic tendency surveys to the forecasts of main economic indicators (e.g. Białowolski, Kuszewski & Witkowski, 2010). Due to a wide variety of questions in business and consumer tendency questionnaires, data from them may serve as a source of leading information on production in the manufacturing sector, construction output, turnover, prices, investments, changes in employment(with respect to the sector of companies); and analyzing the performance of the households sector allows to gain information on the demand, including the propensity to buy durable goods (OECD, 2003;European Commission, 2006;Bieć, 1996). a clear tendency to lead, the indicator concerning inventory has a tendency to be lagged rather than to lead in the cycle (Zarnovitz, 1992). Nevertheless, the indicators concerning inventory are very reliable and are recommended by the European Commission for creating a synthetic index of the economic situation in the manufacturing industry (European Commission, 2006). Furthermore, since this study is also devoted to forecasting inflation and unemployment, the forecasting series of price changes (q5f) and the expected employment in companies (q6f) are also taken into account as far as the survey on the economic situation in industry is concerned.

Macroeconomic Forecasts in Models with Bayesian Averaging of Classical Estimates
The study made use of the following questions asked in the State of the Households Survey: predictions of the general economic situation (hhs_q4), unemployment changes forecasts (hhs_q7), inflation expectations (hhs_q5), and forecasts of durable goods purchases (hhs_q9). The first three are directly related to the areas of interest in this forecast (the GDP, unemployment, and inflation). As the matter of fact, purchases of durable goods are very sensitive towards the expected variation in the economic situation. This factor is also the most volatile element of households' consumption and, therefore, the forecast of the behaviour of households in this respect constitutes an important factor influencing the forecasts of the general economic situation.

Estimation of the econometric models
In order to employ the algorithm of BACE in the study, it is necessary to assume s -a number of predictor variables in the "correct" model. The value of this number is assumed arbitrary, which is especially typical of atheoretical models. Basing on the previous re-search results by Białowolski, Kuszewski & Witkowski (2010), it has been assumed that s = 6, however, it is noteworthy that the final number of predictor variables in particular models may differ from the prior assumption. Simultaneously, the influence of a change in the value of s on the set of predictor variables has been tested. The research leads to the conclusion that assuming that s =6 is adequate. Bigger models often require the excessive number of variables to be eliminated due to their collinearity. Furthermore, the values of probabilities calculated in line with (8) change proportionally to s but the order of values (8) for each indicator of the economic situation is preserved.
The initial set of indicators of the economic situation, which are under discussion, comprises 22 variables. If we assume that s =6, theoretically variables whose probability (8) is over 6/22 ≈0.27 should be included in the set. Table 1 contains the expected values of parameters (6), the Bayesian probabilities of inclusion (8) and the weighted empirical significance levels for test t. This paper presents only the results of estimation of models employing the three-period (m=3) lagged indices of an economic situation. The results of estimation of models using one-period lagged indices of an economic situation (m=1) were not discussed in the text due to the fact that the forecast acquired through the use of these models occurred highly unsatisfactory 1 . Table 1. The results of the predictors selection process with application of the BACE algorithm in models with three-period lagged business and consumer sentiment indicators (m=3) Note: Y(t-1) denotes a one-quarter lagged variable, i.e. GDP, UNE or CPI respectively; denotes parameters estimated along with the formula (6), P(X) those estimated along with formula (8), t(X) is an average of the empirical significance level of a given variable with weighs estimated in line with formula (5).
The BACE algorithm assumes independence of regressors. In fact, it would be hard to find an area in macroeconomics where a relationship between variables considered as regressors were inexistent. Therefore, the set of regressors finally included in models under consideration was slightly modified by means of excluding the variables responsible for an increase in collinearity from the set . Additionally, in one case, the set of variables was expanded by one variable whose Bayesian probability of inclusion was slightly below 0.27. If strict rules governing the algorithm of BACE had been maintained in this case, the equation under consideration would contain one regressor only.  ing changes in the consumer price inflation takes into account, among the regressants, the balance of answers to questions asked in the households' survey regarding the unemployment forecast (hhs_q7), purchases of durable goods (hhs_q9), and expectations concerning the inventory of manufacturing companies (ind_qf4).
However, no variables regarding the perception of the processes on the market of consumption goods are included.

Forecasts and conclusions
The lags of predictor variables in the econometric models were selected in such a way so as to enable the forecast of GDP, UNE, and CPI four quarters ahead without the need to make additional assumptions concerning the values of predictor variables in the forecast period.
As already mentioned, the last observation in the time series of regressants was reported for the fourth quarter of 2010 and the last observation in the time series of data acquired from the economic tendency surveys was reported for the first quarter of 2011.
Since regresants lagged by one period are included as predictor variables, it is necessary to perform the forecast for four successive quarters in a stepwise manner.
The values of regressors -variables derived from the tendency surveys performed in the past -are known for the whole period of forecast. The predicted values of the GDP, UNE, and CPI for a given quarter will become the values of predictor variables in the consecutive quarter.
Such a procedure is faulty since it leads to accumulation of forecast errors. Table 3 presents the results of the forecasting process for the four quarters of 2011.   Further research in the field of forecasting models of basic macroeconomic variables employing survey data will encompass the inclusion of dynamic factor models into the set of models under analysis (Boivin & Ng, 2005;Forni & Reichlin, 1998;Stock & Watson, 2002).
Furthermore, the possibilities raised by the forecasting models whose parameters are estimated on the basis of time series with removed seasonal effects should be considered further, especially the methods of desea-, especially the methods of deseasonalization taking different phases of the economic situation into account (Canova & Ghysels, 1994).