INFLATION FORECASTING IN THE WESTERN BALKANS AND EU: A COMPARISON OF HOLT-WINTERS, ARIMA AND NNAR MODELS

The purpose of this paper is to compare the accuracy of the three types of models: Autoregressive Integrated Moving Average (ARIMA) models, Holt-Winters models and Neural Network Auto-Regressive (NNAR) models in forcasting the Harmonized Index of Consumer Prices (HICP) for the countries of European Union and the Western Balkans (Montenegro, Serbia and Northern Macedonia). The models are compared based on the values of ME, RMSE, MAE, MPE, MAPE, MASE and Theil's U for the out-of-sample forecast. The key finding of this paper is that NNAR models give the most accurate forecast for the Western Balkans countries while ARIMA model gives the most accurate forecast of twelve-month inflation in EU countries. The Holt-Winters (additive and multiplicative) method proved to be the second best method in case of both group of countries. The obtained results correspond to the fact that the European Union has been implementing a policy of strict inflation targeting for a long time, so the ARIMA models give the most accurate forecast of inflation future values. In the countries of the Western Balkans the targeting policy is not implemented in the same way and the NNAR models are better for inflation forecasting.


Introduction
Price stability is one of the goals of all countries, especially the countries of the European Union and its potential members, as documented through the political agendas of the European Union and the Maastricht convergence criteria (Golinelli and Orsi, 2002). Predicting the future value of inflation is of particular importance for all countries, whether or not they have clearly declared inflation targeting policies. Historically, the European Union has successfully pursued a policy of targeting inflation and has not had high inflationary developments in the past, while the countries of the Western Balkans aspiring to become members of the European Union had high inflation during the late twentieth century caused by the numerous factors. The countries of the Western Balkans have diverse foreign exchange systems, Montenegro is a dollarized (euroized) economy, while Serbia and Macedonia have their own currencies. The Harmonized Index of Consumer Prices (HICP) monitored for the European Union comprises a number of countries and their diverse foreign exchange systems.
The paper examins the possibilities for application of Auto-Regressive Integrated Moving Average (ARIMA), Holt-Winters and Neural Network Auto-Regressive (NNAR) models in the inflation forecasting for the three countries of the Western Balkans (Montenegro, Serbia, North Macedonia) and full member countries of the European Union during the observed 2010-2020 period. The dual possibility of comparing the models is considered. Firstly, models are compared within their own class, and then the best models from each class are mutually compared.
There are many models that can be used for inflation modeling and forecasting. In this paper the possibility of inflation modeling and accuracy of its forecasting with univariate models are tested. The forecast of future values is based only on historical data on inflation. This type of models often give a faster and more accurate forecast compared to more complex factor models. This is a consequence of the unpredictability and impossibility of accurate measurement and evaluation of numerous factors, as well as their erroneous specifications in the models of traditional econometric analysis. An additional limitation is the need to predict the values of all determinants of the observed series, which further complicates the work and leads to inaccuracies in forecasting.
The paper is organized as follows. A literature review is presented in the next section. The third section reviews the basic methodological bases of development and specifications of econometric models. The fourth section presents the empirical analyses and comparison of estimated models. Finally, conclusions are presented in the fifth section.

Literature review
Today, in modern monetary theory and central banking practice price stability is usually associated with moderate price growth (Ascari and Sbordone, 2014). The level and degree of change in inflation have always been an interesting research topic for many researchers. Researchers' interest in the field reflected the current level of development of empirical apparatus for forecasting time series. Meyler, Kenny and Quinn (1998) forecast inflation in Ireland by comparing the ARIMA models obtained using the Box-Jenkins methodology and objective penalty function methods. Pufnik and Kunovac (2006) give a forecast of short-term inflation based on the ARIMA model by observing the consumer price index (CPI) in Croatia. Other papers also deal with inflation forecasting using the ARIMA model with the most frequent use of Box-Jenkins methodology in model evaluation (Alnaa and Ahiakpor, 2011;Okafor and Shaibu, 2013). A comparison of the power to predict inflation by using the Holt-Winters and ARIMA models is presented by Omane-Adjepong, Oduro and Oduro (2013).
In the literature special attention is given to comparisons of more advanced prognostic models, such as comparisons of ARMA, ARIMA and GARCH models (Nyoni, 2018), comparison of VAR and ARIMA models in HICP prognosis in Austria (Fritzer, Moser and Scharler, 2002), comparison of ARIMA, VAR and ECM models (Uko and Nkoro, 2012). Suhartono (2005) compares the prognostic performances of the Neural Networks, ARIMA and ARIMAX models in inflation forecasting in Indonesia, where it is concluded that the Neural Networks model gives a more accurate inflation forecast compared to traditional econometric time series models. Sari, Mahmudy and Wibawa (2016) give an inflation forecast using the Backpropagation Neural Network method. McNelis and McAdam (2004) apply linear and neural network-based "thick" models for inflation forecast in the USA, Japan and in the euro area. Hubrich (2005) studies inflation in the European Union measured by the change in HICP and investigates whether the forecasting accuracy of forecasting aggregate euro area inflation can be improved by aggregating forecasts of sub-indices of the HICP as opposed to forecasting the aggregate HICP directly.

ARIMA models
ARIMA models are particularly suitable for short-term forecasts and the model evaluation methodology is the result of the work of Box and Jenkins (1976 (1 − 1 ) (1 − Φ 1 12 )(1 − )(1 − 12 ) = (1 + θ 1 ) (1 + Θ 1 12 ) In seasonal ARIMA models p represents the number of autoregressive elements, d is the level of series differentiation, q is the number of moving average elements, while P is the number of seasonal autoregressive elements, D is the number of seasonal differences and Q is the number of seasonal moving average elements.

Holt-Winters models
Holt (2004) and Winters (1960) have developed a method for forecasting time series that can successfully capture the level, trend and seasonality in the series. When forecasting

Inflation Forecasting in the Western Balkans and EU: A Comparison of Holt-Winters, ARIMA and NNAR Models
future values of a time series, values that are closer to the current one are more important than previous values that are further away. The method can be used for short, medium and long term forecasts. There are two types of methods in relation to the nature of time series, and these are additive and multiplicative models. The additive method is used when seasonal fluctuations are at approximately the same level during a time series, while the multiplicative method is used when seasonal variations change in proportion to the time series level.
Equations for the additive method: Equations for the multiplicative method: where is the observed series, s is the length of the seasonal cycle, gives the level of the series, represents the trend, 0 ≤ α ≤ 1, 0 ≤ β ≤ 1, 0 ≤ ≤1 and presents forecast for h-periods ahead.

Neural Network Auto-Regression models
NNAR models are a newer way that allows modeling of complex connections between inputs and outputs. In the case of the NNAR model, the previously lagged values of the observed time series serve as inputs for forecasting future values. Due to the seasonality of the observed HICP time series, the NNAR (p, P, k)m model will be used. Figure no. 1 represents an NNAR model with one hidden layer, k hidden neurons, also known as a multilayer feed-forward network where each layer of nodes receives inputs from the previous one and sends it on to the next one (Hyndman and Athanasopoulos, 2018). The inputs of each node are obtained based on the linear combination function. The results are then modified by a nonlinear function and forwarded. The linear combination function for node j is formulated as: , it is then modified by a non-linear function, such as a sigmoid, and it is sent to the next layer. This aims to reduce the effects of extreme values and to make networks more resistant to extreme values. The notation p represents the number of lagged autoregressive components of the model, while the notation P represents the number of lagged seasonal autoregressive components of order m. The k mark represents the number of nodes in the hidden layer.

Comparasion of forecast accuracy
The comparison of the evaluated models for each of the observed time series will be performed on the basis of 7 criteria: Mean error: Root mean square error: Mean absolute error: Mean percentage error: Mean absolute percentage error: Mean absolute scaled error: = mean(|q |) Theil's U statistic: = √ where +1 −̂( ) represents the forecast error, i.e. the difference between the actual and the predicted value of the time series. Smaller values of forecast accuracy statistics correspond to a better forecast model. In theory there are no exact limits for the values of forecast statistics that separate good from bad models. Therefore, we take the criterion that the best model has lower values of all statistics compared to competitors. The best model minimizes all ME, RMSE, MAE, MPE, MAPE, MASE and Theil's U values.

Inflation Forecasting in the Western Balkans and EU: A Comparison of Holt-Winters, ARIMA and NNAR Models
Data base for this research consist of time series of Harmonized Index of Consumer Prices (HICP). The data are presented on a monthly basis, for the period from January 2010 to March 2020. R software package was used for analysis purposes.

Results and discussion
The   Based on the values of ADF and PP test statistics and calculated p probabilities for all four data series, it can be concluded that the null hypothesis is confirmed and that the series have at least one unit root, i.e. that they are not stationary. We reach the same conclusion on the basis of KPSS test statistics and p values, where we reject the null hypothesis that the series is stationary and provide an alternative that the series have at least one unit root. In order to obtain a stationary time series that can be used in further analysis, the first difference of all series was determined. The results of ADF and PP tests proved that the first differences of all series are stationary, and that as such they can be used in the further Box-Jenkins procedure. The results of the applied KPSS test in the case of Serbia are in contradiction with the obtained results of other unit root tests. In order to meet the requirements of all three unit root tests, the time series needs to be differentiated once again. Simultaneous use of a larger number of tests leads to better results due to the elimination of the shortcomings of the use of individual tests. In this research the strictest criterion is used for fulfilling the conditions of all three unit root tests.
The ACF and PACF given in figure no. 4 are used in determining the order of the AR and MA models for each of the observed series. The results of the unit root tests of the first difference data are given in table no. 2. From the correlogram of all series the presence of seasonal components is clearly noticed, which will influence the further evaluation of seasonal ARIMA models, i.e. SARIMA models. Depending on the observed series, SARIMA models will be of different order, with or without seasonal differentiation of the series. The appropriate SARIMA model is evaluated for each of the observed series, and information criteria is used as a criterion in selecting the optimal model. The best model for each of the

Table no. 3: Identification and evaluation of the model in the case of Montenegro
In the case of data for Serbia, we previously concluded that the series on the first difference is not stationary, so the second difference of the original data series was determined and its stationarity examined. Based on the statistics values of all unit root tests, it is confirmed that the second difference of the series is stationary. (

. Identification and evaluation of the model in the case of European Union
After selecting the best models for each country separately, minimizing the values of the information criteria, it is necessary to examine the existence of an autocorrelation between the residuals and the normality of their distribution. Based on the value of Ljung-Box statistics and the corresponding p value with a risk of error of 5%, the null hypothesis can be confirmed that all autocorrelation coefficients are statistically equal to zero, even up to 22th lag. The residual distributions from all the described models can be approximated by the normal distribution. Hence, the models meet both required characteristics of model validity (absence of autocorrelation and normality of residual distribution), and can be used in further analysis as a benchmark in model comparison. (Table no. 8) Models that meet all the required properties can be used to predict future time series values. Figure no. 5 gives 12-month forecasts of all four selected ARIMA models with 80% and 95% confidence intervals. Forecasts of the future value of the logarithm of the Harmonized Index of Consumer Prices represent the basis for calculating the statistics of the evaluation of the forecast in comparison with the actually realized values. The predicted values will be used not only for comparison with the actual values but also for comparison with the predicted values obtained by the Holt-Winters method and the use of neural networks. A comparison of prognostic methods will be given separately for each of the observed countries.

Holt-Winters forecasting
The type of time series is of special importance for the forecast of future values of the observed time series when using the Holt-Winters method. Forecasts for both types of methods (additive and multiplicative) are given in the analysis and the choice of the better method was made on the basis of forecast statistics. Data for all countries were observed separately, and the selection of the best forecast model was made based on minimizing the forecast error. (Table no.

Inflation Forecasting in the Western Balkans and EU: A Comparison of Holt-Winters, ARIMA and NNAR Models
prognostic performance. Previously calculated forecast statistics and the choice of a better forecast method can also be clearly seen in the figure no. 6 because these values are closer to the actual observed data that serve as test data.

Neural Network Auto-Regression forecasting
Using artificial neural networks, i.e. a special type of Neural Network Auto-Regression model (NNAR), the model was evaluated, and then a forecast was made for out of the sample periods. NNAR (1,1,2) [12] models were evaluated for all observed time series, with one autoregressive element, one seasonal autoregressive element of order 12 and two neurons in the hidden layer. The estimated models are presented in table no. 10.

Table no. 10. Selected NNAR models
Data from January 2010 to March 2019 are used to evaluate the model, while data for the next 12 months until March 2020 are used to test predictive power of the model. For the forecast of one period out-of-the sample, all available data are used, for the forecast of two periods in advance, in addition to the data from the sample, the first forecast values are used. A twelve-month forecast was formed in the same way. The distribution of the residuals of the evaluated models can be approximated by the normal distribution, so the models can be

Conclusions
Inflation targeting is one of the key goals of all European Union countries, as well as of the countries aspiring to become future members of EU. It is of particular importance to all Western Balkan countries that experienced severe hyper-inflation at the end of the twentieth century. Having in mind the convergence criteria that EU has put before its future member states, the dynamics of inflation measured by the change of the harmonized index of consumer prices (HICP) and its forecast is very important topic.
The purpose of this paper was to estimate adequate models for inflation forecasting and to compare their forecasting performances for the case of the three Western Balkans countries (Montenegro, Serbia, North Macedonia) and EU countries. This analysis has been carried out considering three methodologies, ARIMA, Holt-Winters and NNAR models. A comparison of the models for forecasting monthly inflation at several levels was performed. When comparing the models from the ARIMA class, the AICs information criterion was used. The Holt-Winters method comparison was performed based on the forecast error. An automatic evaluation procedure was used to evaluate the NNAR model. After selecting the best model from each of the model classes, a final comparison of the prognostic performances of the models is done on the bases of the forecast errors. For all analyzed countries of the Western Balkans, the NNAR models give the best results in forecasting inflation. In case of the European Union the evaluated ARIMA model gave the best results. The Holt-Winters (multiplicative or additive) method is the second best in forecasting for the case of all analyzed countries.
The results of the research represent a framework for further analysis and do not provide final solutions to this problem in the observed countries. It is interesting to compare the possibilities of forecasting inflation for the countries that have different legacies that act through psychological factors, regardless of the fact that they are not included in the analysis empirically. The models do not take into account other factors that determine inflation, and therefore represent the forecast of future values only on the basis of previous values of the observed phenomenon. Although the models give a very accurate forecast, for long-term forecasts they may show certain shortcomings due to the univariate nature of the model. In the world that is characterized by numerous, rapid and sudden changes and a large number of factors and influences, even forecasts of the best model that can be evaluated can only give a rough picture of the always uncertain and challenging future.