Ensemble forecasting method

The purpose of this article is to analyze the time series based on aggregate forecasting methods. Forecasting time series comprises an important scientific and technical task which is relevant in various sectors of economy and production. The main objective of any forecasting process is to get the ability to assess trends in changes to a particular factor. To predict the behaviour of future processes, a qualitative analysis of data is required, which provides the basis for the forecast. It is also important to choose the appropriate forecasting method. The forecast obtained may not be always correct, so it is crucial to determine how accurately it is built. In the present paper, characteristics of time series are provided. The main mathematical methods of time series forecasting are analyzed and their classification is presented. Criteria for the accuracy and reliability of forecasting models are defined. The accuracy of the forecast is an important criterion for evaluating the forecasting method. The aggregate ensemble forecast has proven to be an effective method for enhancing the accuracy of forecasting. The article provides an example of constructing an aggregate ensemble forecast. The types of nonlinear models are described and several methods are selected for constructing the aggregate ensemble forecast. The practical application of exponential smoothing is considered, the MA (moving average) model and the AR (autoregressive) model are constructed.


Introduction
Forecasting financial time series holds a special place in modern economics. A forecast is important for making many decisions, such as buying or selling assets in the financial market, investing capital, and identifying cost-effective options for long-term, medium-term, or short-term planning. To improve the accuracy of the forecast, various forecasting methods are developed. For example, the method of combining forecasting techniques, also called ensemble forecasting, provides an effective way of improving the accuracy of forecasting.

Main types of models
There are the following types of dynamic models: • distributed lag models, i.e. models in which only independent (explanatory) variables can be lagged. Models with lags have the following form (1): • autoregressive models, i.e. models in which dependent variables are lagged. Autoregressive models have the following form (2): An important step in modelling is the model performance evaluation [1,2]. Mathematical statistics offers a number of estimation methods: 1. The proportion of variance explained (coefficient of determination). The coefficient of determination shows the proportion of variance of the dependent variable explained by the model in question. The coefficient takes values from 0 to 1.
A method based on the maximum likelihood function. 3. Graph of observed and predicted values. The most commonly used graph is a scatterplot of observed and predicted values. If the points are located along a straight line, the model is well constructed.
4. Normal and half-normal residual plot. The graph shows how close the error distribution is to normal distribution. The closer the distribution is to normal, the better the model is constructed.
5. The covariance matrix of parameter estimates. The matrix can indicate excessive model parameters in the case of a strong correlation. Nonlinear models have the following advantages: flexibility and consistency of analysis. However, these models are not free from shortcomings such as the ambiguity of the estimated coefficients and the complexity of their estimation [3,4,5,6].
The most popular nonlinear models are those of the ARMA (autoregressive-moving-average model) class (3), which generalizes two simpler models -AR (autoregressive model) and MA (moving average model).
The parameter «p» is the order of the regression part, and the parameter «q» is the order of the moving average.
Keeping in mind the problem of time series stationarity, integration should be added to the model to solve it. As a result, we have ARIMA (4) (autoregressive integrated moving average model). The model is an extended version of ARMA for nonstationary series, which can be turned into stationary by differencing the time series. In this model, the parameter «d» is added, which indicates the degree of differencing.
Financial time series are characterized by seasonality. Seasonality describes the periodic fluctuations observed in a time series. The seasonally adjusted model is referred to as SARIMA or ARIMA (P,D,Q)s. The parameter «P» is the order of the seasonal component SAR(P), «D» is the integration order of the seasonal component, «Q» is the order of the seasonal component SMA(Q), and «s» represents the dimension of seasonality (week, month, etc.).
In ARIMA modelling, «d» cannot be fractional, but there are fractionally integrated models (5) in which the coefficient «d» can take fractional values. The fractional value is explained by expanding the operator in a power series. This model is named ARFIMA (autoregressive fractionally integrated moving average model).

Forecast quality evaluation
The accuracy indices of the forecasting model determine the value of the resulting error [7,8,9]. To confirm the quality and suitability of the model used, one needs to analyze the system of indicators that reflect both the adequacy of the model and its accuracy. The accuracy of the forecast is determined by the amount of error in the forecast. Forecast error is a value defined as the difference between the actual and predicted values of the indicator. As the error value decreases, the accuracy of the forecast increases accordingly.
The absolute error of the time series forecast is an analytical indicator, the use of which makes it possible to quantify and estimate the value of the forecast error. The absolute error is calculated using the following formula (6): where ̂ -predicted value of the indicator; -actual value. The average squared error is calculated using the following formula (7): The forecast error is identical in dimension to the predicted indicator and directly depends on the scale of measurement of time series levels. Thus, the forecast accuracy increases as this value decreases. For processes that are defined by a large number of indicators, one needs to compute a summarizing indicator of accuracy.
To assess the accuracy of the forecast, the mean squared error MSE is used, which highlights major errors by squaring each error (8).

Ensemble forecast
To build an ensemble forecast, we consider combining forecasts that were developed using three forecasting methods [10,11,12]. 1.
The method of exponential smoothing (9).
where tthe period preceding the forecast period,the predicted indicator, −1the actual value of the studied indicator for the period preceding the forecast period,the smoothing coefficient. A formally correct procedure for selecting α does not exist, so the coefficient will be determined based on the length of the smoothing interval. Then is computed using the formula: where n is the number of observations within the smoothing interval. One of the main advantages of this method is the ability to solve long-term forecasting problems [13,14]. 2.
Moving average (MA) model (10): wherethe predicted indicator, tthe period preceding the forecast period, −2the moving average of two periods prior to the forecast period, nthe number of equations within the smoothing interval, −1the actual value of the examined indicator over the period preceding the forecast period, −2the actual value of the examined indicator over two periods preceding the forecast period. The main advantages of this method are simplicity and visibility. 3.
Autoregressive model AR(p). In this model, the values of the time series depend linearly on past values. The AR(p) model is well suited for making a forecast for 1 period ahead. The order of the model is equal to 1, p=1. The predicted value depends on the previous values (11): where autoregressive coefficients (model parameters), cconstant, and white noise (random error). The main advantages of this model include simplicity and transparency of modelling, it is used in solving problems of forecasting time series [15,16].
To make a forecast, let us take the data of the EUR/RUB currency pair rate over the period 06.09.2019-06.04.2020. The time interval is 1 week. As a training sample for generating a weighted moving average, the period 07.09.2019 -14.10.2019 was selected. As a test sample, the period 14.10.2019 -06.04.2020 was selected.
All calculations are performed using MS Excel [17,18]. First, we make a forecast for the period of the test sample 07.10.2019 -06.04.2020 using three methods that were selected and described earlier. These methods will provide the basis for constructing the ensemble forecast. The actual data is denoted as .
To build the first forecast «F1» we use exponential smoothing. The graph with the original data and the F1 values is shown in figure 1.
Let us denote the forecast made using the moving average method by «F2». Using the MS Excel tools for statistical analysis, we make a forecast for one period ahead based on the moving average method.
The graph with the original data and the F2 forecast values is shown in figure 2.  The forecast generated using the method of autoregression is denoted as «F3». The graph with the original data and the F3 forecast values is shown in figure 3. The ensemble forecast is built based on the arithmetic mean of original forecasts using a formula of the form (12): where i is the number of methods combined. Accordingly, using equation (15), we calculate aggregate forecasts consisting of individual forecasts F1-F2, F1-F3, F2-F3, F1-F2-F3 or F12, F13, F23, F123. Figure 4 shows the graph of forecasts F1, F2, F3, F12, F13, F23, F123.

Conclusion
Thus, we considered methods for analyzing financial time series. Optimal forecasting models were selected and their constructive aspects were studied. Methods and algorithms for estimating the parameters of all the models under consideration were defined. An aggregate ensemble forecast was constructed. Combining forecasting methods for the purpose of improving the forecast accuracy showed a good result.