Time series forecasting using singular spectrum analysis, fuzzy systems and neural networks

Hybrid methodologies have become popular in many fields of research as they allow researchers to explore various methods, understand their strengths and weaknesses and combine them into new frameworks. Thus, the combination of different methods into a hybrid methodology allows to overcome the shortcomings of each singular method. This paper presents the methodology for two hybrid methods that can be used for time series forecasting. The first combines singular spectrum analysis with linear recurrent formula (SSA-LRF) and neural networks (NN), while the second combines the SSA-LRF and weighted fuzzy time series (WFTS). Some of the highlights of these proposed methodologies are:• The two hybrid methods proposed here are applicable to load data series and other time series data.• The two hybrid methods handle the deterministic and the nonlinear stochastic pattern in the data.• The two hybrid methods show a significant improvement to the single methods used separately and to other hybrid methods.

Sub-step 2.2: Design the architecture of the network by determine the number of input units, the number of nodes in the hidden layer, and specify the activation functions used in the hidden and output layer. For the univariate time series forecasting, we considered one output unit. Meanwhile, the selection of the activation function in the hidden layer can be adjusted according to the range of the data. In this case, we modeled the residuals obtained from SSA-LRF model, which fluctuate around zero. Since the data has negative and positive values, the tansig function is the more appropriate for the hidden layer and the purelin function for the output layer. Sub-step 2.3: Obtain the weights connecting each input node to each hidden node, and those connecting each hidden node to each output node, using the Levenberg-Marquardt based back-propagation algorithm. The fuzzy time series model is represented as the fuzzy relationships among observations [13][14][15] . A fuzzy time series F ( t ) can be considered as a linguistic variable on a time series Y ( t ), where t ∈ T , is a set of time points. In this case, F ( t ) is a collection of some fuzzy sets μ i (t)( i = 1 , 2 , . . . ) , which are regarded as the possible linguistic values of . This kind of fuzzy time series is named first order fuzzy time series and the relation is called fuzzy logical relationship (FLR). Let the linguistic value of F ( t − 1 ) be A i and the linguistic value of F ( t ) be A j , where A i and A j are the fuzzy set for the observation at the time t − 1 and at the time t , respectively. The FLR between the two fuzzy sets can be denoted by [13][14][15] .
The steps of SSA-LRF-Fuzzy model presented in this paper are shown in Fig. 2 and the detailes for each step are described as follows ( [12]): Step 1: Analog to the first step of SSA-LRF-NN procedure.
Step 2: Combination of the SSA-LRF and the fuzzy time series Sub-step 2.1: Obtain the residuals of the SSA-LRF as described in Step 1. Sub-step 2.2: Set the universe of discourse, U , and split it into several equal length intervals. Let D min and D max be the minimum and maximum residual defined in Substep 2.1, respectively. The universe of discourse, U , can be define as where D 1 and D 2 are the proper positive numbers. In the case that U is partitioned into n equal intervals u 1 , u 2 , . . . , u n , the length of the interval, l, can be defined as  [13][14][15] , and can be written as [8] , where the maximum membership value of A i occurs in the interval u i . Sub-step 2.4: Fuzzify the residuals obtained in Sub-step 2.1 by considering the fuzzy sets as defined in Sub-step 2.3. Sub-step 2.5 : Obtain the fuzzy logical relationships (FLRs) according to (1) Chen's method [8] (2) Yu's method [9] (3) Cheng's method [10] (4) Lee's method [11] Sub-step 2.6 : Calculate the forecast values for the residuals based on FLRs according to the related method as determined in Sub-step 2.5. Sub-step 2.7 : Calculate the final forecast values by adding the forecast values obtained by the SSA-LRF (Step 1) and the forecast values obtained by the fuzzy model (Sub-step 2.6).      Further details about the results of the application to electricity load forecasting that validate the usefulness of the proposed methods can be found in [12] . In [12] , all the results obtained from the two proposed hybrid methodologies, SSA-LRF-NN and SSA-LRF-WFTS, were compared with the results from the standard SSA-LRF framework. In order to provide more objective evaluation to the accuracy performance of the forecasting results using the two proposed methods, we provide comparisons with standard methods for time series forecasting and other two hybrid methods, the ARIMA-NN [16] and the TLSNN (two levels seasonal neural network) model [17] . In the ARIMA-NN model [16] , ARIMA was proposed to handle the linear relationship in the data, while the NN captures the nonlinearity pattern in the data. In this study, the model parameters of the ARIMA model were estimated with the auto.arima function of the R package forecast [18] , while the parameters of NN are estimated by nnetar function from the same R package. As stated in [19] , NN do not only work well for handling the nonlinearity relationship in the data, but also linear relationships in the time series data.  Recently, [17] proposed the TLSNN model, which consists of two parts to estimate the deterministic and the stochastic components. The deterministic component, including the trend and the oscillatory component, is estimated by the SSA, while the stochastic component, including the residuals of the deterministic model, is modeled by a NN.
The other methods that were used for comparison in this paper are: the two-level seasonal autoregressive (TLSAR) model [20] , the double seasonal Holt-Winter (DSHW) model [21] , and the TBATS (Trigonometric, Box-Cox transform, ARMA errors, Trend, and Seasonal components) model [22] .  The success of TLSAR, TBATS, DSHW, and TLSNN in modeling the load electricity time series in Indonesia can be seen in [12 , 23] . The results for the comparisons, based on the root mean square error (RMSE) and on the mean absolute percentage error (MAPE), between the two proposed hybrid algorithms and the competing models can be seen in Tables 1-5 . Further details can be found in [12] . Table 5 does not include the results for the DSHW model because this model, proposed by [21] , intends to handle double seasonal patterns in the series and the weekly gasoline data does not show two seasonal patterns. It should be noted that a particular model that provides better performance in one case may not necessarily give the same results in another case. Further, [12] have not taken into account the effect of holidays or special days as in [20] . Table 6 The p-values obtained from the two-sided Diebold-Mariano test between methods for the five time series discussed in this study.   Table 6 . We considered a significance level of 0.05, being the pvalues above that threshold associated to no significant difference between the forecasts obtained by method A and method B. When the p-values are lower than 0.05, one-side hypothesis tests are conducted to determine which forecast method provides more accurate results. The results for the one-sided Diebold-Mariano tests [37] are also included in Table 6 under the form of superscript close to the p-values. All calculations were done by using the dm.test function from the R package forecast.

Method Data
The results presented in Tables 1-6 and in Figs. 3 -7 give a clear picture on the overall better performance, in terms of forecasting ability, of the two methods presented in this paper, when compared with standard individual and hybrid methods available in the literature. Moreover, since the proposed two hybrid approaches SSA-LRF-NN and SSA-LRF-WFTS outperform the SSA-LRF, they are also expected to outperform other methods that SSA-LRF outperforms, e.g. [23][24] .
The forecasting results of the SSA-LRF-WFTS hybrid model may be further improved by applying the higher order fuzzy time series as discussed in [26][27][28] . The performance of the fuzzy model is influenced by the universe of discourse selection, length of interval, FLR, and defuzzification. Related literature can be found in [29][30][31] . In the case of data contamination with outlying observations, further improvement related to the SSA part of the model can be obtained by considering a robust SSA algorithm [25 , 32] . A more parsimonious adaptation of the recurrent forecast algorithm [33] or of the vector forecast algorithm [34] can also be considered to improve the forecasting ability of the SSA part of the model. Based on the M4 competition, the hybrid statistical and machine learning approach produces more accurate forecasts and more precise prediction intervals than the combination of statistical approaches [35] . However, it should be noted that more complex models do not guarantee more accurate forecasts than the simpler models [36] . Therefore, caution should be taken when selecting the forecasting algorithm, depending on the kind of data and parsimony required for the model.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.