Abstract

Here is proposed a novel method for decomposing a nonstationary time series in components of low and high frequency. The method is based on Multilevel Singular Value Decomposition (MSVD) of a Hankel matrix. The decomposition is used to improve the forecasting accuracy of Multiple Input Multiple Output (MIMO) linear and nonlinear models. Three time series coming from traffic accidents domain are used. They represent the number of persons with injuries in traffic accidents of Santiago, Chile. The data were continuously collected by the Chilean Police and were weekly sampled from 2000:1 to 2014:12. The performance of MSVD is compared with the decomposition in components of low and high frequency of a commonly accepted method based on Stationary Wavelet Transform (SWT). SWT in conjunction with the Autoregressive model (SWT + MIMO-AR) and SWT in conjunction with an Autoregressive Neural Network (SWT + MIMO-ANN) were evaluated. The empirical results have shown that the best accuracy was achieved by the forecasting model based on the proposed decomposition method MSVD, in comparison with the forecasting models based on SWT.

1. Introduction

Time series forecasting has reached high significance in planning and management for government institutions, industries, and business. Unfortunately the forecasting implementations are limited due to the data complexity. Environmental conditions, economy variables, risk situations, among others, are originated in highly dynamic systems; consequently their analysis becomes complex and inaccurate results have been often obtained.

From the literature review, several linear and nonlinear models have been proposed. One popular linear model is Autoregressive Integrated Moving Average (ARIMA), which was introduced by Box et al. [1] and widely applied to nonstationary time series such as, electricity consumption [2], rainfall [3], solar radiation [4], and tourists arrivals [5]. Empirical results have shown varied accuracy levels by testing different parameter configuration; therefore the performance of ARIMA is dependent on an effective selection of parameters. Even more ARIMA models are limited to deal with constant variance processes and normally distributed residuals which are rarely satisfied in real life signals.

On the other hand, the Artificial Neural Networks (ANNs) are nonparametric models that have been implemented for modeling nonstationary time series. The nonlinear features of the ANNs sometimes explain the nonlinear relationships among the explaining variables and observed phenomena. By instance, Li and Shi (2010) applied three typical ANN techniques for one-step ahead wind speed forecasting by using different datasets of two representative north American sites [6], by implementation of Feed Forward Back-Propagation (FFBP), Radial Basis Function (RBF), and Adaptive Linear Element (ADALINE). After multiple tests the FFBP model was considered the best model for one site, while the RBF model was the best for other sites; therefore the research concludes that it is not recommended to employ only one type of ANN model in wind speed forecasting. Other representative example is the prices variation range; Laboissiere et al. [7] modeled stock prices of power distribution companies through an ANN based on Levenberg-Marquardt (LM). Different Multilayer Perceptron (MLP) topologies were evaluated iteratively with opening and closing prices and other correlated variables as input, different number of hidden neurons and one output until finding the best configuration for short-term horizon. In general, an ANN implementation implies taking some decisions after several tests, such as, network topology, signal propagation method, activation function, weights updating, hidden levels, and numbers of nodes. Sometimes higher accuracy can be reached by an ANN, but this leads a computational complexity increasing [8, 9].

A novel solution is the hybrid models which are based on the combination of techniques. Preprocessing methods combined with conventional linear and nonlinear models are implemented to improve the forecast. The Wavelet Decomposition (WD) was originated in 1984 with the discovery of Grossman and Morlet in the quantum physics context [10]. The conjunction Wavelet Decomposition and artificial intelligence can improve the efficiency of pure models in many areas such as, hydrology [11, 12], transportation systems [13], and public health [14].

In this work is proposed a new decomposition method based on Multilevel Singular Value Decomposition (MSVD) for extraction components of low and high frequency of a nonstationary time series in order to improve the accuracy of a linear and a nonlinear forecasting model. A Multiple Input and Multiple Output Autoregressive (MIMO-AR) model is implemented based on MSVD. Three relevant traffic accidents’ time series of Santiago, Chile, are used to evaluate the forecast performance. The MSVD + MIMO-AR joint model is validated through comparisons with respect to the performance of Stationary Wavelet Decomposition combined with MIMO-AR (SWT + MIMO-AR) and SWT combined with an Autoregressive Neural Network based on Levenberg-Marquardt (SWT + MIMO-ANN).

Related works about traffic accidents forecasting are scarce; most applications make classification with multivariate methods [1518]. There have been found some forecast applications related to transportation areas, such as, traveling time [19], traffic flow parameters as volume, travel speeds and occupancies [20, 21], the market demand after transportation disruptions [22], and freight transportation demand [23, 24].

This paper is organized as follows. Section 2 describes the proposed methodology based on MSVD + MIMO-AR. Section 3 describes SWT + MIMO-AR and SWT + MIMO-ANN. Section 4 presents the efficiency metrics. Section 5 specifies the study case. Section 6 shows the empirical research results. Finally Section 7 concludes the paper.

2. Forecasting Methodology Based on MSVD and MIMO-AR

The proposed forecasting methodology is described in two stages; the first stage is presented by Multilevel Singular Value Decomposition to decompose a time series into two components of low and high frequency, whereas the second stage performs the prediction through the MIMO-AR model. The MIMO-AR inputs are the lagged values of the components extracted, and the outputs are the prediction for multiple horizon.

2.1. Multilevel Singular Value Decomposition

MSVD is a method inspired in the pyramidal process implemented in multiresolution analysis of Mallat Algorithm [25] which was defined for wavelet representation. In this method is proposed the multilevel decomposition of a Hankel matrix, at difference of standard HSVD [26]. MSVD implements iterative embedding and pyramidal decomposition with a fixed window length ; therefore two components are obtained at each decomposition level.

MSVD algorithm is summarized as the pseudocode shown in Algorithm 1. The input algorithm is the observed time series of length , and at the end, two additive and intrinsic components are obtained as outputs, and , which represent the Low Frequency and the High Frequency Component, respectively, each one of length . MSVD is performed in three steps: embedding through a Hankel matrix of dimension as (2), decomposition in orthogonal matrices of eigenvectors and and singular values , and finally extraction from elementary matrices and .

Normalize time series max(abs())
Set the counter , and signal to be decomposed
while ( )
   
   Embed the signal hankel
   Decompose the matrix
   Compute elementary matrix 1:
   Compute elementary matrix 2:
   Extract the low frequency signal of level ,
    
   Extract the high frequency signal of level ,
    
   Update the decomposition signal
end while
Get the low frequency component
Get the high frequency component

MSVD is processed iteratively until the optimum decomposition level , when the Singular Spectrum Rate reaches the asymptotic point. Equations (1a) and (1b) describe the computation of , which is based on the relative energy of the singular values (for decomposition levels).

2.2. Multiple Input Multiple Output Autoregressive Prediction

The AR model is implemented to forecast the time series by using the MIMO strategy. MIMO is used for overcoming the error accumulation problem that is observed in the recursive strategy and the direct strategy and for preserving the random relationships between predicted values [27]. MIMO-AR computes the output for the forecast horizon in a single simulation with unique model, which returns a vector rather than a scalar value as follows:where is the forecast horizon and is the matrix of outputs. Each column will contain the prediction related to a specific horizon.

The following equation defines MIMO in matrix form: where is a matrix of linear coefficients and is the Autoregressive transposed matrix created from the components and that were extracted by means of MSVD. Matrix is dimension, where is the number of samples. The matrix of coefficients is computed with the Least Square Method (LSM) as below:where is the Moore-Penrose pseudoinverse matrix of .

3. Stationary Wavelet Transform Combined with MIMO-AR and Stationary Wavelet Transform Combined with MIMO-ANN

3.1. Stationary Wavelet Transform

Stationary Wavelet Transform (SWT) is improved version of Discrete Wavelet Transform. SWT is also known in the literature as Dyadic Wavelet Transform, Maximal Overlap Transform, Undecimated Discrete Wavelet Transform, and Redundant Wavelet Transform. The implementation of SWT is defined in the algorithm of Shensa [28]. SWT implements filtering but the downsampling procedure is omitted and the filters are upsampled [29, 30].

In SWT the length of the observed signal must be an integer multiple of , where is the scale number. The signal is separated in approximation coefficients and detail coefficients at different scales; this hierarchical process is called multiresolution decomposition [25].

The observed signal (which was named in previous section) is decomposed in approximation and detail coefficients through a bank of low pass filters and a bank of high pass filter one to each level as the scheme of Figure 1. Each level filter is upsampled version of the previous one. The components obtained after the decomposition are never decimated; therefore they have the same length as the observed signal.

At first decomposition level the observed signal is convoluted with the first low pass filter to obtain the first approximation coefficients and with the first high pass filter to obtain the first detail coefficients . The process is defined as follows:The process follows iteratively, for and it is given asInverse Stationary Wavelet Transform (iSWT) performs the reconstruction. The implementation of iSWT consists in applying the operations that were performed in SWT but in inverse order and based on equivalent filters of reconstruction. SWT obtains subbands of frequency; the last approximation coefficient reconstructed gives rise to the component of low frequency , whereas all reconstructed detail coefficients are added to obtain the component of high frequency .

3.2. Forecast Based on Components Extracted by SWT

The forecasting is implemented via both a linear and nonlinear models to evaluate the performance of the decomposition obtained through SWT.

The MIMO-AR model has the same structure that was used in the previous section. Therefore the nonlinear forecast based on an Artificial Neural Network is described here.

A sigmoid Multilayer Perceptron (MLP) of three layers [31] is implemented. The ANN is denoted with . The inputs are the lagged terms contained in the regressor matrix . The hidden layer has nodes and the output has nodes, which are the forecast horizon. The prediction at each forecast horizon via the ANN is expressed with,where , is the weight of the connection between th hidden node and th output, and is th hidden level output, while is the weight of the connection between th input node and th hidden node, and represents th lagged vector.

The sigmoid transfer function is denoted with the following:Parameters and are updated with the application of the learning algorithm, in this case, Levenberg-Marquardt [32, 33].

4. Efficiency Metrics

The performance of the models MSVD + MIMO-AR, SWT + MIMO-AR, and SWT + MIMO-ANN is evaluated with three efficiency criteria. The normalized Root Mean Square Error (), the modified Nash-Suctliffe Efficiency () [34], and the modified Index of Agreement () [35]. Metrics and at difference of are not based on square differences; in place they are based on the Sum of Absolute Error () and the Sum of Absolute Deviation (); the formulas are shown below:where is th observed value, is th predicted value, is the mean of all , and is the testing sample size.

Scores and overcome the possible oversensitivity to extreme values induced by metrics based on square computation and increase the possible sensitivity for lower values.Metrics and are monotonically and functionally related, but the use of balances the number of deviations evaluated within the numerator and within the denominator of the factional part.

5. Case Study

The Chilean Police and the National Traffic Safety Commission (CONASET) are the official institutions that register the data of traffic accidents in Chile [36]. The data are continuously collected, and in this study they have been sampled with a fixed interval of 7 days (one week). Three discrete time series of injured people in traffic accidents in Santiago from 2000:1 to 2014:12 due to different causes are used. CONASET has defined one hundred causes of traffic accidents; in this study case the series of persons with injuries I-G1 and I-G2 group include 20 causes which are related to improper behavior of drivers, passengers, and pedestrians, with an incidence rate of 75%, whereas the series I-G3 groups include the remaining causes (not related to improper behavior) with an incidence rate of 25%. Table 1 presents the series of persons with injuries in traffic accidents and the causes related to the incidents.

Figures 2(a), 3(a), and 4(a) show the observed time series, whereas Figures 2(b), 3(b), and 4(b) show the Fourier Power Spectrum (FPS) of I-G1, I-G2, and I-G3, respectively. High data variability is observed during the analyzed period; by instance I-G1 presents upward trend between weeks 1 and 280 which is followed by downward trend until week 348; that behavior continues in the remaining observed period. On the other hand, I-G2 presents downward trend from week 232 until the end, and I-G3 presents upward trend from week 491 until end. The FPS analysis shows the signal spectrum and the red-noise spectrum; a signal spectrum peak is significative when its value is higher than the red-noise spectrum [37]. Both series, I-G1 and I-G2, present the highest peak at week 26 at 98% of confidence level, whereas I-G3 shows the highest peak at week 17 at 73% of confidence level. The order of each AR model was selected with the number of weeks for which the highest power spectrum was found; consequently for I-G1 and I-G2, and for I-G3.

6. Empirical Research Result

The empirical results obtained by the application of MSVD + MIMO-AR, SWT + MIMO-AR, and SWT + MIMO-ANN are presented in this section in two stages decomposition and prediction.

6.1. Decomposition Based on MSVD and SWT

MSVD implements an iterative process which finishes when reaches the asymptotic value. The Singular Spectrum Rate for each decomposition level is illustrated in Figure 5; the asymptotic value is reached when , and it was observed in repetition 16. Therefore the iterative process finishes when iteration 16 was performed; this condition is used with all-time series.

SWT decomposition is implemented trough Daubechies of order 2 (Db2) (due to the inaccurate results that were obtained with the other types of wavelet functions, they are not presented). Three decomposition levels () were selected according to the period fluctuation between 8 and 16 weeks.

Figures 6, 7, and 8 show the components of low frequency and high frequency obtained with MSVD and SWT for I-G1, I-G2, and I-G3, respectively. The components extracted by both MSVD and SWT show long-memory periodicity features, whereas the components show short-term periodic fluctuations.

6.2. Prediction via MIMO-AR and MIMO-ANN Models

The MIMO strategy is implemented to predict the number of injured people in traffic accidents for multiple horizon through the Autoregressive model and through an Artificial Neural Network. For both linear and nonlinear models the spectral analysis developed by means of FPS informs about the order of the models; it was shown in Figures 2(b), 3(b), and 4(b). The inputs are the lagged values of and the lagged values of , and the outputs are the number of injured people for the next weeks. The components and were extracted previously via MSVD and SWT.

Before prediction, each data set of low and high frequency has been divided into two subsets, training and testing. The training subset () involves of the samples, and consequently the testing subset () involves the remaining .

The MIMO-ANN structure denoted with () was implemented with lagged values (set previously with the FPS information), , where is the training subset size, and forecast horizon.

The prediction performance is evaluated with the efficiency metrics , , and , which are presented in Tables 2, 3, and 4 for I-G1, I-G2, and I-G3, respectively. The ANN results correspond to 500 epochs and 10 runs. The results obtained through the nonlinear model SWT + MIMO-ANN are inferior with respect to the linear models MSVD + MIMO-AR and SWT + MIMO-AR; therefore the rest of comparisons are performed between the linear models. The SWT + MIMO-AR results for 14 weeks’ ahead prediction of all-time series are not presented due to the poor results that were obtained, as those results that were obtained with SWT + MIMO-ANN for forecast horizon higher than 8 weeks.

From Table 2 and Figure 9, MSVD + MIMO-AR obtains the best accuracy in the forecast of I-G1. A significative gain was observed in each forecasting horizon of the model based on MSVD with respect to the models based on SWT. The mean gain for 1 to 13 weeks of MSVD + MIMO-AR over SWT + MIMO-AR is 17.7% in and 8.1% in .

From Table 3 and Figure 10, MSVD + MIMO-AR obtains the best accuracy in the forecasting of I-G2. A significative gain of MSVD + MIMO-AR was observed in each forecasting horizon with respect to the models based on SWT. The mean gain for 1 to 13 weeks of MSVD + MIMO-AR over SWT + MIMO-AR is 20.6% in and 9.3% in .

From previous analysis that was made for I-G1 and I-G2, from Table 4 and Figure 11, the I-G3 forecast based on MSVD + MIMO-AR is also more accurate than the prediction obtained with the models based on SWT. The mean gain for 1 to 13 weeks of MSVD + MIMO-AR over SWT + MIMO-AR is 20.9% in and 9.4% in .

The I-G1 Prediction via MSVD + MIMO-AR for 14 weeks’ ahead prediction is shown in Figures 12(a) and 12(b); from figures good fit is observed between actual and estimated values. Metrics computation gives of 2.9%, of 83.3%, and of 91.6%. The prediction of the same series via SWT + MIMO-AR for 13 weeks’ ahead prediction is shown in Figure 13; lower accuracy is observed with of 10.1%, of 43.7%, and of 71.8%.

The I-G2 Prediction via MSVD + MIMO-AR for 14 weeks’ ahead prediction is shown in Figures 14(a) and 14(b); from figures good fit is observed with of 5.8%, of 81.9%, and of 90.9%. The prediction of the same series via SWT + MIMO-AR for 13 weeks’ ahead prediction is shown in Figure 15, lower accuracy is observed with of 19.6%, of 36.4%, and of 68.2%.

The I-G3 Prediction via MSVD + MIMO-AR for 14 weeks’ ahead prediction is shown in Figures 16(a) and 16(b); from figures good fit is observed with of 3.8%, of 81.4%, and of 90.7%. The prediction of the same series via SWT + MIMO-AR for 13 weeks’ ahead prediction is shown in Figure 17, lower accuracy is observed with of 13.0%, of 35.0%, and of 67.5%.

7. Conclusions

In this paper was presented a new decomposition method for extracting components of low and high frequency of a nonstationary time series. The method was called MSVD due to the use of Multilevel Singular Value Decomposition of a Hankel matrix. MSVD was evaluated for multistep ahead forecasting based on the Autoregressive model and the MIMO strategy.

The forecasting model MSVD + MIMO-AR was compared with a linear and a nonlinear forecast model based on the commonly decomposition technique Stationary Wavelet Transform. The empirical application was developed through three time series coming from traffic accidents domain. All experiments have shown the superior accuracy of MSVD + MIMO-AR with respect to SWT + MIMO-AR and SWT + MIMO-ANN. MSVD + MIMO-AR in comparison with the second best model SWT + MIMO-AR, achieving mean -gain of 19.8% and mean -gain of 8.9% for 13 weeks’ ahead forecasting of persons with injuries in traffic accidents in Santiago, Chile. It was also observed that an ANN with Levenberg-Marquardt based on SWT decays significantly from 9 weeks’ ahead forecasting.

Furthermore, the implementation of MSVD presents simplicity with respect to other techniques based on singular values by the use of a fixed window length in the embedding step, and although the algorithm is iterative, the stopping condition is guaranteed by the convergence of the Singular Spectrum Rate parameter. MSVD also presents more simplicity with respect to SWT; this is because SWT requires taking some decisions to select the wavelet mother function.

Future implementations will consider new application areas to support planning and management tasks of public and private institutions.

Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported in part by Grant Project VRIEA-PUCV DI-Regular 039344/2016.