Short-Term Traffic Flow Forecasting Method Based on LSSVM Model Optimized by GA-PSO Hybrid Algorithm

Short-term traffic flow forecasting is one of the key issues in the field of dynamic traffic control and management. Because of the uncertainty and nonlinearity, short-term traffic flow forecasting remains a challenging task. In order to improve the accuracy of short-term traffic flow forecasting, a short-term traffic flow forecasting method based on LSSVM model optimized by GA-PSO hybrid algorithm is put forward. Firstly, the LSSVMmodel is constructedwith combined kernel function.Then the GA-PSO hybrid optimization algorithm is designed to optimize the kernel function parameters efficiently and effectively. Finally, case validation is carried out using inductive loop data collected from the north-south viaduct in Shanghai. The experimental results demonstrate that the proposed GA-PSO-LSSVMmodel is superior to comparative method.


Introduction
Real-time and accurate traffic flow forecasting information can provide the theoretical and data supports for the advanced traffic management system (ATMS) and advanced traffic information service system (ATIS). Because of its importance in both theoretical and empirical aspects of ITS, short-term traffic flow forecasting has generated great interest among researchers. With the development of traffic surveillance systems, more and more real-time traffic data become available in every couple of minutes or seconds. The short-term traffic flow forecasting generally means that the observation period is less than 15 minutes. The traffic flow forecasting, especially the short-term traffic flow forecasting, has been recognized as a critical need for the intelligent transportation systems. In the past decades, numerous studies have been applied to the traffic flow forecasting by researchers. The forecasting methods in the literatures can be broadly divided into parametric methods and nonparametric methods [1]. The parametric methods mainly include autoregressive integrated moving average (ARIMA) model [2][3][4][5], time series model [6][7][8][9], Kalman filtering model [10][11][12][13], parametric regressive model [14][15][16].This kind of method can get better forecasting effect if the traffic flow data varies temporally. However, these methods often assume a number of harsh conditions, such as the normality of residuals and a predefined model structure, which are seldom satisfied due to the stochastic and nonlinear characteristics of traffic flow. To overcome the limitations of parametric models, lots of researches have used nonparametric methods, such as nonparametric regressive model [17][18][19], spectral analysis model [20,21], artificial neural networks (ANN) models [22][23][24][25], support vector machine (SVM) models [26][27][28][29], and so on. Particularly, the SVM model has great generalization ability and global minima for sample data, which has gained special attention in recent years. This paper is motivated to build the short-term traffic flow forecasting model based on SVM model due to its ability in dealing with the dynamic, nonlinear, and complex traffic flow time series.
Nevertheless, besides its advantages, there are some insufficiencies of the SVM based forecasting models. One is the choice of the kernel function. The traditional selection of kernel functions is single kernel function and generally dependents on experience. In view of this problem, we construct a combined kernel function to overcome the limitation of single kernel function. In addition, the parameters determination of SVM model remains a difficult yet important challenge. At present, the commonly used parameter optimization methods mainly include cross validation method [30] and grid search method [31]. But these methods are 2 Discrete Dynamics in Nature and Society easy to fall into local optimum and have large amount of calculation. In order to obtain rational parameters, intelligent optimization algorithms have been pursued by many researchers. Particle swarm optimization (PSO) and genetic algorithm (GA) are the most popular intelligent optimization algorithms. Genetic algorithm (GA) [32] is a heuristic scientific method based on Darwin's biological evolutionism, which can search parallel from a population of points. Therefore, it has the ability to avoid being trapped in local optimal solution. Particle swarm optimization (PSO) [33,34] is a swarm intelligent optimization algorithm, which is derived from the study of bird predation behavior. Compared with genetic algorithm, the PSO algorithm has a simpler structure because it has no selection, crossover, and mutation operation. However, because the PSO algorithm evolves by comparing its own position and the surrounding position and the current optimal position in the group particle, therefore the convergence speed of the PSO algorithm is slow in the later calculation stage and easy to fall into local optimum value. Comparatively speaking, because of crossover, mutation, and other evolutionary patterns, GA can improve the diversity of solution. But GA often leads to a large number of redundant iterations when calculated to a certain extent, which reduces the computational efficiency.
Taking into account the above reasons and with the goal of improving the accuracy of short-term traffic flow forecasting, we put forward a short-term traffic flow forecasting method based on LSSVM model optimized by GA-PSO hybrid algorithm. The remainder of this paper is structured as follows: in section "Modeling of LSSVM Model", the principle of LSSVM model and the construction of combined kernel function are presented. In section "GA-PSO Hybrid Optimization Algorithm Design", the process of GA-PSO hybrid optimization algorithm is described. In section "Experiment Setup and Case Study", empirical analysis is carried out, and the forecasting results of different approaches are presented and discussed. In section "Discussion and Conclusions", a brief review and future research are presented.

Modeling of LSSVM Model
. . e Principle of LSSVM Model. LSSVM is an improved algorithm based on SVM. By introducing the method of equality constraint and least square loss function, the optimization problem is changed into a linear equation, and the complexity of the algorithm is reduced by avoiding the two programming problem. Regression forecasting based on LSSVM can be described as follows.
Considering a given training data set = ( , ), = 1, 2, ⋅ ⋅ ⋅ , . The relationship between and is usually nonlinear, so is mapped into high-dimensional feature space. The regression function of LSSVM is defined as subject to where w is the weight vector, C is the penalty factor, is the approximation error, ( ) is the nonlinear mapping function, and b is the offset. To solve the optimization problem, the Lagrange function can be introduced as follows: where is the Lagrange multiplier. According to the Karush-Kuhn-Tucker(KKT) conditions, the following formula can be obtained by partial derivatives with respect to , , , and .
By eliminating and , the equations can be written as , and Ω is kernel matrix with Ω = ( ) ( ) = ( , ), , = 1, 2, ⋅ ⋅ ⋅ , . Considering Ω = Ω + / , the expressions of and can be written as Therefore, the regression model of LSSVM can be obtained as where ( , ) is the kernel function which satisfies Mercer condition.
. . e Construction of Combined Kernel Function. SVM model is built based on the principle of structural risk minimization, whose core idea is to introduce kernel functions. The SVM model with different kernel functions could have different learning and generalization ability. Therefore, how to select the appropriate kernel function is a major problem encountered in the field of short-term traffic flow forecasting.
At present, the commonly used kernel functions can be roughly divided into two categories, such as local kernel function and global kernel function. The Gaussian kernel function is typical local kernel function, which has strong learning ability and weak generalization ability. The polynomial kernel function is typical global kernel function, which has strong generalization ability and weak learning ability. Therefore, taking into account the advantages of Gaussian kernel function and polynomial kernel function, this paper will construct a new combination kernel function. The combination kernel function will not only have the local learning ability of Gauss kernel function but also has strong generalization ability of polynomial kernel function. The form of combination kernel function is as follows: where is weight coefficient, 0 ≤ ≤ 1, is the kernel width of Gaussian kernel function, and is the order of polynomial kernel function.
When approaches 0, the combined kernel function approximates the polynomial kernel function. Although it has good fitting ability to the sample data far away from the test point, the data fitting effect near the test point is poor. When approaches 1, the combined kernel function is close to the Gaussian kernel function, of which the global generalization ability is weak. In short, different kernel functions have different advantages, if the choice of weight coefficient is inappropriate, and the performance of combination kernel function may be lower than single kernel function. Therefore, proper weight coefficient is of great importance for the combined kernel function.

GA-PSO Hybrid Optimization Algorithm Design
The construction of the combined kernel function increases the parameters that need to be optimized. This paper designs a new GA-PSO hybrid optimization algorithm to obtain the optimal parameters of LSSVM model. The main idea of the GA-PSO hybrid optimization algorithm is as follows: first of all, the PSO algorithm is carried out, and the optimal M particles are retained. Then, pop size-M individuals are obtained by copying operations based on the position value of the M particles, and the crossover and mutation operations of GA are carried out. Finally, the position value of M particles retained by PSO and the pop size-M obtained by GA form a new particle population and perform the next generation of evolutionary computing. Figure 1 gives the GA-PSO hybrid optimization schematic. The main differences between this new algorithm and the traditional hybrid algorithm are as follows: (1) The combination of GA and PSO is hierarchical. Firstly, all individuals of the population perform PSO evolutionary operations, then the optimal M particles are selected to perform genetic evolution.
(2) Two-information transmission is completed during mixing process. The initial population of GA is generated by the optimal individuals in PSO; after genetic operation, the  Step . Initialize the parameters: the number of particles pop size, the number of particles retained after PSO evolution M, PSO weight factors c 1 and c 2 , the crossover probability c , the mutation probability m , the maximum velocity of a particle V max, the maximum evolutionary generations k max, and the general evolutionary generations of hybrid algorithm max gen.
Step . Generate the initial pop size particles in the feasible domain and calculate the fitness function value. The fitness function is defined as the mean absolute percentage error of the fivefold validation method on the training data set.
Step . If k<=max k, then implement Step 7; else implement Step 9.
Step . Update the position and speed of the particles.
Step . The pop size particles are sorted by the value of the fitness function, and the M particles with the least fitness value are selected.
Step . According to the position of the retained M particles, the pop size-M GA individuals are generated by copying operations. Step . Crossover and mutation operations are carried out with probabilities P and P .
Step . A new pop size particle is formed by combining the pop size-M individual with M particles.
Step . Output the optimal fitness function value and the parameters optimization results. The process of GA-PSO hybrid optimization algorithm is shown in Figure 2

Experiment Setup and Case Study
. . Data Description. An arterial segment of the north-south viaduct expressway from Gonghe Road interchange to Yanan East Road interchange in Shanghai, China, is selected as experimental section. The graph of the experimental area is shown in Figure 3, with four lanes for each direction. Figure 4 gives the layout of detectors for four lanes in one direction. . . Data Analysis. The determination of input data has a direct impact on the short-term traffic flow forecasting. Traditional traffic flow forecasting methods mainly focus on the time correlation of traffic flow data and ignore the spatial correlation, which has some limitations. Through the analysis of a large number of traffic flow data, it found that there is a strong temporal and spatial correlation for traffic flow data. Figure 5 gives the traffic flow data of the same detector for five consecutive Monday. Figure 6 gives the traffic flow data Discrete Dynamics in Nature and Society  of different detection sections in the same lane. Figure 7 gives the traffic flow data of different lanes at the same detection cross section. As we can see, traffic flow data has strong spatial and temporal correlation, and the spatiotemporal correlation characteristics provide effective data support for short-term traffic flow forecasting.
Through the above analysis ( Figures 5, 6, and 7) , this paper will make full use of the multimodal spatio-temporal correlation information to determine the input variables of the forecasting model. Taking detector NBDX16(2) as example, where NBDX denotes the main line of east side on  Table 1.
. . Comparison of GA-PSO Algorithm Performance. In order to compare the effect of parameter optimization, GA algorithm, PSO optimization algorithm, and traditional GA-PSO algorithm [35] are used for comparative analysis. The K-fold cross validation method is used to prevent overfitting and under-fitting. The training data set is randomly divided into K subset. The LSSVM model is built using − 1 subset as the 6 Discrete Dynamics in Nature and Society

Number
Input variable Traffic flow data collected from the upstream detection section of the same lane NBDX15(2) Traffic flow data collected from the downstream detection section of the same lane NBDX17(2) Traffic flow data collected by adjacent lane of the same detection section NBDX16(1) Traffic flow data collected by adjacent lane of the same detection section NBDX16(3) Traffic data collected by the same detector at the same time a week ago Traffic data collected by the same detector at the same time two weeks ago  training set. The performance of the parameters is checked on the ℎ subset. In this paper, fivefold cross validation method is used. The parameters of each optimization algorithm are shown in Table 2. Figure 8 gives the convergence process of different algorithms. As we can see from Figure 8, the fitness curve of proposed GA-PSO hybrid optimization algorithm is obviously better than three other algorithms. The convergence speed of the proposed GA-PSO hybrid optimization algorithm is faster than the three other algorithms. The ideal effect is achieved basically around 20 iterations. In summary, the parameter optimization effect of proposed GA-PSO hybrid optimization algorithm is better than GA algorithm, PSO algorithm, and the traditional GA-PSO algorithm.
In order to further verify the superiority of the proposed algorithm, comparative analysis was carried out from the aspect of fitness value, the average convergence algebra, and the average computation time. The experimental environment is as follows: Computer processor is Intel(R) Core(TM) i5-2450M CPU 2.50GHZ. Memory capacity is 4GB. Operating system is Windows 7. Table 3 gives the comparison results of different algorithms.
. . Influence Analysis of the Number of Retained Particles. In the GA-PSO hybrid algorithm proposed in the third section, there is an important parameter M, which represents the size of the particles retained after the PSO optimization and also the original population of the GA genetic manipulation. In Section 4.3, we set up M=0.4×pop size. In order to analyze the influence of M on the performance of the algorithm, The average convergence algebra M=r×pop size is adopted, where r is set to 0.1,0.2,0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9, respectively. The 20 random trials were carried out. The average convergence generations are compared. The contrast result is shown in Figure 9. From Figure 9, it can be seen that the value of r has no significant influence on the average convergence algebra. With the increase of the value of r, the average convergence algebra is relatively stable.
. . Performance Evaluation Index. In order to evaluate the efficiency of the proposed approach, three different types of statistical indices are utilized to measure the forecasting accuracy. These indices are the mean absolute error (MAE), mean absolute percent error (MAPE), and equal coefficient (EC). The equations of these indices are as follows: where denotes the actual value for the ℎ time interval, denotes the predicted value for the ℎ time interval, and is the total number of time intervals. . . Model Performance and Analysis. In order to evaluate the forecasting performance of the proposed approach, the training dataset selects traffic data collected on April 28th, May 5th, May 12th, and May 19th, and the test dataset selects traffic data collected on May 26th. Figure 10 presents the forecasting results of east mainline detector NBDX16(2). Figure 11 presents the forecasting results of west mainline detector NBXX10(1). The green line stands for the forecasting results, and the blue line stands for the original traffic flow data.
As shown in Figures 10 and 11, the forecasting results by the proposed approach track closely to the actual data, which 8 Discrete Dynamics in Nature and Society  instructs that the proposed approach is able to predict shortterm traffic flow data with small errors in most situations.
To further demonstrate the superiority of the proposed approach, the GA-LSSVM model and PSO-LSSVM model are compared using the same dataset. Figure 12 presents the forecasting results by each approach for east mainline detector NBDX11(1), and Figure 13 presents the forecasting results by each approach for west mainline detector NBXX15(2). Figures 12 and 13, the GA-PSO-LSSVM model successfully captures the changing tendency of traffic flow data and has the best fitting performance comparing to the other approaches, which proves that the proposed approach could accurately forecast short-term traffic flow data and outperforms the other two approaches. Table 4 presents the evaluation results of different methods. It can be found that the lowest forecasting errors are achieved by the proposed method. The proposed method has strong generalization ability because it could achieve good forecasting performance both on east mainline and on west mainline. Overall, the proposed approach works well for short-time traffic flow forecasting, which can achieve satisfactory forecasting results.

Discussion and Conclusions
In this paper, we propose a short-term traffic flow forecasting method based on LSSVM model optimized by GA-PSO hybrid algorithm. The main contribution of this paper is that we provide the new idea to the LSSVM model on how to build a combined kernel function for the shortterm traffic flow forecasting model and how to optimize the kernel function parameters efficiently and effectively. Validation of the short-term traffic flow forecasting has been carried out using traffic flow data collected from the northsouth viaduct expressway in Shanghai. The validation results indicate that the GA-PSO-LSSVM model has good potential to be developed and is suitable for short-term traffic flow forecasting.
Further improvement on the accuracy of short-term traffic flow forecasting could be made when more influence factors are considered, such as morning and evening peak, off-peak, adverse weather, and traffic accidents. In addition, traffic data collected at different time intervals are interested to be test in the model.

Data Availability
The urban expressway traffic data used to support the findings of this study are included within the article

Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Discrete Dynamics in Nature and Society 9