Short-Term Passenger Flow Prediction With Decomposition in Urban Railway Systems

Accurate prediction of short-term passenger flow is vital for real-time operations control and management. Identifying passenger demand patterns and selecting appropriate methods are promising to improve prediction accuracy. This paper proposes a hybrid prediction model with time series decomposition and explores its performance for different types of passenger flows with varied characteristics in urban railway systems. The Seasonal and Trend decomposition using Loess (STL) is used to decompose the passenger flow into the seasonal, trend, and residual time series, representing the constant, long-term fluctuant, and stochastic passenger demand patterns. The approximate entropy (ApEn) is utilized to quantify the predictability of each component. After that, the Holt-Winters (HW) method is developed to predict the seasonal and trend components that both with high predictability. The long short-term memory networks (LSTM NN) are proposed to predict the residual with low predictability. The outputs from these models are combined to predict the short-term passenger flow. We assess the model performance with entry, exit, and transfer passenger flows at the target stations, which are selected via the agglomerative hierarchical clustering (AHC) algorithm in the Shanghai Metro. Compared with three representative models, the results show that 1) the decomposition-based hybrid model performs well for both one-step and multi-step predictions in terms of accuracy and robustness; 2) ApEn is useful for choosing appropriate prediction models.


I. INTRODUCTION
Short-term demand prediction facilitates advanced online applications in public transport, including predictive operations control, service, and customized information provision to passengers [1].
Numerous short-term prediction models have been proposed in transportation for different applications, such as travel demand [2], traffic flow [3], travel time [4], and crowding [5]. Regardless of the varied application scenarios, the short-term prediction in transportation is essentially a time series forecasting problem with exogenous explanatory variables (e.g., weather, land use). In this context, the effectiveness of predictions depends on the match of time series patterns and model assumptions (capability in capturing The associate editor coordinating the review of this manuscript and approving it for publication was Baozhen Yao . patterns). The match (i.e., prediction performance) can be improved from perspectives of inputs (time series or variables transformation), functions (statistic models and machine learning models) or outputs (correcting or combining results). Depends on the features of the studied data, different prediction methods may need to be developed for the best performance. The paper focuses on the station-based short-term passenger flow prediction in urban railway systems.
From prediction inputs perspectives, many studies developed novel methods by adding, constructing, or transforming input variables or time series data of demand. For example, Pereira et al. [6] combined the contextual information about special events with entry/exit flow data to predict public transport arrivals; Ni et al. [7] adopted the rates of social media posts and passenger flow information to estimate passenger demand. Several studies developed novel statistical models to construct or transform the time series of demand data.
For example, Wei and Chen [8] used empirical mode decomposition (EMD) to obtain the intrinsic mode function (IMF) components of passenger flow data and identified the useful IMFs as inputs for the back-propagation neural network (BPN); Sun et al. [9] extracted the sub-sequences of demand series using wavelet transformation and then applied the support vector machine (SVM) to forecast the extracted subsequences. The final demand predictions were reconstructed using the wavelet inverse transformation. Ma et al. [2] constructed weekly, daily, and hourly pattern time series, and developed an interactive multiple model-based pattern hybrid (IMMPH) approach to dynamically select the best model (matching the time series pattern) to predict passenger demand. Recently, the deep learning-based prediction framework gains widely applications. In this area, the stacked autoencoder (SAE) technique is frequently used in extracting spatiotemporal features [3], [10]. For example, Lv et al. [3] devised an SAE model to extract spatiotemporal features of traffic flow. A logistic regression function is utilized to combine the extracted features to predict the traffic demand. Meanwhile, the convolution neural network (CNN) has been widely applied to capture spatiotemporal dependencies of multiple time series [11], [12]. For instance, Ma et al. [13] proposed a parallel architecture that includes the CNN and bi-directional long short-term memory networks (BLSTM) to extract spatiotemporal features of passenger demand in large-scale metro networks. The model performance is better than statistical models (e.g., ARIMA) and deep learning models (e.g., CNN-BLSTM).
The prediction functions include: a) statistic models, such as ARIMA [14], state-space model [15], and Bayesian networks [16], b) machine learning models, such as SVM [17], Kalman filtering [18], and shallow neural networks [19] and c) deep learning models. A review of prediction methods can refer to [20], [21]. Deep learning models have gained increasing interest for its extraordinary mapping ability for big data. For example, Hao et al. [22] constructed a sequence to sequence model embedded with the attention mechanism to predict alighting passengers in a large-scale metro system; Polson and Sokolov [23] developed a deep neural network (DNN) to predict traffic flow under abnormal conditions; Tsai et al. [24] combined simulated annealing (SA) algorithm and DNN to predict bus passenger demand; Zhang et al. [25] utilized a spatial-temporal graph inception residual network to predict the network-based traffic flow. Deep belief network (DBN) [26], LSTM NN [27], radial basis function networks (RBFNN) [28] were also reported in literature. Although deep learning is popular in predictions, many studies have provided evidence on supporting simple statistical or machine learning models [29], [30]. Developing appropriate functions to match the time series patterns is the key to effective predictions.
From the perspective of prediction outputs, many methods were proposed to combine the sub-model outputs by taking advantage of real-time observations. The hybrid model yielded better prediction than the individual model by selecting the best 'fit' model predictions with real-time observations [20]. Diao et al. [30] combined the outputs of an efficient tracking model and a novel Gaussian process model to predict the real-time passenger demand, and the ensemble prediction can improve the accuracy by 20%-50% on average. Ding et al. [31] integrated the outputs of ARIMA and generalized autoregressive conditional heteroskedasticity (GARCH) to estimate the volatility of the passenger demand. Also, Gu et al. [32] used the Bayesian method to combine the outputs of the three sub-predictors, including the gated recurrent unit neural network (GRUNN), ARIMA, and radial basis function neural network (RBFNN). Noursalehi et al. [33] combined the prediction of the statespace model and dynamic factor model, which shows promising performance for abnormal passenger flow prediction.
Time series data exhibit significant characteristics. Therefore, identifying the characteristics (through transformation or decomposition) have been proven to improve the forecasting accuracy in areas of the economy [34], [35], atmospheric science [35], [36], and transportation [8], [9], [35]. For example, considering the periodicity of time series data, Chen et al. [35] proposed a Periodicity-based Parallel Time Series Prediction (PPTSP) algorithm for large-scale time series data prediction. Compressing the data and extracting the feature accordingly, a multi-layer time series periodic pattern recognition (MTSPPR) algorithm using the Fourier Spectrum Analysis (FSA) is developed to identify the potential multi-layer periodicity. A Periodicity-based Time Series Prediction (PTSP) algorithm is proposed to make predictions. Given various time series data (e.g., meteorological datasets, traffic flow datasets, and stock price datasets), results showed that the PPTSP algorithm significantly outperformed other algorithms regarding prediction accuracy and performance. Similarly, Yang et al. [37] analyzed the influence of periodic components on short-term speed prediction. Using the periodic component of speed data as the model input, results validated that combining the periodic component with statistical models (e.g., ARIMA) or machine learning models (e.g., SVM) could improve the multi-step ahead prediction accuracy. However, to the best of our knowledge, the similar idea was only carried out by the EMD [8] and wavelet [9] technique in short-term metro passenger flow prediction. Unfortunately, the EMD or wavelet decompositions could hardly represent passenger demand patterns in urban railway systems and makes it difficult for prediction models matching these low interpretability patterns. Besides, the predictability of passenger demand patterns should be the prerequisite for choosing suitable models, which is neglected in the previous studies [8], [9]. Therefore, the paper bridges the knowledge gap by exploring the potential of the decompositionbased prediction idea for urban railway demand and deriving insights on aspects of decomposition methods, one-and multi-step predictions, and station/passenger types selections.
In the paper, the Seasonal and Trend decomposition using Loess (STL) [38], Holt-Winters (HW) method [40], [41], and LSTM NN [44] are proposed (i.e., STL-HW-LSTM) to predict station-based short-term metro passenger flow under one-and multi-step ahead prediction scenarios. Preprocessing passenger flow sequences by STL, we obtain the seasonal, trend, and residuals accordingly [38], which represents the constant, long-term fluctuant, and stochastic passenger demand patterns in the context of short-term metro passenger flow prediction. Then, approximate entropy (ApEn) is employed to quantify the predictability over each subsequence. After that, the HW method, which is effective in predicting time series with high regularity, is employed to predict the component with high predictability (i.e., constant and long-term fluctuant passenger demand). LSTM NN, which shows a strong mapping ability in time series predictions, is utilized to predict the component with low predictability (i.e., stochastic passenger demand). Combining the predictions from the HW method and LSTM NN via STL and ApEn, the proposed model can provide more reliable and robust predictions under both one-and multi-step ahead prediction scenarios.
The main contributions of this work are three-fold: • The STL decomposition-based hybrid model outperforms other representative models for both one-step and multi-step predictions in terms of accuracy and robustness • ApEn is adopted to evaluate the predictability of passenger flow time series and lays the basis for choosing suitable prediction models.
• To assess the impact of target stations and passenger flow types on the model prediction performance, we employ the AHC algorithm to select target stations, and consider the entry flow, exit flow, and transfer flow. The rest of the paper is organized as follows. Section 2 describes STL, ApEn, the HW method, LSTM NN, and the proposed model. Section 3 demonstrates the data and settings for the case study. Section 4 presents the experiment results and the findings. Section 5 concludes the paper. Fig. 1 shows the proposed hybrid STL-HW-LSTM method for the station-based short-term urban rail passenger flow prediction. The inputs are passenger demand time series of stations (e.g., extracted from smart card data), and the outputs are passenger demand predictions in the future time intervals (e.g., 15 minutes ahead). The proposed model consists of four stages:

II. METHODOLOGY
• Stage 1: time series decomposition. STL decomposes the passenger flow time series into seasonal, trend, and residual time series. The seasonal, trend, and residual time series represent stable, long-term fluctuant, and stochastic passenger demand, respectively.
• Stage 2: predictability quantification. ApEn is utilized to evaluate the predictability of stable, long-term fluctuant, and stochastic passenger demand to lay the basis for choosing suitable for each subsequence. • Stage 3: time series predictions. It selects and builds corresponding functions to predict the seasonal, trend and residual time series. Particularly, HW models are developed to predict the seasonal and trend time series given their high regularity. LSTM NN model is used to predict the residuals that show highly irregularity and nonlinear correlation structures over time.
• Stage 4: prediction hybrid. It combines the predictions from seasonal, trend and residual models to predict the passenger demand in the future steps.
A. SEASONAL AND TREND DECOMPOSITION USING LOESS (STL) STL was initially introduced by Robert et al. [38]. It decomposes time series by applying a series of smoothing approaches using Locally Weighted Regression-Loess. Given a time series X t , STL decomposes X t into three additive components containing seasonality S t , trend T t and remainder R t : where t represents the time interval (e.g., 7:00-7:15). STL is robust to outliers. This can be beneficial to real-world applications where missing and noisy observations are inevitable. In urban rail system, the travel characteristics (e.g., departure and arrival time, origins, and destinations) among passengers are varied, which results in a stable and stochastic passenger demand pattern, respectively. Therefore, separating the seasonal and random component effectively, STL instead of other decomposition methods is adopted. The prediction task is then transformed into the prediction of different passenger demand patterns. STL consists of inner and outer recursive procedures. The inner loop extracts the seasonal and trend parts, and the outer loop calculates the residual. Take the (k + 1) th inner loop as an example, the updates of time series S k+1 t and T k+1 t are: Step 1: Detrending. Compute the detrended series, Step 2: Cycle-subseries Smoothing. Calculate the temporary seasonal series C (k+1) t by smoothing each cycle-subseries of X detrend t using loess smoother. For a monthly series over a year, January series is the first cyclesubseries, February second, and so on so forth.
Step ) from the original passenger flow time series X t . The robustness weight is defined for each data to measure the reliability of the corresponding R t . The weight will be used in the next inner iteration (i.e., the loess function of steps 2 and 6) to weaken the effects of outliers. Readers can refer to [38] for the form of loess function and the selection of other parameters.

B. APPROXIMATE ENTROPY (APEN)
Given passenger demand time series that obtained from STL, choosing the suitable prediction model is a key issue. Therefore, we attempt to evaluate the regularity and predictability of passenger demand time series at first and then select the appropriate models accordingly. Approximate entropy (ApEn) technique was developed by Pincus [39] used to quantify the amount of regularity and predictability over time series. Accurate entropy calculation requires vast amounts of data and the results will be greatly influenced by system noise [39], thus it is not practical to apply these methods to experimental data. ApEn can handle these limitations by modifying an exact regularity statistic. The steps are: Step 1: Fix an integer m and a positive real number r for a time series X t = (x 1 , x 2 , · · · , x N ), m and r represents the length of compared run of X t and a filtering level, respectively. N is the length of X t .
Step 2: Form a sequence of vectors U 1 , U 2 , · · · , U N −m+1 in the m-dimension space: Step3: For each i (1 ≤ i ≤ N − m + 1 ), use the sequence U 1 , U 2 , · · · , U N −m+1 to construct: The x a are the m scalar components of U.
Step4: Define ApEn as: ApEn reflects the likelihood that similar patterns of observations will not be followed by additional similar observations. A time series containing many repetitive patterns has a relatively small ApEn, which owes a higher predictability.
The ApEn provides a way to quantify the predictability of the decomposed time series for selecting the appropriate prediction models, especially for cases when the predictability of the decomposed time series is ambiguous (e.g., MED and wavelet).

C. HOLT-WINTERS (HW)
Passenger flow time series exhibits strong seasonal patterns. The HW method, proposed by Holt [40] and Winters [41], is adopted to forecast the seasonal and trend components. The approach belongs to a class of exponential smoothing techniques that aim to predict the time series X t by dividing x t ∈ X t into level l t , seasonal s t , and trend terms b t . HW method has a solid theoretical merit on prediction accuracy [43].
The additive HW is defined as: The multiplicative HW is given by: where h represents the forecasting horizon (i.e., forecasting step). α, β, and γ are the smoothing parameters.

D. LONG SHORT-TERM MEMROY NETWORKS (LSTM NN)
LSTM NN solve the vanishing gradient problem that occurs in the traditional recurrent neural networks (RNN) [44]. Fig. 2 shows the basic structure of LSTM NN. A standard LSTM NN structure is composed of one input layer, one recurrent hidden layer, and one output layer. The core of the hidden layer is a memory block, consisting of a memory cell, and input, output gate, and forget gates. The cell transports values at arbitrary time intervals and memorizes the temporal state. These gates serve for similar purposes with VOLUME 8, 2020 the ''conventional'' artificial neurons in the feedforward NN. Through the multiplicative gates, the memory cells can store and access information over a long time period. Given the historical time series x = (x 1 , x 2 , · · · ,x T ), the target time series y = (y 1 , y 2 , · · · ,y T ), and the hidden output h = (h 1 , h 2 , · · · ,h T ), the prediction process is iteratively carried out by equations (14)- (20): where T is the prediction period. t is the time interval, W is the weight matrix (e.g., W xi is the weight matrix between input x and input gate i), and b is the bias vector (e.g., b i is the bias vector of input gate). i, f , o, and c represent the input gate, forget gate, output gate, and cell activation vectors, respectively. (·) is the standard logistic sigmoid function:

III. EXPERIMENT SETUP A. DATA
We use the AFC data from the Shanghai Metro. It covers all stations over three months from July 1st to September 30th, 2016. Fig. 3 Fig. 3) is one of the busiest lines in Shanghai and is selected in the case study. Table 1 lists the AFC data fields used in this paper. As suggested in [45], 15 minutes is a threshold when making short-term metro passenger flow prediction. Therefore, we aggregate the station entry/exit flow into 15 minutes for the prediction analysis. The prediction time period is the common operation time 6:00-23:00, which results in 68 data points per day. The data from the last week is used for testing, and the remaining is for training. The mean absolute percentage error (MAPE) and root mean squared error (RMSE) are used as performance metrics. The former measures the mean prediction accuracy while the latter reflects the forecasting stability.   The distribution of metro station entry/exit flows over time may vary among stations. To evaluate the model performance, we employ the AHC algorithm to cluster stations based on entry/exit demand patterns. Fig. 4 shows an example of clustering results for metro line1. The detailed information on AHC algorithm can refer to [33].   morning and evening. Without loss of generality, entry flows at Xin-Zhuang (XZ), exit flows at Cao-Bao-Lu (CBL), and transfer flows at Ren-Min-Guang-Chang (RMGC) (from Line 1 to Line 2 at Lu-Jia-Zui) are used to validate the prediction performance of the proposed model and state-of-the -art models, including STL-HW [42], ARIMA-BPNN [46], and a naive decomposition (ND) based LSTM NN model.
The multi-step ahead prediction is implemented as a stepby-step prediction up to four step-ahead (i.e., 15,30,45, and 60-min), that is, using the predicted flow of the current time step to predict the flow in the next time step.  [38]. After exploring various combinations, the following STL parameters are used: n p = 476, n i = 2, n o = 10, n l = 476, n s = 477, and n t = 717.
The length of compared run of time series X t and a filtering level (i.e., m and r) are prerequisites in the application of ApEn. Yentes et al. [47] conducted a comprehensive study on the appropriate use of ApEn with short data. When the length of a dataset is greater than 200 (i.e., the case in the paper), 2 is suitable for m and several r values should be examined. Attempting the empirical values of r (i.e., 0.1, 0.2, 0.3), we verified that 0.2 was the optimal value in the experiment.
The additive HW is used to model the seasonal time series. We use the maximum likelihood estimation [47] to calculate the smoothing parameters of level α, trend β, and seasonality γ . Table 2 shows the resulting parameters.
In the STL-HW model, we build the HW method to predict the residuals, and in the STL-HW-LSTM, the LSTM NN is developed to predict residuals. Time series aggregated in four adjacent time intervals [46], [49], [50] is used to predict the passenger demand in the next time slot. Hence, the model input is x t−4 , x t−3 , x t−2 , x t−1 , and the output is x t .
We employ the gird search and cross-validation technique employed to select the optimal parameters in LSTM NN. Table 3 shows the optimal parameter values for LSTM NN. The ratio of the validation dataset is set to be 0.2. The value of cross-validation is empirically set as 3∼10 due to the massive calculation, and we set k as 5. The hidden layer size ranges from 1 to 6 [3]- [10], the number of hidden units l N ∈ {8, 12, 16, 20, 24}, the activation function of each layer is ''tanh'', the batch-size l S ∈ {1, 2, 3, 4}, the learning rate l S ∈ {0.01, 0.05, 0.001, 0.005, 0.0001}, and a dense layer is added to the output layer at last. Table 3 shows the optimal values of LSTM NN:

2) ARIMA-BPNN
In ARIMA-BPNN [46], we use an ARIMA model to extract the linear parts of the passenger flow series and build a back propagation neural network to fit the residuals. Modelling ARIMA consists of three steps: sequence identification, parameter estimation, and model verification [51]. Table 4 shows the estimated parameters of the ARIMA models. For BPNN, the neurons in the input layer, hidden layer, and output layer is 4,4 and 1, such 4 * 4 * 1 structure is also used by [46], [49], [50].

3) ND-LSTM
The naive decomposition uses derivation by subtracting the average passenger flow from the observed one for the same VOLUME 8, 2020 time period of a day. The residual passenger flow is: wherex is the average passengers flow at time period t for the same weekdays or weekends.x is constant in the studied dataset. LSTM NN is used to predict the residual time series. The determination of the optimal parameters of LSTM NN is similar to that in STL-HW-LSTM, and the results are the same as shown in Table 3. The optimal parameters of LSTM NN are the same as STL-HW-LSTM (Table 3).

IV. RESULTS AND DISCUSSION
A. ENTRY FLOW PREDICTION AT XZ STATION Fig. 6 shows the STL decomposition of entry flow at XZ station from Sept. 23 to 30, 2016. The seasonal component is a smoothing result of the passenger flow time series, thus the outliers are removed from the raw data. The trend passenger demand is almost stable, and the stochastic passenger demand exhibits weak regularity. Table 5 shows the ApEn value of each passenger flow time series.  The ApEn value of the trend component is the smallest, and thus its predictability is the highest. Therefore, it is reasonable to use the HW method to predict the seasonal and trend components and use the LSTM NN to predict the residual component. Fig. 7 shows the multi-step ahead (i.e., h) prediction deviations of entry flows. Table 6 shows that STL-HW-LSTM, STL-HW, and ND-LSTM models significantly outperform the ARIMA-BPNN model, especially for the multi-step predictions. The STL-HW-LSTM model performs the best for all scenarios in both MAPE and RMSE. The MAPE for the STL-HW-LSTM model is less than 8% in the up to four-step ahead predictions.     Table 8 shows that the STL-HW-LSTM model outperforms other models in multi-step predictions. The MAPEs of all the models in Table 8 are higher than those in Table 6. It attributes  to the diversification of land use (e.g., commercial, residential, and medical uses) around CBL station.

C. TRANSFER PASSNEGER FLOW PREDICTION AT RMGC STATION
We estimate the transfer passenger flow on the assumption that passengers prefer the shortest path. Fig. 9 shows the prediction results of the transfer passenger flow. Table 9 presents the ApEn of each transfer passenger demand pattern at RMGC station.   Table 10 shows that the STL-HW-LSTW consistently outperforms other models (decreases the MAPE by more than 2%). Notably, the prediction performance of the STL-HW-LSTM is stable (variations of the MAPE and RMSE are small) and evidently better than other models in multi-step predictions. Models' performance in predicting transfer flows degrades significantly compared with entry/exit flows. The reason is that the predictability of transfer flow is reduced due to the operation.

D. DISCUSSION
The case of transfer passenger flow prediction is analyzed because it best supports the predictive ability of the proposed model. Fig. 10 shows the prediction of each component. Table 11 summarizes the corresponding forecasting errors. With the HW method, the value of MAPE and RMSE of seasonal and trend components are small in Table 11. However, the method cannot adequately predict the residuals with high noise, and thus the corresponding MAPE is at least 47.57%. In contrast, LSTM NN reduces the MAPE of residuals by at least 2.47% compared with the HW method. Therefore, it is safe to conclude that the proposed model outperforms the STL-HW.
For the ND-LSTM, simply averaging passenger flow time series can not separate the trend and residual components and leads to the fact that the LSTM NN cannot accurately model the trend component. Hence, the proposed model also performs better than the ND-LSTM. It needs to note that the most ND-based residuals in Fig. 10 are between 0 to 10, and a minor prediction error can generate a tremendous value for MAPE (Table 11).
Also, ARIMA cannot adequately capture the linear part of passenger flow sequence, and the corresponding residuals have both linear and nonlinear components, which negatively influence the BPNN performance. These reasons partly explain why ARIMA-BPNN performs the worst among the four models.
Additionally, the forecasting errors of the seasonal and trend components barely change, although the forecasting step is increasing. It is because the seasonal and trend passenger demand remains constant in a short-time span (e.g., three months). Therefore, for the STL-HW-based models (i.e., STL-HW-LSTM and STL-HW), we only focus on the stochastic passenger demand prediction, and thus the prediction accuracy can be significantly imporved. (Tables 6,  8, and 10) The experiment results strongly support that STL is promising for the multi-step predictions by providing regular and random passenger demands separately for the forecasting model. By calculating the predictability for each decomposed passenger flow time series, we prove that ApEn is an efficient tool to help us select suitable prediction models. Combining the prediction results of HW and LSTM NN, the predictive advantages of these two models are combined. Therefore, the proposed model has great potential to provide more robust and reliable forecasting abilities.

V. CONCLUSION
The paper proposes a STL-based short-term metro passenger flow prediction model. It decomposes the original passenger flow into constant, long-term fluctuant and stochastic time series. We use the ApEn to evaluate the predictability of each time series, which lays the basis for choosing the suitable prediction models. We bulid the HW models to predict the constant and long-term fluctuant time series (high regularity), and develop the LSTM NN model to model the stochastic time series (irregularity).
Case studies using a heavily used urban railway system validated the better accuracy performance of the proposed model STL-HW-LSTM in predicting entry/exit/transfer flows, compared with STL-HW [42], ARIMA-BPNN [46], and a (derivation) decomposition-based LSTM NN. Also, the proposed STL-HW-LSTM is significantly more robust (2% in MAPE) in multi-step prediction than the peer models.
YANGYANG ZHAO received the B.E. and M.E. degrees in transportation engineering from Chang'an University, Xi'an, Shaanxi, China, in 2014 and 2017, respectively. He is currently pursuing the Ph.D. degree in transportation planning and management with the School of Transportation and Logistics, Southwest Jiaotong University, Chengdu, Sichuan, China. His research interests include metro operation and management, machine learning, and data mining.

ZHENLIANG (MIKE) MA is currently an
Assistant Professor with the Institute of Transport Studies, Monash University, Australia. He focuses on the inference, prediction, and design through the integration of novel data sources into mathematical learning models. His applications include public transport and shared mobility-on-demand services. His research interests include intersection of optimization, machine learning, and simulation. VOLUME 8, 2020 YI YANG received the B.E. degree in transportation and the M.E. degree in traffic and transportation engineering from Central South University, Changsha, China, in 2014 and 2017, respectively. He is currently pursuing the Ph.D. degree in transportation planning and management with the School of Transportation and Logistics, Southwest Jiaotong University, Chengdu, Sichuan, China. His research interests include public transportation and urban transit system operation and management.
WENHUA JIANG received the B.E. degree in traffic engineering from the Suzhou University of Science and Technology, Suzhou, China, the M.E. degree in transportation engineering from Tongji University, Shanghai, China. She is currently pursuing the Ph.D. degree with Monash University, Melbourne, VIC, Australia. Her research interests include spatio-temporal data processing, machine learning, and urban transit system operation.
XINGUO JIANG received the B.E. and M.E. degrees in transportation engineering from Tongji University, Shanghai, China, and the Ph.D. degree in transportation engineering from Michigan State University. He was with International Traffic Consulting Firm and City of Fort Myers, Florida, as a Chief Traffic Engineer. He is currently a Professor with the School of Transportation and Logistics, Southwest Jiaotong University, Chengdu, Sichuan, China. His research interests include traffic planning and management, public transportation, and traffic safety. He received the Honorable Title of Sichuan 100 Plan, in 2012.