A Deep Learning Model with Conv-LSTM Networks for Subway Passenger Congestion Delay Prediction

,


Introduction
With the rapid development of the national economy and the continuous improvement of the urbanization level, the number of passenger trips and construction projects of urban rail transit is also increasing rapidly. By the end of 2019, 40 cities in mainland China had opened urban rail transit, with annual passenger quantity up to 23 billion 710 million times. Moreover, there are still 65 cities whose urban rail transit plans have been approved, and China's urban rail transit is in a period of great development and construction.
Although the subway has the characteristics of large passenger capacity, the imbalance of traffic supply and demand often occurs in the peak periods [1]. Due to the limitation of the carrying capacity of subway carriages and platforms, some passengers need to wait for the next train or even more. is phenomenon of passengers staying at the platform will prolong the waiting time of passengers and delay the overall travel time of passengers. In this case, it is very important for passengers and operation departments to accurately grasp the internal operation status of the subway network system. Subway congestion usually refers to the crowd in carriages. When passengers cannot get on the crowded subway car, it will reduce the comfort and increase the travel time [2]. e part of the increase in travel time due to congestion is called congestion delay, which can be used as an indicator to reflect the current state of passenger flow at the station in real time. e realization of this idea benefits from the largescale application of automatic fare collection (AFC) system data.
e AFC data record the card number, time, and location of each passenger trip. erefore, it is possible to count the travel time and delays of passengers between ODs in the subway system in real time through big data mining technology [3]. is paper studies the travel time delay of subway passengers and judges the congestion state of the station by analyzing the average travel time delay of passengers waiting at the station. To allow subway operators and passengers to effectively grasp the future operating status of stations, we use the most advanced deep learning methods to predict station congestion. e main contributions of this paper are as follows: (1) based on the calculation of passenger travel time using AFC data, we use the idea of control variables to eliminate interference factors and use the difference between the real travel time in the peak period and normal travel time in the off-peak period to evaluate passenger congestion delay. (2) e congestion delay of subway passenger flow in the whole network is represented by the image and time series. Among them, the image contains the spatial propagation of congestion delay between adjacent stations, and the time series contains the time dependence of subway station congestion delay. (3) We extend the traditional fully connected long short-term memory (FC-LSTM) network idea to the convolutional long short-term memory (Conv-LSTM) network, which has a convolution structure in both input-to-state and state-to-state transitions and can effectively capture spatiotemporal correlations of congestion delay. (4) e congestion delay of the whole Chongqing Metro network is calculated and predicted, and the effectiveness of the method is verified by the operation data. is is different from the traditional passenger flow forecast research, which is often limited to station or route-level forecasting. e rest of this paper is organized as follows. Section 2 reviews past works and existing methods in the fields of congestion delay calculation and forecasting. Section 3 introduces how to use AFC data for congestion delay calculation. To cooperate with the prediction of congestion delay, the Conv-LSTM structure used in this paper is described in Section 4. Section 5 analyzes the distribution of congestion delays in the Chongqing Rail Transit network. Section 6 briefly summarizes the work of this paper and puts forward the outlook.

Related Work
In the research field of the subway congestion delay problem, there is no complete and effective calculation and prediction method. However, in recent years, big data processing technology and artificial intelligence have developed rapidly, which provides us with new ideas and methods to study the subway congestion delay problem.
In the research field of subway passenger congestion delay, the existing literature mainly focuses on the evaluation and optimization of passenger travel time or waiting time. As early as 2009, Vansteenwegen [4] proposed the linear programming method to optimize the Belgian railway's train timetable and found that the general waiting cost could be reduced by 40%. With the extensive use of the AFC system and the continuous development of big data technology in recent years, researchers began to use AFC data to study the waiting time and congestion problem of passengers and achieved a lot of results. Yong-Sheng and En-Jian [5] used a new estimation model based on the Bayesian inference formula to evaluate the travel time distribution of subway passengers and prove that the walking, waiting, transfer, and in-vehicle travel times of subway passengers belong to a truncated normal distribution by using AFC data. Ingvardson et al. [6] proposed a mixed distribution composed of uniform distribution and beta distribution to estimate the waiting time of passengers. en, smart card data are used to verify that this method can improve the estimation of waiting time in the public transport model. Some scholars use mathematical programming to optimize train headway and use AFC data to verify the effectiveness of their model. Liu et al. [7] optimized the departure interval of the subway transfer station by combining simulated annealing and parallel computing and verified that the model can effectively reduce the waiting time of passengers by using AFC data. Yin et al. [8] proposed an integrated approach for the train scheduling problem on a bidirection urban subway line to minimize the operational costs and passenger waiting time.
e effectiveness of the method is verified by the operation data of the Beijing Subway. Luo et al. [9] proposed a hybrid method, which combines the static traffic assignment model with the agent-based dynamic traffic simulation model to estimate the frequent congestion in the subway system.
In recent years, machine learning has made great progress in various practical applications [10]. At present, some achievements have been made in the prediction of passenger flow and traffic congestion by using the deep learning method. Yang et al. [11] proposed an improved long-term feature model based on long-term short-term memory (ELF-LSTM) neural network. It makes full use of the advantages of the long short-term memory (LSTM) neural network model in processing time series and overcomes the limitation that it cannot fully learn long-term time dependence due to time lag. Huaizhong et al. [12] used the method of deep learning to predict the passenger flow of a single subway station by considering the weather, holidays, ground transportation, and other factors. Wang et al. [13] proposed a deep learning method with an error-feedback recurrent convolutional neural network (eRCNN) structure for continuous traffic speed prediction. ey took Beijing ring road as an example to demonstrate the feasibility of the model in identifying congestion sources. Chen et al. [14] proposed a hybrid algorithm that combines the addition mode of seasonal-trend decomposition based on loess and the LSTM neural network (STL-LSTM) to mitigate the influences of irregular fluctuation and improve the performance of short-term subway ridership prediction. Ai et al. [15] used Conv-LSTM to solve the problem of airport delay prediction in the network structure and verify the effectiveness of the model. Zheng et al. [16] developed an attention-based Conv-LSTM module to extract the spatial and short-term temporal features, able to efficiently capture the complex nonlinearity of traffic flow. Sudatta et al. [17] defined the vehicle congestion fraction in a block and used the LSTM neural network structure to predict the congestion of the street network. Moreover, some scholars use the parallel structure model to predict congestion. Ma et al. [18] proposed a parallel structure composed of a convolutional neural network (CNN) and a bidirectional long-term memory network (BLSTM) to predict subway passenger flow. However, the calculation and analysis process of the parallel architecture is complex. Lin et al. [19] took the event detection system as the research object and proposed an event detection framework based on the generative adversarial networks (GANs) to solve the problem of insufficient event samples. Li et al. [20] expanded the sample size and balanced the datasets by using the generative adversarial network (GAN) and then extracted the temporal and spatial correlation of traffic flow and detected incidents by using the temporal and spatially stacked autoencoder (TSSAE). In short-term passenger demand forecasting, Ke et al. [21] proposed the fusion convolutional long short-term memory network (FCL-Net) to address spatial dependencies, temporal dependencies, and exogenous dependencies within one end-to-end learning architecture.
In conclusion, by reviewing the existing results, we found that the inbound passenger flow at subway stations is generally influenced by the travel habits of passengers and the weather, so better prediction results can be obtained by using LSTM and its improved model [22]. However, we found that the congestion of the subway station is not only related to the passenger flow in and out of the station but also closely related to the congestion of the adjacent stations. Specifically, after a station is congested with passengers on the platform due to a full carriage, if the next station does not have a large number of passengers disembarking to make room for the remaining carriages, the phenomenon of passengers being stranded on the platform will still occur at the next station. is forces us to extract spatial features effectively while considering time series data. erefore, we make a fundamental adjustment to the traditional LSTM approach and adopt Conv-LSTM to extract the spatial and temporal characteristics of the passenger flow congestion to realize the prediction of station passenger flow congestion and achieve a better prediction effect.

Congestion Delay Calculation
Passenger flow congestion means that the movement of passengers is limited by other passengers and the state of the environment, increasing travel costs (travel time, physical consumption). e passenger congestion in the station shows that the limited space (station space, train residual capacity) and equipment capacity cannot meet the needs of passengers, thus gradually forming congestion. Compared with other periods, the passenger volume in the peak period is significantly higher, and a large number of passengers gather in a short time in the local space, which easily leads to passenger congestion. If we cannot achieve early warning and effective management, it will bring security risks to the subway operation. However, there are many reasons for passenger delays. For example, passenger flow congestion, train delay, signal failure, and other objective factors will cause an increase in passenger travel time. When the passenger travel time exceeds a certain threshold, it means that the passenger travel is different from the usual, and it is likely that there is a delay. Due to the congestion of passenger flow during the peak period, passengers are hindered by objective factors such as other passengers or control measures, resulting in extra time loss in the process of travel. Rather, the delay is expressed as the difference between the real travel time and the normal travel time.
Among them, the increase of travel time caused by passenger congestion is called congestion delay, which is the main research object of this paper. Congestion delay is mainly composed of walking delay and waiting delay. (1) e main reasons for walking delay include slow travel caused by passenger flow congestion, queuing caused by equipment capacity limitation, and increased travel distance caused by passenger flow organization adjustment in the station. (2) e main reason for the waiting delay is that passengers cannot get on the train in time due to the high full load rate. erefore, this paper adopts the idea of control variables, selects specific dates to eliminate the interference of other factors (train delay and signal failure), and focuses on the impact of passenger flow congestion on travel time delay.
For the passengers who need to transfer in the process of travel, we can only know the location of the passengers in and out of the station through AFC data and cannot determine where the passengers' transfer. Moreover, when passengers' travel time increases due to passenger congestion, we cannot judge whether the increased travel time occurs at the origin station or the transfer station. erefore, when we evaluate the degree of station congestion, we take nontransfer passengers as the research object. If this part of passengers has a congestion delay, it can also be judged that nontransfer passengers entering the station during the same period will also face the same congestion situation.

Off-Peak Period for Normal Travel Time.
We assume that, during the off-peak period, passengers will not stay on board due to passenger congestion in the carriages or platforms. In this case, the waiting time of passengers is an approximately uniform distribution U(0, H normal ), and the maximum waiting time of passengers is a departure interval H off in the off-peak period.
Since the passengers' walking speed is approximately normal [5], we assume that the inbound walking time t enter and outbound walking time t exit of passengers obey the normal distribution N(μ, σ 2 ).
Some researchers consider that, for the same station, the path of entering and leaving the platform is the same, so they set the walking time to enter and leave the station to the same value. However, through the investigation, we found that some stations have different routes for passengers to enter and leave the platform, passengers have different directions on the stairs when entering and leaving the platform, and the capacity of the stairs is also different. erefore, we calculate and analyze the walking time to enter and leave the platform, respectively. For station p, the walking time of passengers entering and leaving the platform can be set as t p enter and t p exit .

Journal of Advanced Transportation 3
Taking the no transfer route (p∼q) as the research object, the overall travel time can be given by where t pq denotes the overall travel time of passengers from station p to station q, t p wait denotes the waiting time of passengers at station p, and t pq vehicle denotes the total on-board time of passengers from station p to station q. When the train runs on time according to the timetable, t pq vehicle is the fixed value.
According to the independence of each travel time element, the mean and variance of travel time can be given by where E(t pq ) and D(t pq ) denote the mean and variance of travel time of the route (p∼q), μ e time range can be divided into k periods x 1 , x 2 , . . . , x k . For passengers entering the station at x m , the overall travel time can be given by For line l with n stations, the average walking time μ enter and μ exit of each station during the off-peak period can be obtained by introducing equation (2) into equation (4). For the same station at the same time point, due to different up and down directions, the waiting situation of passengers is also different. As shown in Figure 1, this paper calculates the platform congestion delay time in the up and down directions of each station separately. For line l with n stations, the number of platforms is 2n.

Peak Period for Congestion
Taking the no transfer route (p∼q) as the research object, the overall travel time can be given by where Δt p denotes the congestion delay time of passengers in the up or down direction of station p.
According to the independence of each travel time element, the mean and variance of travel time can be given by Using the walking time μ p enter and μ q exit obtained in the previous paper, the congestion delay time of passengers at station p can be obtained by equation (6).
e time range of the station congestion study is divided into k periods x 1 , x 2 , . . . , x k . For passengers entering the station at x m , the total delay time can be given by If passengers arrive at the platform evenly, the average waiting time is equal to half of the departure interval. In other words, even if there is no congestion at the platform, half of the passengers' waiting time is longer than half of the departure interval. erefore, for a specific passenger, even if Δt x m p > 0, it is not sure that the passenger has a congestion delay. To avoid the calculation error of the number of passengers with a congestion delay caused by this part of passengers, the full departure interval is used as the maximum waiting time of passengers without a congestion delay. A passenger whose congestion delay exceeds the full departure interval is deemed to have been delayed. e number of passengers delayed can be given by where a denotes the order number of passengers, the peak period set of rail transit is Pt, and A x m ,pq denotes the collection of all passengers traveling along the route (p∼q) during the period x m .

Journal of Advanced Transportation
For station p, the proportion of passengers with congestion delay in the period x m can be given by where α (11)

Deep Learning Forecasting
Passenger congestion delay has complex characteristics in spatial and temporal dimensions. e passenger congestion delay of a station at a certain time can be explained from two aspects. From the perspective of the temporal dimension, the passenger congestion delay of the next period can be regarded as the continuation of the passenger congestion delay of the previous period. From the perspective of the spatial dimension, the passenger congestion delay of a station is affected by the congestion delay of adjacent stations, and the congestion delay of adjacent stations has a certain spatial correlation. erefore, we apply Conv-LSTM to deal with the spatial dependence, temporal dependencies, and the network topology properties of the subway passengers' congestion delay. In this section, we will briefly review the traditional FC-LSTM structure and then explain the deep learning architecture and advantages of Conv-LSTM.

FC-LSTM.
LSTM is a special form of the RNN structure, which is mainly used to solve the problems of gradient vanishing and gradient explosion in the process of long sequence training. In most RNNs, the hidden layer function H is the basic application of the sigmoid function. However, the LSTM architecture uses specially constructed memory cells to store information, which is better at discovering and utilizing long-term dependence in the data. In short, LSTM can perform better in long sequences than ordinary RNN. e main innovation of LSTM is that its storage unit is the accumulator of state information. Whenever there is a new input, if the input gate i t is activated, the input information will accumulate into the cell. Besides, if the gate f t is opened, the past cellular state c t−1 may be "forgotten" in the process. Whether the final unit output will be propagated to the final state is further controlled by the output gate o t . Using memory cells and gates to control information can ensure that the gradient will be captured in the cell and avoid disappearing too fast. FC-LSTM adds "peephole connections" to the traditional LSTM structure, allowing the gate layer to see the state of cells. e inner structure of an FC-LSTM layer is shown in Figure 2. FC-LSTM can be regarded as a multiversion LSTM in which the input, output, and status are 1D vectors. In this paper, we follow the FC-LSTM formula in [23], which is expressed as follows: i t , f t , and o t denote the input gate, forget gate, and output gate. x t represents a one-dimensional vector or scalar, and h t can be given a different dimension. e weighted parameter matrices are W xi ∼ W co , which conduct a linear transformation between the vectors. b i ∼ b o are the intercept parameters. e operator " ⊙ " is the Hadamard product; σ and tanh are the two nonlinear activation functions given by Because the internal gate of FC-LSTM is calculated by a similar feedforward neural network, this structure can deal with the time-series data well, but for spatial data, it will bring redundancy. e reason is that spatial data have strong local characteristics, but FC-LSTM cannot describe these local characteristics.

Conv-LSTM.
To obtain a better spatiotemporal relationship of the model, we extend the traditional FC-LSTM idea to Conv-LSTM. e method is to replace input-to-state

Journal of Advanced Transportation
and state-to-state of FC-LSTM with convolution instead of feedforward calculation [24]. By stacking multiple Conv-LSTM layers to form a coding prediction structure, we can establish an end-to-end training model for short-term subway congestion delay prediction. Conv-LSTM can overcome the shortcomings of the traditional LSTM network in space dependence. Compared with traditional LSTM, Conv-LSTM transforms all inputs, outputs, hidden states, and various gates from a two-dimensional vector to a three-dimensional tensor. e comparison between Conv-LSTM and FC-LSTM is shown in Figure 3.
As we defined, the grid of the subway congestion delay system in a spatial region is composed of P rows and Q columns. Each cell with a subway station in the grid has Z measurement scales varying with time. erefore, the information at any time can be represented by tensor X ∈ R Z×P×Q , where R is the observed feature domain. Conv-LSTM determines the future state of a cell in the grid by its local neighbors' input and its past state. e key expressions are as follows: All the inputs X t , cell outputs C t , hidden states H t , and gates i t , f t , o t of the Conv-LSTM are 3D tensors whose last two dimensions are rows and columns. e operator * denotes convolution, and ⊙ is the Hadamard product, so the weight matrix will be transformed into a convolution filter for calculation.
In this part, we can take Conv-LSTM as a model to deal with the eigenvectors in 2D meshes. We can predict the characteristics of the central grid according to the characteristics of the surrounding points in the grid. erefore, we can make a short-term prediction of the subway congestion delay system under the spatiotemporal variables. e training steps of Conv-LSTM are as follows (Algorithm 1).

Datasets.
is paper takes the Chongqing subway network as an example to verify the model. In the Chongqing subway system, passengers need to input smart card information on the automatic fare collection system of each subway station. e AFC system records the entrance and exit information of each passenger (e.g., transaction time and station ID). An example of card data is shown in Table 1. is study selects 40 working days' operation data of Chongqing Metro in September and October 2018. Firstly, the subway passenger delay rate and congestion delay index are calculated by the method in Section 3. en, the RMSE, MAE, and R 2 values of the prediction results are calculated to evaluate the ability and effectiveness of the Conv-LSTM model.

Results
. We need to divide the subway network diagram into many small units and ensure that each small unit contains at most one subway station. erefore, we take the row and column values of subway network cells as 64. Besides, the dataset will be divided into two parts: the first part is the training data (35 days), and the second part is the test data (5 days). We will test the Conv-LSTM model with different layers to determine the best structure. e future delay rate is predicted by using historical observation data such as the number of passengers entering or leaving the station, the delay rate, and the average delay time.
In this paper, root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R 2 ) are used to verify the prediction accuracy of the model: where y i denotes the ith actual value and y i ′ denotes the ith predicted value. y denotes the mean of all y i , and n is the size of the test set. Table 2 shows the comparison of the prediction results between the proposed model and benchmark models. e results show that the Conv-LSTM network is superior to the benchmark models in three indexes of prediction performance. By comparing the benchmark models, we can find that the machine learning method has better prediction performance than the traditional time series model. CNN and Conv-LSTM networks have obvious advantages in spatial relevance capture and verify the importance of considering spatial correlation to the prediction of subway congestion delay. Among them, Conv-LSTM achieves the best predictive performance measured by RMSE (0.0331), which is 6.5% lower than the CNN (0.0354). Conv-LSTM performs better in the combination of spatial features and time series features, and the convolution layer can realize the transition from state to state, so it can capture the spatial correlation better in the coding network. Figure 4 shows the actual situation of the delay rate of the Chongqing subway network, in which the red bar chart represents the upward direction, the blue bar chart represents the downward direction, and the height of the bar chart represents the size of the congestion delay rate. Figure 5 shows the forecast of the congestion delay rate of the  Chongqing subway network. e darker the color, the higher the delay rate of the station. ese two visualization graphs effectively reflect the good prediction effect of Conv-LSTM. According to the intuitive comparison between the actual delay and the predicted delay, we find that the model can effectively capture the spatiotemporal characteristics of each node and make an effective prediction. Besides, we found some rules in the training and prediction of the model. First of all, the stations with the highest congestion delay are mainly concentrated in the subway stations at the intersection of line 1 and line 3 and the stations in the surrounding areas. is is mainly because line 3, as the longest straddle-type monorail transit line in the world, has its limited capacity. Moreover, line 3 passes through several important areas and transfer stations in Chongqing, attracting a large number of passengers. During the peak period, it is even necessary to wait for 5 trains to get on the train. Secondly, the peak of congestion in the morning peak occurs between 7:30 and 8:30, and that in the evening peak occurs between 17:30 and 18:30, which is consistent with the commuter rule of passengers. irdly, we also find that congestion mainly occurs in areas within the inner ring, while the possibility of congestion outside the inner ring is relatively small. is is related to the layout of the subway network created by the special terrain of Chongqing. ere are relatively few routes to the central city, which make it easy for passengers to gather in the urban area, which will lead to congestion. e method of combining passenger congestion delay distribution with visualization is helpful for the subway operation department to detect and forecast station congestion and provide a more reasonable basis for subsequent  work plan arrangement and even subway network planning. At the same time, it can also provide a reference for passenger travel route planning.

Conclusion
Based on the analysis of the reasons for the delay of subway travel time, this paper uses the idea of control variables to propose the calculation method of passenger congestion delay at the subway network level. Considering that the passenger flow congestion between stations is communicable, the congestion of stations is not only related to the historical congestion of the station but also related to the congestion of adjacent stations. erefore, combined with the temporal and spatial characteristics of passenger congestion, we use the improved deep learning method Conv-LSTM based on CNN and FC-LSTM to make a short-term prediction of subway station congestion delay. Conv-LSTM not only retains the advantages of FC-LSTM but also is suitable for spatiotemporal data because of its unique convolution structure. We use a variety of benchmark models to evaluate the performance of the proposed model. e test results show that Conv-LSTM is satisfactory in solving the passenger congestion delay prediction problem of the subway station.
In this paper, an end-to-end deep learning structure based on spatiotemporal variables is used to realize the short-term prediction of the passenger congestion delay distribution, which can real-time grasp the congestion situation in the subway network. On the one hand, it can help the operation management department to develop better management and planning schemes. On the other hand, it can help passengers grasp the congestion situation of subway stations and make better travel plans and choices. However, this paper also has corresponding shortcomings, such as transfer passengers will face twice or more waiting time, and we cannot accurately determine the specific time and place of congestion delay. In future work, we will discuss how to judge and calculate the congestion delay of transfer passengers and add it to the prediction model.

Data Availability
Access to data is restricted. e survey data source has certain confidentiality.

Conflicts of Interest
e authors declare that they have no conflicts of interest. Journal of Advanced Transportation 9