Dynamic Origin-Destination Matrix Estimation Based on Urban Rail Transit AFC Data: Deep Optimization Framework with Forward Passing and Backpropagation Techniques

At present, the existing dynamic OD estimationmethods in an urban rail transit network still need to be improved in the factors of the time-dependent characteristics of the system and the estimation accuracy of the results. *is study focuses on predicting the dynamic OD demand for a time of period in the future for an urban rail transit system. We propose a nonlinear programming model to predict the dynamic ODmatrix based on historic automatic fare collection (AFC) data.*is model assigns the passenger flow to the hierarchical flow network, which can be calibrated by backpropagation of the first-order gradients and reassignment of the passenger flow with the updated weights between different layers. *e proposed model can predict the time-varying OD matrix, the number of passengers departing at each time, and the travel time spent by passengers, of which the results are shown in the case study. Finally, the results indicate that the proposed model can effectively obtain a relatively accurate estimation result. *e proposed model can integrate more traffic characteristics than traditional methods and provides an effective and hierarchical passenger flow estimation framework. *is study can provide a rich set of passenger demand for advanced transit planning and management applications, for instance, passenger flow control, adaptive travel demand management, and real-time train scheduling.


Introduction
As an important part of passenger flow prediction, in an urban rail transit system, origin-destination (OD) matrix estimation plays an important role, which provides basic data for passenger flow assignment. Most of the developed approaches are usually applied to road traffic systems, such as freeway, highway, and road networks, in which link (section) flow can usually be obtained by detectors. However, the development of approaches for estimating dynamic traffic demand for large-scale and complex rail transit networks without observed link flows, such as the Beijing and Shanghai subway systems in China, remains a critical and challenging problem, and this issue has been attracting a significant amount of attention from transport operation researchers and managers [1].
Many approaches have been proposed for distributing trips among origins and destinations over the years. e gravity model is a typical traditional model that can predict the OD distribution of traffic zones [2]. However, conventional approaches based on economic, population, and spatial relationship data are time-consuming and are highly expensive.
Some estimation approaches have become popular in OD estimation or updating the OD matrix from traffic counts. e scope of the literature in this field is very wide. e existing works are based on the entropy maximization [3], the maximum likelihood approach [4], the generalized least squares estimator [5,6], the Bayesian inference approach [7,8], and other approaches to solving this problem. However, due to the complexity of calculating this parameter, mostly the developed models based on the assignment matrix are limited to simple networks like intersections, interchanges, and freeways [9]. ese conventional methods for estimating origin-destination (OD) trip matrices from link traffic counts assume that route choice proportions are given constants. But this assumption does not hold in a network with realistic congestion levels [10]. A bilevel programming approach has been used for the estimation of the OD matrix in congested networks [11,12]. is approach combined the generalized least squares estimation model and the network equilibrium model into one process. However, the bilevel approach has certain difficulties in finding an optimal solution because of nonconvex and nondifferential formulations. Sherali et al. [13] constructed a linear programming model with a user-equilibrium solution for synthesizing OD tables from traffic volume counts. Later, Toledo and Kolechkina [14] presented the methods based on use of linear approximations of the assignment matrix in the optimization iterations. Fujita et al. [15] proposed an OD modification approach formulated as a static user-equilibrium assignment with elastic demand, based on the residual demand at the end of each period. Applying the model to large-scale road network demonstrates that it efficiently improves estimation accuracy because the 24-hour time coefficients of survey data are slightly biased and may be modified properly. Unlike the gravity model, these approaches are based on traffic count data, which can be detected from links by vehicle identification or locating technologies, such as GPS floating, automatic license plate recognition (ALPR), and radio frequency identification (RFID). Tang et al. [16] proposed a new method based on the entropy-maximizing theory to model OD distribution in Harbin city using large-scale taxi GPS trajectories. e results demonstrate that the entropy-maximizing model is superior to the gravity model, which can validate the feasibility of OD distribution from taxi GPS data in the urban system. Rao et al. [17] formulated a particle filter model for vehicle trajectory reconstruction based on ALPR data. And the OD patterns are estimated by adding up the path flows, which is conducted through dividing the reconstructed complete trajectories. Liu et al. [18] predicted the OD matrix based on the historical ALPR data. Guo et al. [19] developed an optimization model based on the least squares method. e optimization model estimated the dynamic OD matrix by integrating the preliminary OD matrix, dynamic assignment matrix derived by RFID data, and link flow detected by the inductive loop detectors. However, the route choice and travel time delay issues are still difficult to deal with. So, establishing the dynamic flow equations is the first challenge to estimate dynamic passenger flow demand for rail transit systems, for the lack of observed information and complex structure of the network.
In recent decades, the application of the neural network model expands this problem into a new field. e neural network operates as a black box, model-free, and adaptive tool for capturing and learning significant structures in data [20]. Gong [21] used the Hopfield neural network (HNN) model to estimate the urban orientation-destination (OD) distribution matrix from the link volumes of the transportation network so as to promote the solving speed and precision. Yang et al. [22] proposed a dynamic model based on backpropagation (BP) learning for estimating OD flows from road entering and exiting counts. e OD flows in each short time interval are estimated through the minimization of the squared errors between the predicted and observed exiting counts. Li et al. [23] proposed a new dynamic radial basis function neural network to forecast outbound passenger volumes and improve passenger flow control. Passenger flow control was considered to improve the prediction accuracy by adding passenger flow control coefficients to their model. However, the current perceptron neural networks may not perform well in all issues due to reasons such as model nontransferability, insufficient ability to generalize, and reliance on activation functions [24].
Subsequently, the computational graph was proposed as a description language to represent mathematical expressions. It is important to understand how the underlying computational graph of a deep learning network, combined with the BP algorithm, can be used to describe the forward propagation and backward feedback processes between different levels of transportation planning and decision making [25,26]. Wu et al. [27] proposed a multilayered hierarchical flow network representation to structurally model different levels of travel demand for road networks, including trip generation, OD matrices, path and link flows, and individual behavior parameters. However, the travel times were assumed to be observed in their study. In other words, their model was constructed in a static network rather than a time-dependent dynamic network. is issue has been improved in this paper.
In this paper, we aim to predict the dynamic OD demand for a time of period in the future based on historical observations. In most research papers, it is assumed that the OD matrix can be predicted from historical data [18,28]. We apply the historic AFC to train a time-dependent hierarchical flow network. en, we use it to predict the future OD demand with real-time AFC data at current as input information, which is the basic data to formulate operational and organizational strategies. Besides, the programming model proposed in this paper can also estimate a hierarchical traveling decision process for passengers in an urban rail transit system, including the departure time choice at the origin, the path choice, and the corresponding arrival time at the destination. erefore, the proposed method in this study achieves a combination prediction of dynamic OD matrix, departure time choice, route choice, and travel time.
A nonlinear programming model is proposed to conduct real-time OD matrix estimation for an urban rail transit system based on historic automatic fare collection (AFC) data in this paper. Forward passing in the hierarchical flow network of urban rail transit sequentially assigns passengers to candidate stations, paths, and different travel time intervals. e network can be calibrated by backpropagation of the first-order gradients and reassignment of the passenger flow with the updated weights between different layers under the deep optimization framework. is model can determine the time-varying OD matrix, the number of passengers departing at each time, and the travel time spent by passengers, of which the results are shown in the case study. Finally, a comparative analysis with artificial neural networks is conducted to illustrate the effect and efficiency of the proposed model. e potential contributions are as follows: (1) A modeling framework using the multilayer hierarchical flow network is applied to describe the passenger transit process in an urban rail system. Based on the flow-oriented prediction formulation, this deep learning modeling approach can simultaneously estimate different levels of unobserved or partially observed passenger flow variables. is model is applicable to the estimation of the OD matrix of passenger flow with AFC data, unlike other traditional estimation methods based on traffic counts.
(2) is modeling paradigm enables us to capture the mathematical structure inside the OD matrix estimation problem by representing and decomposing complex composite functions through a graph of current states and numerical gradients. is model is constructed by the passengers' trip process, unlike the black-box model ANN. erefore, the computational graph can express more traffic characteristics than the ANN and provides an effective and hierarchical passenger flow estimation. (3) e layered framework provides a flexible mechanism for further expansion. In particular, the framework can easily add a new hierarchical structure to achieve OD estimation when other sensor data sources can be obtained. e remainder of the paper is organized as follows. e next section presents the mathematical formulation of the time-dependent hierarchical flow network estimation model. In the following section, we present the solution framework for implementing forward and backward propagation. In Section 4, we describe a numerical experiment based on the Beijing Subway and compare the results with the ANN method. e conclusion is given in the last section.

Problem Statement and Notation.
is paper aims to design a time-dependent hierarchical flow network (TDHFN) model according to historic AFC records. e model is based on a passenger assignment network considering time variations.
ere is an abundance of historic AFC records that can be applied to train an optimal model. An urban rail transit network consists of a set of stations N, N { }) can be obtained from historic AFC data. e path set P (P � 1, 2, . . . , p, . . . ) is the given information consisting of alternative routes for each OD. erefore, there are 5 layers in the passenger assignment network: origin station, departure time, destination station, paths, and travel time. e passengers are assigned from the origin station to different departure time intervals, assigned to different destination stations, then assigned to different paths, and finally assigned to different travel times.
In the passenger assignment network design problem, the following inputs should be given: (1) AFC records of how many passengers enter at each station, depart at each time interval, exit at each station, and arrive at each time interval and (2) the supply network of the paths of each OD with minimum travel time and maximum travel time.
From the perspective of system-optimal passenger assignment, we can obtain (1) the number of passengers departing at each time; (2) the number of passengers arriving at each station; (3) the number of passengers arriving at each time; and (4) the number of passengers choosing each path.
ere is an important assumption in this model: the same departure time interval of different origin stations will be marked as different departure time interval indices, as well as the destination station indices and travel time indices.
is ensures that each path for the different destination stations in the network belongs to a different OD.
A multilayer TDHFN is adopted to describe the OD matrix estimation of the urban rail passenger flow problem. e notations used in this paper are shown in Table 1.

Physical Description.
Consider a simple physical urban rail network with four nodes, as shown in Figure 1. Node 1 is the origin station where passengers enter (tap-in) the urban rail system. Nodes 2 and 4 are the destination stations where the passengers exit (tap-out) the system. Node 3 is the transfer station. Four paths belong to two different ODs in this network. We consider a time-space passenger network based on the simple physical network (from Figure 1), as shown in Figure 2. ere is a very important principle in the numbering. With different departure times but equal travel times for the same OD, the destination station, path, and travel time values should be numbered with different indices. Additionally, when different OD pairs have the same departure times and equal travel times, the path and travel time values should be numbered with different indices. Similarly, when different OD pairs have the same departure times, equal travel times, but different paths, the travel time values should be numbered with different indices. is principle ensures that the model proposed is time-dependent. In other words, the passengers departing from the origin station at different times may choose different paths and different travel times. However, in a static network, passengers are often considered to be homogenous, such as in the research of Wu et al. [27]. In this paper, the timedependent numbering principle can be used to consider the characteristics of passenger heterogeneity, which is more practical. Finally, a simple example of the time-dependent numbering principle is shown in Figure 3, which is based on Figures 1 and 2. e indices are shown above the bold horizontal lines. e numbering of all the stations, departure times, paths, and travel times, as well as the determination of the connection between the decision variables of each level, is the basis of the model. is method is a very important and complex process, especially in a large-scale complicated urban transport network, such as the Beijing Subway.

Mathematical Description.
A TDHFN representation is used as a high-level modeling abstract to formulate the OD matrix estimation problem. Let a TDHFN G � G(V, E) be the collection of all the elements of the traffic demand variables in different layers, where each layer controls a subset of the demand variables and receives network flows Table 1: Sets, indices, variables, vectors, and parameters.
Definitions (1) e first layer is the origin station layer, containing each origin station with the index i corresponding to the number of passengers x i entering (tap-in) the system at origin station i.
Equation (1) describes the process of trip production from the origin station layer to the departure time layer. Equation (2) maps the flow from the departure time layer to the destination station layer. Equation (3) maps the flow from an OD pair to the candidate routes. Equation (4) aggregates the path flows to the travel time flows.

Model and Solution
We propose a nonlinear programming model with linear constraints for the studied passenger assignment problem. Forward passing in the TDHFN sequentially assigns passengers to candidate stations, paths, and different travel time windows. e network can be improved by backward propagation of the first-order gradients and reassignment of the passenger flow with the updated weights between different layers under the deep optimization framework.

Optimization Model.
We propose a nonlinear programming model with linear constraints for the OD matrix estimation problem.
en, the optimization model is reformulated in the TDHFN for the urban rail system.

Constraints for Passenger Assignment.
Assuming the total number of passengers entering the urban rail system at station i is x i , passengers may depart at station i at each time interval t. erefore, equation (5) formulates the assignment process, where the passengers in the urban rail system are assigned to each departure time interval t. Equation (6) assigns the passenger flow h t in departure time interval t to the destination station j as flow h j . Equation (7) assigns the passenger flow h j from destination station j to path p as h p . Equation (8) assigns the passenger flow h p from path p to the travel time τ as y τ .
Assigning the departure time intervals: Assigning the destination stations: Assigning the paths: Assigning the travel times:

Constraints for Flow
Equilibrium. e passenger flow equilibrium constraints are shown in equations (9)- (12):

Objective Function.
e objective function is shown in the following equation:

BP of Gradient.
e Lagrangian functions are as follows: erefore, the gradient of each level based on the KKT conditions is as shown in (15)- (18): e Lagrangian multipliers λ are known as the adjoint variables. To compute the gradient, we simply read the gradient concerning ∇L � 0. 6 Journal of Advanced Transportation

Reformulation in the Deep Optimization Framework.
We extend the TDHFN as a computational graph to express the passenger flow assignment process of an urban rail transit system. In the TDHFN, we implement forward passing and backward propagation (BP) to update the estimation variables to approximate the objective functional relationship expressed by (13). As BP is an essential part of the procedure, we use the term BP algorithm to represent the overall procedure throughout this paper. e model is divided into five layers. e first layer is the input layer, which represents the passenger flow entering the urban rail system by tapping in the card from the origin station; the second layer is the first hidden layer, which represents the passenger flow departing at a certain time; the third layer is the second hidden layer, which represents the passenger flow exiting the system by tapping out the card at the destination station; the fourth layer is the third hidden layer, which represents the passenger flow choosing a certain path; the fifth layer is the output layer, which represents the passenger flow arriving at a certain time. e propagation process of the passenger flow in the network is shown in Figure 4.
rough the connection relationships between neurons and the weight of each layer, the passenger volumes of each OD within various time periods can be predicted precisely. At this point, the output layer y τ represents y(i, j, t, p, τ). In this paper, to solve the problem conveniently, we proposed a numbering principle (shown in Section 2.2) so that the unique τ can represent (i, j, t, p, τ).
We can calculate many complex marginal values (update values of weights) using the chain rule in calculus, for example, where ω is a dimension vector of partial derivatives. We see that the marginal values consist of calculating a gradient product for each operation in the computational graph. Similarly, the updated formulas for other weights are as follows: Table 2 shows the solution algorithm for determining the estimation results, including the following three main parts.

Forward
Passing. e forward passing step sequentially implements trip generation, trip distribution estimation, and a route-based passenger flow assignment, which can be viewed as a process of the 3-step (from Step 2.1 to Step 2.3) approach in the area of traffic planning.

Backward Propagation.
e backpropagation step inversely implements feedback control on the forward passing process. Different layers of first-order partial derivatives or "loss errors" are aggregated to calculate the marginal gradients (as shown in Step 2.4).

Update.
Update values of variables using gradient descent (as shown in Step 2.5).

Parameter Settings.
A partial network of the Beijing Subway system is adopted to verify the proposed predictive model. is portion of the network contains 12 lines (including 6 two-direction lines) and 43 stations, as shown in Figure 5. e research time ranges from 7 am to 9 am, which is the early peak period of the Beijing metro. e AFC record data collected from Sep. 3rd to 7th (from Monday to Friday) in 2018 are utilized to train the model. en, the data of Sep. 10th (Monday) are adopted for testing. e time intervals are set as 10 min. Accordingly, the passenger flow for each station in the early peak hour is divided into 12 groups.
In this paper, we mainly focus on the OD passenger flow, not the section passenger flow in the subway network. Moreover, the congestion of the route is mainly reflected by the passengers' travel time, so the passenger flow state of the subway section is not considered. erefore, we only apply the AFC record of which the origin station and destination station both belong to the partial network of Beijing Subway shown in Figure 5.
In this paper, the travel time is defined as the time range between passengers entering (tap-in) and exiting (tap-out) the station. To facilitate the data statistics, the travel time in this   Step 2: iterative optimization process Step In particular, the travel time of more than 90% of the passengers in both of the ODs ranges from 10 to 20 min. In contrast, the proportions of passengers with travel times that are longer than 30 min are less than 1% for the two ODs. Because the frequencies of some travel times are relatively small when constructing the travel time index set Γ, the travel times for which the frequency is less than a specific threshold (e.g., 5%) can be eliminated to reduce the network size. For instance, for the OD from Xizhimen to Xidan, as shown in Figure 6(b), only one index that points to the travel times of 20 min is assembled into the set Γ. e threshold can be adjusted. A smaller threshold of less than 5% can be chosen if a finer resolution is needed. e difference in travel time of each path is due to the path's congestion and individual characteristics of passengers. If a logit model is used to describe the choice probability and behaviors of passengers, the path choice probability is only related to the path cost, which cannot reflect the difference of path's congestion and individual characteristics of passengers. erefore, we reversely deduce the possible path for passengers based on the real travel time data from AFC and the travel time distribution of each path.

Result Analysis.
We implement the TDHFN using Python 3.6.1, and a part of the Beijing Subway is selected to examine the applicability as well as the computational efficiency of our proposed model. e computational environment is an Intel(R) Core(TM) i5-45900 Processor CPU with 3.30 GHz, 8.00 GB RAM, and 64 bit OS. In addition to TensorFlow, we can use other off-the-shelf software tools, such as eano, to construct a computation graph-based model.
Extracted from the AFC data, the origin layer has 43 nodes, the departure time layer has 516 nodes, the destination layer has 21,672 nodes, the path layer has 45,732 nodes, and the travel time layer has 39,396 nodes. In this experiment, we let the maximum iterations � 10000 and set the initial learning rate � 0.00001. e iterative curve of the case study is presented in Figure 7, which shows that the loss function can achieve convergence at the 9000th iteration.
To compare the estimated OD passenger flows with the actual passenger flows, we can apply some goodness-of-fit measures, such as the mean absolute percentage error (MAPE), the mean square error (MSE), the root mean square error (RMSE), the root mean square normalized (RMSN) [29], and R-squared. Since we adopted the timedependent prediction errors in this article, this situation cannot be avoided when the value of OD passenger flow would be zero. erefore, MAPE is not available because the divisor cannot be zero. RMSE and RMSN measures can be adopted because their divisors would not be zero in this study. But the value of RMSE is related to the value of variables. erefore, we also adopted the RMSN to compare and show the accuracy of different variables. e classical function of RMSE is presented in equation (22). Besides, RMSE i in equation (23) represents the measure of the output nodes belonging to the network of which the origin station index is i. RMSE ij in equation (24) represents the measure of the output nodes belonging to the network of which the origin station layer index is i and the destination station index is j. Moreover, RMSE ijt in equation (25) represents the measure of the output nodes belonging to the network of which the origin station layer index is i, the destination station layer index is j, and the departure time layer index is t. In the same vein, the functions of RMSN, RMSN i , RMSN ij , and RMSN ijt are reported in equations (26)- (29).   Table 3 shows the results of RMSE i and RMSN i for each station. Several observations can be made:

Journal of Advanced Transportation
(1) On the whole, the results for the RMSE i and RMSN i of all the stations are relatively low. e average RMSN is below 3%, which indicates that the proposed TDHFN can provide an effective estimation of passenger flow for urban rail systems.
(2) Some stations' RMSN i are relatively poor, such as those of Tiananmendong, Tiananmenxi, Dongsi, and Ciqikou. e reason for these results may be that these stations are mainly located at famous scenic spots and shopping mall areas rather than the places where residents live or work. us, with morning peak data on working days, the characteristics of the passenger flow in these types of stations cannot be fully captured. In the future, allday data can be collected to improve the estimation effect.
To explore the estimation results among the passenger ODs, a 3-dimensional surface map of the RMSN ij matrix is shown in Figure 8(a), where the indices of the origin and destination stations are considered as the x-axis and y-axis, respectively, and the RMSN ij value is considered as the zaxis. Besides, the contour line of the RMSN ij matrix from a 2-dimensional perspective is given in Figure 8(b). Note that the station indices in Figure 8 are the same as the indices presented in Table 3.
Furthermore, we produce a 3-dimensional surface map and a contour graph, as shown in Figure 9, for the specific origin station in Chongwenmen. In Figure 9, the departure time, destination station, and RMSN ij values are considered as the x-axis, y-axis, and z-axis, respectively. e definitions of the departure time indices in Figure 9 are given in Table 4.
From the contour graph in Figure 8, we can see that most of the RMSN ij values are relatively small. is result indicates that TDHFN is effective in estimating the OD matrix of urban rail transit passenger flow. However, we can see that there is one point drawn in a dark red color that represents the value of the OD from Tiananmendong to Beijingzhan. e passenger flow between Tiananmendong and Beijingzhan is quite small during the morning peak, which results in a relatively large error.
Most of the points in Figure 9 are drawn with cool colors, which further validates the effectiveness of the proposed method in estimating the time-dependent OD matrix passenger flow. ere are few points marked with warm colors, of which the destination stations include Hepingmen, Beijingzhan, etc. In terms of the time dimension, the time range of these data points is mainly concentrated between 7:50 and 8:10.
In addition to the time-dependent OD estimations, the time-dependent travel times for passengers can also be obtained based on the TDHFN method. e results for passengers from Chongwenmen to Changchunjie are illustrated in Figure 10

Comparative Analysis.
e estimation results of TDHFN are compared with the results of an artificial neural network (ANN). For a detailed introduction of the ANN method, we refer to the literature by Remya and Mathew [20] and Mozolin et al. [24]. e eigenvalues selected in this paper are obtained from AFC data and urban rail network topology, including the daily average passenger flow of the origin station, the daily average passenger flow of the destination station, the number of alternative paths, the average travel time, the distance (replaced by section number), the departure time, and the average transfer times. After training and adjusting, we got a well-trained ANN model. ere are three layers in the network, including the input layer, the output layer, and one hidden layer. e activation function is Relu and Sigmoid, and the number of hidden layer nodes is 5.
e comparison results are illustrated in Figure 11 and Table 5, which show that the results of the model proposed in this paper are significantly better than those of the ANN. However, it should be noted that the source of the input data for ANN is the same as that of the TDHFN model. e performance of the ANN method can be improved when additional data are collected, such as commuter numbers, commuter properties, and land types. However, in a       practical situation, a detailed and comprehensive collection is difficult. e difference between the ANN and computational graph algorithm is that the former neural network is a blackbox model, and the number of neurons, activation functions, and neural network layers is not certain, so this method often requires continuous experiments and adjustments to find the optimal model. However, in TDHFN, the number of neurons, the form of the activation function, and the number of layers of the neural network are determined values with practical physical significance. Only the weight matrix of each layer in the network is unknown and needs to be determined through learning. erefore, the computational graph can express more traffic characteristics than the ANN and provides an effective and hierarchical passenger flow estimation.
Finally, the dynamic OD matrix estimation of passenger flow is shown in Figure 12. It shows the passenger flow changes of each OD in different periods. e dynamic OD matrix estimation of passenger flow can provide basic data for the passenger flow control strategy of urban rail transit.

Conclusions
is study proposed a time-dependent hierarchical flow network for urban rail transit passengers. e OD passenger flow matrix at each time in the subway network can be obtained by inputting the incoming passenger volume of each station during the morning peak to the model. is model can be improved by backpropagation of the firstorder gradients and reassignment of the passenger flow with the updated weights between different layers under the deep optimization framework. e result analysis indicates that the TDHFN can provide abundant and hierarchical passenger flow estimation results. A comparative analysis shows that the proposed model can effectively obtain relatively accurate passenger flow estimation results.
At present, the existing OD dynamic estimation methods of urban rail network passenger flow still need to be improved in the factors of timeliness and accuracy. e most important contribution of this paper is to propose a multilayer hierarchical flow network applied to urban rail with deep learning research.
is method can solve the dynamic OD matrix estimation problem. is flow-oriented prediction formulation can simultaneously estimate different levels of unobserved or partially observed passenger flow variables. Furthermore, when more data sources are available, this method can achieve hierarchical expansion, making this method more flexible. To build a theoretically sound modeling framework, this paper hopes to trace back to the fundamentals or low-level representation of deep learning networks and construct a transportation-focused computational graph as a structured modeling language. is modeling paradigm enables us to capture the mathematical structure inside the OD matrix estimation problem by representing and decomposing complex composite functions through a graph of current states and numerical gradients. However, the model proposed in this study does not apply to all stations. e model function is better when the subway stations are mainly the distribution of the places where residents live or work. By only using the data of the morning peaks over a few working days, we cannot determine the characteristics of passenger flow through training. In the future, more comprehensive data should be collected, such as GPS trajectory data [16], land-use data, or the (point of interest) POI features [30]. Tang et al. [31] applied to uncover the characteristics of travel patterns from temporal and spatial dimensions in the metro network according to the POI data. Based on their study, the stations can be clustered by node significance on the metro network or POI features of stations. us, the applicability of this model may be improved.

Data Availability
e numerical data used to support the findings of this study are available from the corresponding author upon request. Disclosure e funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Conflicts of Interest
e authors declare that they have no conflicts of interest.