Optimizing Train Routing Problem in a Multi-station High-speed Railway Hub by a Lagrangian Relaxation Approach

As the intersection of multiple high-speed railway lines, the multi-station high-speed railway hub is the key to improve the transport efficiency of the high-speed railway network. This paper focuses on the optimization of the multi-station high-speed railway hub and models it as a train routing problem (TRP). Considering the capacity of railway infrastructures and the demand of passengers, a mixed integer linear programming model is proposed to minimize the total cost of train routes and passenger routes. The optimized train routes include the macroscopic routes between stations and the microscopic track allocation inside stations and Electric Multiple Units (EMUs) depots. A Lagrangian relaxation (LR) approach is developed to dualize the hard constraints and decompose the origin model into train and passenger subproblems, then a shortest path algorithm is designed to solve the subproblems independently. Numerical experiments based on an illustrative railway hub network and a real-world network are implemented to demonstrate the effectiveness of the model and algorithm. The solution results prove that the LR approach can obtain high-quality solutions within an acceptable computational time. Compared with the existing fixed scheme, the optimization scheme can reduce the total cost by 37.18% and utilize the railway lines and tracks more reasonably.


I. INTRODUCTION
As a key component of the transport system, high-speed railways provide an efficient, comfortable, punctual, and safety service for transport passengers, and the high-speed railway network has been gradually formed worldwide. In the railway network, the distribution of railway lines is closely related to the regional development, so the metropolises usually build multiple stations to connect various railway lines, thus forming the multi-station highspeed railway hubs [1]- [3]. A multi-station high-speed railway hub refers to a railway hub composed of multiple high-speed railway lines and passenger stations, such as Berlin railway hub in Germany, Paris railway hub in France and Zhengzhou railway hub in China. As the intersection of multiple railway lines, the multi-station hub needs to deal with trains from different directions, so its efficiency will greatly affect the operation of the railway network [4], [5].
Due to this strategic significance, optimizing the multistation railway hub plays an important role in the high-speed railway optimization, which becomes the main objective of this paper.
At present, the stations in the multi-station railway hub are independent of each other and only deal with trains from the fixed connecting directions [6]. Although this fixed scheme is clear and easy to formulate, it may lead to the imbalance of operation among different stations. For example, the stations on trunk lines are usually in busy and lack of capacity, while the stations on branch lines are idle and waste their capacity. In view of this weakness, some connecting railway lines are constructed between adjacent stations to transform some trains from busy stations to idle stations. However, the transformation of trains are only regarded as an temporary adjustment in the operational level [7] and not included the VOLUME XX, 2017 1 tactical plan. In order to make full use of these connecting lines in the tactical level, we attempt to regard all the stations as a unified system and propose a flexible scheme to cooperate the operation of stations. Specifically, in the flexible scheme, the connecting directions of all the stations are flexible, which indicates each station can handle trains from all directions. And this scheme can make overall use of all stations and it is beneficial to play the scale effect of the multi-station hub.
Based on the flexible scheme, how to route trains from all directions becomes the core problem of the multi-station hub optimization. Train routing problem (TRP) is a typical problem in the railway optimization and it has been well studied by many researchers [1], [5], [7]- [13]. According to the research scopes, the relevant researches include the macroscopic TRP and the microscopic TRP. The macroscopic TRP aims to optimize the train routes between stations in the railway corridors and networks, including the stop patterns and frequencies [4], [5], [14], [15]. In contrast, the microscopic TRP focuses on the train routes inside the individual station, including the track allocation and conflict resolution [16]- [19] . Different from previous researches, the TRP in the multi-station hub should optimize the macroscopic and the microscopic train routes simultaneously. On one hand, the macroscopic train routes between stations and connecting directions should be optimized to minimize the train running cost in the hub. On the other hand, the microscopic train routes inside stations also should be optimized to ensure the reasonable track allocation.
During the process of optimization, many kinds of railway infrastructure capacity will be considered as the constraints [20]. The capacity refers to the maximum number of trains that a railway infrastructure can deal with in a certain time period, such as line capacity and track capacity [21], [22]. Moreover, since we focus on the high-speed railway hub, the capacity of Electric Multiple Units (EMUs) depot is also vital for the optimization. The EMUs depot indicates the yard where the storage and maintenance services of EMUs are carried out [23]- [25], and the storage and maintenance capacity of tracks should be introduced into the formulation.
Furthermore, many recent studies have paid more attention to the passenger demand to improve the service quality of transportation [26]- [30] . As the main service target of highspeed railway hub is the urban residents, the passenger demand should be considered into the optimization. Thus, we attempt to introduce traffic zones to represent the demand of residents in different regions of the city.
Based on the above background, how to integrate the macroscopic TRP and the microscopic TRP considering the capacity of EMUs depots becomes the main challenge of this paper. Thus, we focus on the TRP in the multi-station highspeed railway hub and the main contributions are summarized as follows:1) A mixed integer linear programming model is proposed to minimize the train and passenger cost. The optimized train routes include the macroscopic routes between stations and the microscopic track allocation inside stations and EMUs depots; 2) A Lagrangian relaxation approach is developed to solve the model. We dualize the hard constraints into the objective function and decompose the model into train and passenger subproblems, which can be solved by a shortest path algorithm; 3) Numerical experiments based on an illustrative railway hub network and Zhengzhou high-speed railway hub network are implemented to demonstrate the effectiveness of the model and algorithm.
The remainder of this paper is organized as follows. Section 2 reviews the relevant literatures. Section 3 presents the problem description and Section 4 formulates the mixed integer linear programming model. The details of the LR approach are given in Section 5. Section 6 provides some numerical experiments to verify the model and algorithm. Conclusions and further work are summarized in Section 7.

II. LITERATURE REVIEW
The core problem of the multi-station high-speed railway hub optimization is a TRP combining the macroscopic train routes and the microscopic track allocation. In past decades, TRP has always been a research hotspot in the field of railway optimization, and we summarize relevant literatures as the macroscopic TRP and the microscopic TRP. Moreover, since the EMUs depot is a vital factor of the high-speed railway hub, we also conclude some related researches.

A. THE MACROSCOPIC TRP
The macroscopic TRP focuses on the train routes in the railway networks and corridors, which usually includes the stop patterns and frequencies of trains from the origins and the destinations. Szpigel [31] firstly studied the TRP in singletrack railway lines, they regarded the TRP as the job shop scheduling problem and proposed models to minimize the travel time for each train. Carey and Lockwood [11] developed a mixed integer programming model to optimize the train routes on a double-track railway line at different speeds. Lee and Chen [32] proposed a four-step heuristic algorithm to assign the trains to railway lines and generated feasible train routes according a threshold accepting rule. Jia et al. [2] constructed a bi-level programming model to optimize the train routes of express freight trains on a highspeed railway corridor. A heuristic approach combining the genetic algorithm (GA) and Frank-Wolfe algorithm was developed to solve the bi-level model.
Due to the tight relations between the train routes and timetables, many researchers usually integrated the TRP with the train timetabling problem (TTP) to acquire more reasonable train plans. Yang et al. [14] focused on the integration of the TRP and the TTP on a single-track highspeed railway corridor and proposed a multi-objective mixed integer programming model to minimize the total delay between the origin the actual timetable. Khan and Zhou [15] considered stochastic disturbances into the train timetables and developed a heuristic algorithm to decompose the formulation into a series of train subproblems. Zhou and Teng VOLUME XX, 2017 1 [33] developed an integrated model based on an extended space-time-speed network to optimize the train routes and timetables simultaneously. In this formulation, energy consumption of trains with different speeds were regarded as the part of train traveling arcs, and a Lagrangian relaxation solution with a dynamic programming algorithm was designed to solve the model. C. Zhang et al. [34] proposed a joint optimization of the TRP, TTP and maintenance planning on a double-track high-speed railway corridor, and a heuristic algorithm based on Lagrangian relaxation and dynamic constraint generation was developed. Tan et al. [9] proposed a multi-objective programming model to insert extra train routes into the existing timetable and developed a genetic algorithm to acquire high-quality solutions in the real railway network. Gao and Niu [35] focused on the TRP and TTP of multiple type trains with different speed on a high-speed railway corridor. They relaxed the time windows to generate more flexible train routes and timetables, and an improved algorithm based on the alternating direction method of multipliers (ADMM) is introduced for solving the formulation.
In general, since the scale of the macroscopic TRP usually covers a series of stations, the detailed layout of each individual station is ignored. As for the multi-station railway hub, not only the macroscopic train routes between adjacent stations, but also the detailed track allocation of each station should be optimized. Therefore, the existing macroscopic researches cannot be applied to the multi-station hub directly and we need to combine them with the microscopic researches.

B. THE MICROSCOPIC TRP
The microscopic TRP optimizes train routs inside the individual station, including the inbound route, outbound route and track allocation where the train enters and leaves. Compared with the macroscopic TRP, the microscopic TRP is more detailed and can obtain more accurate train routes and timetables. Lusby et al. [7] made an explicit literature review on the classical formulations and solutions of the microscopic TRP in previous studies. The most common approach is conflict graph methodology, which can be further divided into the node packing problem [36] and the graph coloring problem [37]. But when applied to the large-scale problem, the formulation based on the conflict graph would become complicated and cannot get high-quality solutions in an acceptable time. Therefore, some other methods are proposed to improve the applicability of the model. Carey and Carville [38] developed a heuristic method based on the manual operation of railway operators to route trains at busy complex stations. Rodriguez [39] proposed a constraint programming model for routing trains through railway stations. D'Ariano et al. [13] considered disturbances in train timetable and proposed an alternative graph model to resolve conflicts between the microscopic train routes inside stations. Pellegrini [40] also considered disturbances in complex stations and represented the capacities of infrastructure with fine granularity. A solution framework called RECIFE-MILP was developed to obtain feasible solutions quickly. Meng and Zhou [12] developed an integer programming model using space-time network and cumulative flow variables to minimize the total deviation time of trains on an N-track network. A solution framework based on Lagrangian relaxation was designed to decompose the model into train least-cost subproblems, which can be solved by the dynamic programming algorithm. Qi et al. [41] integrated the layout of stations into train timetables and proposed a bi-level programming model to minimize the total travel time of all trains, where the upper level determined whether to design new siding tracks and the lower level routed trains. Z. Zhang et al. [42] considered the braking failure of trains and proposed a dynamic using strategy of successive routes.
In recent years, some researchers introduced the maintenance planning problem (MPP) into the microscopic TRP and TTP, which can make the train track allocation and timetable more practical for the railway dispatchers. Luan et al. [19] focused on the integration of the TRP, TPP and MPP on a railway hub. By introducing cumulative flow variables and representing the maintenance task as virtual trains, they proposed a mixed integer linear programming model based on a space-time network. A Lagrangian relaxation approach was developed to decompose the origin formulation. Y. Zhang et al. [17] also proposed a mixed integer linear programming model to minimize the total train travel time and maintenance cost on a railway line consisting of multiple stations, and further added speed restriction constraints into the model. A heuristic iteration algorithm was designed to decompose the model into train subproblems. Q. Zhang et al. [8] predefined the track maintenance task inside stations firstly, and then proposed a binary integer programming model to minimize the number of train cancellations and the weighted train travel times on several adjacent stations. An ADMM-based approach was developed to solve the model.
To sum up, the researches on microscopic TRP further detail the layout of stations and obtain the track allocation of trains inside stations. However, the previous researches just focused on a single station or several adjacent stations of a single railway line, as for the multi-station hub with multiple stations and railway lines, the current methods are not fully applicable. Furthermore, with the increasing importance of service quality, more attention should be paid for the passenger demand in the railway optimization. By referring to the methods of recent researches [29], [43]- [45], we attempt to represent the passenger demand of different regions in the city as several independent traffic zones.

C. THE EMUS DEPOT
The EMUs depot is the infrastructure where the EMUs are stored and maintained, which is an essential for the high-speed railway hub and usually located next to the station. Recently, some researches have begun to introduce the storage and maintenance capacity of the EMUs depot into the optimization of train plans [25]. This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.  proposed a model to optimize the track utilization of the storage, maintenance and washing area in the EMUs depot. Shi and Li [24] further studied the shunting operation plan among various areas in the EMUs depot. Two integer linear programming models were formulated to minimize the total delay time of trains and a variable neighborhood search algorithm was designed to solve the models. Wang [23] extended the problem scale to the multiple depots in the highspeed railway networks. The sharing strategy and EMUs' running data among different depots were introduced into the optimization of operation schemes. Li et al. [3] integrated the maintenance planning of EMUs into the microscopic TRP and proposed two related integrated programming models to optimize the circulation plans of EMUs in a high-speed railway network. A two-stage algorithm based on column generation was developed to solve the models. As for the multi-station high-speed railway hub studied in this paper, the EMUs depot is an important part of the hub and it has a great effect on the train operation scheme. Thus, we should refer the previous researches and introduce the capacity of the EMUs depot into the optimization. In summary, although many existing literatures have paid much attention on TRP, there is few studies focusing on the multi-station high-speed railway hub. Due to the special characteristics of the multi-station hub, the methods of previous researches are not fully applicable. Specifically, the optimization of the multi-station high-speed railway hub should not only focus on the macroscopic train routes among stations in the hub, but also detail the microscopic track allocation inside each individual station. Besides, the EMUs depot capacity should be considered, which is also significant to obtain more practical optimization scheme. Therefore, we need to integrate the macroscopic TRP, the microscopic TRP and the capacity of the EMUs depot in the optimization of the multi-station high-speed railway hub. The methods of the previous researches and this paper are listed in Table 1.

III. PROBLEM DESCRIPTION
In this section, the optimization of the multi-station high-speed railway hub is defined as a TRP under the capacity of railway infrastructures, and the detailed train routes include the macroscopic routes between stations and the microscopic track allocation inside stations and EMUs depots. Based on the above modeling ideas and actual railway structures, we construct a railway hub network ( , ) G V E = to represent the multi-station high-speed railway hub, in which V refers to the set of nodes and E refers to the set of arcs. The layout of the railway hub network is shown in Figure 1  nodes to express the depots, but regard them as a part of the stations.
2) Set of arcs The set of arcs L refers to the double-track high-speed railway lines, indexed by ( , ) ij, includes: (a) the arcs between connecting directions and stations; (b) the arcs between stations.

3) Set of tracks
The set of operational tracks G refers to the operational tracks for receiving and departing trains in the station, indexed by g , and a GG =  , a G represents the operational tracks of the station a .
Furthermore, as mentioned above, the track of the EMUs depot is also regarded as a part of the station. The set of storage tracks a MM =  indicates the storage tracks for the train in the depot of stations, indexed by m , and the set of maintenance tracks a NN =  indicates the maintenance tracks for the train maintenance in the depot, indexed by n .

4) Set of traffic zones
The set of traffic zones P indicates the passenger aggregation nodes in different regions, indexed by p . These traffic zones can qualify the passenger demand of certain areas and usually used in the urban transportation researches [47], [48].

5) Set of trains
The set of trains K refers to all EMUs operated at the stations of the railway hub, indexed by k , includes: (a) set of Notably, the departure and passing trains 13 k K K  U just require the basic operations at the operational tracks gG  . But the arrival trains 2 kK  will be stored at the storage tracks mM  in the depot, and part of them will also be maintained at the maintenance tracks nN  .
Based on the flexible scheme, all trains can be operated at any station of the railway hub, so we set up virtual origins and destinations, which can be shown in Figure 2. For the departure trains, by taking train 1 k as an example, since it may

6) Capacity of arcs and tracks
The capacity of arc refers to the maximum number of trains can pass through a railway line within a certain time period. Specially, the capacity of virtual arcs is set to 1. The capacity of track refers to the maximum number of trains can be operated at the track in a time period, includes the operation capacity of operational track, the storage capacity of storage track and the maintenance capacity of maintenance track.
Based on the high-speed railway hub network p to station a , , p P a A  . Considering all kinds of capacity, we aim to minimize the total cost of all trains and passengers, which consists of the train running cost on railway lines, train operation cost on operational tracks, train storage cost on storage tracks, train maintenance cost on maintenance tracks and passenger travel cost from traffic zones to stations. The related notations are listed in Table 2.

IV. MODEL FORMATION
Based on the multi-station high-speed railway hub network, we propose a mixed integer linear programming model by using the multi-commodity flow modelling method, which aims to minimize the total cost of trains and passengers.

A. ASSUMPTIONS
Without loss of generality, some principal assumptions are set up for the formulation: This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. 1) Some information of the trains is pre-determined. Specifically, for the departure trains, the destinations are given and the origins are flexible. For the arrival trains, the origins are given and the destinations are flexible. For the passing trains, both the origins and the destinations are given.
2) The non-stopping trains are not considered, which indicates the trains do not stop at any station and pass through the hub directly. This is because these trains are not involved in the track allocation and their occupation of railway lines is relatively small.
3) All the trains only stop at any station in the railway hub once. Besides, all the arrival trains will be stored and part of them need maintenance.
4) The cost of tracks is diversified, which is mainly based on the positions from the entrance of the yard. The cost rises as the distance from the entrance increases.
5) The route of the trains entering and leaving the EMUs depot is not included, which is relatively short for the entire route and can be ignored. 6) Only the passengers departing from the traffic zones of the hub are considered, while the passengers arriving at or passing through the hub are not included.
7) The building scale of each station is enough to accommodate passengers boarding at this station.
8) The capacity of other infrastructures in the high-speed railway hub is sufficient.

B. DECISION VARIABLES
In our formulation, we optimize the train and passenger routes simultaneously. The train routes include the macroscopic routes between stations and the microscopic track allocation inside stations and depots, while the passenger routes refer to the travel routes from the origin traffic zones to the boarding stations. Thus, we define five types of decision variables to represent the detailed routes and they are listed in Table 3.
Integer, representing the number of passengers from traffic zone p to connecting direction b and boarding at station a , ,,

C. FORMULATION
By using the multi-commodity flow modelling method, a mixed integer linear programming model M1 is proposed to minimize the total cost of trains and passengers, the objective function is represented as follows: In (1), the objective function includes the train cost and passenger cost. The train cost indicates the sum of train running cost, operation cost, storage cost and maintenance cost. The train running cost equals the length of train routes multiplied by train running cost per kilometer 1 w , and the other cost refers to the specific cost on each kind of track, which can be represented by 2 3 4 , and g m n w w w . The passenger cost equals the length of travel routes multiplied by passenger travel cost per kilometer 5 w .
Then, the constraints of the model are represented as follows: ,, 1, 0 , and , , 1, (2) is the flow balance constraint for nodes in the railway network, which is the basic constraint of the multicommodity flow model. 1 , , Constraints (3) Constraint (6) ensures each arrival train can only be stored once at any storage track of any depot. Similarly, constraint (7) indicates the arrival train can only be maintained once at maintenance track if it needs maintenance.
, , , , Constraints (8)-(10) indicate the station where the train is operated, stored and maintained must be on the train route.
, , This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.
Constraints (14)- (16) are the track capacity constraints. Constraint (14) refers to the operation capacity constraint, which ensures the number of trains operated at operational track g is less than the capacity of track g . Similarly, constraint (15) refers to the storage capacity and constraint (16) refers to the maintenance capacity.
, , pab pb aA Constraint (17) indicates the distribution of the passengers, which ensures the number of passengers boarding at all stations is equal to the passenger demand from traffic zone p to connecting direction b . 13 , , Constraint (18)  x y z u r N +  (19) Constraint (19) indicates the domains of the decision variables.

V. SOLUTION METHOD
In previous researches, Lagrangian relaxation (LR) has been widely used in the railway optimization and proved to be efficient for the TRP. The basic principle of LR is to dualize the hard constraints of the model and move them into the objective function by multiplying non-negative Lagrange multipliers, then the dualized model can be decomposed into a series of independent subproblems which can be solved efficiently [4], [12], [33], [34].
According to the characteristics of model, we attempt to develop an LR approach to solve the model M1. In the approach, we first dualize the hard constraints of model M1 and penalize them into the objective function. Then, we further decompose the dualized model into two series of train and passenger subproblems and apply the shortest path algorithm to solving these subproblems. Notably, sometimes LR may fall into the symmetry issues [4], [8], which will limit the algorithm into a loop and affect the improvement of the solution quality, so we try to diversify the subproblems by updating the cost of arcs and tracks.

A. THE SOLUTION FRAMEWORK OF LR
After analyzing the constraints of model M1, it can be conducted the capacity constraints (13)(14)(15)(16) lead to a large number of train combinations, and the coupling constraint (18) integrates the train and passenger variables. Therefore, we regard these constraints as the hard constraints and dualize them into the objective function with five sets of non-negative Lagrange multipliers   17), (19). For the sake of simplification, the objective function of the dualized model M2 can be rewritten as follows:  Since the constant term of (21) is unconcerned with the decision variables, it can be ignored and the dualized model M2 can be decomposed into two series of independent train and passenger subproblems, which are represented as model M3 and M4 as follows: Subject to constraints (2)-(12), (19).  (23) Subject to constraints (17), (19). This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and The train subproblem M3 aims to minimize the total cost of arcs and tracks for each independent train, which is a shortest path problem with specific constraints and can be solved by a shortest path algorithm based on the Floyd algorithm. At the meanwhile, the passenger subproblem M4 aims to minimize the total cost of passengers. After solving all the train and passenger subproblems, we substitute the train and passenger solutions into (20) to generate the lower bound of model M1.
As for the upper bound, we schedule the train and passenger routes according to the train profits by using a priority scheduling algorithm.
In the iterative part of the algorithm, we use the most common sub-gradient method to update the Lagrange multipliers iteratively, while the step size  is simultaneously updated with iteration. Denote q as the iteration number and the related formulas are expressed as (24)-(29): ( 1/ ( 1) q  =+ (29) The termination conditions of the algorithm include the following three rules: (1) the gap between best upper bound and best lower bound is within an acceptable range; (2) the values of best upper bound and best lower bound remain unchanged in consecutive iterations; (3) the value of iteration q exceeds the maximum max q .
Above all, the solution framework of LR is summarized in Algorithm 1 and the flowchart is shown in Figure 3. Step 2: Solve the dualized model M2 and update the best lower bound Step 2.1: Decompose the dualized model M2 into two series of subproblems, including train subproblems M3 and passenger subproblems M4.
Step 2.2: Apply the shortest path algorithm to the subproblems to generate train routes and passenger routes, then calculate the current lower bound

Step 4: Update the Lagrange multipliers
Update the iteration 1 qq =+; Using the sub-gradient method to update the multipliers by (24)-(28); Update the step size  by (29).

Step 5: Termination conditions test
As long as one of the three termination rules is satisfied, stop the algorithm and output the best upper bound * UB , the best lower bound * LB , the current gap and best feasible solution; Otherwise, loop back to Step 2.

B. THE SHORTEST PATH ALGORITHM
The subproblem M3 of train k can be regarded as a shortest path problem from the origin k o to the destination k d with operation, storage and maintenance constraints. Since there are multiple pairs of origins and destinations in the train subproblems, we choose the Floyd algorithm rather than the Dijkstra algorithm to calculate the train shortest routes [12]. Then, as for the constraints of tracks, we add the minimum track cost of each station into the cost of shortest train routes passing through the station, and select the train route with the least cost of arcs and tracks. Furthermore, since the cost of arcs and tracks is the same for all trains in each iteration, some trains will concentrate on the same route, which is also called the symmetry issues of LR [4], [8]. The symmetry issues may make LR fall into local optimal solution and is not conducive to improve the solution quality. Therefore, we try to break the issues by diversifying the cost. Specifically, we update the occupation of arcs and tracks after solving each train subproblem, once an arc or track is full of capacity, a penalty cost  will be added into the cost of the arc or track to avoid more occupation of other trains.
As for the passenger subproblem M4, we compare the travel cost from the origin traffic zone to different stations and select the least cost travel routes.
The solution framework of the shortest path algorithm is summarized in Algorithm 2 as follows: Algorithm 2. The solution framework of the shortest path algorithm.
Step 1: Apply the Floyd algorithm to generate the shortest routes between nodes Step 1.1: Construct the adjacency matrix between any two nodes , ij in V .
According to (22) Step 1.2: Update the shortest routes and previous nodes.

C. THE PRIORITY SCHEDULING ALGORITHM
As for the feasible solutions, referring to the methods of previous researches [12], [29], we also develop a heuristic algorithm based on the priority of trains. In the priority scheduling algorithm, we first sort the trains according to the Lagrangian profit in (30), which represents the violation of dualized constraints. Afterwards, we schedule the trains successively, once an arc or a track is full of capacity, it cannot be occupied by the following trains. Finally, we compute the passenger carrying capacity of each station and generate the optimized passenger routes.
The solution framework of the priority scheduling algorithm is summarized in Algorithm 3 as follows: Algorithm 3. The solution framework of the priority scheduling algorithm.
Step 1: Train priority ranking Rank the trains according to the decrease of the train Lagrangian profits, which refers to the violation of dualized constraints (13)(14)(15)(16) and (18).
Step 2: Generate the feasible train routes Step 2.1: Choose the train k with the highest priority and obtain its route through using the shortest path algorithm (Step 2 of Algorithm 2). If there is no feasible route for train k , terminate the algorithm and output no feasible solutions.
Step 2.2: Update the occupation of arcs and tracks, once an arc or a track is full of capacity, it cannot be occupied by the following trains.
Step 2.3: If all trains have been scheduled, calculate the current passenger carrying capacity of each station for each connecting direction. Otherwise, move to Step 2.1.

Step 3: Generate the feasible passenger routes
Considering the passenger carrying capacity of stations, assign passengers to the nearest station.

VI. NUMERCIAL EXPERIMENTS
In this section, we design two numerical experiments to demonstrate the effectiveness of the model and algorithm. Firstly, an illustrative railway hub network is constructed and we compare the solutions obtained by the LR approach with the optimized results obtained by the commercial solver GUROBI. Secondly, we further apply the model and algorithm to a real-world network based on Zhengzhou highspeed railway hub in China, then evaluate the validity of the optimized scheme.

1) DESCRIPTION OF THE ILLUSTRATIVE NETWORK AND RELATED PARAMETERS
The illustrative multi-station high-speed railway hub network is shown in Figure 4, which consists of two passenger stations with the corresponding EMUs depots, four connecting directions and two traffic zones. Specifically, each station has two operational tracks, two storage tracks and two maintenance tracks, the capacity and cost of tracks is listed in Table 4. In Table 5, the capacity and lengths of arcs between nodes are listed. Except the double-track railway liens, the travel routes between traffic zones and stations are also included. The numbers of passengers from the origin traffic zones to the target directions are listed in Table 6. In order to make the effect of train cost and passenger cost on the objective function in an order of magnitude, the values of other basic parameters are set as follows: per train running cost on the railway lines 1 w is set to 1 RMB/(train · km), per passenger travel cost 5 w is set to 25 RMB/km and passenger carrying capacity per train is set to 100 persons/train.   g1  a1  10  10  G  g2  a1  10  5  G  g3  a2  10  10  G  g4  a2  10  5  G  m1  a1  6  10  M  m2  a1  6  5  M  m3  a2  6  10  M  m4  a2  6  5  M  n1  a1  4  10  N  n2  a1  4  5  N  n3  a2  4 a1  80  5  b3  a2  80  5  b4  a2  80  5  a1  a2  80  5  a1  b1  80  5  a1  b2  80  5  a2  b3  80  5  a2  b4  80  5  a2  a1  80  5  p1 a1  )  p1  b1  30  p1  b2  30  p1  b3  30  p1  b4  30  p2  b1  30  p2  b2  30  p2  b3  30  p2 b4 30

2) RESULTS OF THE LR APPROACH
In order to demonstrate the effectiveness of the LR approach, we design four cases with different number of trains based on the illustrative network, the details of these cases are listed in Table 7. Furthermore, for testing the solution quality of the results, we regard results obtained by a commercial solver GUROBI as benchmarks, which has been proved to be efficient in many optimization problems [17], [20]. The solution results of the LR approach are summarized in Table 8, where the best lower bounds (LB), the best upper bounds (UB), the gaps between the best upper and lower bounds (GAP1), the gaps between the best upper bounds and the optimal values (GAP2) and the computational times (CTs) are reported. The optimal values obtained by GUROBI (OP) are also included as benchmarks. Besides, in order to describe the iterative process of the algorithm, the convergence curves of LR are drawn in Figure 5. Both the LR approach and GUROBI are implemented in Python on the PyCharm platform, and all the cases are carried out on a Windows desktop computer with i7-7700@2.8Ghz CPU and 16GB RAM.  According to the results in Table 8 and Figure 5, the following conclusions can be drawn: 1) The LR approach can obtain solutions with good quality. In Table 8, it can be seen that the GAP2 of the four cases are all 0%, which also means the best feasible solutions generated by LR are same to the optimal solutions generated by GUROBI. Furthermore, the average GAP1 of the four cases is 6.87%, the minimum is 0% and the maximum is 14.41%. These indicators are all in an acceptable range and prove the good convergence of the algorithm.
2) The LR approach has a good performance in solving efficiency. First, the computational times of LR in the four cases is all short, the minimum is just 0.13s and the maximum is 53.00s. Second, as shown in Figure 5, it can be seen that LR can obtain the best lower bounds quickly with several iterations. Besides, the numbers of iterations where the best This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. upper bounds are generated are marked as red circles in Figure  5, the minimum is 1, the maximum is 64 and the average is 32, which indicate the LR can generate good feasible solutions in short iterations.

Zheng-Wan Yard
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.   Table 10 indicates the capacity and cost of tracks. Then, the numbers of trains between origins and destinations are shown in Table 11, the origins of the departure trains and the destinations of the arrival trains are represented as Z. The numbers of passengers from traffic zones to directions are shown in Table 12, and the values of other basic parameters are listed in Table 13.

2) OPTIMIZED RESULTS
After substituting the real data into the proposed model, we implement the LR approach to obtain solutions by Python on PyCharm. Specifically, the objective value is 8151020 kRMB, in which the train running cost on railway lines is 3194320 kRMB, the train operation cost on tracks is 69980 kRMB, the train storage cost is 26720 kRMB, the train maintenance cost is 30000 kRMB and the passenger cost is 403000 kRMB. Due to the large number of trains, it is troublesome to list all detailed train routes, so we take 74 trains of connecting direction b4 as examples and represent their routes in Table 14.
2) The utilization of arcs is relatively balanced in the whole network. In Figure 7, the average capacity utilization rate of arcs is 18.64% and 91.30% of the arcs with the rates less than 60%, which proves the railway lines still have enough capacity for the future trains. Notably, the utilization rate of arc (a2, b7) reaches 71.67%, so more parrel railway lines need to be built between Xu-Lan yard and Xi'an.
3) The utilization of tracks in stations and depots is also reasonable. In Figure 8, the average capacity utilization rate of the operational tracks is 52.12% and the gaps among different stations are relatively small, which illustrates operation capacity of stations is sufficient and the division of labor is fair. Besides, the average utilization rate of maintenance tracks is 14.30%, which means the maintenance capacity of the hub is even affluent for more trains. However, the average utilization rate of storage tracks is 77.53% and many tracks are even full of capacity, so we need to construct more storage tracks in the depots, especially for the Zhengzhou East station.

3%)
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.  Table 15, it can be seen the passengers of different traffic zones can board at the nearest stations, which indicates the optimization scheme can meet the travel demand of residents in different regions of Zhengzhou.

3) DISCUSSION
At present, the fixed scheme is widely used in the multi-station railway hub, in which the stations of the hub only deal with trains of fixed connecting directions. Due to the simplicity and clarity, this scheme is convenient for the railway operators, but it may limit the overall capacity of the hub and cannot make full use of all stations. Thus, we compare the optimized train routes with that of the fixed scheme to demonstrate the effectiveness of the optimization scheme.
The results of the optimization scheme and the fixed scheme are presented in Table 16, including the total cost, train cost, passenger cost, average capacity utilization rates of arcs and operational tracks. The capacity utilization rates of the storage tracks and maintenance tracks in different depots are shown in Table 17. Moreover, for the sake of visibility, we represent the utilization of arcs of the fixed scheme in Figure  9 and express the utilization of operational tracks in Figure 10. The distributions of the arc and operational track capacity utilization rates are represented as pie charts in Figure 11. From the comparison results in Table 16, 17 and Figure 9 -11, the following observations can be obtained: 1) In Table 16, the total cost of the optimization scheme is 37.18% lower than the cost of the fixed scheme. Specifically, the train cost decreases by 3.66% and the passenger cost decreases by 58.28%. These results illustrate the optimization scheme can effectively optimize the train routes and facilitate the passengers.  2) Compared with the fixed scheme, the utilization of arcs in the optimization scheme is more balanced. Although the average utilization rates of arcs are almost similar, the variance of the optimization scheme is 0.0255, which is lower than 0.0348 of the fixed scheme. Furthermore, in Figure 9, it can be seen there are more arcs with high utilization rates in the fixed scheme. For examples, the utilization rate of the arc (a3, b3) is 64%, which is difficult to operate new trains, but the tense situation is mitigated in the optimization scheme.
3) In the fixed scheme, the gaps between the operational track utilization rates of different stations are obvious and the variance reaches. In Figure 10, Jing-Guang yard and Xu-Lan This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. yard of Zhengzhou East station are much busier than the other stations, which will limit the introduction of more new trains and waste the capacity of other idle stations. On the contrary, the optimization scheme has a more balanced utilization of operational tracks, which is more conducive to playing the scale effect of the multi-station hub.  Table 17, it can be seen the storage capacity of the fixed scheme is insufficient, such as the utilization rates of Jing-Guang yard even exceeds 100%. While in the optimization scheme, the trains to be stored in the depots are assigned more reasonably. Besides, since the capacity of maintenance is sufficient, the utilization rates of the two schemes are similar.

VII. COLUSIONS AND FURTHER WORKS
In this paper, we focus on the train routing problem in a multistation high-speed railway hub. Considering the capacity of railway lines, stations, EMUs depots and the demand of passengers, a mixed integer linear programming model is proposed to minimize the train cost and passenger cost. In the process of optimization, both the macroscopic train routes between stations and the microscopic track allocation inside stations and EMUs depots are optimized simultaneously. According to the characteristics of the model, an Lagrangian relaxation approach with the shortest path algorithm is developed to dualize the hard constraints and decompose the model into two series of subproblems. The numerical experiments verify the effectiveness of the model and algorithm. In the illustrative network, the LR approach can obtain optimal solutions within an acceptable time. In the realworld network, the LR approach still can find high-quality solutions and generate a feasible scheme. Compared with the existing fixed scheme, the total cost of the optimization scheme is decreased by 37.18% and the utilization of arcs and tracks is more reasonable.
For the sake of simplicity, we just regard the EMUs depot as a part of the station and ignore the routes between the stations and the EMUs depot, which are usually the throat of the depot optimization. Thus, we will attempt to introduce these routes into the TRP of the high-speed railway hub in our future work. Furthermore, although we diversify the train subproblems to break the symmetry issues in LR, the convergence of the algorithm still can be further accelerated, and we can use some techniques to improve the lower bounds.