A Sparse Optimization Approach for Energy-Efficient Timetabling in Metro Railway Systems

In this paper, we propose a sparse optimization approach to maximize the utilization of regenerative energy produced by braking trains for energy-efficient timetabling in metro railway systems. By introducing the cardinality function and the square of the Euclidean norm function as the objective function, the resulting sparse optimization model can characterize the utilization of the regenerative energy appropriately. A two-stage alternating directionmethod ofmultipliers is designed to efficiently solve the convex relaxation counterpart of the original NP-hard problem and then to produce an energy-efficient timetable of trains. The resulting approach is applied to Beijing Metro Yizhuang Line with different instances of service for case study. Comparison with the existing two-step linear program approach is also conducted which illustrates the effectiveness of our proposed sparse optimization model in terms of the energy saving rate and the efficiency of our numerical optimization algorithm in terms of computational time.


Introduction
With the rapid increase in industry and population in cities, the metro railway system has been playing a more and more important role in urban economic and social development for metropolis due to its high loading capacity and low pollution.Despite the inherent efficiency, the energy consumption of metro railway systems is of a huge magnitude and the efficient energy management for such systems is then of global importance.It is known that the energy consumption of trains is largely influenced by the driving strategy and the arrivals and departures in the timetable, which inspires the extensive study on energy-efficient train control and energy-efficient train timetabling.A careful treatment on such topics can be found in the nice survey paper by Scheepmaker et al. in [1], with focuses on reviewing the mathematical problem formulations and solution approaches in the literature.Related work on energy optimization for rail traffic systems such as tramway systems has also been well studied; see, e.g., [2,3].
Among all the involved energy saving strategies in metro railway systems, an energy-efficient timetable with regenerative braking which schedules the arrival time and the departure time of each train to and from the platforms it visits for the minimization of energy consumption and the maximization of produced regenerative energy utilization, has gained considerable academical attention in recent years.In 2012, Peña-Alcaraz et al. [4] proposed a large-scale mixed integer programming model to maximize the utilization of regenerative braking energy by means of maximizing the total duration of all possible synchronization processes of acceleration and regenerative braking between all possible train pairs.The simulation model is applied to line three of the Madrid metro system and the energy savings of the optimized timetable are reported to be about 7% on average.By merely choosing those train pairs suitable for regenerating energy transfer other than all possible train pairs in [4], Das Gupta et al. developed a more tractable mixed integer programming model in [5].In 2013, Yang et al. [6] presented a cooperative integer programming model to maximize the time overlaps of nearby accelerating and braking trains and applied a genetic algorithm to approximately solve the resulting optimization model.For the case study on Beijing Metro Yizhuang Line, they reported 22.1% and 15.2% increase of effective time overlaps during peak hours and off-peak hours, respectively.A two-objective timetable optimization model for maximizing the regenerative braking energy with consideration of the passenger waiting time was developed by Yang et al. in [7] where the genetic algorithm was again adopted to solve the optimization model.An integrated energy-efficient operation model was proposed by Li & Lo in [8] with the speed profile part to minimize the net energy consumption and the timetabling part with headway constraints to synchronize the accelerating and regenerative braking for the reuse of regenerative energy.About 25% energy savings were reported for Beijing Metro Yizhuang Line with the headway between trains being 90 seconds.Li & Lo further developed an integrated energyefficient timetable and speed profile optimization model to determine the cycle time, the headway time, and the speed profiles for a metro line to minimize the energy consumption [9].The resulting mathematical model is a strictly convex quadratic programming problem with simple bound constraints and an analytical approach based on the first-order optimality condition (i.e., the KKT condition).An integrated timetable and speed profile optimization model with Taylor approximation were also considered by Yang et al. in [10] where the active set method was employed to handle the resulting strictly convex quadratic programming with linear constraints.Under the assumption that trains are operating in the optimal speed profiles, Yang et al. [11] proposed a mixed integer programming model aiming at synchronizing the arrivals and departures of the trains such that regenerative braking energy can be used as much as possible during the whole day service.A two-step linear programming model with the first step to minimize the total energy consumed by all trains and the second step to maximize the utilization of regenerative energy produced by braking trains for a whole day service in metro railway networks was proposed by Das Gupta et al. in [12].To ensure the transfer of maximum possible regenerative energy from braking trains to accelerating trains, a time overlap vector was introduced which requires to be as close as possible to the zero vector.To effectively quantify such a requirement, the ℓ 0norm and the ℓ 1 -norm were employed in [12].By adopting the ℓ 1 -norm heuristic for the original ℓ 0 -norm (i.e., the number of nonzero components in a vector), together with epigraph approach that equivalently transforms the convex ℓ 1 -norm minimization to a linear programming, the resulting relaxation problem in the second step was efficiently solved by Gurobi Optimizer.Numerical experiments on different instances of service PES2-SFM2 of line 8 of Shanghai Metro network spanning a full service period of one day were conducted and a significant reduction in effective energy consumption with the worst case being 19.27% was reported by applying the proposed approach.
The original ℓ 0 -norm minimization problem proposed in [12] is well-known to be NP-hard due to the discontinuity and nonconvexity of the ℓ 0 -norm.Such a type of optimization problems is coined by sparse optimization in the optimization community and has a wide range of applications in compressed sensing, signal processing, statistical regression, financial portfolio, and so on.Various tractable relaxation strategies are proposed to replace the ℓ 0 -norm, such as the popular convex surrogate ℓ 1 -norm [13], and some weighted ℓ 1 -norm variants [14].Numerical algorithms are also developed for the resulting relaxation models, such as the greedy-type methods [15], the first-order optimization methods [16], and second-order methods for some special structured models [17].
Inspired by the development of sparse optimization methods, in this paper, a sparse optimization model with the ℓ 0 -norm and the squared Euclidean norm as the objective function is built to maximize the utilization of the regenerative braking energy.Motivated by the adaptive LASSO model in the high-dimensional statistical regression [14], we propose a weighted ℓ 1 -norm relaxation scheme.A two-stage alternating direction method of multipliers (TSADMM) is then proposed with the first stage for calculating an appropriate weight vector and an initial point for the iterative algorithm in the second stage.To illustrate the effectiveness of our new mathematical model, the linear programming model in [12], a quadratic model merely considering the squared Euclidean norm, and our weighted-ℓ 1 companied with the squared Euclidean norm are applied to Beijing Metro Yizhuang Line for comparison.The numerical results on energy savings show that our model is the best among them.To demonstrate the efficiency of our proposed TSADMM, the academical version of CVX (http://cvxr.com/cvx/) is called to solve our model as well.The numerical results in computation time show that our algorithm outperforms that used in CVX.
The rest of this paper is organized as follows.In Section 2, the sparse optimization model for energy-efficient timetabling in metro railway systems is built.In Section 3, the weighted ℓ 1 norm relaxation model is proposed and a two-stage alternating direction method of multipliers (TSADMM) is designed to get an approximate optimal solution.Theoretical global convergence is established as well.By applying our proposed approach to Beijing Metro Yizhuang Line, we demonstrate the effectiveness of our proposed model and the efficiency of our proposed TSADMM by the numerical results in Section 4. Conclusions and future research are given in Section 5.For reference, the appendix provides the proof of the global convergence theorem for our algorithm.

Model Formulation
2.1.Notation.The notation that will be used throughout the paper is listed in Table 1 for the convenience of the subsequent model formulation.

Constraints.
The set of constraints in the metro train networks including constraints of the trip time, the dwell time, the headway time, the total travel time, and the domain of event times will be described in this subsection, following from [12].
(i) Trip time constraint: it includes the trip time constraints with tracks and with crossing-overs.For the former one, when a train  ∈ T has a trip from platform  ∈ N  to platform  ∈ N  along the track (, ) ∈ K  , the departure time    and the arrival The set of all platforms K The set of all tracks K The set of tracks visited by a train  N The set of platforms visited by a train t K The set of all arcs associated with trip time constraints   (⋅) The indicator function with respect to the set S Π  (⋅) The projection operator onto the set S sign(⋅) The sign function ∘ The componentwise product, also known as the Hadamard product    The transpose of the matrix  V() The column vector generated by stacking all columns of the matrix  time    are bounded by    and  t  , respectively.The corresponding constraints are For the latter one, one knows that a crossing-over connecting two train-lines is a special track, where a train-line is a path with some nonopposite platforms and tracks.The crossing-over is to connect the terminal platform of a train-line with the first platform of another.A train  turns around by a crossing-over and departs from the first platform to travel through the train-line after it arrives at the final platform at another train-line, and then the same train is labelled as   in order to treat it as two different trains [20].Let  represent the set of all crossing-overs.Consider a crossing-over (, ) ∈ , we denote the set of all train pairs associated with a same physical train through the crossing-over (, ) by O  .The trip time to travelling through the crossing-over (, ) ∈  is the time from the departure from the platform  (train is labelled as ) to the arrival at the platform  (train is labelled as   ), which has to stay within a window [    ,     ].And the corresponding constraints can be described as below: (ii) Dwell time constraint: when a train  ∈ T arrives at a platform  ∈ N  , it should dwell there in a time interval denoted by [   , ] to make sure that passengers can get off and get on the train before it departures from the platform .The dwell time constraint can be written as follows: (iii) Connection constraint: an interchange station used by connecting trains is to make passengers transfer between trains on different lines.Any platform pair (, ) ∈ S consists of two platforms at a same interchange station.For a connection train pair (,   ) ∈ C  , we assume that the train  arrives at platform  and another train   departs from platform .The difference of both times can be in a window [    ,     ] to maintain that passengers can get off from the former train and get on the latter.The constraint of connection can be described as follows: (iv) Headway time constraint: similar to the trip time constraints, the headway time constraints includes those with tracks and with crossing-overs.In any subway system, a minimum interval between the departures and arrivals of sequenced trains on the same track is maintained, called headway time.To meet the needs of the passengers, railway networks in many cities designate an upper bound between the departures and arrivals of consecutive trains, in order to ensure passengers do not wait for a long time before the arrival of next train.Define H  as the set of pairs of successive trains that pass by track (, ) ∈ K.
Consider (,   ) ∈ H  , and let [ℎ    , ℎ ] be the time intervals that maintained the departures and arrivals of consecutive trains  and   from and to platform  and platform , respectively.For any (, ) ∈ K between platform  and platform , headway time constraint with tracks can be described as follows: In the other case, consider two consecutive trains  1 and  2 that go through a crossing-over (, ) ∈  after departing from the final platform  of a train-line.
They will be labelled as   1 and   2 when they arrive at the first platform  of some other train-line.Let Õ be the set of all such two-train pairs (( 1 ,   1 ), ( 2 ,   2 )).There is an interval time window [ℎ ] between the departures of trains  1 and  2 from platform  and a window [ℎ ] between the arrivals of trains   1 and   2 at platform .We can then write the headway constraints with crossing-overs as follows: Besides, the headway time is related to the passenger demand and the number of trains.Let   be the number of trains that are in service per hour,   be the train capacity,  be the train utility rate, and  be the passenger demand.Then,  =   ×  ×   [8].Let ℎ be the headway time.Since ℎ = 3600/  , we have Here we assume that  and  are constant while the passenger demand varies.In other words, the headway time of trains changes at different periods in a day.
(v) Total travel time constraint: when a train  ∈ T has a trip, traversing all platforms and tracks in chronological order, it can have the total travel time between   and   , to meet the needs of passengers in subway systems.Total travel time constraints can be written as follows: (vi) Domain of event times: we set zero second as the time of the departure of the first train from a platform in a day when the railway is in service.By adding the maximum possible values of all trip times and dwell times, we can set an upper bound for arrival of the last train at the final platform.And we denote the upper bound by   ∈  ++ .So, we can write the domain of the event time constraints as follows:  where   : R ++ → R ++ with argument (   −    ) being the energy consumption function that is unknown in advance and some best possible affine approximate function in the sense of least-squares by using the measured energy in practice is then utilized.By denoting  fl vec(  ) with the approximate counterpart of (10) can be formulated by the following linear programming: where  = 2|T||N|, K is a subset of {(, ) | ,  = 1, 2, . . ., } that collects all those indices according to constraints ( 1)-( 8), and K  is a subset of K that collects all those indices according to constraints (1)- (2).By utilizing its optimal solution , we can get    's and    's such that For all other constraints, lower and upper bounds are allowed to vary as described in (3)-(9).

Maximizing the Energy Regeneration:
A Sparse Optimization Model.We adopt the speed profile by maximum accelerating, speed holding, maximum braking strategy from [18] with a multi-phase-speed-limit section, as described in Figures 1 and 2.
Let  1 ,  2 , and  3 be the time of maximum accelerating, speed hold, and maximum braking phases, and   and   are the accelerating rate of the accelerating and braking phase.It is known that, with regenerative braking, kinetic energy can be converted into electricity which can be fed back to the power supply system to be used by other nearby trains in the same electric substation.Denote the conversion factor from electricity to kinetic energy as   and the conversion factor from kinetic energy to regenerative electricity as   and the resistance rate at the speed holding phase as .For any two adjacent stations, we adopt the formula from [18] to calculate the energy consumption for accelerating and the regenerative energy produced during braking, denoted by   and   , respectively, in the way as below: Here   and   are both for the unit mass.A simple heuristic method is applied to approximate the power graph as shown in Figure 3, where  and c are the width of the rectangles for accelerating and braking and ℎ and h are heights of the rectangles.Similar to those in [12], the midpoint of the width in the rectangle is called the regenerative or consumptive alignment point, where Δ   represents the distance between    and the regenerative alignment point and ∇   represents the distance between    and the consumption alignment point.
Here, we align the distance between regenerative and consumption alignment points to maximum the utilization.Before introducing such a distance, we recall the suitable train pairs as introduced in [12].Assume that platform pairs are opposite to each other at the same electrical substation.Let P be the set of all platform pairs.For any pair (, ) ∈ P and the set T  ⊆ T contains all trains that arrives at, dwells, departs from platform .When the train  ∈ T  departs from the platform , it is a best choice to find a train  →  at the platform  which are producing the regenerative braking energy using the following definition.
where  is an empirical parameter determined by the timetable designer and is much smaller than the time horizon of the entire timetable [12].Similarly, when a train  ∈ T  arrives at platform , it produces the regenerative energy.
To find a suitable train ←   departing from platform  for transferring the regenerative braking energy, we recall the following definition from [12].
Definition 2 (see [12]).Consider any (, ) ∈ P. ) . ( The components of  represent difference of the time between the alignment points of accelerating and braking.Once the running time is fixed, Δ   and ∇   are two constants.The ultimate energy consumption can be approximately calculated as follows (see Figure 4): where train  is accelerating and train t is braking.

Model Relaxation and Solution Algorithm
3.1.Relaxation Model.Problem ( 21) is generally NP-hard due to the combinatorial property of the involved ℓ 0 -norm [21].By adopting the well-known convex surrogate ℓ 1 -norm initiated in compressed sensing [13,22], we get the following ℓ 1 + ℓ Inspired by Oracle property beneficial from the adaptive lasso strategy proposed by Zou et al. [14], we will further utilize a reweighted ℓ 1 -norm term instead of the ℓ 1 -norm in (24) and get the following relaxation counterpart: where  is some appropriate positive column vector to encourage a low sparsity of .
It is worth mentioning that we can utilize the above ADMM method to solve an ℓ 1 problem which is exactly (26) by taking  =  and  2 = 0 to a relatively low accuracy.The resulting numerical solution ( x, ŷ, ẑ) can not only be served as the initial point, but also produce the weight vector  in the way that  = (1/(| ŷ |  + )) with  ∈ (0, +∞) and  > 0 sufficiently small for a low sparsity promotion when solving (26).The two-stage ADMM algorithmic framework then can be summarized in Algorithm 1.

Numerical Experiments
Numerical experiments on Beijing Metro Yizhuang Line will be conducted in this section to evaluate our proposed model and to illustrate the performance our designed TSADMM algorithm.The procedure of the numerical study is shown in Figure 6.All the computational results are obtained by running Matlab (version 2016b) on a windows desktop (Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz 3.40GHz RAM 16.0 G).
Beijing Metro Yizhuang Line, opened on December in 2010, is one of the 20 lines in Beijing Metro Networks.It has two train lines with 14 stations as shown in Figure 5.The total length of Yizhuang Line is 23.23 km, with the average distance of two station being 1.8 km, the minimum distance being 1 km, and the maximum being 2.6 km.In the section, we calculate the energy conversion when a train has a trip under unit mass.

Inputs and Algorithm
Initialization.For (26),  ∈ R  in the objective function is a 0-1 vector defined in Section 2.4.And the weight vector is denoted by  = (1/(|ŷ|  + )), where ŷ is an approximate solution generated by Stage I in Algorithm 1.Here we set  = 0.5 and  = 10 −20 .In the equality constraint  −  −  = 0,  ∈ R × with  = 2 ,  = |N| = 28,  the number of trains in one day service,  ∈ R  with components (Δ   + ∇ t  ) (,,, t) sorting in the appropriate way, where train  is accelerating and t is braking, and platforms  and  are opposite to each other, satisfying the equation  +  = 29. −  = 0 in (26) is an equality constraint about trip times. ∈ R × is the coefficient matrix in (26),  = ( − 1).The components of  ∈ R  are the trip times. ∈ R × in  −  +  = 0 is the coefficient matrix of inequality constraints, and  = 2(2(−1)+ ++).The elements in the vector  ∈ R  are upper and lower bounds of dwell time constraints, headway time constraints, total travel time constraints, and domain of event times constraints.For testing purpose, elements of  are given in the following way.The bounds of dwell time constraints are given in Table 2. Let  be the passenger demand for one-day service (16 hours) and   ∈ R 16 be the distribution of  for each hour which is presented in Table 3.Let   be the capacity of a train, which takes the value 1440, and  ∈ R 16 be the train utility rate for different hours as stated in Table 3 as well.By utilizing formula (7), we can get the headway time for each different period of time (per hour).Specifically, denote the headway time vector by ℎ ∈ R 16 .Then ℎ  = (3600×1440×  )/(×   ) for all  = 1, 2, . . ., 16. Set the windows for the headway time constraints to be ℎ  ± 8 for all possible train pairs in the  th hour.The total travel time constraints are related Set initial point x 0 , y 0 , z 0 , u 0 ,  0 , s 0 Stage I: Calculate w, (  x,  y,  z) x * to the dwell time and trip time.The bounds of trip times are calculated by Table 4 as reference.For each train , the bounds of trip time are more than ∑ 5 =1 (  /V  ) for all sections (),  = 1, 2, . . ., 27,  =  + 1.The bounds of trip time constraints are presented in Table 5.The following formulae are adopted for calculating the windows of total travel time constraints, where  1 = 1.1 and  2 = 0.85.For the domain of event times constraints, we set   = 16 × 3600.
The initial point ( 0 , V 0 ,  0 ) in Algorithm 1 is chosen to be 0. To calculate the energy consumption and regenerative energy, we set the maximum acceleration   = 0.5/ 2 in accelerating,   = −0.8/ 2 for braking, the resistance acceleration  = −0.05/ 2 during the speed holding phase, the conversion factor from electricity to kinetic energy   = 0.7, and the ratio of kinetic energy converting to regenerative energy   = 0.5 as used in [18].For testing purpose, we approximately calculate the energy consumption for accelerating and the regenerative energy produced during braking in the following way.As presented in Section 2.4, the maximum accelerating, speed holding, maximum braking strategy from [18] is adopted and the parameters  1 ,  2 , and  3 are set in Table 6.The parameters  and c in Figure 3 which represent Table 4: Speed limits of every section on Beijing Metro Yizhuang Line following from [18].
For a given tolerance  2 > 0, we will stop the tested algorithms when both   and   are less than  2 .The algorithms will also be terminated when they reach a given maximum number of iterations  2 .Here we set  2 = 10 −3 and  2 = 10 4 .For Stage I in Algorithm 1, it will be stopped when both   and   are less than  1 or the maximum iteration reaches  1 .Here we set  1 = 10 −1 and  1 = 10 3 .

Experiment Results
. To evaluate our model with ℓ 0 -norm plus ℓ 2 2 -norm as the object, the other two possible models are computed for comparison.The first model is the ℓ 0 -norm plus  where only the squared Euclidean norm is employed to measure the vanishing of .Since we focus on the reduction of the effective energy consumption, we define the energy saving rate  as follows: Energy savings by using these three different models are listed in Table 7 with different choices of .
As shown in Table 7, the energy savings by using our model are better than those by the other two in all the testing instances with at least 2.58 percentage higher in the energy saving rate .
Other choices for the initial trip times from [6,[9][10][11]19] with  = 368 are also considered as inputs.The comparison results in terms of the energy saving rate  are stated in Table 8.
As one can see in Table 8, the utilization of regenerative energy of our approach is the best among these three models  as well with the energy saving rate at least 3.60 percent higher than the other two.These evidently show the effectiveness of our proposed model.
In order to demonstrate the efficiency of our proposed TSADMM algorithm, we call the academical CVX to solve ours model for comparison.The time comparisons for all the aforementioned instances are presented in Tables 9 and 10, respectively.
As can be seen in Tables 9 and 10, our proposed TSADMM outperforms the CVX solver for the model (26) in terms of computation time.Particularly, for the cases of  equaling 368 and 449 in Table 9, the computation time for TSADMM is less than one-third of those for CVX.Besides the theoretical global convergence of our TSADMM, the numerical convergence behaviors in terms of the infeasibility measure for all the above 14 instances are presented in Figures 7 and 8.One can see that for each instance, the infeasibility decreases rapidly as the number of iterations increases and it meets the required accuracy 10 −3 within less than 4000 iterations.

Conclusion
In this paper we have proposed a sparse optimization model using ℓ 0 -norm and squared ℓ 2 -norm as objective function so as to maximize the utilization of regenerative energy in metro railway system.To overcome the NP-hardness resulting from the ℓ 0 -norm, the weighted ℓ 1 -norm by invoking the adaptive lasso strategy in statistical regression has been introduced to make the relaxation counterpart computationally tractable and numerically effective.A two-stage alternating direction method of multipliers (TSADMM) has been designed to efficiently solve the proposed mathematical model and the global convergence of the algorithm has been established as well.Numerical experiments have been conducted on Beijing Metro Yizhuang Line for case study.The comparison results with the linear programming proposed by Das Gupta et al. and a quadratic programming with least-squares loss function have illustrated the effectiveness of our proposed model in terms of the energy saving rate, and the timing comparison results with the CVX toolbox have demonstrated the efficiency of our proposed TSADMM algorithm.Since the weighted ℓ 1 -norm is a convex approximation of the ℓ 0 -norm, the optimal solution to the (26) is a proximal optimal solution to the original sparse optimization model.How to design some more effective nonconvex relaxation scheme to get a better proximal optimal solution so as to get a better energy saving rate will be one of our future work.Meanwhile, the delay perturbations of the metro trains and the randomness of passenger demands have not been considered in our static timetabling.How to use advanced optimization techniques for handling such cases will also be our future research issue.
Natural Science Foundation of China [Grants nos.11771038 and 11431002].

Figure 3 :
Figure 3: Use rectangles instead of energy consumption and regenerative energy.

Figure 4 :
Figure 4: Three cases for net energy consumption function.

Figure 6 :
Figure 6: The procedure of the numerical experiments.

Figure 7 :
Figure 7: The numerical convergence behavior of TSADMM with different choices of .

Table 1 :
A list of notations.The connection window between the train  at the platform  and the train   at platform [12]every train  ∈ T  , the train ←   ∈ T  is called the temporally closest train to the left of  if   ) to be zero or as close to zero as possible.Following from[12], we define an auxiliary variable  ∈ R  with  = 2 ,  = |N| and  = |T| by ); Step 2.3 Set  =  + 1.If some criterion is satisfied to the accuracy  2 , then set ( * ,  * ,  * ) = (  ,   ,   ) and stop.Otherwise, go to Step 2.2.

Table 6 :
Parameters for calculating   and   .width of the rectangles for   and   (in percentage of the trip time) are also given in Table6for approximately calculating the energy consumption for accelerating   and the regenerative energy produced during braking   . the

Table 9 :
Time comparison between CVX and TSADMM with different choices of .