A Self-Organizing Policy for Vehicle Dispatching in Public Transit Systems with Multiple Lines

In this paper, we propose and analyze an online, decentralized policy for dispatching vehicles in a multi-line public transit system. In the policy, vehicles arriving at a terminal station are assigned to the lines starting at the station in a round-robin fashion. Departure times are selected to minimize deviations from a certain target headway. We prove that this policy is self-organizing: given that there is a suﬃcient number of available vehicles, a timetable spontaneously emerges that meets the target headway of every line. Moreover, in case one of the vehicles breaks down, the remaining vehicles automatically redistribute over the network to re-establish such a timetable. We present both theoretical and numerical results on the time until a stable state is reached and on how quickly the system recovers after the breakdown of a vehicle. These promising results suggest that our self-organizing policy could be useful in situations where centralized dispatching is impractical or simply impossible due to an abundance of disruptions or the absence of information systems.


Introduction
Self-organizing strategies are a promising concept to increase the resilience of urban public transit systems.
In such a strategy, the concept of a schedule or timetable is abandoned. Instead, departure times and/or destinations of vehicles are determined locally at stations according to an easy-to-implement policy. In the absence of perturbations, an adequate self-organizing policy causes the system to converge to some preferable state, typically a periodic repetition of services with constant headways (the time between consecutive services). As a result, the impact of disruptions always dies out spontaneously, without intervention by a this problem, we propose and theoretically analyze an easy-to-implement decentralized dispatching policy.
In our policy, every terminal station maintains a cyclic ordering of its outgoing lines and keeps track of the most recent departure times of these lines. Vehicles arriving at the station are assigned to the outgoing lines in round-robin fashion, according to the cyclic ordering. The departure times of vehicles are chosen such that deviations from the target headway are minimized.
Our main contribution is that we prove that our policy is self-organizing, leading to emergent behavior.
Once converged, the decentralized policy matches the performance that can be achieved under centralized control. As long as the number of vehicles is large enough to perform a schedule meeting the target headways, our policy guarantees convergence to such a schedule. This result holds regardless of the initial locations of the vehicles. As a consequence, even when one of the vehicles breaks down or a bus returns to the depot at the end of the driver's shift, the remaining vehicles spontaneously redistribute over the network to again meet the target headway of all lines. In numerical experiments we illustrate that this happens rapidly, such that the impact of a disruption is quickly absorbed. In case the number of vehicles is not sufficient to meet the target headways using a centralized approach, we prove that our policy keeps the headways, on average, as small as possible given the number of available vehicles. Finally, we also derive upper bounds on the largest headway that can occur and the stabilization time.
The remainder of this paper is structured as follows. In Section 2, we describe the problem setting and explain the policy. In Section 3, we discuss related literature. In Section 4, we theoretically analyze the performance of the policy. In Section 5, we discuss the results of a series of experiments that illustrate the practical performance of the policy. Finally, we conclude the paper in Section 6.

Problem Setting and Notation
We represent the public transit system by a directed network G = (S, L), where S is the set of terminal stations and L the set of lines. The intermediate stations are not relevant for our policy and therefore not included in the network. We assume that the network is symmetric, such that for every line (s → s ) ∈ L, the reverse line (s → s) is also an element of L. Furthermore, we assume that G is connected (otherwise the connected components can be considered separately). Every line l ∈ L is characterized by a travel time denoted by t l . We allow for asymmetric travel times, so the travel time of a line and its reverse line are not required to be equal. Every line has the same target headway, which we denote as H. In other words, the goal is to operate each line every H time units. We assume that all travel times and H are integer. We let δ + (s) and δ − (s) and denote the set of lines originating and terminating at s ∈ S, respectively. We assume there is a fixed number of vehicles available in the system, which we denote as n. At the moment of initialization, all vehicles are at stations. Vehicles are allowed to switch between lines at the terminal stations, but are not allowed to deadhead (drive without passengers). Therefore, after a vehicle performs line (s → s ), the next line the vehicle is assigned to must be an element of δ + (s ). To meet all the target headways, one needs at least n * vehicles, with n * = l∈L t l H .
In general, n * may be fractional, so it can be rounded up to the next integer to obtain a stronger bound.
Furthermore, this bound on the number of required vehicles does not depend on whether the system is operated using a centralized or a decentralized approach. Although operators will naturally choose a target headway that is feasible given their fleet size, we consider both the case where n ≥ n * and where n < n * , as the latter may be relevant when there is a breakdown of a vehicle or travel times are longer than anticipated.
A visual illustration of the problem setting is provided in Figure 1, depicting a network of four stations, four lines and four vehicles. In this example, n * = 3.8, so at least four vehicles are necessary to meet the target headways.

Policy Definition
We now propose a policy for dispatching vehicles at a terminal station. The policy determines the next line and the next departure time of a vehicle arriving at a terminal station. In the policy, the lines starting at a terminal station are selected in round-robin fashion, according to a fixed (but arbitrary) cyclic order.
Departure times are based on the previous departure times of the lines, which are assumed to be known at the station. The departure time is taken to be the maximum of the target departure time, which is equal to the sum of the previous departure time and the target headway, and the current time (as it is not possible to depart in the past). Note that any minimum required time between services can be incorporated in the definition of the travel times, so we assume without loss of generality that an arriving vehicle can depart immediately.
As an example, suppose a vehicle arrives at station s 2 from Figure 1 at 9:10. Table 1a displays the relevant information at s 2 at this time, indicating which line should be performed next, the previous departure times of the lines starting at s 2 and the target departure times. Our policy assigns the arriving vehicle to line (s 2 → s 4 ), as it is indicated that this line should be performed next. Naturally, the previous departure time of this line is also the longest ago. As the target departure time has already passed, the departure time is set at 9:10. Table 1b shows the updated information after the departure. Note that the target departure time of line (s 2 → s 4 ) is now equal to 9:40, as it is only based on the most recent departure time. Suppose that the next arrival occurs at time 9:15. The policy assigns the arriving vehicle to line (s 2 → s 1 ). As the target departure time of line (s 2 → s 1 ) is 9:20, the policy instructs the vehicle to wait for 5 minutes and depart exactly at 9:20.
For a formal definition of the policy, let us (arbitrarily) order the lines starting at station s ∈ S as l s 1 , l s 2 , ..., l s |δ + (s)| , representing the cyclic order in which the lines from this station are performed. Let l s next ∈ δ + (s) denote the next line to be performed from station s and let τ l denote the current target departure time of line l (at initialization, all target departure times are 0 and l s next = l s 1 ). Suppose at time t now , a vehicle arrives at station s and l s next = l s i . Our policy assigns the arriving vehicle to line l s i and schedules it at time t = max{τ l s i , t now }. Next, the policy updates the target departure time of the selected line: Finally, the policy updates l s next according to the order of the lines: 3 Related Literature 3.1 Self-Organizing Approaches in Public Transit Bartholdi and Eisenstein (2012) were the first to introduce the concept of self-organization or self-coordination in the field of public transit scheduling. In their approach a vehicle is delayed at a control point by a time proportional to the headway to the trailing vehicle. The authors prove that for the case with a single circular line, this policy ensures that all headways self-equalize over time, regardless of the initial locations of the vehicles. This approach has been extended by Liang et al. (2016) and Zhang and Lo (2018), who consider both the backward headway and the forward headway when computing how long a vehicle should be delayed, resulting in a faster convergence rate. Zhang and Lo (2018) also provide theoretical evidence that the headway variation remains limited under stochastic travel times. However, only single-line systems are considered in these papers.

Multi-Line Control
For multi-line systems, the approach of Argote-Cabanero, Daganzo, and Lynn (2015) is closest to our work.
In this study, the authors propose an adaptive control rule for holding, accelerating and decelerating vehicles with the aim to adhere to the schedule as well as possible. However, the possibility to dynamically switch lines after a vehicle reaches a terminal station is not considered. Furthermore, this approach requires the specification of a target schedule and a number of functions and parameters. In contrast, our approach does explicitly allow vehicles to change lines in order to better spread vehicles over the network and only requires the specification of a target headway, making the policy easier to implement. Other papers focusing on multi-line systems, such as Hernández, Muñoz, Giesen, and Delgado (2015) and Petit, Lei, and Ouyang (2019), consider centralized optimization based approaches to reduce bus bunching, as opposed to applying a local decision rule.

Rotor-Router Systems
Our policy can be viewed as a generalization of the rotor-router model, which was originally introduced in Priezzhev, Dhar, Dhar, and Krishnamurthy (1996) as the deterministic counterpart of a random walk on a graph. In the random walk on a graph, one or more agents move over a graph at discrete and synchronous steps. The next edge to be traversed by an agent is selected randomly from the set of incident edges of the current node where the agent is located (Lovász, 1993). In the rotor-router model, a node does not send agents visiting it to a random neighbour, but instead selects the incident edges in round-robin fashion. That is, every node in the graph maintains a cyclic ordering of its incident edges and has a pointer indicating the next edge to be traversed by an entering agent. Whenever an agent enters a node, the pointer is advanced to the next edge in the cyclic ordering.
As our policy assigns arriving vehicles at a station to lines in a round-robin fashion, similar to the the rotor-router system. On the other hand, in the rotor-router model it takes one time step to traverse an edge, whereas in our case a line can have any positive integer valued travel time. Moreover, our policy sometimes instructs to hold a vehicle at a station to meet the target headway, where agents in the rotor-router model move in every time step. However, we will see that some of the results for the rotor-router model also hold for our policy.
For the case where there is only one agent, Priezzhev et al. (1996) proves for the rotor-router mechanism that after a sufficiently long time, the agent gets locked-in in a cycle where every edge is traversed exactly once in both directions. Yanovski, Wagner, and Bruckstein (2003) and Bampas et al. (2009) show that the lock-in time is bounded by 2mD, where m is the number of edges in the graph and D the diameter of the graph. The ability of the system to recover from, for example, edge deletions (corresponding to the removal of lines in a public transit network) is investigated in Bampas et al. (2017). For the case with multiple agents, Wagner, Lindenbaum, and Bruckstein (1999) prove that the difference in the number of traversals of two edges cannot grow unbounded. Yanovski et al. (2003) present a stronger bound for the maximum difference between the number of traversals of two edges and also prove that a rotor-router system with multiple agents converges to a periodic motion. Chalopin et al. (2015) provide a further analysis of the limit behavior of the multi-agent rotor router system, and show that unlike the case with one agent the duration of the periodic motion (so the time until the system returns to the same state) can be superpolynomial in the number of edges. Finally, Dereniowski, Kosowski, Pająk, and Uznański (2016) prove that the time it takes until all edges are traversed with k agents is at least log(k) times shorter than with one agent.

Theoretical Analysis
In this section, we analyze the emerging behavior of the system in case all vehicles are scheduled according to the proposed policy. First, we investigate whether the policy services all lines in a fair or balanced manner, as preferably each line should have approximately the same number of departures. Secondly, we analyze the long run behavior of the system and investigate whether the target headway of every line is met. Thirdly, we provide worst case results on the maximum headway that can occur under the policy and the time it can take before the system reaches a stable state. We conclude this section with a discussion regarding the performance of the policy in case there is no common target headway of all lines.

Balanced Services
We first analyze the extent to which our policy leads to a balanced service of all lines. Ideally, at any point in time, every line should have approximately the same number of departures. Formally, let f (s → s ) denote the number of departures of line (s → s ) ∈ L up to time t. The lemmas and theorems that we prove hold for any t. Hence, for readability, we omit the index t. In this section, the goal is to show that the difference between f (s 1 → s 1 ) and f (s 2 → s 2 ) is bounded for two arbitrary lines (s 1 → s 1 ), (s 2 → s 2 ) ∈ L As the policy serves lines in a round-robin fashion, for two lines (s → s ) and (s → s ) originating at the same station, it holds that |f (s → s ) − f (s → s )| ≤ 1. The first part of our analysis only depends on this property of our policy. Because this property is shared with the rotor-router system, we apply a similar analysis as presented by Yanovski et al. (2003).
Let S 1 , S 2 be a partition of the set of all stations. Then, we define f (S 1 → S 2 ) as the number of times lines starting in S 1 and ending in S 2 have been performed up to some time t (again we omit the index t).
We also refer to f (S 1 → S 2 ) as the flow from S 1 to S 2 . As it holds for every of the n vehicles that it is impossible to cross from S 1 to S 2 twice, without crossing back from S 2 to S 1 , we can make the following observation (first made by Wagner et al. (1999)).
Using Observation 1, it possible to prove Lemma 2, which gives an upper bound on the difference between the number of times a line and its reverse line are performed.
Proof. We define f (s) := min (s→s )∈δ + (s) f (s → s ) for a station s ∈ S, denoting the minimum number of departures for any line leaving s. As the lines are always served in a round-robin fashion, we have that Now, suppose that the lemma is not true, such that there exists a pair of opposite lines (s → s ) and We have that s ∈ S 1 and s ∈ S 2 . Let the number of lines crossing from S 1 to S 2 be m. As the flow from s to s is j and the flow over all other lines from S 1 to S 2 is at least j − n, we find that Similarly, the flow from S 2 to S 1 is at most Therefore, it follows that As this contradicts Observation 1, the assumption that the lemma is not true must be wrong.
We now present a theorem that bounds the difference between f (s 1 → s 1 ) and f (s 2 → s 2 ) for two arbitrary lines, which implies that the number of times two lines are performed cannot differ too much, at any point in time. To do so, let dist(s 1 , s 2 ) denote the number of lines in the shortest path from s 1 to s 2 (shortest in terms of number of lines).
Proof. First, we prove a bound on the difference between the number of times two consecutive lines are performed. Thereafter, we consider a shortest path between s 1 and s 2 and iteratively apply this bound to prove the theorem.
By Lemma 2 it holds that f (s → s ) ≤ f (s → s) + n and by definition of the policy it holds that Next, let s 1 = s a , s b , ..., s p = s 2 denote a shortest path from s 1 to s 2 . It holds that .
The theorem follows by symmetry.
The desirable property of the bound proven in Theorem 3 is that it does not depend on t. Therefore, it holds that the difference between the number of times two lines are performed cannot grow unbounded. In what follows, we use this observation to characterize the long run behavior of the system.

Limit Behavior
In this section, we analyze the emerging properties of the system in the long run. The first result states that after some time it is guaranteed that the system enters a periodic motion (i.e. starts to cycle) and that every line is performed the same number of times in one cycle. The proof is an adaption of the proof by Yanovski et al. (2003) of the same property for the rotor-router system. This implies that the time since the latest departure of a line cannot be unbounded. It then follows that the number of possible states is finite. Furthermore, since the policy is deterministic, it must be that the system returns to the same state, at which point the system starts to cycle. Therefore, the system enters a periodic motion.
To prove the second part of the lemma, note that if the number of times two lines are performed during a cycle would be different, over time the difference would grow without bound, contradicting Theorem 3.
It follows from the proof of Lemma 4 that we can represent the state of the system at time t using some state vector V t . Following Chalopin et al. (2015), we call a state V t stable if there exists t > t such that Equivalently, we say that the system has stabilized once it has entered the periodic motion. By Lemma 4, V t will always be stable for large enough t. The stabilization time, denoted as T stable , is the smallest value such that V T stable is stable. Furthermore, the periodicity, denoted as T period , is the smallest value such that V T stable +T period = V T stable . In the remainder of this subsection, we analyze the properties of the system once it reaches a stable state in more detail. In Section 4.3, we analyze how large the stabilization time can be in the worst case.
To provide more insight into the emerging behavior of the system, we first consider the number of idle This brings us to the following lemma: Lemma 5. For the number of idle vehicles at a station, it holds that i s (t) ≥ i s (t + H).
Proof. Assume i s (t) < i s (t + H). Then, it must hold that dep s (t, t + H) < arr s (t, t + H).
By Lemma 5, the number of idle vehicles cannot increase over time. Therefore, it holds that the utilization cannot decrease over time. The next lemma formalizes this statement.
Proof. Observe that γ s (a, b) = b a i s (t)dt. By applying Lemma 5 we find Therefore, it holds that For the second part of the lemma, note that by the periodicity of the system, we have that for t > T stable util(t , t + H) = util(t + iT period , t + iT period + H) for any i ∈ Z + . Since util(t, t + H) ≤ util(t + H, t + 2H), it follows for t > T stable that util(t , t + H) = util(t + iH, t + (i + 1)H) for any i ∈ Z + .
From the above lemma, it follows that once the system reaches a stable state, the utilization is constant over consecutive intervals of duration H, even though T period , the duration of the periodic motion, can in general be strictly larger than H. We formally define this limit value of the utilization as u := util(T stable , T stable + H), to which we refer as the stable utilization. Note that the policy ensures that at all lines are performed at least once in one cycle of the periodic motion, such that u > 0.
Before we state the main theorem, recall that n * is a lower bound on the number of vehicles required to meet the target headways. Theorem 7 shows that the behavior of the system depends on whether n < n * or n ≥ n * .
Theorem 7. If n < n * , then the stable utilization u equals 1 and the average headway of all lines during the periodic motion equals n * n H. Otherwise, the headways of all lines during the periodic motion equal H for all lines and the stable utilization equals n * n .
Proof. As we have shown that the utilization converges to a certain stable utilization 0 < u ≤ 1, we can distinguish the following two cases: Case I: 0 < u < 1. This implies that there exists at least one station where there is strictly positive idle time during the periodic motion. It must be that the lines originating from this station are performed once per H time units, as a vehicle only waits if it is in time to meet the target headway of its next line.
As all lines are performed the same number of times in the periodic motion by Lemma 4, it follows that T period = H and every line is performed exactly once per H time units in both directions. By definition, this implies that n ≥ n * . As every line is operated once per H time units, the utilization converges to u = l∈L t l nH = n * n .
Case II: u = 1. As every line can be operated at most once every H time units, this implies that n ≤ n * .
Moreover, it follows from Lemma 4 that the system enters a periodic motion in which all vehicles have no idle time and all lines are performed the same number of times. Let g denote the number of times each line is performed during a cycle. As all vehicles are running all the time, it holds that Consequently, the average headway of all lines, which we denote asH, equals H = T period g = n * n H.
The above theorem provides a concise characterization of the behavior of the system under the proposed policy. The result can be seen to be optimal in some sense. If the number of vehicles is large enough to meet the target headways using centralized control, our decentralized policy is also able to meet the target headways. In case there are not sufficient vehicles to meet the target headways, every vehicle is used all the time and every line has the same average headway, equal to the smallest headway possible under centralized scheduling. Moreover, if n ≥ n * + 1, there is some slack in the system, such that if a vehicle breaks down, the headways of all lines again converge to H. In the other case, there is no slack in the system and every breakdown of a vehicle leads to an increase in headways, and therefore to a reduction in passenger service.

Worst Case Analysis
In this part, we provide worst case results of the headway deviation in case n < n * and on the time it takes to reach a stable state.

Worst Case Headway Deviation
In contrast with the results Bartholdi and Eisenstein (2012) and Zhang and Lo (2018) obtained for the single-line case, in case n < n * (so there are not enough vehicles available), our policy leads to convergence of the average headways of all lines, but not necessarily to convergence of the headways themselves. A natural question to ask is how large the maximum headway can become in the worst case. Theorem 8 shows that the headway cannot be larger than H + (n * − n)H, such that the excess headway is never larger than (n * − n)H.
Theorem 8. If n < n * , once the system is in a stable state, all headways are at most H + (n * − n)H.
Proof. For this proof it is convenient to think of every line l as having length t l and think of every vehicle as a snake having length H and moving 1 unit distance per unit time (such that it takes t l time units to traverse a line). As n < n * , it holds according to Theorem 7 that the system converges to a periodic motion where the snakes are constantly moving. Furthermore, the policy ensures that consecutive departures of the same line are always separated by at least H time units, such that two snakes, despite having length H, cannot occupy the same part of a line. Therefore, once the system has stabilized, the snakes cover a part of the network of length nH. The part of the network that is not cove red by any of the snakes then has length, l∈L t l − nH = n * H − nH = (n * − n)H. Thus, whenever a snake starts traversing a line, the distance between the front of the snake and the tail of the preceding snake on the line is at most (n * − n)H. As the length of every snake is H, it follows that the distance between the fronts of two vehicles is, at any time, at most H + (n * − n)H. Hence, the time between two consecutive departures of the same line is at most This theorem has a nice interpretation, as it shows that if there is only a small shortage of vehicles, the headways cannot become very large. As long as the discrepancy between n and n * is not too big, the target headways are met reasonably well. For example, if n = n * − 1, the maximum headway that can occur is 2H.

Stabilization Time
In this part, we derive worst case bounds on the stabilization time T stable . We investigate the time until stabilization for two special cases. In both cases, the stabilization time depends on the unweighted diameter of the network, which is denoted by D and represents the maximum number of lines one is required to traverse when traveling from one station to another. First, we analyze the case where n = n * = 1, so a single vehicle suffices for meeting the target headway.
Proof. As there is only one vehicle, the system is stabilized if the vehicle continuously performs an Euler tour every H time units. Bampas et al. (2009) shows that for the rotor-router model with a single agent, an Euler tour is established in "phases" and that in the worst case, D phases are required. This result directly extends to our setting. In every phase, the vehicle performs a tour starting and ending at s 0 , the initial location of the vehicle. A phase ends when all the lines originating at s 0 have been performed during that phase. Furthermore, in the worst case, the cyclic order of the outgoing lines at every s is such that in phase i, station s is visited if and only if dist(s 0 , s) ≤ i. Therefore, after round D the vehicle will have entered the periodic motion and continuously perform an Euler tour. Furthermore, since n * = 1 every closed tour over the network takes at most H time units, which implies that the duration of every round is H. It follows that the system stabilizes in the worst case at time DH. We can observe that the total charge over the network equals zero: According to the policy, the number of vehicles leaving station s in iteration m+1 equals min{i s (m), deg(s)}.
As the number of vehicles entering station s in an iteration is at most deg(s), it follows that the charge of a neutral or positively charged station can never increase. Hence, a station can change from being positively charged to neutral, but not vice versa. On the other hand, until the system stabilizes, it is possible that negatively charged stations become neutral and vice versa.
We define the potential function Φ(m) = s∈S:Cs(m)>0 C s (m), equal to the sum of the positive charges.
As the charge of positively stations can only decrease and neutral stations cannot become positively charged, it follows directly Φ(m) ≥ Φ(m + 1). If m ≥ T stable /H, it holds that Φ(m) = 0. In order to bound T stable , we use the following result from the rotor-router system Lemma 10. For a rotor-router system with k > 1 agents, the cover time (the time until all edges have been visited at least once) on a graph with diameter D and m edges is at most O mD log k . If there is only 1 agent, the cover time is O (mD).
Proof. See Dereniowski et al. (2016). Suppose that Φ(m) = f > 0. Then, there are f anti-vehicles in the network after iteration m. We are interested in how long it takes until one of the anti-vehicles arrives at a positively charged station, as such an event reduces the potential function. As in any iteration the anti-vehicles traverse the lines that are not traversed by regular vehicles, it can be seen that that the anti-vehicles move according to the same policy as the regular vehicles, but with the cyclic order of the lines reversed. Moreover, since anti-vehicles move in every iteration (otherwise the number of anti-vehicles at a station would be larger than the degree), this system is equivalent to a rotor-router system. Therefore, the number of iterations until one of the antivehicles hits a positively charged station is at most the cover time of a rotor-router system with f agents.
Applying Lemma 10 and using that the potential can decrease at most n − 1 = |L| − 1 times and that every iteration takes H time units, it follows that The results in Theorem 9 and 11 illustrate that the stabilization time depends on the diameter and number of lines of the network and the target headway. For high-frequency networks that are highly connected, for example urban transit systems, stabilization occurs rapidly. For large elongated networks operated at lower frequencies, for example inter-regional transit systems, stabilization is established more slowly.

Different Target Headways
We conclude the theoretical analysis with the observation that it is unlikely that the presented results can be extended to settings where there is no common target headway, but lines may have different target headways. In particular, note that the behavior of the system is different depending on whether there are enough vehicles or not. As such, by using simulation, it is possible to find out how many vehicles are required to meet the target headway for every line. However, Van Lieshout (2019)  On the other hand, this insight does not mean that our policy cannot be applied in systems where the target headway of lines is different. A first possibility is to decompose the network into sub-networks where there is a common target headway. Secondly, one could still choose to apply the policy and accept that there are no theoretical performance guarantees. To do so, the policy needs to be slightly generalized. Whenever there is a departure of line l at time t , the target departure time should now be updated according to the formula τ l ← t + h l , where h l denotes the target headway of line l. We assess the performance of this policy numerically in the next section.

Numerical Experiments
In this section, we describe the results of a series of experiments that illustrate the practical performance of the proposed policy. First, we analyze how the time it takes to reach a stable state grows if the size of the network increases. Next, we investigate how long it takes to re-stabilize after one of the vehicles breaks

Stabilization Time
We assess the time to reach a stable state for four types of network topologies: path, ring, star and fully connected. The differences between these types of networks are illustrated in Figure

Re-Stabilizing after a Vehicle Breakdown
To get a better sense of the performance in practice, we perform a third experiment, where we start in a stable state (i.e. a feasible timetable). Then, we let one of the vehicles break down and analyze how long it takes to re-stabilize. We refer to the number of vehicles in the system above the minimum number of required vehicles to reach a stable state as the buffer. Provided that there is a buffer of at least one vehicle, we know that the system will always bounce back to a stable state after the breakdown. We perform this experiment on a star network with five lines with a target headway of 15 minutes and travel times uniformly drawn between 10 and 30 minutes. We start this experiment from a random stable state, which is achieved by having the system converge to a stable state from a random starting configuration.
In Figures 5a-d, the results of this experiment are visualized, for different sizes of the buffer and with ten randomly generated networks for each buffer size. The horizontal axis depicts the time since the breakdown and the vertical axis the current maximum headway in the network. As expected, the maximum headway in the system can be quite large right after the vehicle breakdown. However, the impact of the breakdown dies out rather quickly. Even with only a single vehicle as a buffer, the maximum headway in the system reduces to less than 20 minutes within the first hour. On the other hand, there can be a long tail off effect, as for some of the replications we observe it takes a long time before all headways really have converged to 15 minutes. However, in practice passengers will hardly notice the difference between a headway of 15 minutes and a slightly larger headway. We also observe that the maximum headway converges to 15 minutes much faster when there is a larger buffer in the number of vehicles.

Different Target Headways
In the next experiment, we test the performance of the policy in case the lines in the system do not all have the same target headway. We conduct this experiment on a star network with three lines, with target headways of 10, 15 and 20 minutes, respectively. The travel time for each line is uniformly drawn between 10 and 30 minutes.
In Figures 6a-6b, the headways of the three lines are plotted over time for a randomly generated instance.
Only the headways of the lines from the central station of the star network to the outer stations are included here. If the number of vehicles is equal to the minimum number required to meet the target headways, we observe that the system does not converge to a stable state where the target headways are always met. Instead, the system converges to a periodic motion where there are slight deviations from the target headways.
In this periodic motion, the headways of the line with a target headway of 10 minutes vary between 10 and 13 minutes and for the line with a target headway of 20 minutes they vary between 20 and 23 minutes. On the other hand, if there is one more vehicle in the system, we can observe that the headways do all converge to the target headways. Hence, this indicates that despite the absence of theoretical guarantees, the policy still performs well, but that a larger number of vehicles may be required to ensure that the target headways are met at all times.

Stochastic Travel Times
In a final experiment, we test the performance of the policy in case the travel times are not fixed, as we assumed in the theoretical analysis, but stochastic. We perform this experiment on a star network with five lines with a target headway of 15 minutes. The nominal travel time for each line is uniformly drawn between 10 and 30 minutes. The realized travel time is equal to the sum of the nominal travel time and a disturbance term. As it is likely that there is correlation in the duration of subsequent trips, we generate the disturbances ε l for each line according to an autoregressive model: We set ρ = 0.8 and σ l = 1 4 t l . In Figure 7, the empirical cumulative probability density function of the headway is visualized for a randomly generated network. This function is presented for different number of buffers, which are computed based on the nominal travel times. As expected, the headways are not always equal to the target headway of 15 minutes as there are constant disturbances keeping the system away from a stable state. However, it can be observed that the headways are reasonably close to the target headway. Even without any buffer, over 50 percent of the headways are equal to the target headway and 80 percent of the headways are shorter than 20 minutes. When the buffer is larger, these numbers rapidly increase. Therefore, this suggests that the policy performs reasonably well if the travel times are stochastic, but that a larger number of vehicles is required to obtain a (very) high service level. We proposed a self-organizing policy for dispatching vehicles in multi-line public transit systems. Theoretical and numerical analyses illustrate that our policy performs well. In idealized conditions and provided that a sufficient number of vehicles is available, it is guaranteed that the system converges to a stable state where the target headway of each line is met. In case travel times are not fixed but stochastic, or in case lines have different target headways, the deviations from the target headways are small, especially if there is some reserve capacity in the number of vehicles. Furthermore, our policy causes the system to quickly recover after disruptions, such as the breakdown of a vehicle.
Our promising theoretical and numerical results show that the potential of self-organizing strategies extends to multi-line public transit networks. Specifically urban high-frequency networks seem suited for our approach, as convergence is more rapidly established if the target headway and size of the network is smaller. Compared to schedule-based approaches, the self-organizing approach is much easier to implement, as it does not require constructing a schedule, monitoring adherence to the schedule and rescheduling after disruptions. Only the target headway needs to be set, which should be feasible with respect to the travel times of all lines and the number of available vehicles.
For further research, it would be interesting to investigate if the policy can be generalized to ensure that headways are always self-equalizing, even if n < n * . Likely, this would require that the target headway is no longer exogenous, but emerges spontaneously due to the dynamics of the system. It is an open question whether this can be achieved by a simple decentralized policy, without coordination or communication between different parts of the network.