UvA-DARE (Digital Academic Repository) A framework for efficient dynamic routing under stochastically varying conditions

Despite measures to reduce congestion


Introduction
To reduce congestion in road networks, a wide variety of measures can be thought of.Perhaps the most basic measure is to increase the network's capacity -the Dutch government for example increased the total road length from 130.446 km in 2001 to 141.361 km in 2020 [14].Alternatively, one may attempt to better exploit the existing resources, examples being the use of reversible lanes [55] and the transit lane experiments in various US states [12,15].Despite such measures, recurrent congestion (i.e., congestion during peak hour) and non-recurrent congestion (i.e., congestion due to incidents) remain a major concern.Consequences include substantial delay in travel times, increased economic costs, and negative environmental effects.
So as to reduce the effects of recurrent congestion, historical data is used to incorporate periodically occurring events (rush hours, weekly patterns, etc.) into routing algorithms.Regarding non-recurrent congestion, an important role is played by Intelligent Traffic Systems (ITS) capable of providing travelers with real-time information.The availability of data on virtually all thinkable network characteristics, combined with instant communication to individual travelers, offers the potential for routing with minimal delay and minimal congestion.A complication, however, lies in the unpredictability of traffic surges and incidents, and in addition in the computational effort needed to process all available information in real time.Our work provides an approach that handles both these challenges: providing fast and close-to-optimal routing, using a probabilistic framework that is well suited for modeling uncertainties that affect congestion.

Routing framework
In this paper we study routing by casting it as a dynamic stochastic shortest path problem.Our objective is to minimize the expected travel time of a vehicle pertaining to a given origin-destination pair, assuming that the vehicle is allowed to adapt the chosen route while driving.A key feature of our setup is that we model the travel-time dynamics using a suitably chosen stochastic processes.Specifically, for each road of the network, we let the velocity of a vehicle on that road be determined by the state of a Markovian background process, describing incidents, weather conditions, etc.
Routing decisions in the proposed dynamic model can be expressed in terms of a semi-Markov decision process (SMDP) [13,47,51].The state of the SMDP consists of the location of the vehicle, which, in a practical context, can be traced by GPS, and the state of the background process, being provided by an ITS.The route can be adapted at each intersection along the traveled path.Based on the state of the SMDP at an intersection, a decision is taken that informs the driver which arc to travel next.Clearly, the use of a Markovian background process facilitates the modeling of non-recurrent, random events.Perhaps surprisingly, however, we will argue that it also allows us to incorporate the deterministic patterns corresponding to recurrent, (near-)deterministic events.Importantly, our framework allows for correlation between the travel times on the edges and is therefore well capable of modeling the spillback effect: an incident at an arc causes a drop in speed levels at upstream arcs.
In the context of our model, an optimal routing policy can be evaluated by the use of dynamic programming (DP), as it can be shown that such a policy satisfies the Bellman optimality equations.DP does however suffer from the curse of dimensionality, making it prohibitively slow for networks of practical relevance.To cope with this, we have developed the edsger algorithm, as well as its more efficient variant edsger , which are Dijkstralike shortest path algorithms, but with the additional feature that the route can be adapted along the way, based on the current state of the background process.We furthermore present speed-up techniques that greatly reduce the size of the underlying network model and its corresponding state-space.These techniques can therefore be used as preprocessing steps to make the routing algorithms substantially more efficient.
To make our approach operational, the parameters pertaining to the background process must be estimated.Deterministic patterns and corresponding transitions in the background process can be identified from historical data.In addition, we need to estimate the frequency and consequences of non-recurrent events, such as the frequency of incidents or the drop in speed level under bad weather conditions.Examples of institutions that collect data on traffic flows and traffic speeds are the National Data Warehouse for Traffic Information (NDW) in The Netherlands, the Mobilitäts Daten Marktplatz (MDM) in Germany; in addition various US states have such an infrastructure (see for example MITS, the Michigan Intelligent Transportation Systems).This paper has the following two main contributions: • First, we present the Markovian background process and corresponding SMDP framework in detail.To our knowledge, this is the first study that uses a continuous-time Markovian background process to develop an adaptive routing algorithm.It has important advantages over existing approaches: in a single framework it offers the potential of incorporating both recurrent and non-recurrent events and allows for correlation between speeds on different edges.In addition, our setup guarantees continuity in travel times (travel times are a continuous function of the departure epoch), implying the so-called FIFO-property (the arrival epoch is an increasing function of the departure epoch).
• Second, after having presented a way to evaluate the optimal policy, which suffers from the curse of dimensionality, we propose efficient alternatives.Through a series of numerical examples, we show that in particular the algorithm edsger is fast, in the sense that it can be performed in real time, providing suboptimal routes in a very small subset of all possible scenarios.The examples all concern specific parts of the Dutch road network, with instances that are based on the data provided by NDW to make sure they are representative for real traffic scenarios.

Literature overview
Routing algorithms have extensively been examined in literature.In a deterministic graph with positive arc costs satisfying the FIFO property, the shortest path can be obtained by Dijkstra's algorithm, developed by Edsger W. Dijkstra in the late 1950s [18,32].Throughout the years many variations and speed-up techniques for Dijkstra's algorithm have been presented, examples of which are the A -algorithm and the bidirectional Dijkstra algorithm [24,35].Bellman [5] and Ford [21] constructed a shortest path algorithm that allows for both positive and negative arc costs.
The search for optimal routes can complicate considerably as soon as randomness is introduced.The stochastic shortest path problem is an extension of the deterministic shortest path problem in which uncertainty in travel times is taken into account.Several variations of this problem exist, each with its own objective.Examples of two well-studied objectives are the minimization of the expected travel-time and the maximization of the on-time arrival probability.In the present article we focus on the first objective; for studies on the latter we refer to [19,22,40,42] and references therein.In case of a minimum expected travel-time objective, Dijkstra's algorithm can still be applied if there is neither correlation (incidents on specific arcs do not influence travel times on other arcs) nor time-dependency (the travel times do not depend on the hour of the day) [36,38,39].Recurrent events are well-described by time-dependent velocities (think of predictable, recurring events such as rush hours), but Hall [30] showed that classic shortest path algorithms like Dijkstra's algorithm fail in stochastic time-dependent networks.Hall additionally argued that it is suboptimal to solely consider static routing and that it is advantageous to allow travelers to adapt the chosen route while traveling.These findings resulted in studies on optimal adaptive routing algorithms in stochastic time-dependent networks.Examples of such algorithms are presented by Fu and Rilett [23] and Miller-Hooks and Mahmassani [37].These algorithms do, however, not take ITS information into account and are therefore limited in their capability to incorporate non-recurrent congestion.
Algorithms that do take ITS information into account can be divided into two categories.The first category consists of algorithms that assume that ITS provides information on all realized travel times in the network.Without attempting to provide an exhaustive overview, we refer in this context to the algorithms presented in [2,16,17,25,26,43].The second category consists of algorithms that assume that ITS provides information on the state of a background process in the network.Psaraftis and Tsitsiklis [46] were among the first to work in this setting, presenting a framework in which the travel time on an arc depends on an environmental variable becoming known once a vehicle arrives at the attached node.Their setup was extended by Kim et al. [33], who work with a Markov process, the state of which state directly determines the travel time.By ITS, the state of a Markov process of every arc is known while traveling through the network, so that we can phrase the optimal routing problem in terms of a Markov decision process (MDP).The MDP framework is further explored in [28,49,53], under the assumption of independence of the Markov processes on the arcs and thus neglecting spatial correlation.They furthermore assume that the travel time on a link is known once the intersection at the head of the link is reached, whereas in reality the conditions on an arc may still change while traveling on that arc.Other drawbacks are (i) the fact that the realized travel times can only take values from a previously chosen discrete collection of values and (ii) the fact that additional assumptions are needed to guarantee the FIFOproperty.In the MDP framework an optimal policy can be characterized by the Bellman optimality equations and can therefore be evaluated by performing a DP recursion [3,4,6,7,47,48,51].DP suffering from the curse of dimensionality, it may lead to excessive computational costs.Possible solutions are dimensionality reduction, as described in e.g.[52] and references therein, and approximative procedures such as approximate dynamic programming (ADP) [8,44].In the context of routing, dimensionality reduction is employed in [28,34], while ADP is extensively studied in [49].

Contributions
Above, we stated our two main contributions, which we will describe in greater detail now.Regarding the first contribution, the specific Markov-modulated background process that we advocate in this paper, describing the speeds at which vehicles can drive on the arcs, overcomes the above-mentioned drawbacks of earlier approaches.
Although the approaches of Ferris and Ruszcyński [20] and Karoufeh and Gautam [31] also work with continuoustime Markovian processes, there are important differences with our proposal.In the setup of [20], the travel times are directly modeled relying on a continuous-time Markov process.In [31], a continuous-time Markov background process is considered to model the velocities, but it only describes the dynamics on a single arc and does not consider routing.With our continuous-time Markovian background process, applying to the network as a whole, we overcome the drawbacks of the discrete MDP framework that we mentioned above, as the continuity of the background process implies continuity in arrival times and guarantees the FIFO-property.In addition, our setup allows for correlation and is capable of incorporating both recurrent and non-recurrent congestion.As mentioned above, our second contribution concerns the development of efficient and close-to-optimal routing algorithms.The evaluation, through DP, of the optimal policy being prohibitively slow due to dimensionality issues, we propose the highly efficient edsger algorithm.Our numerical examples demonstrate that our approach is well capable of incorporating both recurrent and non-recurrent congestion.Notably, it clearly outperforms deterministic Dijkstra-type algorithms in which the per-arc travel times are replaced by their expected values.Comparison of the edsger algorithm to DP shows that edsger is orders of magnitude faster.Indeed, the computational costs of DP grow exponentially with the network size, while edsger provides essentially real-time response, performing just slightly below optimal.
The background process and SMDP framework are presented in Section 2. Section 3 considers routing in this framework, describing the optimal routing algorithm, heuristic algorithms and speed-up techniques.Numerical examples to show the usefulness of the model and to assess the efficiency of the routing algorithms are presented in Section 4. Section 5 contains concluding remarks.

Markovian Velocity Model
The first part of this section provides a motivational example for the proposed framework and the various routing algorithms developed for it.For illustration we consider a relatively small network, and describe the structure of a typical background process corresponding to this network.We briefly assess the expected travel time and runtime of the routing algorithms, so as to provide an indication of the gain that can be achieved by our proposed routing scheme.The second part of the section gives a detailed description of the mathematical framework for the Markov model that describes the attainable velocities of the vehicles, and points out how in the context of this model routing can be interpreted as an SMDP.1b, depicting the roads that can be used to travel from Almere (A) to Dronten (D), two cities in the Netherlands.As in the rest of this paper, the goal is to minimize the expected travel time.Figure 1a shows the corresponding routing graph in which each node represents an intersection in Figure 1b, and each link represents the road between two of these intersections.The Dutch government has imposed maximum velocities on the 16 roads (8 bidirectional arcs, that is) in the network of Figure 1.However, due to e.g.bad weather conditions and traffic incidents vehicles may not always be able to drive at these maximum velocities.
We model the variability in velocities by a Markovian background process.This background process records the events that affect these speeds.In this example, the only events discussed are incidents and rain showers, but it can be extended to include various sorts of other events in an obvious manner.To model traffic accidents we let {X i (t), t ≥ 0} be a Markov process that denotes whether arc i is blocked by an incident at time t.Specifically, we choose to set X i (t) = 1 if the arc is not blocked, and 2 otherwise.Furthermore, we let {Y (t), t ≥ 0} be a Markov process that describes whether it rains at time t or not: Y (t) = 1 indicates it is dry whereas Y (t) = 2 indicates it rains.The Markovian background process can then be written as the vector B(t) := (Y (t), X 1 (t), .., X 16 (t)), having a state space of dimension 2 17 .The state of B(t) describes the velocities on each of the arcs: we let v i (s) be the velocity on arc i when B(t) = s.
In the sequel we impose the natural assumption that the processes X 1 (t), .., X 16 (t) are independent, which effectively means that the occurrence of an incident on a specific link has no impact on the occurrence of an incident on other links.This assumption does not imply that there is no correlation between travel times on the arcs, as the velocity on an arc depends on the state of B(t).
We let the transition rate matrix for X i (t), for i = 1, . . ., 16, be written as with α i the incident rate and β i the clearance rate of an incident at arc i.The 2 16 × 2 16 -dimensional transition rate matrix of the vector process (X 1 (t), . . ., X 16 (t)) is now given by Q here the operation '⊕' denotes the Kronecker sum [41], which, given '⊗' denotes the direct product and I dim(C) denotes the identity matrix with the same dimensions as the matrix C, is defined as for two square matrices A and B. For example, the Kronecker sum for the two matrices Q 2 and Q 1 is given by , which can be interpreted as the transition rate matrix of (X 1 (t), X 2 (t)) that lives on the state space (1, 1), (2, 1), (1, 2), (2,2).Regarding Y (t), we denote the transition rate from 1 to 2 as λ and from 2 to 1 as µ.This means that 1/λ represents the mean time between showers, and 1/µ the mean shower duration.Combining the above, the Q-matrix for the full vector B(t) can be written as Remark 1.For this example we have assumed that X i (t) is independent of Y (t), which informally means that rain does not affect the occurrence of incidents.This may sound unnatural, but, importantly, our model can be adapted in a straightforward manner to make sure that weather conditions do affect the rate of incidents.The general setup will be introduced in full detail in the next subsection.♦ In the model introduced above the travel time on any arc is completely determined by the proposed velocity dynamics.This means that for any given arc, if the state of the background process when leaving the origin node is given, the expected travel time to the destination node can be computed (jointly with the state of the background process when arriving at the destination node).This in principle allows the computation of the optimal routing policy.Below we will argue that this optimal policy can be found relying on dynamic programming (DP) methods.DP-based methods, however, are typically prohibitively computationally demanding in case the underlying state space is large (the curse of dimensionality).Focusing on our specific routing objective, it is clear that such computational issues will arise, since, as we saw above, the number of background states is large, even in a small network.
The edsger algorithm, that will be presented in Section 3, succeeds in overcoming the high computation complexity of DP-based methods.The idea behind edsger is inspired by the A -algorithm, a speed-up version of Dijkstra's algorithm.We assess edsger on two criteria, namely: • distance-to-optimality, i.e., the difference between the value of the objective function and the optimal value.As the results of edsger are just a marginal amount below the optimal value, edsger is close-to-optimal.• efficiency, i.e., the run-time of the algorithm.edsger is highly efficient as it is sufficiently fast to allow it being used in real-time.
To provide an impression of the achievable performance of edsger , we conducted an experiment on the network presented in Figure 1.We define the 17-dimensional background process B(t) as above.We chose α i = 0.1 h −1 , β i = 2 h −1 for i = 1, . . ., 16 and λ = µ = 0.25 h −1 , meaning that the expected time between two incidents is 10 hours and the expected clearance time of an incident 30 minutes, whereas both the expected duration of a rain shower and the expected duration of a dry period equal 4 hours.These parameter values were chosen merely for illustrative purposes; in all later experiments we base ourselves on historical data.In case that it does not rain and there is no incident on the arc we let the vehicle speed be 120 km/h if there is no incident on the directly adjacent arcs, and 100 km/h otherwise.Regarding the case that it does not rain and there is an incident on the arc, we let the vehicle speed be 50 km/h if there is no incident on the directly adjacent arcs, and The results pertaining to this example are shown in Table 1.The performance of the edsger algorithm is compared with a deterministic Dijkstra-type algorithm that assumes that a vehicle can always drive at the maximum speed level and thus does not take any stochasticity into account ('DS', being an abbreviation of 'Deterministic Static').We in addition implemented a competitive DP algorithm for determining the policy that minimizes the expected travel time ('DP').
• The first column shows the expected travel time from A to D under the different policies, averaged (evenly) over all possible initial background states.• The second column provides a weighted average of the expectations corresponding to all possible initial background states; for a given initial background state, the weight is chosen equal to its limiting probability.• The last column contains the run-time of the three algorithms.
A first conclusion is that the objective function achieved by edsger is close to its minimal value (as provided by DP).Secondly, even in this small network, the run-time of edsger is significantly lower than the run-time of the competitive DP algorithm.Numerical experiments later in this paper will show that the run-time of the competitive DP algorithm grows exponentially with the network size, whereas the run-time of the edsger algorithm remains essentially real-time.We in addition conclude that ignoring the stochasticity, as is done by DS, leads to a fast but far from optimal algorithm; note in particular that the corresponding objective function is substantially higher than its minimal value (as provided by DP).
2.2.Mathematical Framework.After the motivating example of the previous subsection, we now formally introduce the general mathematical framework that we will be working with.As before, we consider the objective of minimizing the expected travel time between a given origin-destination (OD) pair in the road network.Let G = (N, A) be a graph representing the road network, with N and A the set of nodes in G, and the set of directed arcs in G, respectively.The set N represents the intersections in the road network, whereas the set A represents the roads that connect these intersections, implying k ∈ A only if there is a (direct) road in the network between the intersections that are labeled by k and .Let d k be the length of arc k .
A realistic feature of our setup is that when this arc is traveled by a vehicle, the speed of this vehicle is not necessarily constant.Indeed, it may vary between finitely many values, related to e.g. the occurrence of incidents and weather conditions.To deal with these changing velocities, we introduce a Markovian background process on the arcs A, of which the background process discussed in Section 2.1 is a special case.As we will argue below, the use of this stochastic process will allow us to incorporate random effects (corresponding to non-recurrent congestion, that is) as well as (near-)deterministic patterns in the attainable speeds on the arcs (corresponding to recurrent congestion).We assume that the state of the background process and the corresponding traffic velocities are available while traveling.Indeed, travelers are able to adapt the chosen route while driving, based on the information available.In the sequel we will phrase this dynamic stochastic shortest path problem with information on velocity levels in terms of a finite semi-Markov decision model.

Background process
Before presenting the full Markov-modulated environmental process, for expositional reasons we will first show examples of background processes on small networks that fit into our framework and gradually extend this setup.First consider the single-arc network in Figure 2. Define {X k (t), t 0} as the continuous-time Markovian background process on this arc such that similar to the modeling of incidents in the motivational example of the preceding subsection.The process X k (t) provides information on possible events on k and determines the velocity of a vehicle traveling on k : the speed at time t equals v k (i) if X k (t) = i, for i = 1, 2. For technical reasons it is throughout assumed that all these velocities are strictly positive, but this is, in practical terms, not a restriction as they are allowed to have arbitrarily small values.The speed at which vehicles can travel on this arc is now completely described through the background process X k (t) and the corresponding v k (i).
In the above example there are two possible speeds, but note that in practice one could work with more than two levels; one could e.g.think of the period between the clearance of an incident and the time the free-flow speed can be attained again.These dynamics can be incorporated by allowing the state space of the process X k (t) to consist of more than two states.In general we let this state space be denoted by S k = {1, . . ., n k }.When X k (t) = s ∈ S k , the speed at which vehicles are moving on arc k is v k (s).We use the notation Q k to denote the corresponding transition rate matrix of dimension We now extend the network with an arc m (Figure 3), and let B(t) = (X k (t), X m (t)) be the Markovian background process on this two-arc network.Here {X k (t), t 0} and {X m (t), t 0} are independent Markovian background processes on the arcs k and m, respectively.As in the single-arc case we denote the state space of X k (t) by S k = {1, . . ., n k } and the transition rate matrix as Q k .We can equivalently define S m and Q m for the process X m (t).With X k (t) and X m (t) being independent, the transition rate matrix Q of B(t) is of the form where '⊕' is the Kronecker sum that was defined in (1).If B(t) is in state s ∈ I := S k × S m , we let the speed at which vehicles are moving be v k (s) on arc k and v m (s) on arc m.Importantly, this means that the velocity on each of the two arcs is allowed to depend on both X k (t) and X m (t).Hence, this way of modeling defines an implicit dependence between the velocities on the two arcs.
The presented setup can easily be extended by adding more arcs to the network.Consider the network G = (N, A) and label the directed edges in A by k 1 1 , k 2 2 , . . ., k n n with n := |A|.The Markovian background process can now be written as Here {X k j j (t), t 0}, with j ∈ {1, . . ., n}, is a continuoustime Markov process with state space S k j j = {1, . . ., n k j j } and transition rate matrix Dependence between the arcs can again be realized by letting the speed on each of the arcs depend on the state of the full vector B(t): when B(t) is in state s ∈ I = S k 1 1 × • • • × S kn n the speed at which vehicles are moving on arc k j j is v k j j (s).The dynamics of the vehicles moving on the network are now completely defined through the background process B(t) and the corresponding velocity levels.For instance, if B(u) = s, then the travel time on edge where we use the fact that the travel distance d can be computed by an integral over the travel speed.For transparency of notation we abbreviate to denote the travel time for arc k i i when leaving at time u from node k i .
There is global correlation in the above setup, due to the fact that the speed that a vehicle on a given arc can drive at depends on the full background process B(t), containing information on the congestion levels of all arcs.Previous research has shown that correlation due to non-recurrent congestion is primarily local; the congestion level of an arc mainly affects the attainable velocities at upstream arcs that are within a certain distance of the incident arc [29,45].We model this property through the following assumption.
Assumption 1 (local-r-correlation).Denote with A r k the set of arcs that are at most r arcs away from arc k (including k itself ).We assume the velocity on arc k to depend only on the congestion levels of the arcs at most r arcs away: Before continuing to the SMDP routing framework induced by our proposed model, the next remark discusses a possible extension of our model, which was already noted in the motivational example.
Remark 2. Suppose there is a Markovian process Y (t) that prompts global correlation in the network, e.g.weather conditions that affect the velocities on all arcs in the network.This can be incorporated into our framework by expanding the process B(t) with the process Y (t).We let B(t) = (Y (t), X k 1 1 (t), . . ., X kn n (t)) be such that, given the state of Y (t), the future evolution of X k i i (t) and X k j j (t) are independent for i = j.The transition rate matrix for the case Y (t) = s is then given by with Q s k j j the transition rate matrix for the congestion levels at arc k j j given Y (t) = s.In this way our model allows for global dependence when information on processes that affect velocities in the whole network is available.In the remainder of this paper we will mainly focus on the case without global correlation, but, by using transition matrices of the type (3), the results of this paper can be extended to the case of a known global dependence process.♦ Remark 3. By the use of phase-type distributions the proposed Markov model for congestion is not limited to exponentially distributed times between states of congestion [1].Importantly, phase-type distributions can model random quantities that are less variable than the exponential distribution (e.g. by using Erlang distributions) as well as random quantities that are more variable than the exponential distribution (e.g. by using hyperexponential distributions).In particular, for highly predictable events (think of recurring events, such as rush hours) Erlang distributions with a high number of phases, thus leading to a low variance, are well suited.♦

Semi-Markov Decision Process
The objective of this subsection is to phrase our optimization problem in terms of an SMDP.Above we introduced our stochastic process modeling the velocity dynamics.Our setup is somewhat in the spirit of the one used in the work of Psaraftis and Tsitsiklis [46], Kim, Lewis and White [33,34] and Sever et al. [49], who use an MDP to capture the stochasticity in travel times.Importantly, as opposed to such earlier MDP-based approaches, we impose a continuous-time, rather than discrete-time, stochastic process on the arc speeds.Although at the expense of additional computational complexity, this has several advantages: • When a vehicle travels an arc, changing conditions on this arc are taken into account.In the conventional MDP framework it is assumed that if a vehicle is at an intersection, travel times on the attached arcs are known and choices are based on these known travel times.Conditions on these attached arcs may however change while driving, affecting the arrival time.• Travel times are a continuous function of the departure epoch.Therefore the consistency-or FIFOproperty [54], is naturally satisfied (see Appendix A for the proof): • In both our framework and the MDP setting the Markov processes on the links are assumed independent.
In the MDP framework this implies independence between the travel times on the links.By imposing a background process affecting the arc velocities we however provide the opportunity to incorporate dependence between the travel times on the links.
Routing in our framework can be analyzed using a finite semi-Markov decision process (SMDP).The state of the SMDP consists of the location of the vehicle combined with the state of the background process.In a practical context, the first could be tracked by GPS, whereas the latter can be provided by an ITS.Decision epochs are the arrival times at the intersections, in the sense that at these epochs one has the opportunity to adapt the route.Denote at decision-epoch i the state of the SMDP as (K i , B(t i )), with K i ∈ N the label of the current intersection and B(t i ) the state of the background process at the current time t i .The following (reasonable) assumptions are made: • Waiting at a node is not allowed.Note that this assumption does not affect an optimal policy, as the FIFO property implies that waiting is never advantageous.

• Information on the congestion levels on all arcs is available at all times.
• There is at least one path connecting every node with the destination node.If this assumption is not fulfilled, we have a non-connected graph for which we can only construct routing policies for OD-pairs within connected components.
At decision epoch i, the goal is to choose a neighbor k i+1 of k i such that the expected travel time from k i to the destination k is minimized.A policy matrix assigns a successor node for each node and each current state of the background process (i.e., each pair (k, s) with k ∈ N, s ∈ I).Given destination k , the expected travel time under a policy π and initial state (K 0 , B(t 0 )) is given by in which I denotes the number of decision epochs until the destination k is reached, t i the time of decision epoch i, and π(K i , B(t i )) the action under policy π given the pair (K i , B(t i )).The optimal policy is now a policy π such that with Π denoting the set of admissible policies (i.e., policies for which the probability of reaching the destination is one for all initial states).For every non-admissible policy π set J π (k, s) := ∞ for all (k, s) ∈ N × I.
We conclude this section by deriving expressions for both the expected travel time and the transition probabilities pertaining to a single arc.The resulting quantities are building blocks of the routing policies that will be discussed in the next section.To this end, define, for d ∈ [0, d k ] and s, s ∈ I, the expected travel time on k : In addition, we will work with the Laplace-Stieltjes transform of the travel time on edge k intersected with the event that upon completion the background state is s : The objects introduced above can be evaluated by conditioning on a possible jump of the background process in a small time-interval.As shown in the following theorem, it leads to a system of linear differential equations, whose solution can be given in terms of matrix exponentials.A proof is given in Appendix B.
Theorem 1.Given a graph G = (N, A) with a pair of nodes k, ∈ N and a distance d 0, it holds that ) s∈I } and 1 a |I|-dimensional column vector of ones.A solution for this system of linear differential equations can be written as with 0 an |I|-dimensional column vector of zeroes.
Due to our requirement that all the v k (s) are positive, the matrix V −1 k is well-defined.With the above result the expected travel time on an arc can be directly computed using the expression for Φ(d | k, ).An expression for the transition probabilities can be found by setting α equal to 0 in (6): Corollary 1.For a pair of nodes k, ∈ N and pair of background states s, s , Remark 4. The upper left |I| × |I|-block of the matrix exponential in ( 5) is the matrix exponential of dV −1 k Q.Thus to compute both the expectation E[τ s k ] and the transition probabilities P(B(τ s k ) = s ), computation of the matrix exponential given in (5) suffices.♦

Dynamic Routing Algorithms
Recall that our objective is to minimize the expected travel time for a given OD-pair in a road network in which a vehicle may experience changing velocities and in which one knows the current state of the driving background process B(t).After having provided background on the optimal policy resulting from DP (Section 3.1), we present two Dijkstra-like shortest path algorithms that aim to output a (near-)optimal routing policy.These algorithms are dynamic shortest path algorithms, in that at every new intersection a shortest path algorithm is run again.The analysis of Section 3.2 reveals that the first of these two algorithms, which we have called edsger, suffers from the curse of dimensionality.The second presented algorithm, edsger , as introduced in Section 3.3, overcomes this drawback, and offers real-time response.Useful implementation details, for these two algorithms as well as for the DP-based approach, will be provided in Section 3.4.We conclude with Section 3.5, which presents speed-up techniques that reduce the state space and/or network size, to be used to further decrease the computational costs.
3.1.Optimal Policy by Dynamic Programming.In this subsection we present an algorithm that outputs an optimal routing policy.As we will see in our numerical examples, this method is prohibitively slow; in this paper we mainly use it as a benchmark for our routing algorithms edsger and edsger .We will argue below that optimal policies can be characterized as solutions of a set of Bellman optimality equations [3,4,6,7,47,48,51].DP methods, competitive algorithms that solve these optimality equations, have high computational costs, so that they are not particularly suitable for real-time routing purposes.Details on the implementation of the competitive DP algorithm are provided in Section 3.4.
We proceed by arguing that algorithms outputting an optimal routing policy suffer from the curse of dimensionality.In case of a MDP, Güner et al. [28], Kim et al. [33,34] and Sever et al. [49] have characterized an optimal policy to satisfy the Bellman optimality equations.Importantly, this property carries over to our framework, as can be seen as follows.First note that we have a destination node k which is cost-free and absorbing, i.e., k = π(k , s) and J π (k , s) = 0 for all s ∈ I. Furthermore, denoting k 1 := π(k 0 , B(t 0 )) as the next node under an admissible policy π given the state (k 0 , B(t 0 )) with k 0 ∈ N \ k and t 0 ∈ R, (4) implies Thus, any admissible policy satisfies a set of Bellman equations.Denote with NB(k) the set of neighbors of a node k ∈ N , i.e., ∈ NB(k) only if k ∈ A (note that k / ∈ NB(k) as it is not allowed to wait at a node).An optimal policy π can then be given through the Bellman optimality equations: The expected travel times and transition probabilities in ( 8) can be computed relying on the expressions derived in the previous section (in particular Equations ( 5) and ( 7)).
Algorithms that solve the Bellman optimality equations have received substantial attention in the literature.
Examples of frequently used solution methods are the dynamic programming (DP) methods (e.g.value iteration and policy iteration) and linear programming.It should be realized that the size of our state space may explode: even in a simple setup in which every link has only two possible background states, a network with |A| links leads to a state space of size 2 |A| .We will therefore employ value iteration (VI) to solve the optimality equations [51].For large state spaces the VI algorithm does however suffer from high complexity: • in every iteration of the algorithm the right hand side of ( 8) is computed for every possible B(t 0 ) ∈ I, so that the complexity of the value iteration algorithm contains a term |I| 2 ; • computation of the expectations and the transition probabilities, which requires the evaluation of a matrix exponential as in (7), will be intractable when the transition rate matrix Q is large; • a large policy matrix will lead to high memory costs.
DP suffers from the curse of dimensionality; expanding the network leads to an exponential increase in the size of the state space and corresponding exponential increase in computational costs.
Based on the above, it can be concluded that VI has limited potential in practical settings.We will therefore introduce two dynamically-used Dijkstra-like shortest path algorithms, edsger and edsger ; here 'dynamicallyused' means that the shortest path algorithm is run at any decision epoch along the way.edsger provides near-optimal solutions, but, similar to VI, suffers from the curse of dimensionality.edsger is an adaptation of edsger that uses the local-r-correlation assumption to overcome the curse of dimensionality.Importantly, the algorithm offers real-time response, while still providing near-optimal solutions.3.2.edsger algorithm.The edsger algorithm (where edsger is an abbreviation of 'Expected Delay in Stochastic Graphs with Efficient Routing') can be seen as a dynamically-used stochastic version of the A -algorithm.More concretely, the method works as follows: • At every decision epoch (every arrival at a node, that is) we use a stochastic shortest path algorithm to identify a path with lowest expected cost, from the current node to the destination, provided that one is not allowed to change the route along the way.The resulting policy π E is to travel to the next node along this path.Thus, given that a vehicle is at node k 0 ∈ N at time t 0 , edsger uses a shortest path algorithm with input (k 0 , B(t 0 )) to output a path from k 0 to k .Then, denoting k 1 for the first node in this path, we have that π E (k 0 , B(t 0 )) = k 1 .• This procedure is then 'dynamically-used' in the sense that it is repeated when arriving at k 1 , so as to identify k 2 := π E (k 1 , B(t 1 )).Given the fact that the state of the background process may have changed while traveling from k 0 to k 1 , this next node may differ from the one that was selected at k 0 .The procedure thus exploits the information currently available.• One proceeds along these lines until the destination node k has been reached.
A first important remark is that this procedure is not necessarily minimizing the expected delay, i.e., the resulting route does not always coincide with the one generated by the DP-based algorithm discussed earlier.We note that one reason for this potential loss is the fact that edsger uses a stochastic A -like shortest path algorithm.In a network with stochastic travel times the notion that a subpath of a shortest path is a shortest path, a property on which Dijkstra and A are built, does not generally hold, so that the output is not guaranteed to be the shortest path in expectation.An example (in our context) of this finding, which is often attributed to Hall [30], is given in the following example.
Example.A vehicle wants to travel from k 0 to k in the network of Figure 4, in which each arc is 60 km long.A state of the background process B(t) is a 3-tuple consisting of the state of (k 0 k 1 ) 1 (the upper link between k 0 and k 1 ), (k 0 k 1 ) 2 (the lower link between k 0 and k 1 ) and k 1 k , in that order.A closer look at the velocities reveals that the speed on link (k 0 k 1 ) 1 is always 60 km/h, whereas the speeds on links (k 0 k 1 ) 2 and k 1 k can take two and three values, respectively.For instance, the speed on link (k 0 k 1 ) 2 is 100 km/h if this link is in state 1 and 10 km/h if this link is in state 2 (irrespective of the states of the other links).We consider the situation that the initial state of the background process is set B(t 0 ) = (1, 1, 1).Using Theorem 1 we find the costs of using the upper and lower link: .02 h.edsger would therefore advise to travel via (k 0 k 1 ) 1 .Comparison of the expected travel times of the two paths between k 0 and k does however reveal that this is not optimal, as these are 12.21h and 11.58h for traveling via (k 0 k 1 ) 1 and (k 0 k 1 ) 2 respectively.Thus even though (k 0 k 1 ) 1 is in the optimal path to k 1 , it is not the expected shortest path to k .A reason for this phenomenon lies in the fact that edsger only takes the expectation into account, while variability plays a role here as well, as this variability directly relates to different conditions on future links.To see this, first note that traveling to k 1 via the upper link will always take 1 hour.Upon arrival in k 1 there is a high probability that the background process of k 1 k has transitioned to a state with reduced speed.If link (k 0 k 1 ) 2 spends most time in state 1, traveling to k 1 via the lower link can be done within 1 hour.In that case there is still a high probability that link k 1 k can (partly) be traveled at speed 100.Traveling via (k 0 k 1 ) 2 therefore yields a lower expected travel time than traveling via (k 0 k 1 ) 1 , despite the fact that we have Example in which edsger is not optimal, as optimal subpath is not in optimal path.
A second reason for the potential loss in optimality lies in the the fact that edsger does not exploit its dynamical use in its fullest extent: due to the fact that in the DP approach one knows that one is allowed to change the path along the way, it potentially leads to lower cost than the dynamically used version of edsger.This effect is highlighted in the example provided below.
Example.The objective is to minimize the expected travel time from k 0 to k in the network of Figure 5 with B(t 0 ) = (1, 1, 1, 1).Every arc has two states and a vehicle can drive 100 km/h on an arc if the state of the arc is 1 and 80 km/h otherwise.The transitions from state 1 to state 2 have rate 0.1 and the transitions from state 2 to state 1 have rate 1 for all arcs.We first note that the expected travel times on all paths are equal, such that edsger cannot distinguish between these paths.However, edsger does not take into account that, in case k 1 is chosen as next node, updated information about the speeds on the two optional paths from k 1 to k can be used to pick the optimal path given this updated information.VI does use this information and as such picks the route via k 1 , yielding a 37 second improvement in expected travel time.Note that, for a travel distance of 100 km, this is only a very small improvement.♦ Whereas the above instance shows that in principle we can construct cases in which DP leads to lower expected delay than edsger, the experiments of Section 4 show that the difference is typically tiny (if there is a difference at all).Importantly, the modest loss of optimality that edsger experiences is compensated by a potentially significant improvement in terms of the computational cost: the experiments show that the computational effort of edsger can be substantially lower than that of the VI algorithm.This is because, in contrast to VI, edsger does not necessarily compute the expectation and transition probabilities on all arcs in the network.
We now provide a detailed description of the shortest path algorithm that is used within edsger.As discussed above, edsger uses this stochastic shortest path algorithm to output a path from the current node to the destination.The first node in this path defines the next action, i.e., the policy of edsger is to travel to this node.
The shortest path algorithm is a stochastic version of the A -algorithm.It assigns a label to every node in the graph and updates these labels iteratively.The initial labels are D k 0 = 0 for the source k 0 and D k = ∞ for k ∈ N \ {k 0 }.A set V is used to store nodes with labels that are no longer altered by subsequent steps of the algorithm, and is initialized by V := ∅.Every iteration of the algorithm a new 'current node' c is chosen according to Here lb k is a lower bound on the travel time from node k to destination k .Since the maximum speeds and the distances of all arcs are known, lower bounds on the arc travel times can be computed in an elementary way.Applying Dijkstra's algorithm on a graph with these lower bounds yields lb k for k ∈ N .
With the current node c chosen according to (9), the earlier derived expressions ( 5) and ( 7) are used to compute for all k ∈ NB(c) and s ∈ I.In the expressions (10) and ( 11) we let p s k denote the probability that upon arrival in k ∈ N the background state is s ∈ I, and p k = (p s k ) s∈I is the vector of these probabilities.In addition, Φ ck is the |I|-dimensional vector with entries E[τ s ck ] = φ s (d ck | c, k) for s ∈ I.Note that, as D c can be seen as an estimate of the expected travel time from k 0 to c, d k is an estimate for the expected travel time from k 0 to k.The labels of the neighbors k ∈ NB(c) \ V are now updated, i.e., if d k < D k we set D k = d k and store the vector p k for this node (replacing a previous stored value if present).We also store the proposed path to k, which is the stored path to c with k itself appended.After performing this updating step, c is added to V and a new current node c is chosen, again according to (9).
The described procedure is now repeated until the destination node k is set as the current node c.The path the algorithm outputs is the stored path for k .In the implementation of the algorithm a heap H can be used to reduce the complexity of the algorithm, similar to the use of the heap in Dijkstra's algorithm [27].Pseudocode is provided in Algorithm 1.Despite the fact that, in contrast to the VI algorithm, edsger does not necessarily compute the expectation and transition probabilities for all arcs in the network, the algorithm still suffers from the curse of dimensionality.
In this respect, note that the cost of evaluating the matrix exponential in (5) will contribute substantially to the cost of edsger.Moreover, in case of a large state space, the dimension of Q will make this evaluation intractable.To overcome this drawback, we will introduce the edsger algorithm.edsger can be regarded as an improved version of edsger, as it exploits the local-correlation structure of the background process to reduce the computational costs of edsger drastically.
3.3.edsger algorithm.Both the VI algorithm and the edsger algorithm do not use the assumption of localr-correlation (Assumption 1), which states that the velocity on arc k ∈ A is only dependent on the congestion levels of the arcs at most r arcs away.The main idea behind the edsger algorithm is to exploit this assumption, so as to overcome the curse of dimensionality.Numerical examples in the next section will show that edsger is indeed highly efficient and offers real-time response, while still providing near-optimal results.
Similar to edsger, at any decision epoch edsger uses a shortest path algorithm that aims to output a minimal path from the current node to the destination.The next arc in this path is traveled and upon arrival in a new node the shortest path algorithm is called again.We denote the resulting routing policy by π E .The shortest path algorithm used in edsger (Algorithm 2 below) is very similar to the algorithm used in edsger, but edsger exploits Assumption 1, in combination with a specific approximation, to drastically speed up the computation of d k and p k .
The main idea behind the edsger -algorithm is that, under the assumption of local-r-correlation, the velocity levels on an arc are only dependent on the state of the background processes on the arcs in A r k .This implies that the expectation of the travel time can be computed using just these processes.As above, we write A = {k 1 1 , . . ., k n n }.Denote with Q r k the transition rate matrix of the process B r k (t) = (X k i i (t), k i i ∈ A r k ) with state space I r k .The dynamics on arc k can be completely described by the process B r k (t), as the velocity levels on arc k depend only on the state of this process.Thus, the velocity on arc k at time t, defined as v k ((X k i i (t)) k i i ∈A ), only depends on the values of entries k i i ∈ A r k .We can therefore write the velocity on arc k at time t as v k ((X k i i (t)) k i i ∈A r k ).Denote the truncation of s = (s k i i ) k i i ∈A ∈ I to the entries corresponding to links in A r k by s r k , i.e., s r k = (s ) for the velocity level on arc k whenever s ∈ I.
Here V r k is a diagonal matrix with entries v k (s r k ) for s r k ∈ I r k .
Proof.The claim follows from Theorem 1, local-r-correlation (Assumption 1), and the independence of the Markov processes on the arcs.
Particularly when r is relatively small, this way of computing the expected per-edge travel times yields significant computational savings, due to the fact that the dimension of Q r k no longer grows exponentially with the number of arcs.This does however not directly yield a solution to the curse of dimensionality, as the computation of the transition probabilities still involves the matrix Q.We therefore propose to use the following (typically highly accurate) approximation for the transition probabilities.Recall that the transition probabilities of a Markov process B(t) with transition rate matrix A after a time t ∈ R can be expressed in terms of a matrix exponential: Note furthermore that we have a good estimate, D c , for the travel time from k 0 to c ∈ N .Based on these two observations we approximate p s c , the probability that B r k (t) = s upon arrival in c given that B(0) = s , by [e DcQ r ck ] (s ) r k ,s .This results in the following expression for d k : cf. (10).The corresponding pseudocode is given in Algorithm 2.
Result: path from k 0 to k given B(0) = s Initialization: 3.4.Implementation details.We now discuss several implementation details for the VI-, edsger-, and edsger -algorithms.We will in particular show how to rewrite specific computations in a form that allows the application of efficient numerical functions.Moreover, we will recommend the use of efficient data structures.We start by discussing the use of these techniques for the VI algorithm and continue with the edsger-, and the edsger -algorithms.
• The value iteration algorithm.As discussed in Section 3.1, this DP algorithm solves the Bellman optimality equations and outputs a routing policy that minimizes our objective function.It is an iterative procedure that assigns values Jπ 0 (k, s) to all (k, s) ∈ N × I and updates these values such that they converge to J π , the set of optimal values.Concretely, the algorithm sets Jπ 0 (k , s) = 0 and Jπ 0 (k, s) = ∞ for k ∈ N, k = k , s ∈ I and iteratively updates these values according to the scheme cf. the scheme (8).Convergence of the iterative procedure has been widely studied in the literature; we refer to e.g.[9] and [11] and references therein.Specifically, it has been proven that Jπ i (k, s) converges to J π (k, s) for all k ∈ N, s ∈ I.
The expectations and transition probabilities in the updating step ( 14) can be computed by applying Theorem 1 and Corollary 1.As was already noted in Remark 4, the computation of the (|I| + 1) × (|I| + 1)-dimensional matrix exponential exp suffices.Note that the upper left |I|×|I|-and upper right |I|×1-block correspond to (P(B(X(τ s k ) = s )) (s,s )∈I×I and (E[τ s k ]) s∈I , respectively.This means that, by writing i s for the index of s in I, p s k = (P(B(τ s k ) = s ) s ∈I and J π i−1 ( ) = ( Jπ i−1 ( , s )) s ∈I , the iterative step ( 14) can be written as Importantly, it now suffices to compute the product of the matrix exponential (15) and the vector ( J π i−1 ( , k ), 1) to evaluate the objective function in (16).For such a product, also known as the action of a matrix exponential, most programming software include compiled functions, examples of which are MatrixExp[ ] in Mathematica and scipy.linalg.expm( ) in Python; these functions are typically considerably faster than first computing the matrix exponential and the vector, and subsequently their product.
Besides the evaluation of the matrix exponential, the computational costs of DP are strongly affected by the data structure used for the Q-matrix.Implementing the Q-matrix in a sparse way significantly decreases the costs of constructing and storing this matrix.Sparsity additionally reduces the costs of updating the values Jπ i (k, s), since compiled matrix exponential functions in programming software are generally faster in case the input is a sparse matrix.
• The edsger-and edsger -algorithms.Also in the implementation of edsger and edsger , a significant speed-up can be achieved by (i) working with the action of the matrix exponential and (ii) exploiting the sparsity of the Q-matrix.Here we will focus on the former speed-up, i.e., the one due to the use of the action of the matrix exponential, as the motivation for using a sparse Q-matrix is identical to the one in the value iteration case.
Note that edsger uses a matrix exponential in the computations of d k and p k (see (10) and ( 11)).If we let p k be the matrix (P(B(τ s k ) = s ) (s,s )∈I×I , we can derive the following equivalence: Recall that for a vector p and square matrix A for which pe A exists, we have that (pe A ) = (e A ) p = e A p , yielding The same procedure can be followed to compute (13) in the implementation of edsger .
3.5.Decreasing state space.edsger yields significant computational savings compared to DP (using value iteration) and the edsger algorithm.However, substantial reductions of the computational costs are possible.This is achieved by performing preprocessing steps to reduce the network size and the state space of the background process.These are general speed-up techniques and can also be performed in the context of value iteration or edsger.Three specific ideas will be discussed in greater detail.The first one uses Yen's algorithm [56,57] to reduce the size of the network, by deleting arcs that cannot be on shortest paths.The other proposed ideas relate to decreasing the state space of the background process: the first idea considers hitting probabilities and excludes background states for which the hitting probability is below a given threshold, whereas the second uses historical data to exclude background states which are (extremely) rare events.
The first proposed speed-up technique decreases the network size by deleting certain nodes and arcs in the network.The technique is motivated by the fact that not all arcs in the network will be considered by travelers.The network in Figure 6 depicts the shortest route (in distance) from Almere to Dronten, in the Dutch road system.In case there is congestion on this route a traveler might wish to take a different route.One can argue, however, that the conditions in the network will never be such that the traveler wishes to use an arc around distant cities, such as, in our example, Eindhoven.As a consequence, the roads around the city of Eindhoven can be omitted when considering the roads to travel from Almere to Dronten.Then we propose to reduce the network size by only considering G in the routing problem.Note that the value of m should not be too large, as this will not yield computational savings, but also not too small, as this might eliminate arcs that are on the optimal route.
Güner et al. [28] propose to reduce the network G to a network similar to G , and compute a routing policy for this reduced network.We however propose an additional step in the construction of a reduced network, in which we add edges to G which offer alternative routes in case these do not exist in G .An example is provided in Figure 7. Applying Yen's m-shortest path algorithm with m = 5 in the graph of Figure 7a yields Figure 7b, which shows that arc k 1 k is present in every of the 5 shortest paths outputted by Yen's algorithm.This is undesirable, since this arc can be congested during the period of travel, meaning a traveler might prefer arc k 0 k .After an application of Yen's algorithm, k 0 k is however no longer in the considered network and therefore not in the routing policy.A real-world example is shown in Fig- ure 7c, where a vehicle wishes to travel from Groningen to Schagen.Application of Yen's m-shortest path algorithm may lead to m paths that all contain the Afsluitdijk dam (black marker).A few times a year the Afsluitdijk is closed for a few hours due to a major accident and in this case there are no alternative routes in G .
To avoid a situation as described above, in which there is no alternative to a heavily congested arc, a second step is introduced, in which arcs that offer these alternative routes are added to G .To this end, it is first checked which arcs are in all of the m shortest paths.Yen's algorithm is then repeatedly used to find l shortest paths in the network in which one of these arcs is deleted each time.The sets of l shortest paths are added to G and this network is used to construct a routing policy.
To analyze the dynamics on k ∈ A , we only need information on the state of the process B r k (t).Decreasing the network size as described above can therefore yield a significant reduction in the size of the state space as well: for k ∈ A \ (∪ i∈A A r i ) the corresponding Markov processes X k (t) give no information on the dynamics on G and can therefore be omitted in any further analysis.
3.5.2.Use of bounds on hitting probabilities.One technique for reducing the state space of the background process is based on hitting probabilities.The idea is to only include, for a small number , background states s ∈ I for which P(B(t) = s) > for some t T , with T > 0 denoting the time of arrival at destination k .Since this time horizon is evidently not known in advance, one could work with a M > 0 that serves as a crude upper bound on T .Given an initial background state s ∈ I an upper bound for the hitting probabilities is then given by In case S k i i = {1, 2} for some i ∈ {1, . . ., |A|} with transition rates λ i , µ i , we have where Y λ i (t) ∼ Exp(λ i ).We equivalently have that f 21 (M ) 1−e −µ i M , and by definition Whenever S k i i = {1, . . ., n k i i } with n k i i > 2, we can rely on standard results on sums of independent exponentially distributed random variables [10] in order to find an upper bound on the probabilities in (17).An example is given in Appendix C, in which it is shown how to use these results in case the Markov process on an arc is a birth-death process.Now, if (an upper bound of) the derived bound from ( 17) is smaller than some predefined > 0, we do not consider the corresponding background state s ∈ I in our further analysis.

Use of historical data.
A technique to further reduce the background state space is directly based on historical data of the process B(t) on G.When this data is available, it can be analyzed to determine which states s ∈ I are most common to occur in reality and which states have never occurred during the time period during which the historical data was recorded.For example, in a network with 50 arcs in which each arc has two states (congested and uncongested), the situation in which all arcs are congested can theoretically occur, but will not be observed in real road networks, as will be confirmed by the historical data.Hence a strategy could be to eliminate all states s ∈ I that have never been attained by the process B(t), so as to reduce the state space of this background process.
The three techniques discussed in this section can be used as preprocessing steps in dynamic routing.In case of edsger and edsger , they can be applied prior to every call of the shortest path algorithm.An outline of the resulting procedure is given in Algorithm 3.
Result: Travel policy from k 0 to k initialization: network G = (N, A), B(0) = s, Node = k 0 ; Step This section presents a series of numerical experiments that demonstrate that the edsger algorithm performs near-optimally with high efficiency.That is, corresponding to the earlier introduced notions of distance-tooptimaliy and efficiency, edsger outputs a value close to the minimally achievable value while essentially being real-time.To substantiate this claim, we have considered a broad range of traffic scenarios and networks of various dimensions.

Figure 8. Amsterdam highway system
More specifically, we first consider a small network and show that value iteration (VI), edsger, and edsger outperform two deterministic algorithms (in terms of distance-to-optimality), in case of a simple as well as a more sophisticated background process.Second, we increase the size of the network to show that the run-time of VI and edsger grows exponentially in the network size, whereas the run-time of edsger is substantially less affected.Importantly, edsger still yields close-tooptimal results in these larger networks.Last, we consider routing in a network of realistic size and show that edsger is still highly efficient and nearly optimal.
The first two experiments consider routing on the highway system around Amsterdam.They compare the distance-to-optimality of VI, edsger and edsger with two deterministic algorithms.The two deterministic algorithms, used as benchmarks, both employ the Aalgorithm.The first deterministic algorithm, as before referred to as 'Deterministic Static' (abbreviated to DS), applies the A -algorithm on the network with maximum velocities.As the maximum speeds are fixed, the Aalgorithm is executed once to determine the complete travel path, which is followed until the destination is reached.The second deterministic algorithm, which we will call 'Deterministic Dynamic' (abbreviated to DD), does use the available information on the velocities in the network.Every intersection the algorithm calls the A -algorithm on a network with the current velocities to determine the next arc to travel.
Experiment 1 uses a Markovian background process in which every arc has just two states: congested or uncongested.It is shown that VI, edsger and edsger outperform the deterministic algorithms in terms of distance-to-optimality in this simple setup.Experiment 2 considers a more detailed background process in which the Markov process of each arc can attain three possible states (so as to more accurately model the speeds the vehicles can drive at).The background process in addition includes a global event with Erlang distributed holding time (which could represent a reasonably predictable change in the global circumstances, e.g.rush hour; recall Remark 3).Graphical and numerical summaries show that the difference in distance-to-optimality between our algorithms and the deterministic algorithms is even more significant than in Experiment 1. Comparing efficiency shows that the deterministic algorithms and edsger can be executed in real-time, contrary to VI and edsger, whose computational costs suffer from the size of the state space.
In Experiment 3 we evaluate the computational costs of the various routing algorithms as function of the network size.We do so by working with an elementary network model, that is then extended with additional arcs to assess its impact on the algorithms' speeds.In addition, we quantify the effect of the dimension of the background process.The final experiment, Experiment 4, considers routing on the entire Dutch highway network and highlights the influence of various model parameters on the performance of the algorithms.Moreover, the experiment considers the speed-up techniques introduced in Section 3.5.For the experiments we implemented the networks and routing algorithms in Wolfram Mathematica 12.0 on an Intel® Core™ i7-8665U 1.90GHz computer.
Experiment 1.To get a first impression of the performance of the various algorithms, we start by considering a relatively small network.Our findings reveal that in this network, using a background process in which every link has just two states, both edsger and edsger efficiently yield close-to-optimal results Consider the network of the Amsterdam highway system (Figure 8), with 9 intersections and 24 links between these intersections (i.e., 12 bidirectional arcs).We pick the following framework: • The background process of every arc contains just two states, uncongested (corresponding to state 1) and congested (corresponding to state 2).Transition rates are tuned with NDW data.• The state of an arc only affects the speeds on the directly attached arcs, i.e., there is local-1-correlation.
• There is no process Y that induces global correlation.
• In case there is no incident on the arc we let the vehicle speed be 100 km/h if there is no incident on the directly adjacent arcs, and 80 km/h otherwise.In case there is an incident on the arc we let the vehicle speed be 40 km/h if there is no incident on the directly adjacent arcs, and 20 km/h otherwise.• In the network there is a maximum of three incidents simultaneously, to bound the size of the state space and guarantee tractability of VI.
Consider a traveler interested in minimizing the total expected travel time from node 1 to node 8.We compare the DP policy with the routing policies under edsger, edsger and the two deterministic algorithms DS and DD.The DP policy is derived from the VI-algorithm, using the implementation guidelines as described in Section 3.4.Implementation of the two deterministic policies follows from a standard implementation of the A -algorithm.The policies of edsger and edsger can be found by storing the output of their shortest path algorithms for every initial state (k 0 , s) ∈ N ×I.The example below, intended to demonstrate the principles underlying edsger in a concrete setting, shows that the policy of edsger in node 1, with as background state that every arc except for the arc from node 2 to node 1 is uncongested, is to travel to node 3.
Example.We use edsger to route from node 1 to node 8 in Figure 8, when upon leaving the only congested arc is the arc from node 2 to node 1.As a first step, Dijkstra is used on a network with maximum speeds, to determine the lower bounds lb k of the travel time from node k ∈ N to node 8, which in this case are given by lb 1 = 0.18, lb 2 = 0.23, lb 3 = 0.13, lb 4 = 0.16, lb 5 = 0.09, lb 6 = 0.05, lb 7 = 0.09, lb 8 = 0, lb 9 = 0.13 Now the shortest path algorithm within edsger is used to determine the next node to travel to.Node 1 is set as current node and we initialize 10) and ( 11) for k = 2, 3, 9, the neighbors of 1, to give: The updated labels now yield: The new current node is then arg min   9 shows the expected travel time (in minutes) of the derived policies in five initial background states.The first set of five bars corresponds to a state of complete non-congestion, i.e., all of the arcs are uncongested upon departure.The expected travel times under the different policies are in this case similar.Note that this is not surprising, as the influence of unpredictable events is minimal: the distance between node 1 and node 8 is small and therefore the probability of occurrence of an incident on the shortest path as well.The second set of bars shows the largest difference in expected travel time between VI and DS.This difference in expected travel time is significant, the expectation under the deterministic policy being more than 1.5 as large as the expectation under VI.The third, fourth and fifth set of bars show the largest difference between the expected travel time under VI and the expected travel time under DD, edsger and edsger respectively.Note that these differences are small, and their policies are thus close-to-optimal.
The fact that DD, edsger and edsger yield close-to-optimal results in the framework of this experiment can also be seen in Table 2: • The first column shows the expected travel time from node 1 to node 8 under the different policies, averaged (evenly) over all possible initial background states.• The second column shows the expected travel time from node 1 to node 8, averaged (evenly) over all initial background states in which one incident has occurred.Columns three and four show the same for respectively two and three incidents.• The fifth column provides a weighted average (WA) of the expectations corresponding to all possible initial background states; for a given initial background state, the weight equals its limiting probability.• The last column contains the run-time of the algorithms (in sec.).
The average and weighted averages of edsger and edsger are close to the results of VI.DD performs relatively well, whereas the values under DS are noticeably suboptimal.Comparison in run-time shows that the computational costs of VI are an order larger than those of the other algorithms.In Experiment 3 we will investigate the efficiency of the algorithms in greater detail, and show that the difference in computational costs becomes more substantial when increasing the maximum number of incidents in the network.
We conclude that in this simple framework, in which every arc can only have two states, edsger and edsger yield nearly optimal results, while being roughly one order of magnitude faster than VI.It was also shown that edsger and edsger perform better than the two deterministic algorithms DS and DD.Especially in case there are incidents in the network, the difference in distance-to-optimality is significant.As the occurrence of incidents is relatively rare, the distance-to-optimality of the four algorithms is similar if there are no incidents in the network upon departure.The next experiment shows that the difference-to-optimality typically grows when adding more detail to the model.Experiment 2. In this experiment we consider a more involved background process, and show that edsger and edsger still yield nearly optimal results.The two algorithms outperform the deterministic algorithms (in terms of distance-to-optimality), and are at the same time considerably faster than VI.The example, moreover, demonstrates the comprehensiveness of our model, by illustrating the possibility of (i) adding recurrent events and (ii) working with more velocities per arc.
We again consider the network of the Amsterdam highway system (Figure 8), but extend the background process of Experiment 1. Concretely, we have three speeds per arc (rather than two), and include a global event.We include the extra speed level and global event in the following way: • Every arc has three states, uncongested (state 1), congested (state 2) and recovery (state 3).A link can transition from an uncongested to a congested state, but not vice versa: from a congested state a link must first enter the recovery state before it returns to the uncongested state.The recovery state represents the time between the clearance of an incident and the time the traffic conditions return to the free-flow speed.• The state of an arc only affects the speeds on the attached arcs, i.e., there is local-1-correlation.
• There is a recurrent event that induces global correlation, to be interpreted as a rush hour.This event affects the speeds on the roads on the inner circle of the highway system, i.e., all arcs between nodes 1, 3, 5, 7 and 9.We will denote these arcs as category I arcs, whereas we will refer to the other arcs as category II.As a rush hour is a recurring event the time till its onset has a relatively low variance.That is the reason why we chose to not model this time by an exponential distribution but rather by an Erlang distribution (as pointed out in Remark 3).In our experiments we took four phases.Only in the last phase, which we identify as the start of the rush hour, the speeds on the arcs in category I are affected.• Speed levels for the different scenarios can be found in Table 3. • In the network there is again a maximum of three non-uncongested links simultaneously, to bound the size of the state space and guarantee tractability of VI.Table 3. Speed levels on the arcs in km/h.The rows specify the category and state of the arc, the columns specify if there is rush hour or not, and distinguish between the numbers of attached arcs that is not in state 1.   4 shows the run-time (in seconds) of the different algorithms.Note that the computational costs of VI and edsger are substantially larger than those of edsger and the two deterministic algorithms.Thus, summarizing the results of the first two experiments, edsger has the best performance in terms of distance-to-optimality and run-time.More precisely, edsger outperforms the deterministic algorithms in terms of distance-to-optimality, and outperforms VI and edsger in terms of efficiency.The next example will show that the savings in run-time will become even more pronounced when increasing the network size and/or the maximum number of incidents.

Non-rush
Experiment 3. We consider elementary networks to assess the sensitivity of the run-time as a function of the network size, and in addition as a function of the maximum number of incidents.In the previous examples, in which the network size and maximum number of incidents were small, we already noted that the computational costs of both VI and edsger are considerably higher than those of edsger .To directly demonstrate the computational savings when using edsger , we use networks of the type depicted in Figures 12a and 12b.These networks show a 2 × 2-and 3 × 3-structure, but extending the model in the obvious way produces an n × n-structure for any n 2. We increase the value of n, so as to evaluate the impact on the computational costs for VI, edsger, and edsger .For simplicity, the conditions on the network are set identical to those of Experiment 1.This includes the condition that there can only be three incidents in the network simultaneously.However, below we also assess how increasing this bound on the number of incidents affects the run-time of the algorithms.Figure 13a shows that, contrary to the run-time of edsger , the run-times of both VI and edsger grow exponentially with the network size (note the logarithmic scale).This exponential increase can be explained by the fact that the size of the background process, and thus the size of Q, grows exponentially with the size of the network.Since edsger only uses the part of the state space that corresponds to the states on the directly attached links, its computational costs are hardly affected by the network size.We conclude that in larger networks VI and edsger become intractable, whereas edsger still offers real-time response.
Importantly, the substantial reduction of the run-time when using edsger (instead of VI, that is) does not correspond to a significant increase of the objective function: Figure 13b shows that edsger yields close to   optimal results.The figure shows the average and weighted average expected travel time (in minutes), with weights again chosen as the stationary probabilities, for different network sizes.Observe that DD also yields close-to-optimal results in this framework.This can, however, be explained by the fact that the considered instance is relatively simple; adding more detail to the framework, as we did in Experiment 2, again leads to a more pronounced suboptimality of this algorithm.We observe the same behavior (in terms of computational costs and the value of the objective function) when, instead of the network size, the bound on the maximum number of incidents is increased (see Figure 14).That is, considering the 3 × 3-network displayed in Figure 12b, the run-time of VI and edsger grows exponentially with the maximum number of incidents.Moreover, the run-time of edsger is hardly affected by the increase of the maximum number of incidents, while still performing nearly optimally.The substantial difference in run-time is again due to the rapid growth of the dimension of Q as function of the number of incidents.In contrast, the matrices Q r k , as used by edsger , involve significantly fewer arcs, and are therefore less affected by the number of incidents.Note that, in case of local-1-correlation, the computational costs of edsger will e.g.no longer increase if the number of incidents exceeds the maximum degree of the graph.
From Figures 13 and 14 we conclude that VI and edsger are inefficient, making these algorithms unsuitable for practical purposes.The results of edsger are considerably more promising, as these showed that its run-time is hardly affected by the network size or maximum number of incidents, while performing close-to-optimal in terms of cost.This is why in our last experiment we investigate the tractability of edsger in networks of realistic size.The experiment also considers the influence of the model parameters and demonstrates the use of the speed-up techniques introduced in Section 3.5.Experiment 4. To confirm the feasibility of edsger in large networks, the final experiment considers routing in a network of realistic size: the Dutch highway network (Figure 15), with 93 intersections and 262 directed roads between these intersections.Besides confirming the feasibility of edsger , the experiment has the following three objectives: • Show that edsger still yields accurate results in terms of expected travel time, supporting our conclusions of the experiments above.We will omit VI and edsger in our analysis, as these algorithms are intractable in large networks, as demonstrated in Experiment 3. Instead, we use simulations to compare the travel time under edsger , DD and DS with the travel time under an optimal path; • Assess the effect of the parameters on the distance-to-optimality of edsger .Concretely, we compare the results of edsger with the results of DD and DS for different parameter values; • Demonstrate how the speed-up techniques described in Section 3.5 can be used in this network, and how the different techniques affect the distance-to-optimality and the performance of the algorithms.
To show the travel time under edsger is indeed close-to-optimal, we simulate the travel time for three OD-pairs.The considered OD-pairs, denoted by 'Short', 'Medium' and 'Long', differ notably in length, with distances 22.5, 98.5 and 191.7 km respectively.For simplicity we again use the same background process as in Experiment 1.We simulate realizations of this background process and determine for every realization the travel time of edsger , as well as the travel times of DS and DD.These travel times are compared with the optimal travel time, i.e., the minimal achievable travel time under the given realization of the background process.To measure the difference in travel time, we compute the average percentage loss in travel time: Average travel time algorithm − optimal travel time optimal travel time • 100% .
Note that a high value of this measure can also arise for an algorithm that is optimal in terms of expected travel time, e.g.VI.This is due to the difference between (i) the notion of being (close-to-)optimal in expectation and (ii) the notion of being optimal for a given realization.Using this measure we can therefore only compare algorithms, not assess individual values.
Table 5 shows the results of the simulations.We first note that edsger is indeed feasible, as the displayed run-times t (in sec.) are sufficiently low.The columns denoted by U% and W% show the average percentage loss in travel time for a given OD-pair, with initial states chosen uniformly and weighed (with the corresponding limiting probabilities), respectively.Table 5 shows that edsger is close-to-optimal in this large network.If the initial states are drawn according to the limiting probability (W%), the average percentage loss in travel time is below 1%.In case the initial state is chosen uniformly, the drawn initial states contain typically many congested links.Even in this more extreme high loss in travel time for edsger is, as argued above, due to the difference between the notion of being (close-to-)optimal in expectation and the notion of being optimal for a given realization.The loss under VI would likely be of the same order as that of edsger .Computing the loss of both algorithms in the network of Experiment 1 with α = β = 1 results e.g. in losses 7.00% (VI) and 7.27% (edsger ).

Short
As stated above, the third objective of this experiment is to demonstrate the effect of the speed-up techniques described in Section 3.5 on the distance-to-optimality and the efficiency of the algorithms.The outcome of this experiment, in which we study the same routes as above (i.e., 'Short', 'Medium', 'Long'), is shown in Table 8.We use Yen's n shortest path algorithm to decrease the network size and distinguish in Table 8 between the different values of n, namely n = 1, 5, 10, 25.As argued in Section 3.5, this decrease in network size directly implies a decrease in the state space, as we do no longer track the states of the arcs not contained in the reduced network.
The derived bound on the hitting probabilities in (17) gives that, for the OD-pair of 'Short', the probability of occurrence of four accidents while driving has probability smaller than = 0.0005.Background states that contain four additional incidents with respect to the initial state are therefore deleted.A similar argument can be used to reduce the size of the state space for the OD-pairs of 'Medium' and 'Long'.The table reveals that the number of nodes and arcs in the network increases with n and consequently the run-time t of the algorithms as well.A moderate value of n is therefore preferred.This is also justified by the distance-to-optimality of the algorithms for moderate values of n.That is, the table shows that for smaller values of n the algorithms still yields close-to-optimal results, with the average percentage loss below 2% for all cases.Note that the travel times for n = 10 and n = 25 are even similar to the case of n = 5.This again advocates the choice of a small n, e.g. between n = 1 and n = 5, as this reduces the run-time while the distance-to-optimality of the algorithms is not affected.

Concluding Remarks
In this paper we developed a new mechanism for describing the evolution of the velocities in a road traffic network, capable of modeling both recurrent and non-recurrent congestion.For this flexible velocity model we developed a routing algorithm that aims at minimizing the expected travel time.Extensive experiments showed that it outperforms competing models, in that it provides near-optimal results but at the same time offers real-time response.
Regarding the velocity model, we advocate the use of a continuous-time Markovian background process to describe the speed driven by vehicles on the arcs in the network.We have developed various ways to deal with the potentially high dimension of this process.Future research could concern further operationalizing this model, with a specific focus on tuning the model's parameters using measurement data.Concretely, we have argued that the proposed background process is able to incorporate the influence of random events, as well as the influence of (nearly) deterministic patterns, on the vehicle speed.Insight into the occurrence of these events, and their effect on the velocities, can be provided relying on Intelligent Transportation Systems (ITS).However, it s not a priori evident how to map the information provided by a typical ITS on the parameters of our Markov model.A future study could explore such calibration issues.
Regarding the routing algorithms, we have evaluated these on the basis of (i) distance-to-optimality (i.e., minimizing the expected travel time between source and destination) and (ii) efficiency (i.e., run-time).Numerical experiments justify the advice to use edsger as routing algorithm, as edsger is close-to-optimal and has real-time run-time.In the implementation of edsger , as well as the implementation of the other presented algorithms, the guidelines described in Section 3.4 were used.There may be opportunities to further speed-up the algorithm.For instance, parallel computing could potentially be used to speed up VI, edsger, and edsger .Since the costs of computing the per-arc expected travel times are under edsger significantly lower than under edsger and VI, also when applying parallel computing edsger will outperform the competing algorithms.
Numerical experiments were conducted to show that edsger efficiently yields close-to-optimal results under a broad range of realistic traffic scenarios.The model parameters have been calibrated from historical data on vehicle speeds and flows, collected by the National Data Warehouse for Traffic Information (NDW) in the Netherlands.Future research will focus on developing a more formal statistical procedure to estimate these parameters, performing a detailed analysis of the occurrence and consequences of incidents; cf. the earlier study by Snelder [50].q ss e −q ss t 1 − e −q ss ∆ dt = q ss 1 − e −q ss ∆ ∆ 0 ))e −q ss t dt.where the second step follows from L'Hopital's rule in combination with Leibniz' integral rule.We have thus obtained the desired system of linear differential equations: The same steps can be performed to derive the second system of linear differential equations.Again conditioning on a possible jump of the background process, as ∆ ↓ 0, This completes the proof.

Appendix C. Upper bounds probabilities
We will show how results on the sum of independent exponentially random variables can be used to derive an upper bound on the hitting probability in case B k (t) is a birth-death process.Figure 16 shows the outline of such a process with n k states.By the structure of a birth-death process, it holds that With S N the sum of N independent exponentially distributed random variables with rates λ i (where λ i = λ j for i = j) and density functions f i (z) =: λ i e −λ i z , we have [10] f S N (z) = These results can now directly be used to find an upper bound on the probabilities in (20).

Figure 5 .
Figure 5. Example in which edsger is not optimal, as it does not fully exploit the possibility to change the chosen route.

Figure 9 .
Figure 9. Highlight a few background states

Figure 10 .
Figure 10.The four Erlang phases of Rush hour

Figure 11 .
Figure 11.Highlight a few background states

Figure 12 .
Figure 12.Examples of networks of size n × n.Here 'S' is the source, and 'D' the destination.

( a )Figure 13 .
Figure 13.Effect of increasing the network size.

( a )
Computational costs with max. of n incidents (b) Performance algorithms in case n incidents

Figure 14 .
Figure 14.Effect of increasing the number of incidents.

Table 1 .
Results example Almere-Dronten nonempty do 1.Extract tuple (D k + lb k , p k , k, path) with minimum first entry (key) from H; 2. If k = k quit and return path.Else if k ∈ V go back to step 1. Else continue; 3. for neighbor k in NB(k) \ V do a.Compute d k and p k with (5) and (7); b.If d k < D k set D k = d k and insert (D k + lb k , p k , k , path + {k }) in H; 1: Preliminary node deletion; a. Derive m shortest paths on G to form G ; b.Identify arcs {k 1 1 , . . ., k n n } that are on all shortest paths; c.Derive l shortest paths on G \ k i i for i = 1, . . ., n and add to G ; d.Reduce state space background process by looking at deleted arcs; Step 2: Preliminary background state deletion; a. Delete states with hitting probability constraint; b.Delete states by historical data analysis; Step 3: Use shortest path algorithm; a. Determine path from k 0 to k ; b.Travel to outputted next node.Stop if this is k .Otherwise return to Step 1 with state upon arrival as initial state; Algorithm 3: Outline Dynamic Routing with Preprocessing steps 4. Numerical Experiments

Table 4 .
Average (A) and weighted average (WA) expected travel time of the algorithms in the four different initial phases of the rush hour.Run-time (in sec.) reported as well.

Table 5 .
Average percentage loss in travel time (W%) and run-time of the algorithms (t); background process as in Experiment 1.