D IRECT OBSERVATION OF REROUTING PHENOMENA IN TRAFFIC NETWORKS

In this paper we propose how available dataset can be used to estimate rerouting phenomena in traffic networks. We show how to look at set of paths observed during unexpected events to understand the rerouting phenomena. We use the information comply model [1] and propose its estimation method. We propose the likelihood formula and show how the theoretical and observed rerouting probabilities can be obtained. We conclude with illustrative example showing how a single observed path can be processes and what information it provides. Contrary to parallel paper [2] where rerouting phenomena is estimated using real traffic flow measures from Warsaw, here we use only synthetic data. The paper is organized as follows. First we elaborate on rerouting phenomena and define the traffic network, then we summarize the literature behind rerouting phenomena. We follow with a synthetic definition of dynamic traffic assignment needed to introduce ICM model in subsequent section. Based on that introduction we define the observations and propose estimation method based on them followed by illustrative example. Paper is summarized with conclusions and pointing of future directions.


Introduction
By rerouting we mean changing the currently chosen path in road network after either receiving some information or observing consequences of an unexpected traffic event.We broadly define unexpected event as any relevant traffic information that is not known in advance by at least some percentage of drivers and implies changes in the perception of the supply side.This can include: incident, road closure, sport event, demonstration, planned road work, etc. Rerouting phenomena originates in rational process when some reason leads an individual to alternate his plans.In case of rerouting it is either observation or information about unexpected situation in traffic network which leads to changing the path to avoid negative consequences.The process is latent as it takes place in mind of every individual and cannot be observed, however, what can be observed are a) reasons for rerouting b) rerouting itself (changing the path).We follow definition of rerouting phenomena from information comply model -ICM [1].This can be seen as an extension of dynamic traffic assignment covering rerouting phenomena.It models the phenomenon through calculating probability of rerouting α for given place and time in the network subject to current situation and information.ICM is parameterized by: a) how information reaches drivers, b) how do they react and c) how do they choose their new paths, which all influence the resulting probability.Such detailed representation yields realistic results as it covers cognitive process of rerouting, unfortunately it is hard to estimate and validate.
Fortunately, thanks to recent developments, revealed preference (RP) data of how individuals travel through the network in time became more available.As we show in the parallel article [2] rerouting can be also (to some extent) indirectly observed by looking at the flows in the network.Here we focus on how the rerouting behavior can be estimated with set of paths observed during unexpected events.In this paper we follow classical definition of road network represented by means of an oriented graph G(N, A), where N is the set of nodes and A  N×N is the set of arcs.Each arc aA is described through a vector of characteristics a() that allow to represent its performances (travel times ta() and cost ca()).The initial node of the generic arc aA is referred to as tail and denoted a -N, while the final node is referred to as head and denoted a + N.
The set of arcs with tail at the generic node iN is referred to as forward star and denoted i + = {aA: a -= i}.Symmetrically, the set of arcs entering node iN is referred to as backward star and denoted i -= {aA: a + = i}.We define generic path k in the network G as an ordered set of arcs aA and nodes iA reached at time k a t .Generic path k connects origin node oN with destination node dN.

Literature review
Valuable staring point to define rerouting phenomena is a comprehensive literature review on route-choice models by [3], where general definitions of routing behavior are summarized.Comprehensive definition of the rerouting problem can be found in [4] which clearly highlights the dynamic nature of rerouting phenomena and places it in traffic assignment.We follow here classic definition of traffic assignment (TA) in terms of econometric equilibrium [5] , and ICM model used in this paper follows fundamental traffic assignment assumptions, most importantly expected-utility concept [6] as an explanation of routing behavior.As the rerouting phenomena is strictly dynamic (takes place in space and time), we consider dynamic version of user equilibrium -DUE [7][8].In fact we work with the stochastic macroscopic DTA model proposed in [9] with sequential route choices [10] at the demand side and General Link Transmission Model [11] on the supply side.The most up-to-date implementation of this framework can be found in [12].
[13] extended typical day-to-day TA model and allowed individuals for en-route rerouting.In their concept rerouting is driven by experienced delay while traversing the network.They model rerouting through agent-based simulation where agents everyday update knowledge on how perturbation on single arc an influences the remainder of the networkthis knowledge is then stored in correlation matrices.It is assumed that experiencing delay while travelling can lead to assumption that downstream arcs are also perturbed, which can, in turn, lead to rerouting.While Snowden assumed rerouting is driven only by experience and actual observation [14] assumed that user have access to perfect knowledge about actual state of the whole network.They assume that rerouting take place if the possible gains of changing the route are big, moreover it takes place only if the difference, both relative and absolute, is greater than so-called 'indifference-band'.Mahmassani propose term 'schedule-day' for what we call here typical day.Both Snowden and Mahmassani implicitly follow one of most valuable concepts for rerouting, namely the hybrid-model formulated by [15].Hybrid-model originates from strong distinction between pre-trip and en-route route choice and ascertainment that rerouting takes place as a mixture of them, namely 'hybrid-routing': following pre-trip chosen route until there's a good reason to deviate from it and follow new route up to destination.Hybrid-model addresses rerouting with a sequential procedure executed at each decision point using utility of rerouting.The utility of rerouting is function of: travel times, elasticity parameter (showing how easy drivers will deviate from pre-trip choices, handled in ICM within compliance model).Utility of rerouting is calculated strictly subject to destination.The final result of hybrid model is the new, recalculated route-choice probability at each node and is analogical to probability of rerouting proposed in ICM.Furthermore [15] provide valuable considerations about information and define it as a function which transform actual costs into perceived costs used by individuals for routing.
[4] defines rerouting problem while analyzing within-day re-planning of agents activity plans due to events, which can include i.e. altering destination, later departure time or even cancellation for non-obligatory trips.Within-day re-planning is based on strong distinction between iterative process adequate for recurrent conditions and single-shot simulation for special cases (i.e.replanning, rerouting, evacuation), explaining why classic TA methods fail in modelling reaction of drivers to exceptional events.The agent structure proposed by Dobler uses the BDI structure (beliefs, desires, intentions, as defined in [16]) which is fruitful to define rerouting where beliefs are perception of the network states, desires are to avoid negative impact and finally intentions are to reroute or not.This BDI structure was origin of ICM structure of information, observation and compliance models.Dobler proposes the Rayleigh distribution [17] to define the information process.
Notable stream of papers by Gao, Frejinger and Ben-Akiva addressed rerouting phenomena through adaptive routing policies.[18] opposed routing policies to adaptive route choices and defined it as the mapping of the network 'support-points' (realization of travel times) to routing decisions.The realization of policy is a single path which can be adapted due to actual network conditions (the general concept is similar to hyperpath [19] and routing strategy [20]).Routing policy is built on assumption that probability of each 'support point' of the network is known.When driver observes actual state of the network (realization of travel times) he updates probability of downstream arcs travel times.So that if new paths become optimal the rerouting is observed.It is assumed that drivers use their experience to update policies, while here we focus on unexpected, non-recurrent events.The information spread model of ICM originated from evacuation models [21] where information about event spreads in time reaching more people.Fortunately recent data from Twitter [22] led to way better understanding of information spread processes.Numerous researches were conducted using Twitter data (i.e.[23][24][25]) and provided valuable information on two aspects: dynamics and range of the information spread process.The dynamics are observable through the 'tweets' posted after emergency situations: earthquake, hurricane, riots, etc. [26] made an outstanding research on how fast the information dissipates though the communication network of Twitter.They investigated the twitter traffic related to the false news (i.e."Rioters released wild animals from London ZOO") and showed how the society reacts, believe and deny the hoaxes.The analysis was made time dependent so that speed of the twitter is observable.This is probably the most evident and precise feedback on what is the process of informing in time.Several examples analyzed by Procter showed similar properties: The shape of the information spread curves resembles what researchers usually adopt while talking about information spread ([21], [27], [28]) -Rayleigh-like distribution (or any of similar shape).Twitter research revealed other important phenomena of information: virality.[29] showed that information in the communication network is either completely negligible and forgotten very fast, or completely opposite: it spreads like a virus through communication network, reposted forward with exponential probability [25].The observations on virality led us to parameterize both information spread speed and range with significance of the information (measured with the total delay it causes).Finally we refer to stated-and revealed-preference studies of rerouting.[30] conducted a survey to see what is the impact of VMS and radio broadcasts on route choice.60% respondents claimed that their route-choice is influenced by radio broadcasts, and 40% by VMS signs.[31] got much less optimistic results with the revealed-preference data from floating mobile data.Analysis on how VMS information affects route-choices showed ~30% of compliance to exact guidance provided through VMS.Thanks to recent solutions detailed data on paths in traffic networks is available and working on path data in various studies becomes more accessible.Paths are observed through GPS tracks are more broadly used in number of researches, i.e. to obtain OD matrices [32], to detect modes [33] and trip purposes ( [34], to define trip generation [35].

Dynamic Traffic Assignment
Before we can introduce the ICM model we need to provide brief introduction to DTA.In general, DTA determines the traffic flows on the network satisfying the demand [36].It is done through an assignment methods, typically following the 'userequilibrium' concept of balancing travel costs of all drivers [5].In dynamic context 'user-equilibrium' becomes a dynamic user equilibrium (DUE), defined as a traffic pattern at which no driver finds it convenient to (unilaterally) change his/her route and departure time (see i.e. [37]).DUE is obtained through an iterative process where at each iteration route choices are adjusted based on outcomes of decisions made in previous iterations.The process is converging to a fixed-point, where demand and supply are stabilized [38].The results of DTA are the network performances (i.e.temporal profile of travel costs and times) and the demand pattern.Demand pattern of DTA is either set of OD paths defined with specific temporal profile of flows, or, alternatively, [39] set of local routing decisions (arc conditional probabilities) defined for each node which coupled with origin demand becomes equivalent to explicit paths.
The outcomes of DTA are travel costs and flows of the network obtained through equilibrium for typical case; ICM rerouting model provides the same outcomes, yet for atypical situation of unexpected event.

Observations
In the remainder of the paper we look at set of paths K={k} observed during the unexpected event.It can be obtained during a long term study of sample individuals recurrently travelling through the traffic network.If during this long term study it is possible to identify an atypical day resulting from some unexpected event we have satisfactory input for proposed estimation procedure.For further analysis only a subset of paths which could possibly be affected by this event is needed.For those paths for which it is possible let's determine its typical realization k , i.e. most probable path of this individual connecting the same origin and destination.Unfortunately k can be easily defined only for typical, recurrent trips and for long-term studies.Using results of DTA, for each arc a of path k we can define probability of choosing it while being at its tail node a -at time τ subject to travelling destination d, denoted as () .By typical costs (denoted throughout the paper with superscript ^) we mean conditions observed during a typical day (when no unexpected events are present), which coincide with costs and times of dynamic user equilibrium and are, at the same time, the conditions expected by individuals to occur when making route choices.Actual conditions (denoted throughout the paper with superscript ~) in turn are those observed as a consequence of unexpected event.Actual travel times and costs will be used by individual to choose a new path to avoid consequences of unexpected event.
For explicit path k we can define its probability using arc conditional probabilities from implicit RCM [9] being product of arc conditional probabilities along the path, as defined in (1).
Where by a<r we mean subpath between origin o and rerouting point r and by a≥r we mean subpath between rerouting point r and destination d.We define r as the point for which r k p is maximized: From (3) we can express r as the point at which path k yields has highest probability (2), in other words r is chosen so that path k is consistent with ˆ() To improve the procedure we add the following criteria for rerouting point r: Condition (4) guarantees that at the rerouting point actual arc conditional probability differs from typical one, otherwise it could happen that r k p will be equal for number points before actual rerouting point.Condition (5) guarantees that ICM model (further elaborated below) evaluated at r yields positive rerouting probability, this way we guarantee the general consistency between observed point r and results of ICM model.Unfortunately (5) leads to circularity as ()  is a result of ICM model calibrated with data from paths K including observed rerouting point of each path calculated with (3).However condition (5) is intended to eliminate big errors (i.e.rerouting point placed before event took place).Alternatively, for the paths for which we know their typical representation k we can obtain r as the last node of overlapping part of k and k .In this case r can be seen as the point at which individual first acts atypically (i.e.diverged from his typical path k ).Using (3) we extend data from K by including rerouting point r and rerouting time k r t for each of observed path k.This way we can say that the realized path k consisted of two parts: prior and after the rerouting decision (if such decision was made).Let's define observed rerouting probability ()  for each node of the path i with formula (6).Mind that (6) defines α for all nodes prior rerouting point r, but it is not defined for subpath a>r, after the rerouting point, this is because the decision process is already over at r and α has no particular meaning for a≥r.
We further define probability of realization for path k with rerouting path k  using (6) in a joint probability that rerouting decision was not taken before rerouting point r and it was taken at point r.Of course for each observed path k probability of observing this path equals 1 and is computed as in (7).

ICM model
, telling if individual reroutes to avoid negative impact of the event, each parameterized to fit to actual observed behavior.For each node i in the network we can compute result of ICM model using input from DTA, most importantly typical and actual travel times and costs.ICM computes ()  as shown in formula (8) which links three submodels of ICM defined through formulas (9) to (11): Submodels of ICM model (( 9) to ( 11)) are defined using following terms: M(τ)global impact of the event calculated as total network delay at time τ: Δti(τ)cumulated delay at subpath from origin to node i: ICM is parameterized through set of parameters a={a1,…a4} plus parameters of logit model , , with following meaning: a1,a2sensitivity of information spread to total severity of the event, a1 alters the total probability of receiving information and a2 the pace at which information is spread.a3,a4probability of guessing the event from the total delay.The proposed structure of ICM model is, at this stage, only a hypothesis about actual rerouting behavior of individuals and should be further validated against observations.However based on literature review we can say which parts are more determined than the others.The Rayleigh shape of information spreading process is verified through number twitter based observations (most notably [26]).Further manipulations of Rayleigh distribution with M(τ) seem to be reasonable through virality analysis of information spread.On contrary functional form of observation model is not justified, although some ATIS [40] and day-today analysis [41] suggest that exponential smoothing rule based on cost difference is appropriate.Also the proposed binomial logit is recognized solution for modelling the discrete choices, while difference between costs represented here with   id w   can resemble the possible gains and losses which link to acknowledged prospect theory describing decision making process.

Calibrating ICM using observed paths
We propose the following method to estimate ICM model based on the observations of K paths.We define our estimation process with theoretical values being probabilities of rerouting for each element given by ICM model ()  16) where x is any distance measure.Yet for estimating models with categorical outcomes (in our case it is a dichotomous yes/no model) we follow most of researchers and propose the maximum log-likelihood formula using log of likelihood as shown in ( ) The log-likelihood L of ( 17) is used as an objective function in the optimization problem.The parameters , , , are estimated by finding the maximum of L, using some numerical method or software package, i.e. [42].
Mind that the formulation of likelihood will sum over all elements of paths kK prior its rerouting points.And most of observations will appear to be for not rerouting (i.e.having zero value of () ), as there are just |k| rerouting points and much more elements of the paths.This will result in optimization driven by second term of likelihood, and the best fit will be obtained at ICM producing zero probability.Therefore reformulation of ( 17) is advocated, also because in likelihood estimation it is assumed that observations are independent.(17) uses each element of subpath a≤r as an independent observation, while it is more adequate to treat each path k as independent observation.Therefore we propose alternative formulation of L (17) using k  as defined in (7).Thus we need to provide equivalent of k  from ICM model, which can be defined with following: Based on the above we can redefine L so that: Above formulation of log-likelihood function reduces number of observations for calibration, yet it is more consistent with the actual correlations between the observations, which are not captured in (17).For single observed path k the L computed both formulations will be the same, however formulation of (20) will yield different structure of optimization program, supposedly more consistent with structure of the problem.Now the loglikelihood maximization can be seen as maximizing the probability of realization of the observed paths produced by ICM through (20).Sample can be extended to cover also paths for which rerouting was not observed (for which r computed with (3) is placed at destination) by assuming in ICM model  .

Illustrative example
In this section we provide synthetic example with single path k observed during unexpected event.
The path consists of 22 decision points i reached at respective times k i t .Using DTA we can obtain arc conditional probabilities, both typical ˆ() dk ai pt and actual () dk ai pt of each arc a of the path and apply them through (3) to determine rerouting point.In our case the rerouting point was placed at node 19 th .The whole example is depicted at fig. 1.For consecutive decision points we can see values of theoretical and observed probability of rerouting α as well as explanatory variables of ICM: Δtdelay Δp, Δwutility of rerouting, ιinformation spread, Msignificance of the event.Fig. 1 shows that information spread process increases along CDF of Rayleigh distribution according to (9), the severity of the event M reaches its maximum at time 13 k t and then decreases.Individual experienced atypical delay only at nodes 14 th an 15 th but it did not make him reroute yet.α is positive only if there is utility of rerouting, resulting from positive values of Δp, or Δw.That theoretical rerouting probability is positive only for for 8 th ,13 th ,15 th and 19 th decision points with various combinations of Δp and Δw.The subpath from 20 th to 22 nd element (subpath after the rerouting decision was made) is not used as it does not bring information.

Fig. 1. Illustrative exampleobservation of single path k
For numerical reasons we cannot estimate the model which produces null α as log(0) does not exist, therefore we transform it to a very small number, i.e. 10 -10 .The theoretical probability of realizing this path computed with (18), being probability of not rerouting prior 19 th decision point and rerouting at 19 th decision point is 0.606.The log-likelihood of the proposed model calculated both with (17) and (20) coincides and equals -0.4995.Model estimation for this single path do not make sense as this single path cannot make a sample, however the applied procedure resulted in -L≈0 by modifying information spread model ι so that it is zero prior 19 k t and quickly grows to 1 for 19 k t which shows on one hand that the model can be estimated, but on the other hand that much bigger sample is needed.Also it shows that additional boundary conditions for parameters should be well thought to fit the reality and keep the cognitive process captured by ICM model and revealed by other researchers.

Perceived costs
Mind that ICM hides circularity as an individual makes a decision based on α which is calculated using () While reality can show that either: a) instantaneous actual, b) free flow or c) maximal expected travel times will be utilized.Such hypothesis can be tested by applying any formula generating perceived costs c as a functional of observed, typical and instantaneous costs.Such formula can be validated through maximizing the consistency of route-choices using the typical route-choice estimation methods (i.e.[43]) coupled with placing the rerouting point r with formula (3).

Conclusion
The paper shows how data set available nowadays can be used to estimate rerouting model and to understand the rerouting behavior.Thanks to the method proposed here through statistical tests, i.e. log-likelihood ratio tests it will be possible to understand ICM parameters, most importantly we can estimate significance of information and observation, also we can identify points at which rerouting takes place.The method proposed here can be applied in order to define magnitude of rerouting phenomena and its impact on traffic patterns.It can be then seen if this phenomena is significant enough and if how important it is to cover it in traffic management and ITS solutions.This paper should be further extended with the real dataset of observed paths K for a city with established DTA real-time model available.