Route choice behaviour and travel information in a congested network: Static and dynamic recursive models

Abstract Travel information has the potential to influence travellers choices, in order to steer travellers to less congested routes and alleviate congestion. This paper investigates, on the one hand, how travel information affects route choice behaviour, and on the other hand, the impact of the travel time representation on the interpretation of parameter estimates and prediction accuracy. To this end, we estimate recursive models using data from an innovative data collection effort consisting of route choice observation data from GPS trackers, travel diaries and link travel times on the overall network. Though such combined data sets exist, these have not yet been used to investigate route choice behaviour. A dynamic network in which travel times change over time has been used for the estimation of both recursive logit and nested models. Prediction and estimation results are compared to those obtained for a static network. The interpretation of parameter estimates and prediction accuracy differ substantially between dynamic and static networks as well as between models with correlated and uncorrelated utilities. Contrary to the static results, for the dynamic, where travel times are modelled more accurately, travel information does not have a significant impact on route choice behaviour. However, having travel information increases the travel comfort, as interviews with participants have shown.


Introduction
Congestion in transportation networks leads to travel time uncertainty, and more specifically to delays, and negative environmental impacts. It is a result of traveller behaviour such as route and departure time choices. Travel information has the potential to influence travellers choices, in order to steer travellers to less congested routes and alleviate congestion (Arnott et al., 1991;Balakrishna et al., 2013;Ben-Elia and Shiftan, 2010;Bogers, 2009;Levinson, 2003;Van Essen et al., 2018). Moreover, provision of travel information contributes to a more efficient use of the existent available road capacity. It is known that different sources of travel information affect route choice behaviour to a different extent (De Palma et al., 2012). When analyzing route choice behaviour under different information provision scenarios, it is necessary to include travel time as an explanatory factor. The effect the travel time representation can have on the modeling results has not been studied before. The question to be addressed in this paper is therefore the following: What is the impact of the travel time representation on the interpretation of model parameters and model • We present arc-additive utility specifications that link information provision to travel time and we also model other intrinsically non-additive attributes such as habitual route. The estimation results show that the behavioural interpretation of parameter estimates are substantially different for static and dynamic models as well as between models with or without correlated utilities.
• We present an extensive cross-validation study of prediction performance showing that static and dynamic formulations achieve similar performance on our data. Consistent with findings in the literature, we find that modelling correlation is essential for the prediction performance.
The remainder of the paper is structured as follows. In Section 2 we overview work related to, on the one hand, travel information and, on the other hand, route choice analysis. In Section 3, we describe the data collection effort followed by two sections with descriptive analyses of the data: Section 4 is focused on the trip and network travel time data that are core to route choice modelling, while Section 5 delineates a descriptive analysis of the combined data sources (network, trips and travel diaries) with a particular focus on information provision. After these data focused sections, we describe the route choice models in Section 6 focusing on the difference between static and dynamic formulations. We present estimation and prediction results in Section 7 and, finally, Section 8 concludes the presented work.

Travel information
Both the provision and the quality of travel information have considerably improved in the past three decades. However, the primary aim of providing travel information remains the same: better management of traffic flow, to enhance driving operations and to improve travellers' safety (Adler and Blue, 1998). The process through which travellers acquire and use travel information to assist their route choice decisions, however, is not fully understood (Ben-Elia and Avineri, 2015b;Ben-Elia et al., 2008;Chorus et al., 2007;Farag and Lyons, 2012;Kattan et al., 2011). Understanding how travellers react to different sources, types and timing of travel information requires further investigation. This is especially relevant for situations in which multiple sources of travel information are available and travellers have to choose which source(s) to use. In this paper we investigate different sources of travel information and their impact on route choice behaviour. We therefore dedicate the following section to an overview on travel information and travellers' use thereof.

Overview of types of travel information
The available travel information varies in the extent to which the information is OD-specific. While non-OD specific public travel information (e.g., provided by radio or television) describes traffic conditions in the network in general, OD-specific travel information deals with the current travel information on the roads of interest for the traveller, i.e., the links the traveller might consider to use. In between, we have semi-OD-specific information (e.g., variable message signs), showing the traffic conditions on a stretch of major roads (Parvaneh et al., 2012), but this information only partly covers the routes between the specific OD-pair of interest for the traveller. The information specification is very likely to influence travellers' behaviour, due to the provided level of detail, but also as part of a strategic behaviour.
Furthermore, the travel information can be prescriptive or descriptive. While prescriptive information offers one recommendation only (e.g., the fastest route), descriptive information provides several options among which the decision maker has to choose (e.g., alternative routes). When prescriptive travel information is provided, travellers have to decide whether to comply with the recommendation or hold to their planned choice.
The timing of travel information provision also influences travellers' behaviour. Here, we distinguish pre-trip information given before the trip and en-route information provided during the trip. Pre-trip information does not only give travellers the opportunity to plan their route (e.g., fastest or less congested route), but also to optimise their departure time. While pre-trip information gives travellers enough time to process the information, en-route information acts more on the tactical level as the time to process is more limited. The latter is particularly true when dealing with OD-specific travel information about the current traffic situation. The mental workload and stress consequently increase as a decision to divert (or not) may result in much longer travel times. Travellers confronted with too much information may become oversaturated and show some difficulty to process it (Payne et al., 1993;Tianliang et al., 2013).

Use of travel information
Multiple studies have been performed to investigate the role of travel information (Ben-Elia and Avineri, 2015a; Lindsey et al., 2014;Molin and Timmermans, 2006;Polydoropoulou and Ben-Akiva, 1999). Initial studies about travel information traditionally focused on requirements for successful utilisation of ATIS tools (Bifulco et al., 2014;Crosby et al., 1993;Koutsopoulos and Xu, 1993;Ng et al., 1995;Sun et al., 2014), discussing whether the travel information should be static (e.g., location of points of interest) or dynamic (e.g., travel information), qualitative (e.g., road congested) or quantitative (e.g., amount of minutes of the delay) and prescriptive or descriptive. In addition, the level of reliability of the travel information was a point of concern. The frequently applied assumption that travellers immediately react to perfect travel information appears to be unrealistic (Ben-Akiva et al., 1991;Mahmassani and Jayakrishman, 1991;Polak and Jones, 1993). This is especially true for situations where multiple sources of travel information are available and travellers have to choose which source to use, which is the case in our study.
Travel information is used in different ways. Some drivers may reduce their level of anxiety just by being aware travel information exists; others may review the information on a regular basis, but make only limited use of route guidance and no use on the traffic situation information; still others may accept and comply with the advice without question (Schofer et al., 1993).
Literature shows that both the traffic situation and characteristics of the traveller influence the required types and quality level of travel information (De Palma et al., 2012). Vaughn et al. (1993) show that less experienced drivers (in terms of travelling frequency) tend to comply more with travel information than more experienced drivers. As experience increases, travellers are more reluctant to make use of travel information and tend to prefer routes with lower average travel times but greater travel time variance . The degree of familiarity with the road network influences the use of travel information (Bonsall, 1996), while this use of travel information in turn improves the efficiency in travellers' route choice and knowledge about network conditions (Adler and Kalsher, 1994). However, additional information appears only useful if a limited number of travellers possess such information (Knorr et al., 2014). Other relevant socio-economic factors appeared to be gender, age, and whether travellers are full-time workers (Wang et al., 2017;Knorr et al., 2014). In addition, the type and accuracy of travel information affect traveller behaviour. Investigating the accuracy of information shows that decreasing accuracy shifts choices mainly from the riskier to the reliable route but also to the useless alternative (Bogers, 2009;Ben-Elia et al., 2013). Prescriptive information appears to have a larger behavioural impact than descriptive information (Ben-Elia et al., 2013). Recently, Zhang et al. (2018) investigated the effect of friends' information, and show that more social interactions in an online travel community do not necessarily lead to better route choices. Finally, trip contexts play a role in the acquisition process (Joh et al., 2011).
Many studies have been performed to investigate how the use of travel information affects route choice behaviour (Yu et al., 2019;Ramos et al., 2018), and to what extent routes change. According to Mahmassani and Liu (1999), commuters are willing to change their route choice under well-developed route guidance systems. Commuters' route switching decisions are based on the expectation that the total travel time will decrease by a certain threshold (indifference band), which varies with the remaining travel time to the destination, subject to a minimum absolute decrease of about 1 min (Mahmassani and Liu, 1999). Commuters receiving pre-trip information are more likely to switch routes than those who do not (Mahmassani and Liu, 1997). Another option to avoid overcrowded routes is to change departure time. Caplice and Mahmassani (1992) and Mahmassani and Liu (1999) have shown that the decision to change routes and departure time are tightly linked. However, there is no consensus whether route (Neuherz et al., 2000;Petrella and Lappin, 2004) or departure time (Jou and Mahmassani, 1996) changes is the most likely adaptation due to information provision (Chorus et al., 2006;Yu et al., 2019). Chorus et al. (2013) investigated the joint effect of information acquisition and its use on travel choices, implying a single underlying system of preferences and beliefs.

Recursive route choice models
Random utility discrete choice models are often used to analyse route choices in transport networks. In this context there are two key challenges that the vast majority of studies in the literature focus on, namely, how to define choice sets of paths and how to model correlated utilities (see, e.g., Prato, 2009, for a survey). Recursive route choice models overcome these two challenges (see Zimmermann and Frejinger, 2019, for a tutorial). While a sampling of alternatives approach is computationally appealing and allows to consistently estimate path-based route choice models (Frejinger et al., 2009;Guevara and Ben-Akiva, 2013b,a;Lai and Bierlaire, 2015), it is unclear how to use the resulting models for prediction. Since we are interested in both estimation and prediction results, we only consider recursive models in this study. The recursive logit model is based on the dynamic discrete choice framework where the choice of a path is modelled as a sequence of arc choices. At each choice stage, given a state, a traveller chooses an action that maximizes the sum of an instantaneous utility associated with this action and the expected maximum utility from the next state until the destination (aka value function). In a static network, the state can be defined by an arc and the actions by the outgoing arcs. The size of the state space is in this case the number of arcs in the network. A key assumption in Fosgerau et al. (2013) is that attributes are deterministic. This means that the next state is given by the action and the only source of stochasticity is the choice model. We make the same assumption in this paper.
While Fosgerau et al. (2013) mention that the model can be used in dynamic networks as long as attributes are deterministic, there has only been applications to static road networks (e.g., Mai et al., 2015;Mai, 2016;Zimmermann et al., 2017), most focus on different ways to model correlated utilities. Nevertheless, Oyama and Hato (2017) deal with a dynamic setting using data from Japan where a grid lock situation occurred. They divide the period of interest into one-hour intervals and consider a static network representation per interval. In this study we investigate two different static network representations where one, similarly to Oyama and Hato (2017), is defined over specific time intervals. In our case, these intervals are peak-hour periods.
We are only aware of two works dealing with dynamic recursive choice model formulations (Zimmermann et al., 2018;Västberg et al., 2019), both focused on activity-travel planning. They define a state as location, time-of-day and information on the previous action. In turn, an action is the joint choice of destination, mode of transport and activity. Different from our work, they do not focus on route choice in a transport network. Therefore, the time between locations is not a result of explicit choices in a transport network, rather they are given by a random variable with known distribution (Västberg et al., 2019). Hence, they focus on challenges that are different from ours, for example, how to make the estimation computationally tractable in this formulation comprising a large number of feasible actions at each stage and probabilistic state transitions. In summary, there is an extensive literature on modelling route choice behaviour with random utility models but, to the best of our knowledge, the impact of the travel time representation on parameter estimates and the prediction results has been completely overlooked. We aim to address this gap.

Data collection
The type of analysis of route choice behaviour we focus on in this study requires data on chosen paths as well as characteristics of the paths and the conditions under which the route choice took place. One of these external conditions is the availability of travel information, with different sources and timing (pre-trip and en-route) of information provision. Most studies that investigate travel information use data from stated preference (SP) experiments. Revealed preference (RP) experiments are conducted in real settings, so one can investigate what travellers actually do, instead of discussing hypothetical scenarios. We have performed a long-term RP experiment, in which we deal with the challenges related to this kind of experiments, namely relating past choices to a network in terms of alternatives and traffic conditions, finding significant variation of travel times across at least two alternatives and observing multiple choices from the same participant (Axhausen et al., 2002). In the following, we shortly introduce the data collection. For more details, we refer to (de Moraes Ramos, 2015).
The RP experiment took place in the Netherlands over a period of nine weeks, from May 9th to July 12th 2011. Thirty-two participants were selected among the staff and students at Delft University of Technology and they all commuted between The Hague and Delft. Among the participants, 44% were women and 56% men. Their age ranged from 23 to 60 years old among which 41% were between 35 and 45 years old. Their commuting frequencies varied between 2 and 5 days per week, but the majority, 61%, commuted 5 days per week.
These participants share the characteristics of drivers with the most frequently observed trips: they are commuters, driving in the Randstad (one of the most congested areas in the Netherlands), having a trip of about 16 km where 60% of the commuters in The Netherlands live 19 km from work 1 . Moreover, the considered network contains alternative paths that are of comparable lengths but with distinct characteristics in terms of type of road and travel time variability.
The participants had two different information treatments during the data collection period, referred to as reference (the control group) and information. The usual free/public sources such as radio, television, websites (e.g., ANWB route planner, Google maps) and VMS were available to all participants. During the last six weeks 80% of the participants were equipped with a TomTom of type VIA Live 120 Europe (TomTom, 2016) and hence had access to OD-specific real-time traffic information (none of the participants had access to such a device during the first three weeks). We refer to those having a TomTom device as informed travellers and to the others as regular travellers. We made a distinction between information sources that are considered to be accessible only pre-trip (websites and television), only en-route (VMS) and both pre-trip and en-route (radio and TomTom).
TomTom travel information consisted of a recommendation regarding the fastest route between an origin and a destination, estimated trip duration, arrival time and delay with respect to free-flow travel time. This information is based on an indicated departure time. The TomTom devices were all set to indicate the fastest route and warn the traveller in case a new route became the fastest. The participants were told not to change the settings of the device.
The use of travel information was not imposed nor recommended to the participants. Instead, at the beginning of the experiment, the participants were instructed to behave as during one of their usual commuting trips.
All participants kept a GPS device in their car of the type BT-Q1000XT (Qstarz, 2016). The horizontal precision is around 1 meter and on average 8 satellites are used to define travellers coordinates. The GPS was set to automatically log every 5 s whenever a movement was detected. The device logged positions each 5 s interval while the participants performed their commuting trips by car.
In addition, the participants were asked to fill in a travel diary after each trip which contained questions related to their reactions to travel information, perceptions about the trip just made and expectations about the next trip.
The network data, corresponding to link travel time at 1 min intervals was stored by TomTom during the whole study period. Data is not only collected on chosen paths but also on alternative paths in a congested network. It was retrieved and processed after the data collection period. The following section provides further details.

Network and trip data
In this section we describe the collected data sources and associated data processing: network and GPS data as well as travel diaries. We consider the main roads in the Delft-The Hague area, which can be represented by a static network composed of 520 links and 200 nodes, shown in Fig. 1. There are a number of static attributes in this network, such as, the presence of traffic lights, speed limits and location of VMSs. During certain periods, the roads are congested in this area so travel times are not static. We define dynamic travel times by using data from TomTom that was collected at the same time as the participants made their route choices. Collecting data on route choice and the network state simultaneously allows us to compare static and dynamic network representations.
In the TomTom database, travel times are estimated at one minute intervals based observations (distinct from our path observations) and are stored at a path level. We transform the path travel times into link travel times using a conversion factor inferred from the data (see de Moraes Ramos, 2015, for details). For the links off the main roads that were not present in the path travel time database we assumed free-flow travel times.
In this study we consider the trips for which the respondents filled in a travel diary. Combining this information with the GPS data makes it possible to construct the trajectories and match them to the network without ambiguity. We discard any trips where the GPS signal was interrupted at any time between the stated origin and destination. The resulting sample consists of 897 trajectories among which 374 refer to the reference treatment and 523 to the information treatment.
The observed trips took place during morning and afternoon peak hours defined from 7 to 10 AM and 4 to 7 PM. We built timeexpanded network representations for these time periods and the days for which the trips took place. The shortest link travel time is 10 s, so we choose to divide the time periods into 10 s time intervals. This results in dynamic network representations of approximately 600,000 links and we need 563 such networks to cover all the observations. In this context, it is important to analyse if travel times for the observed trajectories are different when computed based on the dynamic and static networks. The graph in Fig. 2 shows that there are indeed important differences. It displays the percentage difference between static and dynamic travel times. For example, a value of −25 means that the travel time of the observed trip is 25% shorter in the dynamic network than in the static network. Assuming the dynamic network to be more accurate, the graph shows that the static network underestimates the travel time for certain trips, while it overestimates others. Moreover, the error can be quite large, indicating an important variability of travel time, even when building static networks for specific time periods.

Descriptive analysis of the combined data sources
In the previous two sections we presented the different data sources at our disposal. In this section we present a descriptive analysis with a particular focus on the information sources that were used before and during the trips. The results are reported in Table 1 and are mostly based on the travel diaries and the interviews. G. de Moraes Ramos, et al. Transportation Research Part C 114 (2020) 681-693 First of all, we see that informed travellers tend to consult travel information more often than travellers who do not have an ODspecific travel device ("Informed" column under information treatment "I"). Informed travellers use en-route information more often than pre-trip information, while the opposite occurs for regular travellers. The higher use of en-route information in relation to pretrip information for informed travellers is contrary to what has been shown by Abdel-Aty et al. (1997) and Jou (2001). However, those studies were based on SP data, which might have led to an overestimation of the use of travel information by the participants. In our research, informed travellers seem to use travel information not as a tool to plan their departure times, but to identify the best route when on their way (just entered their car).
Secondly, we discuss the different sources of travel information. Among the public sources of travel information, radio appears to be the preferred source of pre-trip information even when compared to websites on the internet. For en-route information, however, radio is the preferred source only in the reference treatment, while VMS is preferred in the information treatment. The preference for radio might be explained by the fact that travellers listen to the radio during their trips, and thus end up listening to the traffic information. Though VMS were noted not to be easily understood and sometimes covered by trees, they were used by the informed travellers. When OD-specific real-time travel information is available, it turns out to be the main used source of travel information, irrespective of the timing of information provision (pre-trip or en-route). In this case, travellers ignore radio information: even when travellers may be listening to the radio, they do not seem to register/process that travel information may have been provided.
Thirdly, compliance of travellers is analysed (row in the middle of Table 1). Here, compliance is defined as following an advice, irrespective of whether travellers had already decided to choose a specific route. Contrary to the findings related to the consultation of information, among the informed travellers compliance with pre-trip information is higher than with en-route information. This may be caused by the fact that changing route while already being on the way frequently results in taking a local road, which in   G. de Moraes Ramos, et al. Transportation Research Part C 114 (2020) [681][682][683][684][685][686][687][688][689][690][691][692][693] general is not preferred and associated with long travel times, or due to habit, as travellers may prefer to adhere to habitual routes due to the familiarity with the environment. The compliance rate appears to be higher when the source of the travel information is more detailed. Moreover, compliance rates are different for informed and regular travellers, as informed travellers are more willing to comply with travel information than regular travellers. Note that 77% of the informed travellers reported that the route suggested by TomTom differed from their planned route less than 30% of the times. This relatively small difference may be the reason for the large compliance rate amongst informed travellers. However, compliance rates do not only depend on the information itself, but also on the expectations of travellers as well as their willingness to change routes: GPS tracks showed that even when provided with information, most travellers follow their habitual routes and they do not explore different routes. We now turn our attention to the quantitative analysis of route choice behaviour. For this purpose we first present the models in the following section, followed by estimation and prediction results in Section 7.

Let
V A = G ( , ) denote a static network composed of nodesV and links A , i.e., the links represent physical road connections and the nodes intersections. The set of outgoing links from the sink node of a link A k is denoted A k ( ). We consider a time expanded version of G, denoted G , over time intervals = … t T 1, , . There are hence A T links in the time expanded network so that link travel times can vary over time intervals. Static link attributes are simply the same for a given link over all = … t T 1, , . A state is given by a link in G , that is, a location and time pair k t ( , ) where A k and t is the time interval at the sink node. At each state, a traveller chooses an outgoing link a A k ( ) and the next state a t ( , ) is deterministically given as the link travel time t a k t ( ( , )) is deterministic and hence = + t t t a k t ( ( , )). We assume that > t t, i.e., a traveller cannot stay in a same time interval after taking an action. This is easily ensured by fixing the length of the time interval to be the minimum link travel time. We add an absorbing state d T ( , ) to represent the destination d. This absorbing state can be reached from any state k t ( , ) which is a successor of k, i.e., d A k ( ). Fig. 3 depicts an example of a dynamic network (b) and its corresponding static representation (a). For the sake of illustration we draw a graph of the dynamic network that represents the alternatives over time. Time is represented horizontally and space vertically. We note that this is a partial representation of the full dynamic network as we do not represent all the arcs at all time periods. The arc identifications (in brackets) are consistent across the two subfigures. Moreover, the arc travel times in the static network are averages of those in the dynamic network. After detailing the recursive model formulations, we further analyse the path choice probabilities in these two networks.
The recursive logit model of Fosgerau et al. (2013) can be applied to a time expanded network. The only difference is that a state is defined by k t ( , ) instead of simply k. It is important to underline that the network structure is different, as can clearly be seen in the previous example. A static network can have cycles while the time-expanded network does not. For the sake of clarity, we present the key equations of the recursive logit model with the proposed state definition. We associate an instantaneous utility u a k t ( ( , )) with each action a A k ( ), G. de Moraes Ramos, et al. Transportation Research Part C 114 (2020) [681][682][683][684][685][686][687][688][689][690][691][692][693] where v a k t ( ( , ); ) is a deterministic utility, is a vector of parameters to be estimated, µ is the scale parameter of the random terms a k t ( ( , )). Let V a t (( , )) d denote the expected maximum utility from a state a t ( , ) to the destination d, referred to as the value function. It is given by the Bellman equation and is defined for all states k t ( , ) [ max ( ( ( , ); ) (( , )) ( ( , ))].
In the case of i.i.d. extreme value distributed random terms, this corresponds to the logsum. Following Fosgerau et al. (2013), the value functions can be computed by solving a system of linear equations of size A + T 1 (number of links in the time expanded network plus the absorbing state d T ( , )). Moreover, if there are no individual or origin-destination specific attributes in the instantaneous utilities, it is possible to use a decomposition method (Mai et al., 2018) that allows to solve one system of linear equations for all destinations. This method cannot be used if the utilities contains a so-called link size attribute (Fosgerau et al., 2013), which is an attribute designed to relax the independence from irrelevant alternatives (IIA) property of the logit model, similar to a path size attribute (Ben-Akiva and Bierlaire, 1999). Mai et al. (2015) introduced the nested recursive logit model that also relaxes the IIA property and allows for utilities to be correlated. In this case the instantaneous utilities are which differ from (2) because the scale parameters µ k t ( , ) are state specific. In this case, the value functions are more costly to solve because they correspond to a system of non-linear equations. Moreover, the decomposition method cannot be used.
Both the recursive logit and nested recursive logit models can be estimated with the nested fixed point algorithm (Rust, 1987) that combines an outer non-linear optimization algorithm searching over the parameter space with an inner algorithm solving the value functions. The log-likelihood function is defined over path observations that in the dynamic case correspond to a path in the time expanded network, i.e., a sequence of states = We note that in both dynamic and static cases, the paths are composed of the same sequence of physical arcs, only the travel time attributes change. The probability of a path is the product of the probabilities of choosing each link in the path = . We now go back to the illustrative example in Fig. 3 and analyse the choice probabilities of the three path alternatives. We assume that the deterministic utilities are only composed of a travel time attribute, that is, Table 2 reports the travel times for each of the paths along with the probabilities for two different values of . We have constructed this example so that two observations emerge. First, for a given value of (here −1) and a same path, the choice probabilities are different for the static and dynamic networks. Second, we can change the value of (here fix it to −2) so that the probabilities of the static network are equal to those of the dynamic network with a = 1 (see the right most column of the table).
While these two observations may appear as obvious, they are important to keep in mind when analysing the estimation and prediction results in the following section. Indeed, maximum likelihood estimation (MLE) seeks parameter values that maximize the probability of observed trips. If the attribute values are different for static and dynamic networks, the MLE may still succeed in finding parameter values so the likelihood of the sample remains similar. Of course, the associated parameter estimates will be different. If the goal is prediction, as is often the case in machine learning, then this issue is less of a concern. However, if the goal is to analyse the parameter values (e.g., in our case the perception of travel time information), or to use the resulting utility function as part of another optimisation model (see, e.g., Gilbert et al., 2015, for an example of bilevel optimization), then it is important that the parameters have a correct interpretation. In this case the most accurate measurements of the travel times should be used as the potential error incurred when averaging the attributes might be absorbed by the parameters.

Route choice modelling results
This section is devoted to an analysis of estimation and prediction results. The purpose of the analysis is to assess the impact of the representation of travel time on the results. For this purpose we compare results for different static representations of the network to those of a dynamic representation. Moreover, it is well known that capturing the correlation among utilities can impact the results. To assess this dimension we compare results from recursive logit and nested recursive logit models. In Section 7.1 we focus on the interpretation of the parameters and in-sample fit. Section 7.2 deals with prediction performance assessed in a cross-validation study.  Moraes Ramos, et al. Transportation Research Part C 114 (2020)

Estimation results
We start by describing the part of the utility specification that remains the same across the different models. We have experimented with a number of specifications before coming up with this one that we judged to be the best in terms of in and out-of-sample fit. Recall that the utilities for recursive models need to be additive over arcs. This means that certain attributes that are inherently non-additive need to be transformed. In our application, access to travel time information and habitual routes are such attributes. The dynamic linear-in-parameters arc utility is where • LC a is a link constant equal to one for each arc a and designed to penalize crossings; • Hab a k n ( ), is an attribute that equals one if both arcs a and k are on the habitual route of traveller n and zero otherwise and is designed to favour this route if it is known; It is clear from (4) that it is additive over arcs so the corresponding path utility of observation n is simply the sum of the corresponding arc utilities. The information access is captured by estimating two separate travel time coefficients, one for informed travellers and one for regular ones. The habitual route attribute is tricky to capture in an arc-additive form. Either this requires to augment the state space, or, as we do here, to capture this attribute in a partial way. In our case, the habitual route attribute captures, for a given path, the number of arcs on the habitual route but where we only count arcs if the traveller was previously (in state k) on an arc on the habitual route. In addition to these explanatory variables, we add a link size LS attribute designed to deterministically correct the utilities for correlation, similar to a path size attribute in a path-based model (Fosgerau et al., 2013).
In (4), the only attribute that changes over time t is the travel time TT a t ( ) . Therefore, the static deterministic utility functions are the same with the only exception that this attribute is averaged over time. We consider the following two ways to compute those averages: • Average peak (AVG Peak): Arc travel times are averaged either over the morning or afternoon peak hours and over all days of the experiment.
• Average day and peak (AVERAGE Day/Peak): Arc travel times are averaged over the morning or afternoon peak hours for each date of the experiment. That is, for each observation there is a correspondent static network defined for its particular date and time period.
We report in Tables 3,4 the estimation results for the dynamic and the static models, respectively. We note that there are additional parameters associated with the nested recursive models, which are part of the function giving the state-specific scale parameters (3). Following Mai et al. (2015), we define where OL k is the number of outgoing links from the sink node of k. In the static case, the travel time in this expression is the corresponding average one.
A number of findings emerge from these tables. First, in all models, independently of static or dynamic, the parameter estimates associated with explanatory variables have the same sign and they are according to expectation. That is, crossings (link constant) and travel time have negative impact on utility, while habitual route and variables message sign have a positive impact. The parameter estimates have different signs between the static and dynamic models, which is expected as the correlation structure is different. The values of these parameters are not interpretable as their role is to model the size of the positive scale parameters µ k t ( , ) . In terms of in-  Moraes Ramos, et al. Transportation Research Part C 114 (2020) 681-693 sample fit, as expected, the nested models are better than the recursive models. As opposed to findings in other studies, the link size attribute does not lead to a significant improvement in in-sample fit (likelihood ratio test at 0.05 significance level). We now turn our attention to the behavioural interpretation of parameter estimates. In order to compare results across different models and travel time definitions, we report a sample of parameter ratios in Table 5. The ratios for the same network and model type are stable (i.e., comparing RL to RL-LS and NRL to NRL-LS for dynamic network, and comparing RL-AVG peak to RL-AVG Day & Peak and NRL-AVG Peak to NRL-AVG Day & Peak for static network). However, when comparing the recursive logit to recursive nested logit for the dynamic network the ratios are less stable, for example, 1.30 compared to 1.14 for RL and NRL, TT informed/ TT regular. Moreover, the ratios are very different when making this same comparison across models for the static network, for example, 2.34 compared to 1.37 for RL-AVG Peak and NRL-AVG Peak, TT informed/ TT regular. These results are consistent with other studies that empirically show that modelling correlated utilities is important for the interpretation of the parameter estimates.
The results also show that the ratios are not stable across the different networks, even when comparing the same model type. For example, if we compare the ratios for NRL dynamic to NRL AVG Peak and NRL AVG Day & Peak, they are very different for the three different ratios, certain are larger and certain are smaller. We recall that the travel times in our network have an important variation over time (see Fig. 2). As illustrated with an example in Section 6, the parameters in the static network may absorb the errors introduced when averaging the travel time. This is a possible explanation for these important differences in the ratios across networks. We note that the same observation holds for parameter estimates that are not related to travel time (see the ratio between habitual route and VMS parameter estimates in the last column).
In light of these results, the behavioural interpretation of parameter estimates should be performed on the NRL dynamic model (third column in Table 3), as it is the simplest model (compared to NRL-LS dynamic) with the best in-sample fit based on the most accurate measurements of the travel time attribute. The results show that there is only small difference in perception of travel time between informed and regular travellers (-0.33 and −0.29 whose difference is not significant). From interviews, we found that travellers appreciate to have information about a route, but rather than adjusting their route, it increases their comfort of 'knowing potential delays'. This may be a possible explanation for this result. It is important to note that had we analyzed other models, the interpretation of these parameters would have been different and the informed travellers would have come out as significantly more sensitive to travel time than regular ones. Finally, we note from the NRL dynamic results that travellers favour habitual routes and avoid crossings. Moreover, they favour the arcs with variable message signs. This can mean that they favour those because of the   Moraes Ramos, et al. Transportation Research Part C 114 (2020) 681-693 information they receive, or because the signs are strategically placed. Before analysing the prediction results in the following section, we make a few remarks regarding computing times. For all the models, the main computational cost is associated with the value functions. These are required to compute arc choice probabilities. In the case of prediction, the value functions need to be computed each time the attributes of the network change. In the case of the NXFP estimator, they are part of an inner loop and are hence solved a very large number of times. For this application, the RL value functions take on average 0.2 s to solve per destination. The destination specific NRL value functions are on average approximately 20 times longer to compute (4 s). There are several reasons for this important increase. First, the NRL value functions are solutions to non-linear systems of equations instead of linear systems (RL case) and are solved by a value iteration method. Second, the decomposition method proposed by Mai et al. (2018) that allows to solve the value functions once for all destinations (at each evaluation of the log-likelihood function) cannot be used for the NRL model. The estimation time depends on sample size and the difficulty of the non-linear optimization problem. To give orders of magnitude for our application, the dynamic NRL models took approximately 250 h to estimate while the static RL model took less than an hour.

Prediction performance
We present a cross-validation study where we repeatedly and randomly divide the full set of observations into two parts: one for estimation (450 observations) and one for validation (113 observations). We use the predicted log-likelihood as loss-function and average it over validation sets, so lower values mean better out-of-sample fit. Fig. 4 shows the average predicted log-likelihood values over the validation sets. There is a total of 20 validation sets and, as expected, average predicted log-likelihood values converge with increasing number of validation sets. The results indicate that the two dynamic NRL models have the best out-of-sample fit. Unlike the findings in previous studies, and in accordance with the insample fit results, the link size attribute does not contribute to a major improvement of the results. It may be due to the cycle-free structure of the dynamic network which is very different from the static network structure. This is good news since the models with link size are far more costly to estimate and to apply because the utilities and the value functions are origin-destination specific.
Two key findings emerge from Fig. 4. First, modeling correlation is key to prediction performance. Independently of the network, the NRL models outperform the RL ones. Second, while the dynamic NRL model has the best performance, the static counterparts are quite close. A possible explanation for this was given in the illustrative example of Section 6. Indeed, the maximum likelihood estimation finds parameters such that the probability of observed trajectories are maximized. We have seen in the previous section that this can lead to very different parameter estimates. The prediction results here show that the predictions can still be almost as good as those obtained with a dynamic representation. In the case of our application, this leads to the conclusion that if predictions are of the sole interest, then a static representation may be best when weighing in computational requirements. Indeed, static models need less data and are faster to both estimate and to use for prediction.

Conclusion
This paper presented an innovative data collection effort whose strength lies in the joint collection of network data (link travel times), route choice observations (GPS traces) and travel diaries providing information on the use of travel information. The paper presented a descriptive analysis of the combined data source as well as an extensive route choice analysis where we compared results from dynamic and static models. On the one hand, we argued that it is important to use the most accurate measures of travel times when analyzing parameter estimates. Accordingly, if the travel times vary over time, then a dynamic network representation should G. de Moraes Ramos, et al. Transportation Research Part C 114 (2020) [681][682][683][684][685][686][687][688][689][690][691][692][693] be used. Otherwise, the parameter estimates might absorb the errors in the static travel time approximation and lead to erroneous behavioural interpretations. This argument was illustrated with an example and supported by the empirical results. The latter also show the importance of modelling correlation for the interpretation of parameter estimates, well aligned with findings in the literature. On the other hand, also aligned with the literature, the results showed that modelling correlation is crucial for prediction performance. Given a same model type, a static model achieves almost as good performance as a dynamic model. Therefore, for this data set, if the sole purpose is prediction and these predictions need to be obtained at high speed, then one can favour a static model. Regarding travel information, the results showed that there was no significant difference in the perception of travel time between informed and uninformed travellers when correlation and network travel time were correctly modelled.
The results presented in this paper highlight the importance of collecting information on network traffic conditions in addition to observed trajectories when analysing route choice behaviour. While the observations we made regarding differences between models defined over static and dynamic networks are general, the actual results are specific to our dataset. Therefore, we cannot generalize the conclusion that one can favour a static model over a dynamic one for the purpose of prediction. We encourage more data collection efforts with a setup similar to ours to allow a thorough comparison between different network representations.