Urban link travel time estimation using traffic states-based data fusion

: Estimated travel time is a key input for many intelligent transport systems (ITS) applications and traffic management functions. There are numerous studies that show that fusing data from different sources such as global positioning system (GPS), Bluetooth, mobile phone network (MPN), and inductive loop detector (ILD) can result in more accurate travel time estimation. However, to date, there has been little research investigating the contribution of individual data sources to the quality of the final estimate or how this varies according to source-specific data quality under different traffic states. Here, three different data sources, namely bus-based GPS (bGPS) data, ILD data, and MPN data, of varying quality are combined using three different data fusion techniques of varying complexity. In order to quantify the accuracy of travel time estimation, travel time calculated using automatic number plate recognition (ANPR) data are used as the ‘ground truth’. The final results indicate that fusing multiple data together does not necessarily enhance the accuracy of travel time estimation. The results also show that even in dense urban areas, bGPS data, when combined with ILD data, can provide reasonable travel time estimates of general traffic stream under different traffic states.


Introduction
Travel time estimation is the basis for many intelligent transport systems (ITS) applications and traffic management functions [1]. Therefore, the accurate and reliable estimation of travel time is an area of active research [2]. In recent years, the availability of a range of new data sources has led to the application of sensor fusion techniques which aim to improve the quality of travel time estimates by combining data from multiple sensor sources. However, to date, there has been little research investigating the contribution of individual sources to the quality of the final estimate or how this varies according to the characteristics and quality of the specific sensor sources under different traffic states.
Understanding the trade-offs that exist between source-specific data characteristics and fusion complexity is of considerable practical importance, especially as such sources proliferate and practitioners must make tough decisions regarding how to spend limited budgets on procuring and analysing data. This paper aims to explore the effect on the final travel time estimation accuracy of combining data sources with different characteristics and quality using a range of different sensor fusion approaches, of varying complexity. The focus of the work is on urban road networks, using data from London as a case study.
The rest of this paper is organised as follows. After a background section where existing data sources and data fusion techniques for travel time estimation on urban roads are reviewed, an expectation maximisation (EM) algorithm is introduced to cluster the different traffic states, i.e. congested state and uncongested state. In addition, artificial neural networks (ANNs) and weighted mean approach (WMA), selected as representatives of widely used machine learning and statistical technique, respectively, and a hybrid method, the combination of ANNs and WMA, are presented. The fourth section describes the four data sources used in the study, namely mobile phone network (MPN), bus-based global positioning system (bGPS), inductive loop detector (ILD), and automatic number plate recognition (ANPR), where travel times estimated using ANPR data are considered as the 'ground truth'. The accuracy of data fusion estimation is measured separately under different traffic-state regimes in the fifth section, followed by a discussion on the data source inputs and the merits and limitations of data fusion techniques.

Background
A number of data sources have been used for travel time estimation, such as the probe vehicle location data extracted from global navigation satellite systems [2], moving car observer data [3], mobile phone data [4], ANPR system data and flow and occupancy data from ILD [5]. Recently, some researchers have attempted to explore the application of bGPS data for travel time estimation [6]. bGPS data have the advantages of good spatial coverage and low unit cost, and are hence potential inputs for travel time estimation. However, every data source has inherent biases and limitations, such as low polling frequency for ILD, the low penetration and map-matching accuracy of probe vehicle techniques, and the false displacement problem for the MPN. As with other single data sources, bGPS data have some drawbacks, such as the small sample size and biased data; buses sometimes use exclusive bus lanes and travel faster than general traffic, but incur additional delays at bus stops. One key issue of the use of bGPS data for travel time estimation is that bGPS data represent the specific subpopulation, i.e. buses, of the general traffic. These source-specific errors can have an impact on the accuracy of travel time estimation. In order to overcome these drawbacks, one feasible solution is to fuse bus data with data from other sensors.
Multi-sensor data fusion technique enables to combine the heterogeneous sensors through the process of fusion, which aims to compensate for the shortcomings of individual sensor sources and therefore increase confidence, robustness, and spatial coverage of the input of estimation [7]. Combining multiple sensor data has several potential advantages over using a single source of data. First, different types of sensors confirming the same output can increase confidence and reduce ambiguity. Second, the same traffic states are recorded by different sensors in the form of different variables and these independent observations can enhance the reliability of measurements. Furthermore, mutual complementarity can be achieved by fusing multiple data sources with different spatial and temporal coverage and thus increases the robustness as well as the spatial and temporal range of travel time estimation [8]. For example, MPN has a high market penetration and thus has the advantage of wide spatial coverage. This wide spatial coverage can help to estimate the travel time on corridors without sensors such as ILD. Conversely, ILD can provide accurate traffic flow information to compensate the low accuracy of MPN data based on travel time estimates. Some research studies have already shown that fusing two data sources by multi-sensor fusion techniques can produce more accurate travel time estimates [9]. In addition, the accuracy of estimation models with different input structures is varied under normal and abnormal traffic conditions [10]. However, to date, there has been little research investigating the contribution of individual sources to the quality of the final estimate or how this varies according to source-specific data quality under different traffic states. In fact, it is reasonable to assume that the accuracy of individual data sources will influence the accuracy of the fused travel time estimates. In this context, this paper focuses on the question -is more always better? If a number of data sources are available, should one use all available data sources or should one be selective? Moreover, how to select data input under different traffic states?

Methodology
Three different data sources (MPN, bGPS, and ILD) of varying characteristics and quality are combined using three different data fusion techniques of varying complexity under different traffic states. All data sources and the fusion methods are the representatives of current practice. To determine the performance of data fusion under different traffic-state regimes, traffic data need to be classified into congested and uncongested traffic states. EM algorithm, which has been proved to be a satisfactorily general and transferable traffic state probabilistic classifier [11], is used to cluster congestion and uncongested regimes. In addition to a detailed formulation of the proposed EM algorithm, this section also introduces the data fusion approaches used in this study.

Expectation maximisation
The traffic state is related to the relationship between flow and occupancy [11] as follows: where q i and o i are the flow and occupancy of the ith observation measured by ILD, respectively. The traffic state of the ith observation is represented by notation Z i with Z i = 0, 1 .
Two assumptions are made in this paper: firstly, the traffic states are clustered into two separable regimes by either uncongested (Z i = 0) or congested (Z i = 1); secondly, the probability distributions of these traffic states follow a Gaussian distribution with where N(μ, σ 2 ) is Gaussian distribution with the mean μ and variance σ 2 and p(α) z the probability density function of α given by a traffic state. So where γ 0 = (Z i = 0) and γ 1 = Z i = 1 are the mixture factors with the constraint of γ 0 + γ 1 = 1. Equation (4) is a typical Gaussian mixture model (GMM) with unknown parameters Θ = γ 0 , γ 1 , μ 0 , σ 0 2 , μ 1 , σ 1 2 . So, the probabilistic model is defined as: Each p k is a Gaussian distribution function parameterised by θ k , where θ k = (μ k , σ k 2 ) and k = 0, 1 . With the Gaussian distribution of these two traffic states, the question is then modelled to solve the probability of P Z i = 1 | α = α i , and we can know from Bayesian theory: With (5), (6) is then formulated to: In (7), only the parameters Θ of GMM are unknown to the model, and these parameters can be calculated by using maximum likelihood estimation with According to the maximum likelihood estimation theory, the parameters are ones that maximise ℒ, i.e.
So, the problem is reduced to find the parameters to statistically cluster two different traffic states. EM algorithm is used to find maximum likelihood estimates of parameters in statistical models. EM algorithm conducts an iteration of expectation (E) step, which creates an expectation of the log-likelihood for the parameters, and maximisation (M) step, which maximises the log-likelihood on E step. The E step and M step are formulated as shown in Fig. 1.
Then, γ k , μ k , andσ k are the estimates for parameters Θ of assumed two traffic states Gaussian mixture distribution. They perform both the expectation step and the maximisation step simultaneously. In the case of the corrupted ILD data and extreme traffic conditions, an error handling module suggested by Han et al. is used in this model [11]: if the value of the ith observation α i is smaller than μ 0 − 3σ 0 (low occupancy but high flow), then assign 0 (uncongestion) for the observation.

Artificial neural networks
ANNs are a family of machine learning methods inspired by emulating the structures of biological networks, generally presented as a system of connected neurons and multi-layers of processing units. They have the advantages of dealing with complex linear and non-linear problem in which the precise interrelationships among elements are not well understood and defined. ANNs techniques have been widely used in the sensor fusion literature both within transport and more widely [12].
Various ANNs topologies have been applied to estimate travel time, such as fuzzy neural networks, probabilistic networks, feedforward networks, recurrent neural network (RNN), and counter propagation neural network [13]. Among the various ANNs topologies, RNNs models are dynamic networks with internal feedbacks that enable the learning of complex temporal patterns. RNNs have been shown to be well suited to the analysis of times series data with the treatment of seasonal or temporal patterns, and a number of researchers have used RNNs techniques for travel time estimation [12,14,15]. RNNs are thus selected as a representative of the wider class of machine learning techniques used for sensor fusion in the context of travel time estimation. The parameters used in RNNs, such as the hidden layer size, the input delay, and the feedback delay, are optimised based on different data inputs.

Weighted mean approach
The WMA is a simple and widely used statistical technique for sensor fusion in which specific weights can be assigned to the various data sources [16]. These weights are calculated to reflect the reliability of each data source, so that more reliable sources have more significant influence on the final fused estimate. In this paper, three weighting schemes are compared. The mean absolute percentage error (MAPE) and mean square error (MSE) are commonly used metrics to quantify estimation accuracy. Thus, two weight schemes are calculated by the inverse of MAPE and the inverse of MSE, respectively. The third weighting scheme is that suggested by Choi and Chung [9] which incorporates sample size information given in (12). Three weight schemes are mathematically summarised as below.
Weight scheme 1 (w1): the inverse of MAPE Weight scheme 2 (w2): the inverse of the MSE Weight scheme 3 (w3): sample size divided by the square of the standard deviation where s j is a standard deviation and n j the sample size of the jth source. The sample standard deviation s i can be calculated by the below equation where n is the sample size; x i the ith sample value; x¯ the mean of sample values; N the number of data sources; W j the weights of ith data sources and ∑ j N W j = 1.

Hybrid method
In addition to using ANNs and WMA individually, a hybrid method based on combining these two approaches is also used. WMA has the constraint of fusing the same type of independent variables to estimate the dependent variable. For example, provided with data from bGPS, MPN, and ILD, we can only use bGPS data and MPN data to estimate travel time by WMA, as the same travel time variable can be got from these two data sets, while ILD data consisting of the traffic flow variable cannot be used directly without converting traffic flow into travel time. ANNs are effective machine learning tools to establish the relationship between different independent variables and dependent variables.
With the contribution of ANNs, all traffic variables firstly can be converted into travel time. WMA plays an important role to assign more weights to more accurate data sources and fewer weights to less accurate sensors. Fusing the output from ANNs by WMA tends to be more powerful than the individual method. The general process can be summarised into three phases: firstly, sampling and cleaning the raw traffic data sources; secondly, estimating travel time using ANNs; finally, fusing the estimated travel time output from ANNs by WMA.

Quantification of estimation accuracy
The accuracy of the estimation is measured using four metrics: mean percentage error (MPE), MAPE, root mean squared error (RMSE), and root mean squared percentage error (RMSPE).
Usually, the MPE is the average of percentage errors and measures the existence of bias in the estimation. MAPE measures the average magnitude of the errors by setting absolute average errors without considering the direction. Apart from MPE and MAPE which weight all the individual difference equally, a more common measure is RMSE. Since the errors are square rooted before averaging, RMSE assigns a relatively high weight to the large error. Compared to the MAPE, RMSE amplifies large errors. RMSPE provides the same properties as the RMSE, but is expressed as a percentage [17]. Every performance criterion has its advantages and limitations, and using all the criteria will result in a more comprehensive evaluation of accuracy. The equations of these four metrics are shown below where N is the total number of time intervals, x n the nth observed travel time during the evaluation time period, and x n the nth estimated travel time during the evaluation time period.

ANPR data
Link travel time data used as the 'ground truth' in this study are based on ANPR camera data, which is obtained from the Transport for London's (TfL) London Congestion Analysis Project (LCAP). A pair of ANPR cameras located at the start and the end of the link records the vehicle registration number and the time stamp of passing vehicles, while an external system measures the travel time using the corresponding arrival time and departure time. The ANPR data are cleaned by TfL using the overtaking rule method [18].

ILD data
ILDs are widely used for providing inputs to the SCOOT traffic control system [19]. They report vehicles presence or absence (0/1 values) sampled at 4 Hz at the fixed location. Traffic variables such as flow and occupancy can be calculated from the reported data.
There are 20 ILDs on the LCAP link 2509 and 21 ILDs on the LCAP link 2511. In order to guarantee the initial quality of data, daily statistics algorithm (DSA) test was applied to examine the working state of detectors. DSA test was firstly introduced to detect errors for single-loop detectors [20]. Then, it is approved to be an effective method to clean the ILD data, and the DSA with the same failure of criteria used in Robinson's paper was applied to verify the working state of ILD data [21]. According to the results of the DSA cleaning, one detector on LCAP 2511 failed to meet the requirements of DSA test and therefore was excluded from the analysis. ILD data in 5 min intervals from 7:00 to 19 is better than that of LCAP 2511. These differences are statistically significant at the 1% level using a suitable non-parametric (Friedman) test. A number of reasons may account for these differences, such as low sampling rates, spatial imprecision in positioning leading to errors in map matching, and errors in modal discrimination.

bGPS data
The bGPS data are extracted from TfL's GPS-enhanced automated vehicle location system, which is used for bus fleet management, traveller information provision, and bus priority at traffic signals.
The iBus system operates on over 8500 buses, providing the realtime GPS location and signage, throughout London [22]. bGPS data are extracted in 5 min intervals from 7:00 to 19:55 every weekday from 17 to 27 February 2015, and thus, we have 156 samples per day and 1404 samples in total, same as that of ILD and ANPR.
The iBus system provides bGPS data with traffic variables such as the average bus speed on the route for traffic control and management. In addition, the iBus system provides information on the location of buses from which estimates of journey duration can be derived. The link travel time is estimated by adding up the travel time of segmented links, in a manner similar to the pre-processing of the MPN data. As one would expect, bus travel time is also significantly different at the 1% level using a suitable nonparametric (Friedman) test, and generally larger than the travel time of the general traffic stream as measured by ANPR.
Comparing the MPN and bGPS data, it is clear that both are substantially different from ANPR, reflecting both differences in the source technologies and also differences between the behaviour of the general traffic stream and the subpopulations comprising bus and mobile phone users. Interestingly, there is a strong correlation between MPN and bGPS data in both links, which may reflect the fact that many of the mobile phones contributing to the MPN data source are in fact located on bus passengers. It is also notable that the differences between MPN and bGPS data and the ANPR ground truth data differ significantly as between the two links. On LCAP 2509, travel times from MPN and bGPS generally overestimate the ANPR-based travel time, whereas, on LCAP 2511, the tendency is to under-estimate the ANPR-based travel time.

Experimental results and analysis
This section presents the results of probabilistic classifier EM algorithms and the data fusion analysis using ANNs, WMA, and hybrid method. We divide the data sample into the training data set and test data set, accounting for 70 and 30%, respectively. The training data set is used to train ANNs or calculate the weights in WMA, while the test data set is applied to quantify the out-of-sample accuracy of travel time estimation of the calibrated method. The evaluated performance is based on out-of-sample data sets.

Probabilistic traffic states identification
The real traffic flow and occupancy data from detectors at 5 min intervals of 9 days, in two LCAP links, i.e. LCAP 2509 and LCAP 2511, are used to identify the traffic states, i.e. the congested traffic state and the uncongested traffic state. The examples of flowoccupancy scatter plots based on two LCAP links are shown in Fig. 3. The ideal relationship between flow and occupancy is that the lower regime represents samples from uncongested states, while the upper regime represents samples from congested states [23].
From the flow-occupancy scatter plot, it can be seen that the EM algorithm effectively identifies the congested and uncongested traffic states. The results presented in the rest part are based on the traffic states identified by the EM algorithm.

Data fusion using ANNs
In order to implement the ANNs to this research, the framework shown in Fig. 4 is used.
As for one data source, the traffic flow from ILD data, and travel time from MPN data and bGPS data are used as inputs, respectively, to estimate travel time by the ANNs method. Then different combinations of two and three data sources are fused together to estimate travel time. The final accuracy is quantified by MPE, MAPE, RMSE, and RMSPE.
With trial and error method, the travel time results from ANNs with optimised parameters are presented. Overall, the RNN gives accurate travel time estimates based on different data inputs on both LCAP 2509 and LCAP 2511. The precise accuracy is given in Table 1.
As can be seen in Table 1, in general, the ANNs provide accurate travel time estimates in the light of MAPE and RMSE. It is obvious that the performance using one data input overrides that of accuracy from using two and three data inputs according to MAPE criteria. One possible reason is that the patterns of travel time from MPN and bGPS, as well as the variation of flows from detectors, are quite different, and at the same time, with large errors compared to 'ground truth' from ANPR. Fusing them together using ANNs with feedback loop may add extra noise to the estimated results. ANNs give more accurate travel time estimates in the uncongested traffic context.

Data fusion using WMA
In order to implement the WMA, same traffic variables with that of 'ground truth' are needed as the input of the WMA. We can get the travel time variable from 'ground truth' ANPR data, bGPS data, and MPN data, while get traffic flow from ILD. Thus, only bGPS and MPN data are fused based on three weight schemes. The framework is shown in Fig. 5.
The bGPS and MPN data are fused with different weights. The results are shown in Table 2. As can be seen in Table 2, the performance of different weight schemes is varied and depends on the statistical information in both training and test data set. In addition, fusing two different data source by WMA does not necessarily contribute to better results compared to the performance based on single data input. We can also conclude that the performance of ANNs is obviously better than that of the WMA. The reason why the WMA does not output accurate travel time estimates is probably due to some dependencies between bGPS and MPN data. These dependencies and correlations lead to the similar traffic patterns based on MPN data and bGPS data. The performance of LCAP 2511 is better than LCAP 2509, same with the results from ANNs data fusion, which indicates that the quality of data inputs has an impact on the performance of data fusion. The WMA estimation becomes less accurate during the congested traffic states.  The hybrid method is the combination of ANNs and WMA. This method estimates the travel time by ANNs, and then fuses these ANNs outputs by the WMA model to produce the final travel time estimates. The framework is shown in Fig. 6. The weights of WMA are calculated according to the MAPE, MSE, and s 2 from the training data set of ANNs. The results from LCAP 2509 and LCAP 2511 are shown in Tables 3-5, and some example plots are shown in Fig. 7.
Among three weight schemes of the hybrid method, the inverse of MAPE gives the best estimates, and the main reason may come from the consistent performance of the training data set and test data set.
It is obvious that all of the estimates from the hybrid method are accurate than that of ANNs and WMA. The possible reason is that the hybrid method gives more weights to the accurate data and fewer weights to the inaccurate data. Combining WMA and ANNs can take the advantages of both methods to compensate for the distortion caused by time lag and therefore reduce the spacing error from ILD and MPN data. In addition, the underestimation and overestimation parts from the outputs of ANNs can be offset by WMA to achieve more accurate results than the results from using either ANNs or WMA. The data fusion techniques are vital to estimate accurate travel time. More advanced data fusion technique can help to improve the estimation accuracy and reliability.
In contrast with WMA and ANNs that one input is better than fusing two or three inputs, using the hybrid method to fuse multiple data sources can improve the accuracy of estimates compared to using only one data input. To be specific, fusing bGPS data with ILD data can reduce one-third MAPE of fusing single bGPS or ILD data on LCAP 2509 as well as LCAP 2511 under both congested and uncongested traffic states.
The combination of bGPS and ILD data as inputs estimates the most accurate result, superior to that of three data sources. This can  be ascribed to the correlations among different data sources, for example, the correlation between bGPS and MPN data can reduce the accuracy of travel time estimates. So, fusing more data is not necessary to improve the accuracy of travel time estimation. The quality of data inputs that LCAP 2511 is better than LCAP 2509 leads to more accurate estimates for LCAP 2511 than that of LCAP 2509. In addition, the estimated results during uncongested traffic states outperform that of estimates under congested traffic states by using all three fusion methods. It indicates that the quality of data input and traffic states can influence the outputs no matter which methods are selected.

Conclusion
The results presented in this paper show that the final accuracy of travel time estimation depends on the reliability of individual data sources, the characteristics of the sensor fusion techniques used, and also the underlying traffic states. The hybrid method outperforms WMA and ANNs to fuse multiple data resources, and produces more accurate travel times. However, fusing more data sources does not necessarily improve the quality of the final estimation. The results show that fusing highly correlated data sources can lead to a worse result. The results also show that although bGPS data is inherently based on just a subpopulation of the general traffic stream with markedly different behaviour to that of the general stream, when bGPS data are combined with ILD data from the general traffic stream, reasonable estimates of general traffic stream travel time can be obtained.

Acknowledgments
This work was partially supported by the UK Engineering and Physical Sciences Research Council under awards EP/F005156/1 and EP/I038837/1. The data used in this paper were kindly provided by Transport for London (TfL). The authors are particularly grateful to Andy Emmons and Ashley Turner from TfL for their assistance and support. However, the analysis, results and conclusions presented in this paper are those of the authors alone and do not necessarily reflect the views or policy of TfL