Car Travel Time Estimation near a Bus Stop with Non-motorized Vehicles

Real time system for vehicle travel time and traffic flow is an essential part of Intelligent Transportation Systems. In many Chinese cities, the interactions among buses, bicycles and cars bring difficulty to travel time prediction and traffic safety management. The aim of this paper is to develop a new model to estimate car travel time near bus stops in developing countries by data mining techniques and survival analysis methods. The travel time data under mixed traffic conditions are collected by video camera. Four influential factors including car volume, nonmotorized volume, bus departure volume and free ratio of bus stop are chosen by using data mining techniques. A proportional hazard-based duration model is proposed to analyze the factors related to car travel time. The results indicate that mixed traffic flow impacts the car travel time significantly. In addition, various factors can modify the travel time distribution in different degrees and the model can be used to estimate the travel time under assumed conditions. It is hoped to help improve the planning and designing of proper facilities with mixed traffic flow.


Introduction Introduction Introduction Introduction
In recent decades, many studies have been conducted on the traffic safety on highways and urban intersections [1][2][3] . As a developing country, China has her own traffic characteristics. A Mix of non-motorized and motorized vehicles is an important traffic type in China. Some surveys show that the non-motorized vehicle, mainly bicycle is one of the most widely used traffic tools in Chinese daily travel activity 4 . Although some researchers focused on the car-bicycle conflict and injury accidents involving cyclists [5][6][7] , little information has been published concerning the conflict among mixed traffic streams near bus stops. Meanwhile, with the development of new technology, Advanced Traffic Information Systems are widely used in traffic management and control. Real time system for vehicle travel time and traffic flow is an essential part of Advanced Traffic Information Systems. In many Chinese bus stops, the interactions among buses, bicycles and cars bring difficulty to travel time estimation and traffic safety management. Therefore, it is necessary to develop an estimating approach for travel time near bus stops with mixed traffic flow.
Typically, there are three types of bus stops in urban areas: the curbside stops, bus bays, and bus boarders 8 . And curbside stops are the most common bus stops in many Chinese cities. Figure 1 shows the mixed traffic streams at a typical curbside stop. There are two lanes on the urban roadway, the non-motorized lane and the motorized lane. And there are three types of traffic stream: bicycle, bus and car. Bus stops are usually located on the non-motorized lane. When a bus dwells at the curbside stop, bicycles move to the motorized lane and go round the stopped bus. Thus, the presence of a stopped bus creates a temporary conflict between bicycles and cars, increasing vehicle travel time and traffic injury risk. Similar phenomena may be found in other Asian developing countries, for example, India, Malaysia, Vietnam, and Cambodia. On the bus stop, many studies were conducted on the effects of bus stops on traffic behavior. For example, Fitzpatrick and Nowlin used computer simulation to determine how bus stop design influences traffic operations around a bus stop 9 . Levinson and Jacques used field studies and simulation analyses to validate, update, and extend existing bus stop and berth capacity procedures 10 13 . Yang et al. established two models for car capacity near the curbside stop with bicycles based on gap acceptance theory and the additive-conflict-flows procedure, respectively [14][15] . They analyzed the effects of bus stop on the vehicle speed and capacity under mixed traffic conditions; however, they did not study the effects of bus stop on the vehicle travel time.
In this paper, we propose a hazard-based duration approach to analyze car travel time distribution near bus stops under mixed traffic conditions. The hazard-based duration models have been used extensively in biometrics and reliability engineering for decades 16 . Duration models can be used to determine causality in duration data and they are also useful tools in the field of transportation [17][18][19] . These models represent a type of analytical methods to describe the duration of a certain state and how various factors have affected the duration. It is the very reason why the duration method is chosen to analyze the travel time distribution under mixed traffic conditions. The empirical data are modeled by proportional hazard function. The factors are considered as influence variables including car volume, nonmotorized vehicle volume, bus departure volume, free ratio of bus stop, and so on. With above methods, the distribution of car travel time under various conditions is calculated and the influence of the selected variables is quantified.

Hazard-Hazard-Hazard-Hazard-b b b based ased ased ased d d d duration uration uration uration m m m model odel odel odel
Let T be a nonnegative random variable representing the car travel time in a test road section. Let ( ) f t denote the probability density function of T and let the cumulative distribution function be S t denote the probability that the travel duration does not end prior to t, yield is usually called "survivor probability" or "endurance probability" in the duration literature. In this paper, we define ( ) S t as continuance probability in order to represent the probability that a car travels longer than t (the travel duration still "continues" at t ).
In the hazard-based duration approach, T can be characterized by a hazard function, ( ) t λ . It represents the instantaneous probability that the travel duration will end in an infinitesimally small time period, t ∆ , after time t , given that the duration has not ended until time t . The mathematical definition for the hazard function in terms of probabilities is The hazard function gives the conditional failure rate. In this study, the conditional failure rate is the conditional pass rate that cars pass through a bus stop. The hazard function is the instantaneous rate at which a car passes through a bus stop in an infinitesimally small time period, t ∆ , after time t , given that the car has not passed the stop until time t .
The result in the hazard function is hazard rate or hazard. Specifically, give mathematically equivalent specifications of the distribution of T . So the hazard function can also be defined in terms of ( ) Integrating Eq. (4) from zero to t and using (0 Note that car travel time near a bus stop is influenced by various factors. A primary objective of this paper is to accommodate the effects of these influential factors. The influential factors can be defined as a vector of explanatory variables, x x x x . Then the proportional hazard (PH) form is introduced, which specifies the effects of explanatory variables to be multiplicative on a hazard function, yield is the baseline hazard function representing the hazard when the effects of explanatory variables are neglected [i.e. ( , ) 1 g = x x x x α α α α ], ( ) g i is a known function to represent the effects of explanatory variables, α α α α is a vector of parameters for x x x x. In this paper, a typical specification with ( , ) exp( ) g = x x x x α αx α αx α αx α αx , which was proposed by Cox 20 , is used. This specification is convenient since it guarantees the positivity of the hazard function without placing constraints on the signs of the elements of α α α α . The Cox proportional hazard model is The continuance probability function combining Eq. (5) and Eq. (7) can be written as (8) where 0 ( ) S t is the baseline continuance probability function.

Model Model Model Model e e e estimation stimation stimation stimation
The model in Eq. (8) has two components, α α α α and 0 ( ) t λ . Cox 20 introduced an ingenious way of estimating α α α α ; this is now known as the partial likelihood method. Because of its simplicity and usefulness, methodology related to this approach is adopted. Suppose that a random sample consists of k distinct observed duration data, (1) since it consists of all individuals whose durations are at least ( ) i t . The logpartial likelihood function for estimating α α α α is The estimation of 0 ( ) t λ can refer to Ref. 18. The overall goodness-of-fit of the model estimation is determined by the likelihood ratio (LR) statistics, which is specified as where 0 ( ) L α α α α is the log-likelihood for null model with all the regression coefficients are set as zero and( ) L α α α α is the log-likelihood at convergence with k regression coefficients.

Site Site Site Site s s s survey urvey urvey urvey d d d design esign esign esign
Car travel time is the actually observed time for a car to pass a bus stop under mixed traffic conditions. To record the travel time durations, two videos were placed the upstream section and the downstream section of the bus stop, respectively. Every car passing the curbside stop was observed as a data collection unit. The travel time duration was from the time that a car arrives at the upstream section to the time that it leaves the downstream section. The study also explored the effects of mixed traffic flow on the travel time. So the mixed traffic characteristics at 1 min interval were extracted from the videos, such as traffic volumes of various streams and the average dwell time of bus stream at every minute. Meanwhile, the car travel time of selected sample associated with the mixed traffic characteristics was matched according to the interval the sample belonged to. The observed roadway near the stops contains a non-motorized lane and a motorized lane. The site survey was conducted at the selected curbside stop on Xueyuan road in Beijing, China. Data were collected in January of 2008. The survey periods included peak hour and off-peak hour. The length of the test section was 67.5 m and the width of the sections was 7.5 m. In addition, the test section was not influenced by the traffic control. The survey was conducted in good weather. Based on the travel time study, 531 data samples were observed.

Variable Variable Variable Variable s s s specifications pecifications pecifications pecifications
The explanatory variables for inclusion in the model were chosen by using data mining techniques on the basis of previous research and intuitive arguments regarding the effect of mixed traffic flow. In urban traffic, travel time is determined by driving behavior, traffic conditions, geometric design, and so on. In this paper, we concern the characteristics of travel time under the influence of mixed traffic flow near bus stops. Therefore, the variable specifications used in the duration model should be related to mixed traffic flow.
Considering the feasibility of data acquisition, four explanatory variables including car volume (Xc), nonmotorized vehicle volume (Xn), bus departure volume (Xb), and free ratio of bus stop (Xf) are chosen. The free ratio of bus stop reflects the probability of no bus at the stop at every minute. Although the volume of the coming bus has an impact on the car travel time, there is a significantly statistical correlation between the volume of the coming bus and the volume of the departure bus. Additionally, the conflict between the departure bus and the passing car may occur. The departure bus has a greater impact on the car travel time than the coming bus. Thus, not the volume of the coming bus but the volume of the departure bus is chosen. Analogously, there is a significantly statistical correlation between the dwell time of stopped bus and free ratio of bus stop.
Because the latter has a more direct influence on the car travel time than the former, the dwell time of stopped bus is not chosen.

Estimated Estimated Estimated Estimated r r r results esults esults esults
The estimation of the duration model for car travel time is shown in Table 1. The LR statistic of the estimated model clearly indicates the overall goodness-of-fit (the LR statistic is 5,471.4, which is greater than the chisquared statistic with 4 degrees of freedom at any reasonable level of significance). On the other hand, the statistical significance of each variable also indicates the significant presence of variables in the duration of the travel time. From the results, all of the included variables are statistically significant at the 0.02 level of significance.  Figure 2 shows the estimated continuance probability by the proposed model and the observed continuance probability. The curve of estimated distribution is monotone decreasing. The median of the distribution is 14.4 s, indicating that over a half of the observed vehicles can pass the test section within 14.4 s. The 25% quantile of the distribution is 16.7 s, indicating that about 25% of the observed vehicles can not pass the test section within 16.7 s. The observed results show the same overall shape as the estimated results. While the general shape is the same between the two results, the observed continuance probability is slightly larger than the estimated results. This discrepancy is caused by the different sources of two continuance probabilities. In Figure 2, the observed distribution is obtained from the cumulated distribution of observed travel time. And the estimated distribution is calculated by the model with average variables. Therefore, the observed results indicate the travel time under the specific condition for individual sample, while the estimated results indicate the average condition that is related with different influential factors and all samples. The estimated continuance probability shown in Figure 2 can reflect the characteristics of travel time which has an average value for every variable. Any change of the traffic conditions could influence the car travel time. The effects of variables are discussed in the next section.

Effects Effects Effects Effects of of of of Explanatory Explanatory Explanatory Explanatory Variables Variables Variables Variables
According to Eq. (7), the effects of the explanatory variables can be interpreted by the signs of the coefficients in a rather straightforward fashion. If the coefficient is negative, it implies that an increase in the corresponding variable decreases the hazard rate, or equivalently, increases the car travel duration. With regard to the magnitude of the variable effects, when a variable changes by one unit, the hazard would change by [exp( ) 1] 100% i β − × . As shown in Table 1, the variable Xf indicates positive effect on travel duration. Other variables show a negative effect that the increasing variables could increase the car travel time.
To assess the effects of the included explanatory variables on the travel duration, a function of hazard ratio (HR) can be obtained by dividing both sides of Eq.
where ri x is the rth variable for the ith vehicle, α r is the corresponding coefficient. The HR can represent the multiple relations between the hazard under the effects of variables and the hazard when all variables are ignored ( 0 = x x x x ). The variables in the denominator of the left side of Eq. (11) are standardized about the mean and yield is the hazard with the average variables, r x is the average of the rth variable for all sample.
Eq. (12) is the relative hazard ratio (RHR, it is also called the relative hazard index). It represents the ratio of the hazard for a vehicle with a given set of variables to the hazard for a vehicle which has an average value for every variable. If the RHR is more than one, it means the effects of the variables can increase the hazard and so the variables are favorable. That is to say, the travel time in such favorable condition is less than the average level of the survey sample. On the contrary, the unfavorable variables correspond to a low hazard. Therefore, the vehicles in the unfavorable condition would have longer travel time than those in the favorable condition. In order to make a quantitative analysis of the effects of the mixed traffic flow, we can assume that a variable is in the favorable or unfavorable condition and other variables take their average values. Then the HR or RHR for each variable can be calculated. In addition, the RHR can be used to describe the multiple of the hazard when the observed vehicles are in favorable condition or unfavorable condition compared with the average condition. The HR can be used to describe the multiple of the hazard when the observed vehicles are in favorable condition compared with the unfavorable condition. According to the RHRs and HRs in specific conditions, the influence of mixed traffic flow on the travel time can be analyzed quantitatively. The specific conditions and corresponding HRs and RHRs are shown in Table 2. To illustrate the effects of the selected variables, the RHRs for four variables (Xc, Xn, Xb and Xf) are shown in Figure 3. As shown in Figure 3(a), the variable Xc (car volume) indicates a negative effect on the hazard. This reason is that the increasing car volume could increase the probability of car queue and then delay the car travel time. The RHR is 2.25 for the specific conditions. It means that the car pass rate with 10 cars is 2.25 times that for a car with 30 cars, when other variables take their average values. Compared to the filed survey, the average travel time of the vehicles with car volume less than 11.84 is 13.85 s; while the average travel time with car volume more than 11.84 is 15.67 s.
The variable Xn (non-motorized vehicle volume) indicates a negative effect on the hazard, which is shown in Figure 3(b). When a bus dwells at the stop, non-motorized vehicles may change to the motorized lane. Thus, the increasing volume for non-motorized vehicles could cause the larger probability of the conflict between cars and non-motorized vehicles, which finally lead to the increase of car travel time. The RHR is 1.82 for the specific conditions; that is, the pass The volume of bus departure also shows a negative effect on travel time. This is the reason that the car-bus conflict takes place when a bus departs from the stop to the motorized lane. The car-bus conflict could delay the car travel time. The RHR is 2.36 for the specific conditions. It means that the car pass rate with only one departure bus is 2.36 times that with 3 departure buses [see Figure 3(c)].
The effect of free ratio of bus stop (Xf) indicates that the increasing free ratio of bus stop can increase the hazard, or decrease the continuance probability [see Figure 3(d)]. It means the probability that the vehicles can transverse the bus stop longer than time t would decrease. This can be explained by the effect of stopped bus. If no bus dwell at the stop, there is no the conflict between cars and non-motorized vehicles, and the nonmotorized vehicles have no impact on the car travel time. Otherwise, one or more stopped buses dwell at the stop, the conflict between cars and non-motorized vehicles may lead to the increase of the car travel time. Thus, the increasing free ratio of bus stop may increase the conflict between cars and non-motorized vehicles and finally lead to the increase of the car travel time. From the results in Table 2, the high free ratio of bus stop (90%) is 2.32 times as long as the low free ratio (10%) to make the vehicles have longer travel times. According to the observed data, the average travel time of the vehicles with free ratio of bus stop less than 50% is 17.35 s; while the average travel time with free ratio more than 50% is 12.57 s. The observed results also verify the estimated results.

Model Model Model Model Application Application Application Application
The estimated model can be used to predict the distribution of car travel time under different conditions. By using an assumed variable under a specified condition, a new distribution of the continuance probability can be calculated while other variables are at their mean values respectively. In this paper, the nonmotorized vehicle volume is taken as an example to present the model application. If the non-motorized vehicle volume ranges from 5 to 35 veh/min, the new distributions of the continuance probability are shown in  The travel time is typical duration data that is the concern of the duration model. Most importantly, the hazard-based duration methodology can capture the effects of mixed traffic flow near bus stops. Therefore, the hazard-based duration approach could be helpful in the travel time estimation near bus stops with mixed traffic flow. Before the applications to other sites, the model should be estimated using the specified field data. Additionally, the explanatory variables can be chosen flexibly according to the research aim and the traffic reality.

Conclusions Conclusions Conclusions Conclusions
Real time system is necessary for Advanced Traffic Management Systems. This paper applies a hazardbased duration model to estimate car travel time near bus stops under mixed traffic conditions. The conflict among buses, bicycles and cars are discussed. The travel time studies are conducted in the tested bus stop with non-motorized vehicles. Meanwhile, mixed traffic flow and their effects on the car travel time are recorded. The methodology uses a framework of proportion hazard that allows different individuals to have various travel times according to the traffic conditions.
The paper provides several important insights into the determinants of the car travel time distribution near bus stops in developing countries. Firstly, the results indicate that car travel time is affected by various related factors near a bus stop. Such influence can be reflected by the distribution of car travel time. Any change of an influential factor could change the travel time distribution. Secondly, the free ratio of bus stop shows a negative effect on the car travel time, while the car volume, the non-motorized vehicle volume and the bus departure volume show positive effects on the car travel time. Additionally, from the methodological standpoint, this study has provided the empirical evidence that hazard-based duration approach is appropriate for the travel time analysis under mixed traffic conditions. The distribution of travel time estimated by the model would give a quantitative analysis of the influence of mixed traffic flow. Finally, the influential factors related mixed traffic characteristics should be given full consideration in the planning and designing of bus stops.
In terms of the future work, research with more datasets is required. Also other influential factors should be considered, such as the number of passenger loads, and the type of bus stops. In addition, it is necessary to study vehicle speed distribution near a bus stop with mixed traffic flow. It is hoped that these findings may have a better understanding of bus stops and help to plan and design traffic facilities in developing countries.