ESTIMATING THE PARAMETERS OF COWAN’S M3 HEADWAY DISTRIBUTION FOR ROUNDABOUT CAPACITY ANALYSES

Capacity models based on the gap acceptance theory are widely used in unsignalised intersections and roundabout capacity analysis. These models are based on the statistical distribution of major vehicle headways. In this field, Cowan’s M3 distribution is usually recognized as the most adequate, but the estimation of its parameters is not trivial. In this paper, the main estimation methods are reviewed and a new method (Simultaneous Numerical Estimation – SNE) is proposed. The SNE method was used to develop a calibrated relation between parameters, using field data collected in Portuguese roads and roundabouts. It was determined that the new formula improves capacity estimates, either in one-lane or in two-lane roundabouts. The paper also addresses the importance of each input variable and parameter in the resulting capacity model, through a sensitivity analyses.


Introduction
The vast majority of intersections in Portugal are un-signalized. As in many other countries, the use of roundabouts has spread throughout the country, namely to solve intersections safety and/or capacity problems (Antov et al. 2009;Mauro, Branco 2010). Developing highly accurate capacity models is therefore very important.
Modelling approaches, in the scope of capacity analyses, is classified in two main groups: empirical (regression analysis) and stochastic (gap acceptance theory). Current Portuguese practice relies mainly on gap acceptance models to analyze conventional at-level intersections and on regression models to analyze roundabouts. This paper addresses the application of gap-acceptance methods in roundabout capacity analyses. Within this approach each roundabout entry is treated as a normal T-intersection which allows the direct application of general capacity formulas.
Gap acceptance theory has two basic elements: the distribution of headways between priority vehicles and the usefulness of those headways to the entering vehicles. The term "gap" is used here for historical reasons -the correct term would be "headway", which stands for the time interval between the fronts of two successive vehicles.
The usefulness of headways is evaluated by a simple linear equation that relates the critical headway (t c ) with the follow-up time (t f ) and returns the number of waiting vehicles that can enter into the intersection during each interval. The critical headway is the most important term in this relation. It cannot be directly observed, but there are efficient methods that allow its estimation from field data (Brilon et al. 1999).
The distribution of headways in the major stream is described by a statistical function. Semi-Poisson, Hyper-Erlang and Double Displaced Negative Exponential are realistic distributions that are used to describe the availability of headways in the priority streams but they are difficult to use (Luttinen 1999;Zhang et al. 2007). Instead, simpler distributions are normally adopted. For instance, the negative exponential distribution is the basis of the capacity formula present in the Highway Capacity Manual: 2010 [Transportation Research Board] and is described by the following cumulative distribution function: where a -a parameter that should be set equal to the average flow q (vps (vehicles per second). This model has two major limitations: it generates unrealistic short headways and does not describe platoons; consequently, it only deals realistically with very low traffic flows (q < 100 vph (vehicles per hour) (Luttinen 1999). An alternative to the simple exponential model is the family of Cowan's distributions (Cowan 1975) and, specifically, his M3 model. In this model, the headway distribution is described as a mixture of follower and free vehicles headways. It is assumed that the smaller headways of vehicles driving in platoons are represented by a single headway Δ (min headway), while free vehicles follow a shifted exponential distribution. Cowan's M3 cumulative distribution is given as: , ( where f -a parameter that represents the proportion of free vehicles, λ -a scale parameter. The simple exponential model is a particular case (M1) of Cowan's M3 distribution, obtained when f = 1 and Δ = 0. The estimation of these three parameters is not trivial and the existing estimation methods yield different sets of values, eventually affecting the accuracy of the capacity estimates.
Therefore, the main objective of this research is to develop a simple procedure allowing the use of locally calibrated parameters in capacity formulas. In the following points the existing estimation methods, and also a proposed one (Simultaneous Numerical Estimation -SNE), are presented and used to estimate local parameters for a large set of observations collected in Portuguese roundabouts and sub-urban roads.

Cowan's M3 parameter estimation
The objective of a generic estimation procedure is to find the model parameters that provide the best fit between the estimated (theoretical) cumulative distribution function F(t) and the empirical distribution function H(t), constructed with field data. This is also valid when fitting the M3 distribution, but two adjustments must be introduced: first, the mean headway from the fitted distribution should equal the observed mean, thus resulting in equal flows; second, since waiting drivers will reject very small headways, it is preferable to have an accurate representation of large headways and exclude the smaller headways from the evaluation. As a consequence, to quantify the fit quality the variance of the residuals statistic was selected: , where nξ -the number of headways in the sample that are larger than an exponential tail threshold value ξ (it is assumed that for t > ξ the headways follow the exponential distribution). This threshold is usually set with values varying from 3 to 4 s (Hagring 1996;Troutbeck 1997). Luttinen (1999), using Monte Carlo simulation, found that the optimal value for ξ is 3.5 to 4.0 s. In this study, ξ was set to 3.5 s. There are two main approaches for estimating the distribution parameters: the method of moments and the max likelihood/least squares method.

Method of moments (MM)
This method is a technique for constructing estimators of the parameters that is based on matching the sample moments (mean headway , and variance s 2 ) with the corresponding distribution moments. The mean and variance of the M3 distribution are given by the following equations (Luttinen 1999 .
The resulting estimators for f and λ are given by , .
To overcome the indetermination, there are three main approaches. In the first (MM 1 ), the minimum headway Δ is fixed (usually 1.8-2.0 s). With the second (MM 2 ), Δ is chosen iteratively in order to minimize the variance of residuals. In the last, a third momentum equation is included, equating the sample skewness to the estimated distribution skewness (Hagring 1996;Luttinen 1999). Due to the large variance in the sample skewness, this approach is not robust (Luttinen 1999) and will not be considered in the following points.

Maximum likelihood / least squares method (ML)
The first step of this method is to obtain the max likelihood (ML) estimator for the scale parameter λ using the exponential tail of the distribution: , where -the average headway for t > ξ.

t
The second step is to obtain estimates for f and λ so that the distribution for large headways (t > ξ) does not change and the estimated distribution has the same flow as the sample. For this it is first necessary to calculate an auxiliary parameter γ by minimizing the sum of squares of the residuals between the functions F(t) and H(t): (9) and, second, obtain an estimate for f solving numerically the following equation (the Newton-Raphson method was used in this task): , which has solution only if (11) finally, the estimate for Δ is obtained from λ and f, in order to make the mean of the estimated distribution equal to the mean of the sample using Eq(7).

Simultaneous numerical estimation (SNE)
The method of moments and the max likelihood / min squares method are very efficient, producing good estimates from direct calculations. However, for this research, processing speed is not a key issue, so an alternative method is proposed. It makes use of a non-linear optimization tool to find the parameters Δ and f that minimize the variance of residuals between the functions F(t) and H(t), for t > ξ, and simultaneously take into account Eq (7) to assure equal means in the sample and in the distribution. This method was implemented using the Solver tool in Excel, assuming positive values for the three parameters and the range [0-1] of possible values for f.

Description of the data set
The data used for the estimates was collected in individual lanes of sub-urban roads (four unidirectional sections), in a one-lane roundabout, and in three double-lane roundabouts (six sections) located in Coimbra and Guimarães -Portugal. The roundabout data was reduced from video recordings using special purpose software, while the road data was collected using a microwave traffic detector. The sample is composed of 16 535 vehicle passages, corresponding to 28.8 observation hours. The sample was split per site and per lane, resulting in 164 sets of approx 100 headways each. The average flow in these sets varies from 130 to 1250 vph in lane).

Estimation procedure
The four methods described above were used to estimate the local parameters for each of the data sets. In 8 of the 164 cases the max likelihood method failed to converge due to violation of the condition expressed by Eq (11). For comparison purposes, these sets were removed from the sample. The results of the estimation are presented in Table 1 aggregated by site and lane. The following conclusions are drawn: a) allowing Δ to change improves the accuracy of the method of moments (as it would be expected); b) in most cases, the max likelihood method performs better than the method of moments; c) the SNE method was consistently the most accurate. Fig. 1 shows the empirical distribution function (EDF) and the estimated functions using the SNE, ML and MM 1 methods for one of the sets.

Calibration of a general bunching model
The main objective of this step was to obtain a calibrated expression for the proportion of free vehicles f, dependent of the traffic flow q and of the min headway Δ (assumed constant). Several authors have followed different methodologies to obtain this relation. The most well-known is Tanner's linear model (Tanner 1962), where traffic flow is considered as departures from a M/D/1 queuing system. Table 2 lists these models, which are classified as linear, bi-linear and exponential. It is also presented a proposed bi-linear model, dependent of a parameter A, that assumes free flow in range [0 -] and saturation flow for q = .
If A = 0 the formula returns Tanner's linear model; if A = 1, it is obtained f = 1 (only free vehicles). In Fig. 2 these models are compared using Δ = 2 s, unless specific values are recommended by the authors.
Regarding the proposed bi-linear model, the estimation of the A parameter resulted from the following steps: 1) the parameters λ, f and Δ obtained using the SNE method for each site and lane were considered true values; 2) for each site, the capacity of a standard entry of a one-lane roundabout was calculated using Cowan's M3 capacity formula (Plank 1982), Eq (12) .
3) finally, A was changed iteratively in order to minimize the square differences between the reference capacities and the estimates resultant of the proposed bi-linear model, where Δ was set equal to 2 s (Fig. 3). The optimal value was A = 0.356. Table 3 lists the variance of residuals when different bunching models are used to estimate roundabout

Validation
The validation of the proposed bunching relationship was made by comparing the capacity estimates from the resulting capacity model (Cowan M3) with the observed capacities. The generic capacity formula for a minor stream crossing or merging independent major streams, each having a Cowan's M3 headway distribution is (Hagring 1998 where k -the minor stream index; I k -the set of major streams i conflicting with the minor stream k and the scale parameters λ i are given by Eq (7). The proportions of free vehicles f i are calculated using the new bunching model using Δ = 2 s and A = 0.356. Eq (12) is a particular case of this one, obtained when only one opposing lane is considered.
For comparison purposes, the estimates were also calculated using the "traditional" model (Tanner, with f = 1, Δ = 0 and λ = q), where arrivals are super-imposed in a single lane and are assumed exponentially distributed.
In this paper only two cases are presented (the results are similar in the remaining sites): P. Rainha Santa -a single lane road entry into a one-lane roundabout; Nelas (west entry) -left lane entry into a two-lane roundabout. The data from this last site was not used in the calibration of the bunching model, thus providing a true independent validation.
In order to minimize the impact of quantification errors in the remaining variables, no unnecessary simplifications were introduced in the general capacity model and special care was taken to estimate the remaining parameters as accurately as possible. Consequently, the estimation of the critical headways and follow-up times was based on field data using the Siegloch method, recognised as the one with the closer relation with the gap-acceptance theory (Brilon et al. 1999). Since this method is applied only in saturated situations, a 4-seconds threshold for the minor vehicles move-up time (time the next vehicle takes to move into entry position) was used to test the existence of a queue (Rodegerdts et al. 2007). For each headway in the major stream the number of vehicles that enters into the roundabout was recorded and the result was plotted in a graph (Figs 4 and 5). To describe the data, a linear regression function with parameters t 0 (intersection) and t f (slope) was used. The follow-up time is given by the slope (t f ) while the critical headway (t c ) is given by the expression t 0 + . It is useful to calculate the average headway for each number of entries before starting the regression; otherwise, the more numerous observations would govern the whole result.
To clarify the calculation procedure, the left entry to Nelas roundabout was considered. From the Siegloch method, the critical headway and the follow up-times are t c = 3.14 s, t f = 1.94 s. For a total opposing flow (inner, q M1 + outer, q M2 ) equal to 1000 vph, assume 75% of traffic using the inner lane (this makes the example more generic -the observed proportion was 53%). So, q M1 = 0.208 vps, q M2 = 0.069 vps. Using the new bunching model with Δ = 2 s and A = 0.356, results f 1 = 0.906 and f 2 = 1. Eq (7) gives λ 1 = 0.323 and λ 2 = 0.081. Finally, replacing in Eq (13) the left entry capacity is given as: In Figs 6 and 7 the capacity estimates are compared with the observed 1 min entry flows. The special markers are used to distinguish the periods during which all entry vehicles respected the move-up time threshold and, as such, are clearly in a saturated condition.
In the two cases the estimates from Cowan's M3 model with calibrated parameters provided a better fit than the simple Tanner's model, which tends to overestimate capacity (Hagring et al. 2003).

Sensitivity analyses of the capacity model
After calibration of the bunching model, the objective of this section was to access the importance that imprecision or errors in the parameters may have in the capacity estimates, and consequently identify the need for more accurate models. This analysis was made by first computing reference capacity values for a set of known parameters and input variables, and then comparing those capacities to the ones that result when controlled variations, representing quantification errors, were introduced in the remaining parameters, one at a time.
The reference capacity was computed for an entry lane into a two-lane roundabout, in which both major streams have the same minimum headway Δ = 2 s and the total flow in the major streams varies from 0 to 1800 vph. It was assumed that the proportion of free vehicles f is given by the calibrated bi-linear model. The critical headway and follow-up times were set with the mean values of the complete data set (t c = 3.3 s, t f = 2.1 s).
To access the importance of the discrepancies in the capacity estimates, the GEH statistic was chosen , where C R and C M -the reference and estimated capacities, respectively. The GEH statistic is widely used in traffic engineering and traffic modelling due to its self-scaling property, which allows the use of a single acceptance threshold in the comparison of a wide range of traffic volumes. In traffic assignment models, and according to the Highways Agency:1996 [Design Manual for Roads and Bridges, 12-2, United Kingdom], an hourly volume estimate is usually considered good if the GEH statistic is less than 5. For the first four analyses it was considered that the total opposing traffic was concentrated in a single lane. The effect of the traffic distribution in the circulatory lanes is discussed in the last analyses.

Parameter A of the proposed bi-linear model
The sensitivity of the capacity formula to this parameter (assuming that the correct value is 0.356) is plotted in Fig. 8.
If the opposing flows are low to moderate (less than 800 vph) the capacity model is quite robust to errors in the parameter A. The sensitivity is max in the range of high conflicting flows, particularly for excess errors, but A values in the range 0.05 to 0.5 will return accurate estimates (for example, setting A = 0.1 when q M = 1100 vph will return a capacity of 599 vph, instead of the reference value of 568 vph. Here, it should be noted that although the relative difference is considerable (6%), the absolute difference of only 31 vph is acceptable in most traffic engineering applications. It is also interesting to note that for extremely high opposing flows the model gets irresponsive to changes in the A parameter, as the capacity tends to zero.

Minimum headway (Δ)
As seen in Fig. 9, the capacity model gets progressively sensible to variations imposed in Δ as the conflicting flow increases but it is quite robust for q M values less than 500 vph. Usually, q M values do not exceed 1200-1300 vph (per lane). Some authors set Δ = 1.8 s, which is perfectly within the acceptable range.

Critical headway
In Fig. 10 the influence of the critical headway (t c ) in the capacity is shown. Except when the conflicting flow is very low, this parameter has a major influence in the estimates. Considering that the true value is 3.3 s, acceptable errors would be of ± 0.3 s.

Follow-up time
The sensitivity of the model to the follow-up time (t f ) is illustrated in Fig. 11. Accurate estimates of this parameter is relevant when conflicting flow is low to moderate (this parameter is used to describe the number of vehicles that can enter the intersection using a large headway; as the probability of large headways in the major stream decreases with the traffic flow, the same happens to the influence of the follow-up time). Considering a reference value t f = 2.1 s, only extremely small errors would be acceptable in the range of low conflicting flows (approx ± 0.3 s). Fig. 12 indicates the errors that result if traffic is assumed as equally distributed between the two circulatory lanes, when the real usage of the inner and outer lanes is p and 1 − p, respectively. As the capacity formula returns higher capacities when traffic is equally distributed (blocking times are shared by two major vehicles), the errors increase when the lane split tends to 0/1. It should be noted that ignoring this split will not seriously affect the capacity estimates unless traffic distribution is extremely asymmetric.

Discussion
The errors in parameters involved in the general capacity formula may be considered as belonging to two categories: specification errors or quantification errors (Vasconcelos et al. 2009). Specification errors occur during the model development stage, while quantification errors occur during the practical application, by end-users. Errors in the A and Δ parameters fit in the first category, given that endusers are not expected to change them, while errors in the parameters t c , t f and q M fit in the latter category. The above analyses indicate that for usual traffic states (q M < 1200 vph in lane) the effect of specification errors is not very significant -if different field data was used to derive the A parameter, it is unlikely that the difference would have a major effect in the capacity estimates. The same does not happen relatively to the input parameters. Small errors in t c and t f will seriously affect capacity estimates. This is particularly relevant concerning the critical headway, due to three aspects: first, the parameter has a major effect in the almost plenitude of the application range; second, it is very dependent of the site's geometrical characteristics, third, its field estimation is relatively complex and requires a large number of observations. Finally, the effect of traffic distribution among multiple major lanes is relatively weak when compared with t c and t f , but given that in many cases it is easily measured or estimated, its effect should not be disregarded.

Conclusions
A new method (SNE) was used to estimate the parameters of the Cowan M3 distribution and compared against the method of moments and the max likelihood/min squares method. SNE provided more accurate estimates and it was selected to obtain local parameters for a large number of traffic states and to calibrate a new bi-linear bunching model for roundabouts.
It was determined that when the new bunching model is used with the general capacity formula, accurate estimates are provided, both in one-lane and in two-lane roundabouts. However, for a full specification of the headway parameters in the opposing lanes, it is necessary to quantify the traffic split among opposing lanes.
A sensitivity analyses was performed to investigate the influence of each parameter in the estimates of a generic capacity model and it was found that the effect of errors or imprecision in the derived bunching formula is relatively modest when compared with errors that are usually end-user's responsibility, namely regarding the estimation of critical headways and follow-up times.