Engineer’s estimate reliability and statistical characteristics of bids

The objective of this report is to provide a methodology for examining bids and evaluating the performance of engineer’s estimates in capturing the true cost of projects. This study reviews the cost development for transportation projects in addition to two sources of uncertainties in a cost estimate, including modeling errors and inherent variability. Sample projects are highway maintenance projects with a similar scope of the work, size, and schedule. Statistical analysis of engineering estimates and bids examines the adaptability of statistical models for sample projects. Further, the variation of engineering cost estimates from inception to implementation has been presented and discussed for selected projects. Moreover, the applicability of extreme values theory is assessed for available data. The results indicate that the performance of engineer’s estimate is best evaluated based on trimmed average of bids, excluding discordant bids. Subjects: Engineering Economics; Engineering Project Management; Statistics for Business, Finance & Economics; Transportation Engineering


Introduction
Strategic planning and programming of infrastructure projects require reliable cost estimating methods to secure appropriate funding. The expected accuracy of cost estimates varies throughout the life cycle of a project from inception to implementation. Thus, cost estimating is a constant challenge for long-term planning.
ABOUT THE AUTHORS Fariborz M. Tehrani is an assistant professor, professional civil engineer (PE), sustainability professional (ENV SP), and project management professional (PMP) with academic and professional backgrounds in engineering design, management, and education. His research and practice experiences focus on resilient and sustainable infrastructures. This work presents research on resilient project delivery methods which emphasize the probabilistic approach to resource management.

PUBLIC INTEREST STATEMENT
This paper highlights sources of uncertainties in estimating the true cost of selected highway maintenance projects. Presented methodology traces these sources in the preparation of engineer's estimate at various stages of the project development. Further, statistical analysis has been utilized to show the reliability of engineering estimate in respect to bid values, and particularly the lowest bid amount.
Sources of uncertainty in a project's cost can be categorized to inherent variability and prediction error. The latter, also known as modeling uncertainties, can be reduced through improvements in prediction models, inclusion of more data, and enhancement of statistical analysis. An unclear scope of the work alongside undefined means and methods of construction are known examples of errors which can be reduced through a detailed design of projects (Bajari, McMillan, & Tadelis, 2003). Project studies aim to clarify project deliverable items and recognize all principal factors that may affect the final cost of the project, e.g. site conditions, economic trends, and environmental issues. Obviously, prediction errors beyond certain levels of risk or confidence, e.g. occurrence of major disasters during the life time of a routine maintenance project, cannot be taken into account in cost estimates. Such risks are often transferred by means of insurance, indemnification, force majeure clauses, etc.
Some uncertainties pertain to the nature of the project nature and remain nearly consistent throughout the project development cycle. As a result, the main objective in studying inherent variability is simply acknowledging them and incorporating their effect in project estimates rather than reducing them. These types of uncertainties are caused by random variables, which in practical terms could not be controlled. For instance, minor variations in site conditions could not be avoided nor could they be wholly anticipated. The same is true for fluctuations in construction material costs. Cost estimates may also include strategies to cover minor and limited effects of unknown random phenomena. The most common strategy for covering these unexpected costs is the inclusion of contingencies in the cost estimate. Appropriate contingencies protect owners from minor unpredictable events that may increase the cost and cause over-run. However, contingencies should not be used to cover inaccurate estimates or poorly defined projects. The contingency amount is a function of uncertainty in project. It could be evaluated and represented as a percentage of contract cost, either itemized or total cost, or defined through probabilistic models, e.g. PERT or Monte Carlo (Ergin, 2005).
Although an engineer's estimate at the end of the design phase is the most detailed estimate accompanying plans and specifications, but, it may not represent the final and real cost of the project. Rather, the real cost of the project is developed through the construction phase. These estimates, along with a contractor's estimate, known as bids, at their best are fundamentally expert opinions on the final cost of a project. In this arena, the bid process is also a factor in long-term planning of projects.
Competitive bidding is the most common method of procurement in the US public sector. Transportation departments award most projects to the lowest bidders as unit price contracts to comply with government statutes, e.g. Federal Acquisition Regulation System (Schexnayder, Weber, & Fiori, 2003). In spite of the appropriateness or fairness of this method, a bid may not necessarily represent the bidder's opinion on the contract cost. Strategic manipulation of bids and the behavior of the bidder in response to a competitive environment of bidding are main sources of bid variation (Gaver & Zimmerman, 1977). Therefore, bid analysis is often necessary to assure the reliability of bids, particularly the lowest bid, in comparison to the engineer's estimate. Furthermore, some agencies implement alternative awarding processes to select the lowest reliable bid, as opposed to the lowest bid. Examples of these alternate procedures include the selecting of the average bid (e.g. in Italy and Taiwan), the average bid in middle 80% (e.g. in Peru), as well as the lowest bid after excluding discordant bids (Salem Hiyassat, 2001).

Bidding and the extreme value theory
Consider the random variable B as a representation of rational bids from m bidders on a project with a known distribution (Rothkopf, 1969): Suppose that bidders provide positive cost estimates representing their true estimate of a project cost as an independent random variable. Assume that the distribution X is normal. This assumption (1) is not a general necessity, but rather complies with observations made in this study. Other distribution functions could be analyzed likewise. Furthermore, assume that the probability of negative numbers in this distribution may be disregarded and does not require the use of the log-normal distribution function.
Assume that there are n realizations of the sample random variable X with minimum values of Y, which represent the number of projects with identical bidder distributions: Therefore, Y is also a random variable and its distribution function can be derived from the initial random variable X: Similar formulation could be provided for the maximum value of X. Figure 1 shows the normal distribution and extreme values distribution for a given number of projects.
As n grows larger, the distribution of the extreme Y value converges to a particular functional form, Weibull distribution, which does not depend on the initial distribution. Rather, the tail behavior of the initial distribution, an exponential function, defines the extreme value distribution. Distribution of smallest values is provided as (Rothkopf, 1969): where u and a are characteristics of the smallest value of X (mode of Y) and inverse measure of dispersion of Y, respectively. Other moments of smallest values are defined below: In the above equations, g is the Euler's constant equal to 0.577216. Figure 2 shows the distribution of extreme values as defined in the above equations.

Selected engineer's estimates and bids
Projects are unique endeavors, characterized by specific scope, schedule, and cost. Furthermore, the engineer's estimate and the bid values for a project are subject to additional economy-driven parameters of the market at the time of delivery. Therefore, statistical data-sets should be carefully selected based on the common characteristics of projects. Thus, large number of sample projects with different scopes or advertising time may not provide more reliable outcomes, regardless of resulted statistical confidence levels. The main data-set in this research contains 22 projects with identical scope within a single program. This set of data is incorporated to show the development of cost estimate throughout the life cycle of the project design. These projects were developed and delivered over a four-year period. Due to wide range of their advertising date, their bid values may not represent the same market characteristics. To compare bid values, a subset of the above Skew: referenced data-set is considered, as shown in Table 1. Sample projects in this data subset have been advertised through a four-month time frame. All projects received three to eight bids and were awarded to the lowest bidder. The standard deviation of the sample is 33% and the Chi-test result confirms that the engineer's estimate distribution over the sample projects follows a normal distribution at 99.44% confidence level. The skew is as small as −0.05 and median is 1.01. Figure 3 shows the development of the cost estimate during different stages of project development. Each line in this graph reflects real data for a sample project in the main data-set introduced in the previous section. Thus, this graph contains more projects than the subset listed in Table 1. All projects have the same scope of the work, but were advertised in the earlier years. All values are normalized to the final cost of project. Lower and higher bound curves in this graph show the level of accuracy in cost estimation through four specific milestones. Table 2 summarizes these observations with expected values reported by AbouRizk, Babey, and Karumanasseri (2002) and Rothwell (2004). The cost estimate at project initiation varies within ±80% of the final cost. The cost estimate at this stage is often based on the average unit cost of major construction activities obtained from historical records. The main sources of inaccuracy at this stage are uncertainties in definition of project scope, method, and schedule. Specifically, the project initiation takes place nearly three years before construction begins. Although some adjustments could be made to account for inflation, alternative construction methods, and materials, and other minor items, unjustified  contingencies should not be included in the estimate. Such high contingencies do not improve the accuracy and would only shift the estimate to higher amounts, causing unnecessary blockage of funds.

Development of cost estimate
The major qualitative improvement in cost estimate occurs in design stage, where engineer's estimate varies within ±40% of final cost. Preparation of plans and specifications provides the opportunity for the engineer to narrow down the scope and method of the work. The average construction period of sample projects is between 4 and 8 months after the engineer prepares the estimate. Therefore, the engineer's estimate should also include adjustments for inflation and unforeseen conditions within the scope of the project.
Award allocation is the assumed cost of project for contracting purposes. Bidders provide their bids in the advertising period, from two to three months before beginning construction. Further increase or decrease in the project cost is often due to minor changes in the quantities of the work or additional minor tasks which are necessary to perform the project.

Actual cost of the project
Although the contract cost is derived from the lowest bid, this number may not represent the true cost of the project in general conditions, which were assumed in the engineer's estimate. Bidders consider availability of their resources at the time and place of construction in preparation of their bids. For instance, preoccupation with other projects, lack of experience in the specific environment or location, overheads, and size of the business tend to raise the bid amount. Also, desire to expand business, access to mobilized resources in the vicinity of project location, or at the certain time frame are usual causes for lower bids. Moreover, minor errors might also change the bid amount without justifying bid withdrawal by bidder or bid rejection by the owner. In this respect, the lowest bid might simply come from the bidder who has specific advantages in accessing required resources, has specific desire to get the project, or simply has provided an erroneous bid. Statistical analysis of bids is the tool to search for the true cost of the project within bid results, by recognizing non-project-related parameters. Table 3 summarizes the results of a sample study on various hypotheses on the best estimate on project cost. Three error measures are provided for each hypothesis. The error is defined as the difference between a reference value, engineer's estimate in this study, and proposed cost estimate. The weighted average of these errors includes the cost of each project and dollar amount of the difference. Therefore, this measure is the most useful value to evaluate the entire program considering both over-and under-estimated bids. The un-weighted average of errors represents simple average of individual projects. Further, the least square method represents the probable error measures in the program considering the independence of projects and orthogonal nature of error values for different projects. This measure is simply a test to validate consistency of results. The low bid concept is evaluated in three different measures. The lowest bid column, representing the common acceptable rule, reveals the large gap between the engineer's estimate and the lowest bid. In other word, taking the lowest bid as the true cost of the project is the poorest hypothesis among the provided methods. The second lowest bid is a more reliable value, especially when the lowest bid is suspected to represent bidders with special characteristics discussed earlier. Substantial lower error measures in Table 3 indicate this point. Average of the three lowest bids is another approach to filter-out special cases in bids. However, this method is not helpful in projects with less than four bids. The results of this method are closer to the second-low-bid approach.
The average bid is a well-known approach to estimate the real cost of the project. This method has provided the lowest least-square-root-error value in the table of results for small number of projects. However, it does not perform well in presence of very low or very high bids. Therefore, another approach is taken by removing the highest and lowest bids before taking the average. In an alternative approach, all low and high bids, beyond 68% middle population (Average ± standard deviation), are removed before evaluating the average bid amount. The results indicate that trimming extreme values reduces the error. For comparison, the error values for the median bid, at 50% percentile, are also provided. Figure 4 shows the distribution of bids for sample projects, representing the ratio of each bid to the average bid of the same project. This distribution is close to the normal distribution with 99% confidence.

Statistical characteristics of bids
Further analysis reveals that bids are distributed with average standard deviation of 11.6%. Considering each bid as an expert opinion on the project cost, and disregarding the engineer's estimate, the variation of cost estimate is practically higher than 8 at 99% confidence level. In this distribution, the lowest bid is within 0.004-4.237 percentile of bids, which is at the average of 1.34 percentile of sample bids for the first data-set. The average ratios of the lowest bid to engineer's estimate is 84%. Figure 5 shows the distribution of these ratios for sample projects. The confidence level for normal distribution of lowest bid ratio to engineer's estimate is 90.2% only, which is much less than bids distribution in Figure 4.  Considering the ratio of lowest bid to average bid, an alternative distribution would be obtained as shown in Figure 6. This distribution has the average of 86%, with 99.9% confidence level for normal distribution. The median is the same as average, at 86%. Standard deviation is as low as 0.048 and skew is 0.33. Using Equations (5) and (6), the distribution of extreme values could be obtained as shown in Figure 7. This graph indicates that the lowest bid would fall within 69-96% of average bid at 95% confidence level. Further, assuming no discrepancies between the average bid and engineer's estimate, the expected value of lowest bid is 18% below the engineer's estimate, or 82% of engineer's estimate. This number is very close to the average ratio of the lowest bid to average bids, 84%, evaluated from distribution shown in Figure 5. For large number of projects, the distribution would converge to the graphs shown in Figure 8. This graph shows how distributions of the smallest and largest numbers become distant for large number of bids.  Assuming 95% confidence level, the standard deviation of bids would be 9.6%. The engineer's estimate at this standard deviation would be within ±20% of the average bid with the same confidence level. Suppose that average bid is the same as engineer's estimate, mean ratio of 1.0. Then, rational lowest bid ratio could be derived for different values of n as shown in Table 4. It might be helpful to realize that Figure 9 shows the ratio of bids to engineer's estimate. The shape of this distribution is an important tool to understand behavior of bidders. The confidence of level of normality of this distribution is only 80%, which would reject the normal hypothesis. Further, the skew of distribution is 0.4 which would confirm the visual observation of distribution being skewed to values smaller than engineer's estimate. Nearly 64% of bids are below engineer's estimate in sample projects.

Conclusions
Development of cost estimates in various stages of projects shows that unjustified increase of costs through large contingency does not improve the accuracy of estimates. Distinguishing between errors caused by inaccurate predictions, and variations caused by inherent parameters are important in appropriate use of contingency. Summary of observed variations in cost estimate of sample projects indicates that the accuracy of cost estimates is improved from 80% deviation to 40% deviation as the project makes progress from programming to design. The lowest bid (award amount) may also deviate from the final cost by 20%. These deviations are not symmetric, that is cost under-estimation is more frequent and more considerable than the cost over-estimation throughout the project development. This trend slightly changes after the award, as the over-estimation was observed to be higher than under-estimation for sample projects.
Statistical analysis of bid results for sample projects indicates that normal distribution is appropriate to model the bid ratio to average bids. This confirms the perception of bid values as expert opinions about the cost of a project. Further, performance of engineer's estimate should be evaluated based on trimmed average of bids, excluding discordant and extremely low or high bids, rather than the lowest bid. Thus, the accuracy of the engineer's estimate may not directly correlate with the reliability of the budget, as the budget is ultimately compared with the award amount or the lowest bid.
Implementing theory of extreme values reveals that distribution of lowest bids is independent of initial distribution of bids. Expected range of lowest bids could be derived for desired level of confidence using proposed analytical methods. This conclusion directly relates to the budget estimation for a multi-project program, where the total budget is the sum of the lowest bids for each project.
Validity of provided results is limited to assumptions made for rational competitive bidding. Errors caused by deficient design, bids manipulation, change orders, and similar sources were neither investigated nor considered in analysis. Further, numerical analyses are performed on limited number of projects with specific scope and size. Moreover, market parameters were intentionally filtered out by careful selection of projects advertised in a short time period. Regardless, the presented methodology can be implemented over larger pool of samples to provide appropriate guidelines on accuracy and performance of engineer's estimate and reliability of bids in various situations.