Numerical anchors and their strong effects on software development effort estimates

doi:10.1016/j.jss.2015.03.015

Journal of Systems and Software

Volume 116, June 2016, Pages 49-56

https://doi.org/10.1016/j.jss.2015.03.015 Get rights and content

Highlights

•
Irrelevant information may mislead project cost estimates (anchoring effect).
•
Three experiments with software developers examined moderators of anchoring.
•
More (less) precise numbers did not lead to a stronger (weaker) anchoring effect.
•
More (less) credible sources did not lead to a stronger (weaker) anchoring effect.
•
All misleading information should be removed before projects are estimated.

Abstract

The anchoring effect may be described as the tendency for an initial piece of information to influence people's subsequent judgement, even when the information is irrelevant. Previous studies suggest that anchoring is an important source of inaccurate software development effort estimates. This article examines how the preciseness and credibility of anchoring information affects effort estimates. Our hypotheses were that anchors with lower numerical precision and anchor sources with lower credibility would have less impact on effort estimates. The results from three software project effort estimation experiments, with 381 software professionals, support previous findings about the relevance of anchoring effects to software effort estimation. However, we found no decrease in the anchoring effect with decreasing anchor precision or source credibility. This suggests that even implausible anchors from low-credibility sources can lead to anchoring effects, and that all kinds of misleading information potentially acting as estimation anchors in project estimation contexts should be avoided.

Introduction

Imagine that a software developer is in a meeting about a new, large project, and that a client asks whether the developer thinks it will take less than 20 h to complete the new project. Although this number is absurdly low, will it affect the final estimate that the software developer produces? Previous research, see Section 1.1, suggests that the software developer will actually give a lower estimate of how long it will take to complete the project after getting this irrelevant question than he otherwise would have. Numerical anchors of this kind can strongly influence estimates of software development effort, but are there some contexts in which anchors have stronger effects than others? In this paper, we investigate whether the numerical precision of the anchor and the credibility of the source of the anchor can moderate the strength of the anchoring effect in a software development effort estimation context.

Estimates of software project cost and effort are necessary for several purposes, e.g., planning, budgeting and bidding. The consequences of highly inaccurate estimates can be severe. If an effort or cost estimate is too low, the provider may choose to produce the product with lower than desired quality to avoid financial losses, the delivery may be delayed with the consequence that the client loses market opportunities, or the profitability of the project can become negative, i.e., the client would not have started the project if presented with a realistic estimate. Too high an effort or cost estimate may result in inefficient resource use and lost business opportunities, e.g., providers may lose bidding rounds due to the price being too high.

By far the most common way to estimate the effort and cost of software systems, in spite of years of research on formal software effort estimation models, is to ask developers with experience in the field to give their best judgement of the most likely effort needed to develop the system (Jørgensen, 2004). Unfortunately, human experts are not always as good at estimating as one could hope: estimates of cost and effort in software projects are often inaccurate, with an average overrun of about 30% (Halkjelsvik and Jørgensen, 2012). One reason for this inaccuracy is that human judgements are frequently based on heuristics, i.e., rules of thumb or mental strategies that satisfice rather than optimize (Kahneman, 2003). When there is a good match between the context and the heuristics, the use of heuristics will frequently produce accurate predictions. However, the use of heuristics can also lead to biased judgement and poor predictions (Kahneman et al., 1982). Hence, although expert judgement-based effort estimates may be reasonably accurate in some contexts, there are also contexts in which reliance on judgemental heuristics leads to highly inaccurate estimates.

The anchoring effect is one of the best-documented findings in the heuristics and biases approach (Klein et al., 2014). Anchoring occurs when judgements are influenced by an initially presented value (the anchor value). An example could be a client or a manager with unrealistically low cost expectations asking a developer whether he/she thinks a task will take more than 3 days. The developer´s estimate will then tend to be closer to the anchor value than it would have been had the anchor not been presented. In this example, anchoring could therefore lead to too low estimates, with potential negative consequences such as delays or budget overruns. An anchor tends to influence the subsequent judgment even when participants are explicitly informed that the presented value is not relevant to the judgement in question. Researchers have established the anchoring effect using many different kinds of target judgements and many different kinds of anchors (Furnham and Boo, 2011), including important real-world judgements, such as criminal trial judges’ sentencing decisions (Englich and Mussweiler, 2001) and real estate agents’ estimates of the value of a property (Northcraft and Neale, 1987).

Several studies have documented the relevance of the anchoring effect in effort estimation. For instance, Jørgensen and Sjøberg (2004) gave computer science students and software professionals information about customer expectations – they told one group that the client believed that 50 h and another group that the client believed that 1000 h would be a reasonable estimate for the total cost of a software project. Even though the participants were informed that the client knew very little about the time needed and that they should not consider this information as relevant, the anchors strongly influenced both students and professionals, with estimates in the “1000 h”-group much higher than the estimates of the “50 h”-group. Follow-up questions revealed that the participants were not aware of or strongly underestimated this influence (Jørgensen and Sjøberg, 2004).

Other studies have found similar results with students estimating tasks such as answering a set of questions about different items in a commercial catalogue (König, 2005) or building a miniature plastic castle (Thomas and Handley, 2008), as well as for students and professionals estimating software tasks (Aranda and Easterbrook, 2005), even with extreme anchor values (Jørgensen and Grimstad, 2008), and in a field context (Jørgensen and Grimstad, 2011). Non-numerical anchors can also influence effort estimates: describing a project as a “minor extension”, which may lead the developers to anchor their estimates in smaller tasks, led to a lower effort estimate than when the same project was described as developing “new functionality” (Jørgensen and Grimstad, 2008, 2012). Studies also show that it is quite hard to remove the effects of anchors. For instance, when an anchoring value is followed by another anchor at the other extreme of the scale (for instance, a low anchor followed by a high anchor), it is the first anchor that exerts the strongest influence on the final effort estimate (Jørgensen and Løhre, 2012). The study also showed that an instruction to forget the anchor does not decrease the anchoring effect – in fact, if anything, the effect on the effort estimate is slightly stronger after such an instruction.

Overall, previous studies demonstrate that the presentation of an anchor value influences estimates of project or task effort and that the anchoring effect is important to keep in mind for professionals involved in estimation work. This also means that it is important to identify the factors or situations in which the anchoring effect is amplified or attenuated. Software developers could use knowledge of such factors to take extra precautions against specific situations or as a guide to when anchoring effects are less likely to pose a serious risk to the realism of the project's effort estimate.

In this paper, the attitude change theory of anchoring (Wegener et al., 2001) inspired us to view anchoring as a communication process. While traditional studies of anchoring take great care to emphasize to the participants that the anchor is not relevant to the judgement in question (for instance by generating the anchor using a “wheel of fortune”), in a natural communication process numerical values that could influence effort estimates are often introduced as a more or less relevant part of a conversation. In such cases, the participants can allot some informational relevance to the anchor values. For instance, a project manager might ask whether a project will take more than a certain number of hours, just to gain a rough idea of the scope of the project. With this kind of approach, one can hypothesize that subtle changes in different parts of the communication process can influence the anchoring effect. Here, we focus on two somewhat related factors of particular interest regarding anchoring in software development effort estimation, namely how a difference in the numerical preciseness of the anchor and/or the perceived credibility (expertise) of the person providing the anchor affects the strength of the anchoring effect.

Studies within the attitude change approach have found that anchors from highly credible (expert) sources lead to stronger anchoring effects than anchors from less credible (non-expert) sources (Wegener et al., 2009; cited in Wegener et al., 2010). Another study within the same approach found that more extreme (and hence less credible or plausible) anchors can have less influence on judgements than more moderate anchors (Wegener et al., 2001). A study suggesting that less reliable sources lead to the removal of the framing effect also supports the moderating effect of credibility (Jørgensen, 2013). Together, these studies indicate that both the source of the anchor value and the exact value of the anchor can be important moderators of the anchoring effect.

Relatedly, a recent study in a price negotiation context found that high preciseness of the initial offer indicated more expertise on the anchor provider side and led to stronger anchoring effects than when the initial offer was a round number (Mason et al., 2013). Another similar study (Zhang and Schwarz, 2013), using a more traditional anchoring procedure, also found an increased anchoring effect with more precise numbers, but only when the number was pragmatically relevant. Mason et al. (2013) argue that people see a precise number as indicating a greater level of knowledge and therefore consider it to be more informative of the true value. This explanation again points to a possible influence of the perceived expertise or credibility of the source of the anchor.

In software effort estimation contexts, this could mean that exposure to round anchor values or anchor values from sources without expertise would introduce less bias than exposure to more precise anchors or anchors from competent or relevant sources (provided that the anchors were equally off the mark). Although previous studies of anchoring in effort estimation suggest that informing participants about the low competence of the source of the anchoring value does not eliminate the anchoring effect (Aranda and Easterbrook, 2005, Jørgensen and Sjøberg, 2004), anchor values from competent sources have not been compared directly with anchor values from non-competent sources. Therefore, we do not know whether the effect of anchors on effort estimates is reduced when the anchor stems from a less competent source.

In the current experiments, we hypothesize that we will find a strong effect of initially presented numerical values on software project effort estimates, replicating previous studies. In addition, we examine two related questions not addressed in previous papers on anchoring in the domain of effort estimation:

Q1: If the anchor value is imprecise (round), does it reduce the anchoring effect? For example, does the question of whether the project will take more than 1000 h influence an effort estimate less than the question of whether it will take more than 998 work-hours?
Q2: If it is clear that the source of the anchor is less credible, does it reduce the anchoring effect? Does it, for example, make a difference whether a clerk without software development experience or the project manager with technical competence asks whether a task will take more than 10 work-hours?

We hypothesize that the answer to both of these questions is yes, based on previous research from other domains showing stronger anchoring effects with more precise anchors (Janiszewski and Uy, 2008, Loschelder et al., 2014, Mason et al., 2013, Zhang and Schwarz, 2013) and weaker anchoring effects with less credible sources (Wegener et al., 2009).

The current experiments introduce a new way of varying the precision of an anchor, by comparing traditional single anchor values with interval anchors (“How likely is it that the task will take between 900 and 1100 h?”). Intervals are highly relevant to software effort estimation, as it is common practice to describe the uncertainty of an estimate by using an interval (Connolly and Dean, 1997, Jørgensen et al., 2004). In the early phases of a project, when there is a great deal of uncertainty about how the project will turn out, it might be more common to suggest possible ranges for the most likely effort than to suggest single point estimates. It will therefore be interesting to see whether such interval anchors lead to weaker (or stronger) anchoring effects than more precise single anchors.

Section snippets

The study design

We invited 423 software professionals from Romania, Ukraine, Argentina and Poland to participate in a set of three experiments. All the participants were required to have good English skills, so that they could properly read and understand specifications written in English. All the participants received a normal hourly wage for their estimation work. Some of the invited participants gave estimates that indicated that they had misunderstood the instructions; for instance, in cases in which the

Design

We gave the participants a description of a web-based application for visualizing information about the amount of software development offshoring in a country on a world map (Project A). We assigned the participants randomly to one out of five groups:

•
The control group, which we simply asked to estimate the most likely, minimum and maximum number of work-hours needed to develop and test a system meeting the requirements.
•
The precise single anchor group, which we asked how likely they thought it

Design

The rationale of Experiment 2 was similar to that of the first experiment, and we again investigated the potential effect of numerical precision on anchoring. However, in contrast to Experiment 1, here we used low rather than high anchor values. In addition, we expected the anchor values in this experiment to be less extreme than those employed in the first experiment. In Experiment 1, the anchor values were clustered around 1000 h, which turned out to be 25 times higher than the median

Design

In Experiment 3, we focused on the credibility of the source of the anchor as a potential moderator of the anchoring effect. We described to the participants a web-based library system that displays information about scientific publications that should be stored in an SQL database (Project C). After reading the specifications, we asked the participants in the control group to estimate the most likely, minimum and maximum effort. There were three anchoring groups, all receiving a low anchor of

Summary of findings

The aim of these experiments was to investigate whether the preciseness of the anchor and/or of the perceived credibility of the source of the anchor moderates the anchoring effect in software project effort estimation. Finding moderators for the anchoring effect could be helpful for people involved in estimation by providing them with knowledge of situations in which they should be more or less concerned about biased project effort estimates due to anchoring. In our studies, we varied the

Acknowledgement

The authors would like to thank Karl Halvor Teigen for helpful comments on an earlier draft of this paper.

Erik Løhre is a post-doc fellow at Simula Research Laboratory in Oslo, Norway. He received his PhD in psychology from the Department of Psychology at the University of Oslo in 2015. His research focuses on human judgement, including judgement-based software development effort estimation.

References (31)

FurnhamA. et al.
A literature review of the anchoring effect
J. Socio-Econ.
(2011)
JørgensenM.
A review of studies on expert estimation of software development effort
J. Syst. Softw.
(2004)
JørgensenM. et al.
The impact of customer expectation on software development effort estimates
Int. J. Project Manage.
(2004)
JørgensenM. et al.
Better sure than safe? Over-confidence in judgement based software development effort prediction intervals
J. Syst. Softw.
(2004)
MasonM.F. et al.
Precise offers are potent anchors: conciliatory counteroffers and attributions of knowledge in negotiations
J. Exp. Soc. Psychol.
(2013)
MussweilerT. et al.
The semantics of anchoring
Organ. Behav. Hum. Decis. Process.
(2001)
NorthcraftG.B. et al.
Experts, amateurs, and real estate: an anchoring-and-adjustment perspective on property pricing decisions
Organ. Behav. Hum. Decis. Process.
(1987)
ThomasK.E. et al.
Anchoring in time estimation
Acta Psychol. (Amst)
(2008)
WegenerD.T. et al.
Elaboration and numerical anchoring: implications of attitude theories for consumer judgment and decision making
J. Consum. Psychol.
(2010)
WegenerD.T. et al.
Implications of attitude change theories for numerical anchoring: anchor plausibility and the limits of anchor effectiveness
J. Exp. Soc. Psychol.
(2001)

ZhangY.C. et al.

The power of precise numbers: a conversational logic analysis

J. Exp. Soc. Psychol.

(2013)

ArandaJ. et al.

Anchoring and adjustment in software estimation

Softw. Eng. Notes

(2005)

ConnollyT. et al.

Decomposed versus holistic estimates of effort required for software writing tasks

Manage. Sci.

(1997)

Cumming, G. (2013). The new statistics: estimation for better research. Retrieved from...

CummingG.

The new statistics: why and how

Psychol. Sci.

(2014)

Cited by (20)

SEXTAMT: A systematic map to navigate the wide seas of factors affecting expert judgment software estimates
2022, Journal of Systems and Software
Citation Excerpt :
They also found a small to medium effect size when using a textual anchor: putting the same requirements specification as a “minor extension” work led to lower estimates than putting it as “new functionality” work. Løhre and Jørgensen (2016) found a slight tendency for a larger anchoring effect with interval anchors compared to single value anchors when dealing with numerical anchors. Additionally, they expected the expertise – defined as the length of experience – of the anchor’s source would act as a moderator for the anchoring effect.
Software projects involve technical and managerial activities, including software estimation. Inaccurate estimates are harmful and improving estimation methods is not enough: we need to understand more of the factors that impact estimates.
Our study aims to identify the existing evidence about the factors that affect estimates in software projects when using expert judgment.
We executed a Systematic Literature Mapping (SLM) based on database and snowballing searches, selecting papers by first reading their titles and abstracts and later reading the full text.
Researchers investigated a wide range of different factors employing mostly laboratory research strategies and relying primarily on differences of estimates and participants’ perceptions to measure the factors’ effects. Resulting from our analysis, we present the SEXTAMT (Software Estimates of eXperts: A Map of influencing facTors), a map of factors affecting estimates built on three dimensions: project/iteration phase, stakeholders, and type of effect.
Over the years, researchers have investigated a varied set of factors. Many of them were explored in different studies, employing diverse research strategies. Such studies provide compelling evidence on the elements that influence expert judgment estimates, which can be used to assess and improve everyday estimation in the software industry.
Sequence effects in the estimation of software development effort
2020, Journal of Systems and Software
Currently, little is known about how much the sequence in which software development tasks or projects are estimated affects judgment-based effort estimates. To gain more knowledge, we examined estimation sequence effects in two experiments. In the first experiment, 362 software professionals estimated the effort of three large tasks of similar sizes, whereas in the second experiment 104 software professionals estimated the effort of four large and five small tasks. The sequence of the tasks was randomised in both experiments. The first experiment, with tasks of similar size, showed a mean increase of 10% from the first to the second and a 3% increase from the second to the third estimate. The second experiment showed that estimating a larger task after a smaller one led to a mean decrease in the estimate of 24%, and that estimating a smaller task after a larger one led to a mean increase of 25%. There was no statistically significant reduction in the sequence effect with higher competence. We conclude that more awareness about how the estimation sequence affects the estimates may reduce potentially harmful estimation biases. In particular, it may reduce the likelihood of a bias towards too low effort estimates.
Anchoring in project duration estimation
2019, Journal of Economic Behavior and Organization
Citation Excerpt :
Thomas & Handley (2008) report that subjects who admitted to having performed a similar task in the past are less affected by anchors in their experimental setting. Similarly, Løhre & Jørgensen (2016) find that subjects with a longer tenure in the profession are less influenced by anchors and provide more accurate albeit still biased estimates than those with less experience. However, these findings are rather incidental than obtained in controlled conditions.
The success of a business project often relies on the accuracy of its schedule. Inaccurate and overoptimistic schedules can lead to significant project failures. In this paper, we explore whether the presence of anchors, such as relatively uninformed suggestions or expectations of the duration of project tasks, play a role in the project estimating and planning process. Our laboratory experiment contributes to the methodology of investigating the robustness and persistence of the anchoring effect in the following ways: (1) we investigate the anchoring effect by comparing the behavior in low and high anchor treatments with a control treatment where no anchor is present; (2) we provide a more accurate measurement by incentivizing participants to provide their best duration estimates; (3) we test the persistence of the anchoring effect over a longer horizon; (4) we evaluate the anchoring effect also on retrospective estimates. We find strong anchoring effects and systematic estimation biases that do not vanish even after the task is repeatedly estimated and executed. In addition, we find that such persisting biases can be caused not only by externally provided anchors, but also by the planner's own initial estimate.
Software effort estimation based on open source projects: Case study of Github
2017, Information and Software Technology
Citation Excerpt :
Software effort estimation is an important step for developing a software project, which has attracted many concerns of researchers [1–13].
Managers usually want to pre-estimate the effort of a new project for reasonably dividing their limited resources. In reality, it is common practice to train a prediction model based on effort datasets to predict the effort required by a project. Sufficient data is the basis for training a good estimator, yet most of the data owners are unwilling to share their closed source project (CSP) effort data due to the privacy concerns, which means that we can only obtain a small number of effort data. Effort estimator built on the limited data usually cannot satisfy the practical requirement.
We aim to provide a method which can be used to collect sufficient data for solving the problem of lack of training data when building an effort estimation model.
We propose to mine GitHub to collect sufficient and diverse real-life effort data for effort estimation. Specifically, we first demonstrate the feasibility of our cost metrics (including functional point analysis and personnel factors). In particular, we design a quantitative method for evaluating the personnel metrics based on GitHub data. Then we design a samples incremental approach based on AdaBoost and Classification And Regression Tree (ABCART) to make the collected dataset owns dynamic expansion capability.
Experimental results on the collected dataset show that: (1) the personnel factor is helpful for improving the performance of the effort estimation. (2) the proposed ABCART algorithm can increase the samples of the collected dataset online. (3) the estimators built on the collected data can achieve comparable performance with those of the estimators which built on existing effort datasets.
Effort estimation based on Open Source Project (OSP) is an effective way for getting the effort required by a new project, especially for the case of lacking training data.
Research patterns and trends in software effort estimation
2017, Information and Software Technology
Software effort estimation (SEE) is most crucial activity in the field of software engineering. Vast research has been conducted in SEE resulting into a tremendous increase in literature. Thus it is of utmost importance to identify the core research areas and trends in SEE which may lead the researchers to understand and discern the research patterns in large literature dataset.
To identify unobserved research patterns through natural language processing from a large set of research articles on SEE published during the period 1996 to 2016.
A generative statistical method, called Latent Dirichlet Allocation (LDA), applied on a literature dataset of 1178 articles published on SEE.
As many as twelve core research areas and sixty research trends have been revealed; and the identified research trends have been semantically mapped to associate core research areas.
This study summarises the research trends in SEE based upon a corpus of 1178 articles. The patterns and trends identified through this research can help in finding the potential research areas.
Unit effects in software project effort estimation: Work-hours gives lower effort estimates than workdays
2016, Journal of Systems and Software
Citation Excerpt :
shows different expectations than those implied by “How many work-hours will this work require?” The effort expectation of the person requesting an estimate has been documented to have an effect on the estimates, even when the estimators know that the expectation is from an incompetent, uninformed source (Jørgensen and Sjøberg, 2004; Løhre and Jørgensen, 2016). The anchoring effect, with the “selective accessibility” mechanisms proposed in (Mussweiler and Strack, 2001), is based on knowledge activation.
Software development effort estimates are typically expert judgment-based and too low to reflect the actual use of effort. Our goal is to understand how the choice of effort unit affects expert judgement-based effort estimates, and to use this knowledge to increase the realism of effort estimates. We conducted two experiments where the software professionals were randomly instructed to estimate the effort of the same projects in work-hours or in workdays. In both experiment, the software professionals estimating in work-hours had much lower estimates (on average 33%–59% lower) than those estimating in workdays. We argue that the unitosity effect—i.e., that we tend to infer information about the quantity from the choice of unit—is the main explanation for the large difference in effort estimates. A practical implication of the unit effect is that, in contexts where there is a tendency toward effort under estimation, the instruction to estimate in higher granularity effort units, such as workdays instead of work-hours, is likely to lead to more realistic effort estimates.

View all citing articles on Scopus

Magne Jørgensen received the Diplom Ingeneur degree in Wirtschaftswissenschaften from the University of Karlsruhe, Germany, in 1988 and the Dr. Scient. degree in informatics from the University of Oslo, Norway in 1994. He has about 10 years industry experience as software developer, project leader and manager. He is now professor in software engineering at University of Oslo and member of the software engineering research group of Simula Research Laboratory in Oslo, Norway. Magne Jørgensen has supported software project estimation improvement work in several software companies.

View full text

Numerical anchors and their strong effects on software development effort estimates

Highlights

Abstract

Introduction

Section snippets

The study design

Design

Design

Design

Summary of findings

Acknowledgement

J. Socio-Econ.

J. Syst. Softw.

Int. J. Project Manage.

J. Syst. Softw.

J. Exp. Soc. Psychol.

Organ. Behav. Hum. Decis. Process.

Organ. Behav. Hum. Decis. Process.

Acta Psychol. (Amst)

J. Consum. Psychol.

J. Exp. Soc. Psychol.

J. Exp. Soc. Psychol.

Anchoring and adjustment in software estimation

Softw. Eng. Notes

Decomposed versus holistic estimates of effort required for software writing tasks

Manage. Sci.

The new statistics: why and how

Psychol. Sci.