Individual Differences in Anchoring Effect: Evidence for the Role of Insufficient Adjustment

Although the anchoring effect is one of the most reliable results of experimental psychology, researchers have only recently begun to examine the role of individual differences in susceptibility to this cognitive bias. Yet, first correlational studies yielded inconsistent results, failing to identify any predictors that have a systematic effect on anchored decisions. The present research seeks to remedy methodological shortcomings of foregoing research by employing modified within-subject anchoring procedure. Results confirmed the robustness of phenomenon in extended paradigm and replicated previous findings on anchor’s direction and distance as significant experimental factors of the anchoring effect size. Obtained measures of individual differences in susceptibility to anchoring were fairly reliable but shared only small portion of variability with intelligence, cognitive reflection, and basic personality traits. However, in a group of more reflective subjects, substantial negative correlation between intelligence and anchoring was detected. This finding indicates that, at least for some subjects, effortful cognitive process of adjustment plays role in the emergence of the anchoring effect, which is in line with expectations of dual-process theories of human reasoning.

tive reflection on anchoring effects might be in interaction. More precisely, one could expect that individual differences in intelligence play role in the emergence of the anchoring effect only when Type 2 processes are initiated, i.e. when subjects' reflectivity is high. On the other side, intelligence and anchoring should not be related if subjects are more impulsive, i.e. prone to rely on intuitive Type 1 processes.
Finally, it seemed practically worthwhile to empirically examine correlations that anchoring might have with personality traits. Previous studies on relation between anchoring effect and Big Five personality traits have brought mixed results. For example, McElroy and Dowd (2007) detected a modest positive correlation between susceptibility to anchoring and openness to experience, while Eroglu and Croxton (2010) failed to replicate this finding, but encountered on significant, although relatively small, associations with agreeableness, conscientiousness, and introversion. On the other side, Caputo (2014) reports that anchoring was negatively related to agreeableness and openness. Other studies reported on sporadic and practically negligible correlations between Big Five personality traits and susceptibility to anchoring (Furnham et al., 2012;Jasper & Ortner, 2014).

Research Aims and Hypotheses
From the experimental perspective, research was aimed to examine how both direction and relative distance of anchor contribute to the anchoring effect size. A non-linear relationship between anchor distance and the size of the anchoring effect (Mussweiler & Strack, 2001;Wegener et al., 2001), as well as a stronger effect of positively directed anchors (Hardt & Pohl, 2003;Jacowitz & Kahneman, 1995;Jasper & Christman, 2005), were expected. In order to ensure that relative distance of the anchor is closely the same for all participants, the standard paradigm was extended by introducing pretest session (see Method section). This procedural intervention also allowed to examine if reliable individual differences in susceptibility to the anchoring effect could be collected.
The main correlational aim of the research was to determine whether traditional psychometric constructs can predict anchoring effect, with an expectation that relation between anchoring and intelligence is moderated by cognitive reflection (Evans & Stanovich, 2013;Kahneman, 2011;Stanovich, 2009). On the other side, considering the inconsistent findings of previous studies, it was hard to come out with strong expectations regarding relation between anchoring effect and Big Five personality traits.

Method Participants
A total of 236 special-education undergraduate students (214 females; age M = 19.83, SD = 1.31) participated in this study that was part of a wider research on cognitive biases (see Teovanović, 2013;Teovanović, Knežević, & Stankov, 2015) in return for partial course credit. Participants gave their informed consent before taking part in the study.
Individual Differences in Anchoring Effect 10

Anchoring Experiment
Material A set of 24 general knowledge questions, all required numerical answers, was used in the anchoring experiment. Relatively difficult questions, covering various topics, were chosen with the intention to induce conditions of uncertainty. Participants were not expected to know the precise answers but to provide approximate estimates of target quantities.

Design
The anchoring experiment employed 2 x 4 within-subject design, with anchor direction and anchor distance as repeated factors. Anchor direction had two levels: positive, for anchors placed above, and negative, for anchors placed below the values of initial estimates. Four levels of relative anchor distance were set by adding or subtracting the particular portion (20%, 40%, 60%, or 80%) of the initial estimate's value. Three different general knowledge questions were used in each of eight experimental conditions (see the first two rows in Table 1), resulting in a total of 24 questions.

Procedure
Two-session anchoring experiment was administered via a computer program which was used for presentation of instructions and questions, recording of answers, and calculation of individual anchor values. In the first session, questions were presented in a randomized order, one at a time and the participants were instructed to state the answers by using a numeric keypad. Answers were recorded as initial estimates (E1) and used in the subsequent session as the basis for the setting anchors (A) on various levels of distance and direction factors.
As previously stated, for each participant (p) on each presented question (q), individual anchor values (A pq ) were calculated by multiplying initial estimates (E1 pq ) with predetermined values (which ranged between 0.2 and 1.8 across questions) and rounding decimals. At the beginning of the second session, which followed immediately, an instruction was presented on the screen: "In the previous session, you answered a set of questions. In the following one, you will be asked to answer the same questions again. You are allowed, but not obliged, to change your mind and provide amended answers". Same 24 questions were presented in a new random order, but the standard paradigm of anchoring was applied, i.e. for each question participants were administered a comparative task and a final estimation task. More precisely, the participant was first required to indicate if her/his final response was higher or lower than the value of the specific anchor (A pq ). After that, a participant would state the final response (E2 pq ) by using the numeric keypad.

Measure
Measures of the anchoring effect (AE) were calculated for each participant on each question by using the following formula: AE pq = (E2 pq -E1 pq ) / E1 pq * 100.
As an index of relative amendment in estimation after introducing of anchor, AE tells about the difference between initial and final estimate in the units of percentage of initial estimate's value i . Since relative anchor distance was also expressed in terms of percentage of the initial estimate, this enabled their direct comparison.
For all questions, zero value of AE indicated an absence of difference between the initial and final estimate, i.e.

Data Trimming
In order to control for the unwanted effect of outlying AE values, trimming procedure was applied. The absence of the anchoring effect and the maximum of the anchoring effect determined lower and upper bound of an acceptable range for AE measures. However, two type of data departures were both expected and observed. The first concerns under-anchoring, and it had been registered when AE was lower than zero for positive anchors, i.e. higher than zero for negative anchors. Similarly, over-anchoring was indicated when values of final estimates were smaller than values of negatively directed anchors, i.e. higher than values of positively directed anchors. Measures that had been outside of acceptable range were fenced to meet its boundaries. Corresponding second estimations (E2) were also amended such that applied formula results in trimmed AE measure. A described procedure was applied on 17.5% of all AE and E2 measures (see the last two rows of Table A1 in the Appendix) ii .

Individual Differences Measures
One week prior to the anchoring experiment, several cognitive ability tests and one personality questionnaire were computer administered to the same group of participants. Personal identification numbers were used for matching participants' data.

Raven's Matrices
Raven's Matrices (RM; Raven, Court, & Raven, 1979). For each of 18 tasks, participants were asked to identify the missing symbol that logically completes the 3x3 matrix by choosing from among five options. Participants were allowed six minutes to complete the test. Previous studies that had used this instrument reported about its good metric properties (see, e.g., Pallier et al., 2002;Teovanović et al., 2015).

Vocabulary Test
Vocabulary Test (Knežević & Opačić, 2011) consists of 56 items of increasing difficulty. Subjects were required to define the words (e.g. "vignette", "credo", "isle") by choosing the answer from among six options. No time limit for the completion of this test was imposed. On average, the participants completed this test in 13.11 minutes (SD = 2.09).

Cognitive Reflection Test
Cognitive Reflection Test (CRT) was devised as a measure of "the ability or disposition to resist reporting the response that first comes to mind" (Frederick, 2005, p. 36). It consists of only three questions, each triggering most participants to answer immediately and incorrectly. As previously noted, CRT was used to capture individual differences in propensity to engage Type 2 processing.
Individual Differences in Anchoring Effect 12

Experimental Findings
Initial estimates were precisely correct responses only in 141 (2.5%) trials. When correctness was defined more loosely to encompass deviations up to five units (e.g. on "How many African countries there are in the UN?", all values between 48 and 58 were counted as correct responses), number of correct responses rose only to 440 (7.8%) indicating that questions used in this study were relatively difficult, as intended.
Descriptive statistics for initial and final estimates on each question, as well as results of difference tests between them and associated indices of anchoring effect sizes are all presented in Table 1. Final estimates for each quantity markedly differed from the initial ones (ps < .05) and these differences were highly significant (ps < .001) on the vast majority of questions. Equally important, all of them were in predicted direction -final estimates were higher than initial estimates for positive anchors and lower for negative anchors. As much as 55.4% of initial estimates were amended toward anchors. Besides, over-anchoring was observed far more frequently than under-anchoring, χ 2 (1) = 102.56, p < .001. Such pattern of results confirms the experimental robustness of the anchoring effect.
While standard indices of effect size are directly conditioned by the response variability across participants, AE measures relate initial to final estimates for each participant and hence are not dependent on differences between participants. As it is presented in penultimate column of Table 1, final estimates were on average from 9% to 41% higher than initial estimates for positive anchors, i.e. from 4.5% to 33.4% lower for negative anchors.
To enable their comparison and further aggregation, AE values were absolutized for negative anchors. For each of eight experimental conditions, anchoring effect measures were calculated as average AE score on three belonging questions (see Figure 1).  Anchoring effect measures were even more strongly influenced by relative anchor distance. There was general increase of anchoring effect for the first three anchor distance levels. An almost linear relationship was observed with AE measures on the approximately halfway between the initial estimates and the anchors. However, increase leveled off between 60% and 80% distances, for both positive and negative anchors. Differences between mean AEs for last two distances were not significant (ps > .10) iii . This pattern of results indicates that further increasing of anchor distance would probably not be followed by significant increase of anchoring effect size.

Susceptibility to Anchoring Effect
Considerable variability of AE measures was observed across participants on each question (see the last column of Table 1). Moreover, participants who were more susceptible to anchoring effect on one item were also more prone to amend their estimates toward anchors on other items. Internal consistency of individual differences was acceptable (α = .71). The average correlation between AE measures for 24 items was relatively small (r = .11), but it was notably higher after aggregation for eight experimental conditions (r = .27). The first principle component accounted for 36.7% of their total variance (λ = 2.94). It was highly loaded by each AE score (rs ranged from .46 to .73) and approximately normally distributed (KS Z = 0.54, p = .59).

Predictors of Susceptibility to Anchoring Effect
Cognitive and personality measures were collected with the aim to explore their capacity for predicting individual differences in anchoring effect. Results presented in Table 2 suggest fair levels of reliability for all of the measures (αs ≥ .70), except for CRT (α = .40). Latter is not surprising considering that CRT consisted only of three items. Also, performance on CRT was very poor. As much as 199 (84.3%) participants failed to give at least one correct answer. Hence, a high reflective group of subjects in this study consisted of only 37 participants who scored at minimum one point on CRT.
A measure of overall susceptibility to anchoring effect was regressed on the set of potential predictors iv and the results are displayed in the last three columns of Table 2. Zero-order correlations were relatively small, and only the trait of openness correlated significantly with anchoring effect (r = .24, p < .001). Besides, openness was Teovanović 15 the only measure that significantly contributed to the prediction model (β = 0.27, p < .001). In total, relatively small portion of anchoring effect's variance was accounted for by predictors (F(8,227) = 2.65, p = .008, R 2 = 5.3%). Cognitive measures showed no direct correlation with anchoring. However, it was hypothesized that processes captured by AE measures may differ with respect to the degree of subjects' reflectivity, i.e. that cognitive reflection might moderate the relationship between cognitive abilities and anchoring effect. Results confirmed this expectation. Standardized interaction term, entered in multiple regression analysis along with standardized CRT and RM scores, was highly significant (β = -.24, p = .004). For the group of participants who performed poorly on CRT (N CRT-= 199), the correlation between RM scores and anchoring effect measures was not significant (r = .11, p = .13), while for the group of participants that had at least one correct CRT answer (N CRT+ = 37), RM performance significantly correlated with anchoring (r = -.51, p = .001) v . Furthermore, results of separate multiple regression analyses for the two groups, presented in Table 3, indicate stronger effect of predictor set in high reflective group (R 2 = 32.0%) in comparison to low reflective group (R 2 = 5.3%), even after controlling for difference between size of groups (Fischer's z = 2.21, p = .027).

Discussion
Anchoring effect is a well-documented phenomenon. Yet, studies that examine covariates of susceptibility to this cognitive bias are relatively recent and with mixed findings (Bergman et al., 2010;Caputo, 2014;Eroglu & Croxton, 2010;Furnham et al., 2012;Jasper & Christman, 2005;Jasper & Ortner, 2014;McElroy & Dowd, 2007;Oechssler et al., 2009;Stanovich & West, 2008;Welsh et al., 2014). Relative inconsistency in reported results could be partly due to questionable psychometric properties of used instruments, but also due to the absence of the uniform procedure for reliable measurement of individual differences in susceptibility to anchoring effect. In respect to latter, at least two approaches can be distinguished. First, predictors of anchoring were examined by using interaction test in 2 x 2 ANOVA with anchor condition (high/low) and dichotomized psychometric construct of interest as between-subject factors. In such way, McElroy and Dowd (2007) showed that participants who were high on openness to experience provided higher estimates for high anchors and lower estimates for low anchors in comparison to participants who were low on openness. A similar procedure was used in several subsequent studies (Furnham et al., 2012;Oechssler et al., 2009;Stanovich & West, 2008).
Drawbacks of this approach lie in the inevitable arbitrariness of dichotomization criteria and practical inability to simultaneously examine effects of several predictors. As an alternative solution, Welsh et al. (2014) proposed a multi-item measure of anchoring expressed as a correlation between experimentally provided anchors and participant's numerical estimates across the number of trials on a specially constructed card game. In such way, several predictors could be simultaneously examined. However, authors missed to examine reliability of these measures, and the question of generalizability of findings beyond gambling setting can also be raised.
The present study was designed with an intention to overcome some shortcomings of the previous studies. Introduction of pretest session, in which participants stated their anchor-free estimates before appliance of standard paradigm, ensured that anchor direction and relative extremity were same for all participants. This also al- The average size of the anchoring effect in the present study was medium. Participants amended an initial estimate of uncertain quantity toward anchor in more than half cases, on average for slightly more than one-fifth of its value. In that sense, an experimental reliability of the anchoring effect was yet again confirmed, this time by using within-subject design. It seems plausible to suppose that anchoring effect would be even stronger if participants were not previously asked to express their anchor-free estimates, i.e. that reported results can be seen as conservative indications of the anchoring effect size.
Anchors showed asymmetrical effect. Positively directed anchors produced larger effects in comparison to equally distanced negative anchors, which is in line with previous findings (Hardt & Pohl, 2003;Jacowitz & Kahneman, 1995;Jasper & Christman, 2005). This was probably due to fact that the degree of estimate chors led to a larger anchoring effect, but only to some point. For the most extreme anchors, no significant difference in effect was observed in comparison to the nearest anchors. This indicates that maximum of the anchoring effect was registered and that more distant anchors would yield same (Mussweiler & Strack, 2001) if not weaker (Wegener et al., 2001) effect.
Considering the correlational aspects, results indicate fair internal consistency of the anchoring effect measures, thus preventing alternative interpretations of negligible correlations. Anchoring effect was directly associated only with the trait of openness. As suggested by McElroy and Dowd (2007), this can be viewed as a consequence of enhanced sensitivity to external information which is a common characteristic of two phenomena.
In other words, participants prone to take into account alternative points of view in general also show increased readiness to amend their answers toward externally suggested solutions on estimation tasks. Other personality traits were not correlated with anchoring, although some previous studies suggested that possibility (e.g. Caputo, 2014;Eroglu & Croxton, 2010).
Findings on the relations between cognitive variables and AE measures are of particular theoretical importance since they could shed some light on the discussion about psychological mechanisms that underlie anchoring effects. Kahneman (2011) put previously emerged distinction between two competing theoretical accounts of the anchoring effect into the context of dual-process theories of human reasoning. An adjustment was explicitly described as deliberate Type 2 process which is typically carrying out in multiple steps. After estimating if the correct value is higher or lower than the presented anchor, people adjust from the anchor by generating an initial value. Afterwards, people evaluate if this value is a reasonable answer (in which case they terminate adjustment and provide an estimate), or it requires additional modification (in which case they readjust estimate further away from the anchor value). Anchoring effect is hypothesized to arise partially because the capability to perform processes of evaluating and adjusting is limited by capacities of working memory. Since individual differences in these capacities can be approximated by intelligence tests (e.g. Evans & Stanovich, 2013;Stanovich & West, 2008), one would expect to observe a negative correlation between measures of intelligence and anchoring. In that sense, finding that AE measures were not directly related to cognitive measures, which is in line with previously reported results (Bergman et al., 2010;Furnham et al., 2012;Jasper & Ortner, 2014;Oechssler et al., 2009;Stanovich & West, 2008;Welsh et al., 2014), could imply that adjustment, as effortful cognitive process is not involved in the emergence of the anchoring effect at all. Nevertheless, a relation between intelligence and anchoring depended on cognitive reflection. In other words, readiness to involve in more demanding Type 2 processes moderated the relation between cognitive capacities and susceptibility to anchoring effect. For the vast majority of participants, who were prone to mentally economize by hinging on intuitive Type 1 processes, differences in intelligence indeed were not related to anchoring. It can be hypothesized that final estimates of these subjects have not resulted from serial adjustments but were outcomes of Type 1 processing, i.e. relatively automatic activation of information consistent with the presented anchor. Observed variability among these subjects might be due to variations in the dispositional sensitivity to external information, or due to individual differences in distortions of scale on which answers are provided (e.g. Frederick & Mochon, 2012), but unlikely due to capacities to repeatedly reconsider readjusted anchor as a possible answer to given question. However, for more reflective subjects, negative correlation between intelligence and anchoring was observed -participants with higher capacities to carry out Type 2 processes adjusted further away from anchors. In other words, differences among more reflective subjects, who were prone to engage in processes of reevaluation and readjustment before stating their final estimates, were at least to some extent due to their ca-Individual Differences in Anchoring Effect 18 pacities to perform these processes. This indicates that (insufficient) adjustment, at least for some subjects, plays role in the emergence of the anchoring effect.
This finding, though, should be taken with caution, considering the low reliability of three-item CRT and relatively small size of high reflective subsample. Future replications are needed vii . While more direct ones would seek to perform similarly designed study by using psychometrically improved instruments, such as seven-item CRT (Toplak, West, & Stanovich, 2014), others could examine whether comparable results would be obtained if cognitive reflection is measured in other ways, for example by using typical performance measures, such as need for cognition or actively open-minded thinking viii , as suggested by Stanovich (2009). Finally, studies that combine experimental and differential perspective could be also applied with the same purpose. For example, study could be organized with aim to explore if subjects with larger cognitive capacities benefit more from interventions previously shown to reduce anchoring effect, such as consider-the-opposite strategy (Adame, 2016;Mussweiler, Strack, & Pfeiffer, 2000), forewarnings (Epley & Gilovich, 2005, Study 2), or accuracy motivation (Epley & Gilovich, 2005, Study 1; Simmons et al., 2010).
The present study explored some of the potential benefits of novel within-subject procedure which allows simultaneous manipulation of experimental factors of anchor's distance and direction and reliable measurement of individual differences in susceptibility to anchoring effect. Although first results are promising, future studies are needed, especially those driven by theoretical expectations aimed to examine cognitive processes involved in anchoring effect.

Notes
i) AE measures could not be calculated when denominator, i.e. initial estimate had zero value, which was the case in 92 trials (1.63%). Distribution of missing AE data by questions is displayed in the third row of Table A1 in the Appendix. ii) Although markedly smaller percent of measures was amended when more classical trimming procedures were applied (0.9% after following Tabachnick & Fidell, 2007; 2.1% after following procedure proposed by Tukey, 1977), the pattern of results presented in this paper have remained the exact same. iii) Accordingly, results of polynomial contrast tests showed that linear and quadratic components were significant for both positive ( Table A2 in the Appendix. v) Two reviewers expressed consideration regarding the poor performance on CRT and consequently the small size of the high reflective subsample. It should be noted here that poor CRT performance was not consequence of deficient instruction or indication of non-understanding of a task. Participants did understand the task, considering high rate of predictably irrational / intuitive responses (for example, to a question "If it takes 5 machines 5 minutes to make 5 widgets, how long it takes 100 machines to make 100 widgets?" as much as 78.1% of participants provided wrong intuitive answer "100 minutes"; similar pattern was observed for the other two questions). Poor performance was expected since it was one of the first reported findings on CRT (see e.g. Table 1 on page 29 in Frederick, 2005). Considering this, it seems reasonable to set cut-off score to one out of three. However, when analysis was performed on a very small subsample of subjects who had at least two correct answers (n = 9), a similar pattern of correlations was observed (for RM r = -.36; for Vocabulary r = -.57). vi) Multilevel analysis was run on data gathered in this study, with aim to examine if different parameters of individual anchor distance functions -intercept, linear term and quadratic term (as indicators of general susceptibility to anchoring, sensitivity to changes in anchor distance, and resistance to anchor extremity, respectively) are related to personality traits and cognitive variables. While results regarding general susceptibility to anchoring effect were similar to results reported here, low reliability of individual differences in two other parameters prevented any firm conclusion on their covariates. vii) Considering the limited size of high reflective group (N CRT+ = 37), estimated correlation between Raven's matrices and anchoring (r = -.51) was not highly precise (95% CI [.22, .72]). In the light of previous findings that indicate low to a medium correlation between anchoring and cognitive measures, it is far more plausible to suppose that effect size in population would be closer to the lower boundary of 95% confidence interval. Correlational studies that seek to confirm that relationship between intelligence and anchoring do exist in a high reflective group (e.g. ρ CRT+ = .30), but not in a low Teovanović 19 reflective group (ρ CRT-= .00), would have a power of only 50.1% if subsamples remained same in size. However, under the same condition, if the effect size in population were a little bit higher (e.g. ρ CRT+ = .40), power would be 73.4%. Anyhow, future studies are recommended to use larger samples, particularly subsample of high reflective participants. I would like to thank an anonymous reviewer for the encouragement to elaborate on this point. viii) Previous studies on thinking styles as potential predictors of anchoring effect yielded inconsistent results (see e.g. Cheek & Norem, 2017;Epley & Gilovich, 2006;Jasper & Christman, 2005