Overriding First Impressions: Evidence for a Reference-Dependent and Attentionally-Weighted Multi-Stage Process of Value-Based Decision-Making

While previous work has shown that value and attention jointly modulate value-based decisions (Krajbich, Armel, & Rangel, 2010), it is still debated whether attention amplifies value effects (Smith & Krajbich, 2019) or provides a boost to the attended item independent of its value (Cavanagh, Wiecki, Kochar, & Frank, 2014). Here, we independently vary value and visual attention by alternating options on the screen while manipulating presentation duration. Across two studies, we show that the value of the first attended item biases choices in a time-varying manner. We further find that effects of relative presentation duration are value-dependent and specific to the subsequently presented item, which has a stronger impact on choice as relative attention to it increases, overwriting the first item bias. We show that these effects can be captured by a modified attentionallyweighted multi-stage drift diffusion model (aDDM; Krajbich et al., 2010) processing the first item in a reference-dependent manner (relative to the average expected value of previous choice sets). Our results demonstrate that decisions are disproportionally shaped by the reference-dependent value of the first seen item, and that when tested independently, attention amplifies value rather than boosting attended options.


Introduction
Previous research has shown that visual attention affects value-based decision-making at the behavioral (Cavanagh et al., 2014;Krajbich et al., 2010;Smith & Krajbich, 2019) and neural level (Lim, O'Doherty, & Rangel, 2011). However, whether these effects are additive or multiplicative with value is a matter of debate (Cavanagh et al., 2014;Krajbich et al., 2010). A constraint to resolving this controversy is that in typical paradigms of value-based decision-making, visual attention and value are not independent. For example, participants typically fixate more on the item that is ultimately chosen. This effect can be explained either by an effect of fixation-duration on choice likelihood, or conversely an effect of value on attention, as people tend to seek more information in favor of their current choice tendency rather than against it (Hunt, Rutledge, Malalasekera, Kennerley, & Dolan, 2016). Standard value-based choice paradigms therefore have multiple limitations. First, they cannot dissociate effects of attention on value from effects of value on attention. Second, they cannot examine effects of fixation order, an aspect that is to date underexplored but could reveal novel insights into the dynamics of value-based decisions.
Here we test the hypothesis that the first item people see disproportionately shapes the decision-making process, as at the time of fixation on the first item less (or no) information about competing items is available to allow for value comparison. We hypothesize that at this stage, the value of the first item is processed in a reference-dependent manner with approach induced for values higher than the expected value and avoidance induced for values below the expected value. When the participant subsequently fixates the alternate option, the decision-making process is already biased by the value of the first fixated item and this bias needs to be overcome by increased sampling of the alternative item. This hypothesis predicts that in average we should observe a bias towards the first fixated item that decreases over time and with increased time spent evaluating the alternative item.
To test this hypothesis, we used a paradigm that externally manipulates visual attention to the items (Armel, Beaumel, & Rangel, 2008), controlling for value-attention confounds outlined above, while otherwise imposing minimal constraints on the decision-making process. In two studies, we tested the effect of presentation order on choice, how it varies with 1071 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0 overall presentation duration (Study 1), and with relative presentation duration (Study 2). In line with our hypothesis we show that participants are in average biased towards the first item and that this bias decreases for slower decisions. Relative attention biases the impact of the second item's value, with decreased choice probability for below average values and increased choice probabilities for above average values. We show that these findings can be accounted for by a multistage drift diffusion model that is both attentionally-weighted and reference-dependent.

Method
Participants (Study 1: N= 28, Study 2: N= 30) made hypothetical choices between pairs of previously rated consumer items. Items were presented sequentially with the corresponding response hand color-coded. Participants were free to choose any item while items alternated on the screen. After making their choice, participants were presented with the item they chose for one second and allowed to reverse their choice within 750 ms of feedback presentation (7,8%). Our primary dependent variable was the likelihood of choosing the first-seen item on initial choices. Choices were constructed to vary in the overall value (OV) and relative value (RV) of the two items in each choice (Shenhav, Dean Wolf, & Karmarkar, 2018).
In Study 1, we also varied the frequency at which items alternated to test the effects of overall presentation duration effects on choice. In the highfrequency blocks, the presented item changed every 200 ms and in the low-frequency blocks, it changed every 800 ms. The order of frequency blocks was counterbalanced across participants. In Study 2, we varied the presentation duration of the items relative to one another to test the effects of relative presentation duration on choice. For each trial, one of the two items was designated as the long duration item and the other was designated as the short duration item. These durations varied across presentations of an item, with each long versus short duration randomly sampled from distributions with M = 500 ms and SD = 100 ms, and with M = 200 ms and SD = 50 ms, respectively.
Prior to rating and choice, participants were familiarized with all items in isolation once and upon a second presentation of each option indicated items they could not recognize without a label, which were excluded from choice sets. Across participants the final number of choices that could be constructed after exclusion of these options and based on rating distributions varied between 100 and 240 (Median = 228, M = 223, SD = 25).

Study 1
As expected, participants were more likely to choose the first item the higher its value was relative to the second item, b = .26, p < .001. In line with our hypothesis, we also found an independent effect of initial item presentation such that, overall, participants were more likely to choose the first seen item (Fig.  1A), b = 0.13, p < .001. As predicted this first item bias decreased as participants took longer to make their choice, b = -.15, p < .001 (Fig. 1B). Our account would further predict that the longer the first item is seen before the second item is presented, the more its value should bias the decision. In support of our account, overall presentation duration (OPD) affected choice, such that participants showed a reduced first item bias during fast alternations compared to slow alternations, b = -.12, p = .031 (Fig. 1C).

Study 2
As in Study 1, participants in Study 2 were biased towards choosing the first presented item, b = .11, p < .001 and this bias was reduced in slower choices, b = -.10, p < .001. Importantly, Study 2 enabled us to further examine the influence of relative item presentation duration (RPD) on choice. If attention merely boosted the attended item (Cavanagh et al., 2014), we should observe a main effect of RPD. Conversely, a multiplicative account of value and attention (Krajbich et al., 2010;Smith & Krajbich, 2019) predicts an interaction of value and RPD. There was no main effect of RPD on choice, b = .09, p = .576. Instead, RPD significantly interacted with the overall (average) value of the items on choice, b = .166, p < .001 ( Fig. 2A).
When overall values were low, the less presented item was more likely to be chosen. As overall value increased, this effect reversed and the more presented item was more likely to be chosen. Therefore our findings are consistent with an aDDM (Smith & Krajbich, 2019), but also reveal a previously unreported effect, whereby our findings suggest that low value items were perceived as aversive and led participants to accumulate evidence against them while they were attended. This finding is consistent with the possibility that participants are processing the first item with reference to the average of the entire item set, in line with recent observations that values are referencedependent (Khaw, Glimcher, & Louie, 2017;Shenhav et al., 2018). The relative value of the items was not modulated by relative presentation duration (p = .378), nor was there a main effect of overall value on choice (p = .841). To test whether the observed interaction was symmetrically caused by both items or specific to one of them, we analyzed the effects of both item values and their interaction with relative presentation duration while controlling for RTs. We expected that RPD should have a stronger effect on the processing of the second item's value given the first item-bias. Indeed, the slope of the second item value on choice was significantly modulated by RPD, b = .16, p = .009. The longer the second item was presented relative to the first one, the stronger was the effect of its value on choice. The value of the first presented item did not vary as a function of RPD, b < .01, p = .882 (Fig. 2B).

The first item value bias can be captured by a reference-dependent, attentionally-weighted MSDDM
To understand mechanistically how the first item bias and the differential presentation duration effects arise, we used a variant of the Multi-Stage DDM (Srivastava, Feng, Cohen, Leonard, & Shenhav, 2017). This model builds on a typical (single-stage) DDM (Ratcliff & McKoon, 2008), which assumes a process of noisy evidence accumulation to one of two symmetric decision thresholds (-a or a) that terminates when either threshold is reached. The boundaries were defined as the corresponding responses with the lower and upper boundary defined as left and right hand responses, respectively. The rate of evidence accumulation (drift rate), denoted by v, is driven by the value of the items, the current fixation and a weighting factor Q, that controls the contribution of the unfixated item to the diffusion process. For each time-point t, v is defined as: v(t) = (Valueonscreen(t) + Q * Valueoffscreen(t)) * scalingfactor Item values were signed according to the corresponding response, with negative sign for left hand responses and positive sign for right hand responses.
Note that at the first presentation of the first item, the value of the second item is unknown and can therefore not be factored in to the diffusion process, so its value is set to zero. However, to test whether the value of the first item is entirely processed as positive evidence (low value being weak and high value being strong evidence in favor of the option), or whether its value is reference dependent (with below-reference values constituting evidence against it), we compared two models: Model 1 assumes that on the first presentation drift rate scales with the first item's value. Model 2 assumes that drift depends on the item's value relative to an implicit reference (Khaw et al., 2017;Shenhav et al., 2018), which would be the expected value given all the items. We therefore set the drift during the first presentation to the relative value of the first item over the center of the rating scale (5.5). This center value approximates what would be normatively used as the expected value of the second item based on experience over the course of the session (Khaw et al., 2017). For the present simulations, we set Q to 0.8, the scaling factor to 0.1 and a to 1.6.
We found that both models showed the predicted interaction of OV with RPD ( Fig. 3 A,B), but Model 1 more dominantly showed a main effect of RPD, such that choice probabilities did not reverse for longer RPD as value decreased. In contrast, Model 2 captured the observed multiplicative effects of OV and presentation duration. Similarly, when testing for the asymmetric effects of presentation duration on the items, we found that while the second item value was more strongly modulated by presentation duration in both models, the dominant effect in Model 1 was RPD and choice probabilities for the second item did not show the valuedependent reversal. In contrast, in Model 2, we found the observed reversal of choice probabilities (Fig. 3  C,D). Thus overall Model 2 with reference-dependent coding of first-item value offered a better description of the data.

Conclusion
We demonstrated that in the absence of control over their visual attention individuals are biased by the value of the first seen item. This bias decreases as choicetimes increase and more evidence regarding the second item can be accumulated. We further show that in line with an aDDM account (Krajbich et al., 2010;Smith & Krajbich, 2019), attention amplifies value rather than increasing choice probability of the attended item per se (Cavanagh et al., 2014), and that this effect is specific to the second item, overwriting the initial first item bias. Finally, we provide a mechanistic account of the observed effects using a version of the MSDDM that takes attention and the available information over time into account. Importantly, only a model with referencedependent value coding could capture the observed effects. This work bridges research into referencedependent valuation across trials (Khaw et al., 2017) and attentionally-weighted choice dynamics within a trial (Krajbich et al., 2010), and provides a starting point for developing a better understanding of the dynamics underlying value-based decisions.