Learned value and object perception: Accelerated perception or biased decisions?

Rajsic, Jason; Perera, Harendri; Pratt, Jay

doi:10.3758/s13414-016-1242-0

Learned value and object perception: Accelerated perception or biased decisions?

Published: 28 November 2016

Volume 79, pages 603–613, (2017)
Cite this article

Download PDF

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Learned value and object perception: Accelerated perception or biased decisions?

Download PDF

Jason Rajsic¹,
Harendri Perera¹ &
Jay Pratt¹

2077 Accesses
19 Citations
1 Altmetric
Explore all metrics

Abstract

Learned value is known to bias visual search toward valued stimuli. However, some uncertainty exists regarding the stage of visual processing that is modulated by learned value. Here, we directly tested the effect of learned value on preattentive processing using temporal order judgments. Across four experiments, we imbued some stimuli with high value and some with low value, using a nonmonetary reward task. In Experiment 1, we replicated the value-driven distraction effect, validating our nonmonetary reward task. Experiment 2 showed that high-value stimuli, but not low-value stimuli, exhibit a prior-entry effect. Experiment 3, which reversed the temporal order judgment task (i.e., reporting which stimulus came second), showed no prior-entry effect, indicating that although a response bias may be present for high-value stimuli, they are still reported as appearing earlier. However, Experiment 4, using a simultaneity judgment task, showed no shift in temporal perception. Overall, our results support the conclusion that learned value biases perceptual decisions about valued stimuli without speeding preattentive stimulus processing.

Reward history but not search history explains value-driven attentional capture

Article 19 April 2018

Janina R. Marchner & Claudia Preuschhof

Miss it and miss out: Counterproductive nonspatial attentional capture by task-irrelevant, value-related stimuli

Article 05 June 2017

Mike E. Le Pelley, Tina Seabrooke, … Steven B. Most

How do magnitude and frequency of monetary reward guide visual search?

Article 06 June 2016

Bo-Yeong Won & Andrew B. Leber

At any given moment, we can only attend to a small subset of the total information in the visual environment. During each moment, a number of cognitive processes collectively determine what information will be attended and what information will fall out of further processing. For the most part, different states of attention have been considered to be due to either bottom-up processes—driven by causes external to the individual—or top-down processes—driven by the goals of the observer. However, recent research has highlighted the contribution of sources of selection that are internal to the observer yet not determined by his or her current goals (Awh, Belopolsky, & Theeuwes, 2012). The learned value of stimuli is one such source of attentional bias (e.g., Anderson, Laurent, & Yantis, 2011a). These value-driven attention biases can occur even when the value-laden features of stimuli are task-irrelevant (e,g., Anderson et al., 2011a; Raymond & O’Brien, 2009). Although such results have reliably been observed in laboratory experiments, the particular stage, or stages, of perceptual processing affected by learned value is not yet understood. In this article, we assess the ability of learned value to affect visual priority in a task that does not require selective processing. First, however, we review what is known about the ways that learned value biases perceptual processing.

To study the effect of learned value on visual selection, studies have employed a two-phase structure, wherein different stimuli are repeatedly paired with different amounts of reward in a learning phase, and then attentional biases to these stimuli are compared in a test phase in which the reward contingency is removed (Anderson, 2014; Anderson, Laurent, & Yantis, 2011a, 2011b; MacLean & Giesbrecht, 2015; Miranda & Palmer, 2014; Raymond & O’Brien, 2009; Sali, Anderson, & Yantis, 2014). For example, Anderson, Laurent, and Yantis (2011b) trained participants to search for oriented bars within green or red circles among other colored distractor circles. For each participant, one target color had a high probability of producing a high reward, and the other target color had a high probability of producing a low reward. After participants had practiced this task, the reward contingencies were removed, and instead the participants searched for an oriented bar within a unique, diamond shape among distractor circles (similar to the added-singleton paradigm pioneered by Theeuwes, 1992). Critically, one of these circles on each trial would be colored in either red or green, and both of these singleton distractors led to slowed search times. Importantly, singletons in the color that had received high reward in the learning phase produced greater interference, indicating that the learned value of stimuli produces an attentional bias over and above that of perceptual salience.

As was recently noted by Müller, Rothermund, and Wentura (2016), the majority of studies on reward and attention have relied on search tasks to assess the prioritization of rewarded stimuli, and it is therefore unclear which stages of visual processing are affected by reward. These authors argued that reward effects in search may be due to delayed disengagement, as opposed to a preattentive boost for visual features with learned value. To support this argument, they reported data from a modified dot-probe task. After imbuing visual objects with value in a speeded-discrimination task, previously rewarded objects’ ability to orient attention when acting as exogenous cues was compared to that of neutral objects, as well as to that of objects associated with losses. Whereas rewarded objects led to a larger cue validity effect, comparison with neutral cues showed that the rewarded objects led to slower disengagement (i.e., a larger difference between neutral and invalidly cued response times [RTs]), but not to speeded orienting (i.e., no difference between neutral and validly cues RTs). Müller et al. argued that delayed disengagement from rewarded stimuli could explain the attentional biases measured in search tasks, which are assessed by a slowed RT when an object with learned value appears as a distractor.

Using a different paradigm, Hickey, Chelazzi, and Theeuwes (2011) argued instead that reward is able to affect the early stages of target detection and localization, and that this target enhancement mechanism is distinct from a distractor suppression mechanism that operates on a later stage of selection. Although this finding is based on the results of tasks in which the effect of rewards on intertrial priming, and not learned value, has been measured, their conclusion is consistent with a recent electrophysiological and behavioral study showing that reward history influences the early stages of visual attention selection, by altering the P1 amplitude (MacLean & Giesbrecht, 2015; see Hickey, Chelazzi, & Theeuwes, 2010, for a similar result using immediate reward), and also affects attentional capture, as indicated by the N2PC component (Qi, Zeng, Ding, & Li, 2013). Given that these studies involved associating learned values with stimuli, these results are inconsistent with Müller et al.’s (2016) conclusion that rewards solely produce delayed disengagement. Similarly, the suggestion that learned value solely delays disengagement is inconsistent with measures of oculomotor capture (Anderson & Yantis, 2012; Hickey & van Zoest, 2012; Theeuwes & Belopolsky, 2012). Instead, such results point to an effect of learned value that is preattentive, in the sense that it does not require first focusing attention on a particular object to be measured. Behavioral evidence of a preattentive locus of reward has also come from Kiss, Driver, and Eimer (2009), who showed that pop-out was enhanced for targets that often deliver higher rewards (see also Lee & Shomstein, 2014); however, Kristjánsson, Sigurjónsdóttir, and Driver (2010) subsequently showed that this pop-out advantage rapidly reverses following a change in the stimulus–reward contingencies, leaving uncertainty regarding whether learned value, as opposed to expected reward, operates at an early stage. What has been missing is a direct, behavioral demonstration that stimuli with imbued learned value are prioritized for perception.

Our goal in this study was to directly test the claim that learned value can enhance the preattentive processing of visual information. To do so, we employed judgments of stimulus onset (temporal order judgments [TOJs] and simultaneity judgments [SJs]), which are used to measure visual prior entry. Prior entry refers to the accelerated conscious perception of some stimuli at the expense of others, leading to earlier conscious perception of these stimuli (Scharlau, 2007; Spence & Parise, 2010). Prior entry is found to occur when attention is exogenously oriented to the location of an upcoming stimulus (Born, Kerzel, & Pratt, 2015; Hikosaka, Miyauchi, & Shimojo, 1993; Schneider & Bavelier, 2003; Shore, Spence, & Klein, 2001; Stelmach & Herdman, 1991). Although event-related potentials (ERPs) measured alongside TOJs do not always demonstrate accelerated processing (i.e., reduced peak latencies of the early components of the visual evoked potential), increases in the amplitudes of early components (e.g., the P1, N1, and P2) are reliably observed, indicating that behavioral prior-entry effects correspond to changes in early visual processing (McDonald, Teder-Sälejärvi, Di Russo, & Hillyard, 2005; Vibell, Klinge, Zampini, Spence, & Nobre, 2007). Importantly, these tasks can be used as “cueless” tasks that measure the attentional biases that are intrinsic to stimuli, such as the speeded processing found for low-spatial-frequency patches (West, Anderson, Bedwell, & Pratt, 2010), emotional faces (West et al., 2010; West, Anderson, & Pratt, 2009), and near surfaces (West, Pratt, & Peterson, 2013). Furthermore, the tasks do not require selective processing—in fact, both stimuli must be registered to make a response—and so provide an index of visual priority when all of the information is equally relevant. Thus, TOJs and SJs provide a window into the perceptual biases that may exist for stimuli with learned values before focal attention is engaged, since it is difficult to envision a mechanism by which delayed disengagement alone could affect the relative perceived onsets of stimuli.

In the present study, we used a learned-value paradigm modeled after Anderson, Laurent, and Yantis’s (2011b) study, with one major exception: Instead of monetary value, we assigned value using a point system. For Experiment 1, our goal was to replicate Anderson et al.’s (2011b) results, especially given that our point rewards did not map onto any monetary value. To do this, we followed their modified value-learning task with an additional singleton visual search task to establish that the value training was successful. We showed that when the additional singleton feature was associated with learned value, it slowed down visual search proportional to the size of its associated value. In Experiment 2, participants completed the same value-learning task as in Experiment 1, but were then tested using a novel TOJ paradigm to assess whether the learned value would modify visual priority. Experiments 3 and 4 measured the perception of temporal onset for rewarded stimuli using a reversed TOJ and an SJ task, to distinguish between three accounts of changes in perceptual judgments: true prior entry, response biases, and decision biases. To preview our results, we observed that although learned value biases temporal onset responses, such that highly valued stimuli are reported to be perceived earlier, they do not bias perception when simultaneity, and not order, is measured. This supports the proposal that learned value has effects on visual processing beyond delayed disengagement—specifically, in biasing perceptual decisions.

Experiment 1

As we noted above, the main purpose of this experiment was to verify that rewarding participants with points rather than money would result in typical value-learning effects.

Method

Participants

Twenty-two undergraduate psychology students naïve to the experiment were recruited from the University of Toronto. Each participant reported normal or corrected-to-normal visual acuity and color vision. Participants gave written informed consent for the experiment and were provided with course credit for participating. All experimental procedures were approved by the University of Toronto’s Office of Research Ethics and were in accordance with the Declaration of Helsinki.

Apparatus

The experiment was conducted using a Windows-run PC with a 19-in. CRT screen (1,024 × 768 resolution with 85-Hz refresh rate) in a quiet and dimly lit room. Participants sat and viewed the monitor from a distance of 50 cm, with their chin rested on a chinrest throughout the experiment. The experiment was run in MATLAB (The MathWorks, Natick, MA) using the Psychophysics Toolbox. Participants entered responses by using a standard keyboard.

Stimuli and procedure

Participants were tested in a dimly lit room for a single 1-h session. Prior to the experiment, participants were presented with the instructions in a PowerPoint presentation that included images of the visual stimuli used in the experiment alongside the written instructions. Participants were told to place their chin on the chinrest and to make fast and accurate responses on each trial of the experiment.

Each phase of the experiment began with a screen with instructions that reiterated the instructions that had been orally provided to the participants. The stimuli for both phases were presented against a uniform gray background with a white fixation cross, 0.4° in size, centered on the screen.

Training phase

The training phase of Experiment 1 was used to imbue the stimuli with learned value by repeatedly pairing them with different rewards. See Fig. 1 for examples of the stimuli and the trial sequence. The trials in the training phase were made up of displays composed of four Landolt Cs, 1.5° in radius, drawn in four different colors, appearing at random positions, all centered 6.4° from fixation. Of these Landolt Cs, three with their gaps (0.36° in size) at top or bottom were the distractor stimuli, and one with its gap on the right or left was the target stimulus. The possible colors of each distractor stimulus were orange (RGB: 192, 192, 0), blue (RGB: 0, 192, 192), and yellow (RGB: 255, 128, 0); depending on the trial, the target stimulus could be either red (RGB: 255, 0, 0) or green (RGB: 0, 255, 0). The search display was presented until participants had made their response. Participants had to identify whether the gap on the colored circle was on the left or the right by pressing the left or the right arrow key, respectively. A feedback display followed the response to inform the participant of how many points he or she had earned for the completed trial; this feedback was presented in the center of the screen in white Arial font that varied in size depending on the reward magnitude. High rewards (200 points) were shown in large text (48 point, approximately 1.8° in height), whereas low rewards (20 points) were shown in smaller text (16 point, approximately 0.6° in height). The total number of points was presented for 1 s and was added to a running tally that was continuously visible at the top of the screen.

Correct responses were followed by visual feedback indicating the total number of points earned during the training phase. High-reward targets were followed by 200 points (high-reward) feedback on 80% of the trials, and low-reward feedback on 20% of the trials. Low-reward targets were followed by 20 points (low-reward) feedback on 80% of the trials, and high-reward feedback on 20% of the trials. The high-reward and low-reward target colors were randomly assigned as red or green for each participant.

The training phase consisted of a variable number of trials grouped into 12 blocks. Prior to the end of the training phase, practice trials were provided. These trials were identical to actual trials, except that all visual stimuli were presented in white and the points earned on each trial were equal to 0 or to 10 points, for incorrect or correct trials, respectively. The practice phase ceased when participants had collected 100 points—in other words, once they had correctly completed ten trials. Between each block and after completion of the training phase, participants were given a short break. Each block was terminated after the participant had accumulated 2,500 points.

Test phase

For our test phase, we used an additional singleton task (Theeuwes, 1992). During this task, eight stimuli appeared on a search display, where each search stimulus was placed, evenly spaced, around the circumference of an imaginary circle, radius 6.4°, centered on fixation. Seven of these stimuli were Landolt Cs, 1.5° in radius, and the eighth stimulus was a Landolt square outline, 3.0° in width and height. Each Landolt had a 0.36° gap on either the left or the right side (forward facing or reverse). The target was defined as the square outline. Depending on the trial type, either all stimuli were colored white or all stimuli were colored white except one (the additional singleton), which was drawn in either the high-reward-associated or the low-reward-associated color. No feedback or points were provided following each test trial.

The search display was presented until participant had made their response. Each participant had to identify on which side, left or right, the gap was located on the square target, by pressing the “z” or the “m” key, respectively. The RT was measured from the onset of the visual stimuli to the response made by each participant.

The test phase of the experiment included 320 trials that were divided into eight blocks. Once again, practice trials were provided before this phase was completed. In total, RTs were compared in four conditions related to the addition singleton: no color, distractor color, high-value color, and low-value color. The high-value and low-value colors refer to the same colors used for the high-reward and low-reward targets in the training phase of the experiment. The target and additional singletons were equally likely to appear in each of the eight positions of the search array throughout the experiment. The additional singleton always appeared as a distractor. The search display stayed on the screen until the participant had made a response, and then the next search display was presented.

Results and discussion

Correct RTs in the acquisition were analyzed by dividing the training phase into first and last halves, each of which with high- and low-reward-associated targets. Trials were trimmed within subjects by removing trials with RTs outside two standard deviations of a participant’s mean RT. A Block × Reward analysis of variance (ANOVA) revealed a main effect of block, F(1, 21) = 27.10, p < .001, η _p ² = .56, but no main effect of reward, F(1, 21), = 0.84, p = .37, η _p ² = .04, and no interaction, F(1, 21) = 1.03, p = .32, η _p ² = .05, although RTs were numerically faster for high-reward, M = 519, SE = 5 ms, than for low-reward, M = 531 ms, SE = 4 ms, targets in the last half of the training phase. Thus, we did not find reliable evidence of a difference in RTs between high- and low-reward targets in our training phase.

In the test phase, the correct RTs and accuracy were M = 535 ms, SE = 15 ms, and M = 97.2%, SE = 0.6%, respectively. To determine whether the learned value from the training phase affected the allocation of attention in the test phase, the average correct RTs for the additional-singleton effects in the test phase were analyzed using a one-way, repeated measures ANOVA with Singleton Condition (low value, high value, and no singleton) as a factor. The averaged correct RT in each condition is shown in Fig. 2. A main effect of singleton type was present, F(2, 42) = 8.80, p = .001, η _p ² = .30. Follow-up contrasts revealed that low-value singletons slowed search times relative to no-singleton trials, F(1, 21) = 4.82, p = .04, η _p ² = .19, and, critically, that high-value singletons slowed search times even further, relative to low-value singletons, F(1, 21) = 6.15, p = .02, η _p ² = .23. No differences in accuracy were observed by singleton condition, F(2, 42) = 0.99, p = .38, η _p ² = .05. This demonstrates that, in a task that used points in lieu of monetary reward, learned value led to stable changes in attentional priority, such that stimuli associated with more reward produced increased distraction in a subsequent task.

Experiment 2

Given that we were able to show a learned-value effect on the allocation of attention in our version of the task used by Anderson et al. (2011b), we substituted a TOJ task in the test phase to measure whether learned value affected the speed with which the stimuli were processed. If learned value does increase preattentive visual priority, we expected that the stimuli associated with higher value should be perceived earlier than the stimuli with lower value.