The role of reward and reward uncertainty in episodic memory. Journal of Memory and Language

Declarative memory has been found to be sensitive to reward-related changes in the environment. The reward signal can be broken down into information regarding the expected value of the reward, reward uncertainty and the prediction error. Research has established that high as opposed to low reward values enhance declarative memory. Research in neuroscience suggests that high uncertainty activates the reward system, which could lead to enhanced learning and memory. Here we present the results of four behavioural experiments that examined the role of reward uncertainty in memory, independently from any other theoretically motivated reward-related effects. Participants completed motivated word learn- ing tasks in which we varied the level of reward uncertainty and magnitude. Rewards were dependent upon memory performance in a delayed recognition test. Overall the results suggest that reward uncer- tainty does not affect episodic memory. Instead, only reward outcome appears to play a major role in modulating episodic memory. (cid:1) 2017 The Authors. Published by Elsevier Inc. ThisisanopenaccessarticleundertheCCBYlicense(http://

The neuroscience of reward processing has guided research on the relationship between reward and memory (Adcock et al., 2006;Shohamy & Adcock, 2010;Wittmann, Dolan, & Düzel, 2011). Single-cell neurophysiology in non-human primates and imaging work in humans strongly suggests that the dopaminergic reward system responds to different components of reward: expected value; outcome or prediction error; and uncertainty of reward (Cromwell & Schultz, 2003;Fiorillo, Tobler, & Schultz, 2003;Hollerman & Schultz, 1998;Schultz, 1998Schultz, , 2002Schultz et al., 2008;Tobler, Fiorillo, & Schultz, 2005). The aim of this paper is to examine which aspects of the reward signal promote memory performance in motivated learning. In particular, the key question examined here is whether uncertainty about reward has effects on episodic memory. We also assess more generally the role of these different reward components in episodic memory. Across the four experiments presented in this paper, we isolate and assess the contributions of different aspects of reward to episodic memory encoding. The factors of interest are listed in Table 1. As we review in the sections below, these reward components were selected based on previous demonstrations that they are signalled in reward-related brain areas (Cromwell & Schultz, 2003;Fiorillo et al., 2003;Liu, Hairston, Schrier, & Fan, 2011;Preuschoff, Bossaerts, & Quartz, 2006;Schultz, 2010) and/or have been shown to affect reward-related learning (Adcock et al., 2006;Bunzeck, Dayan, Dolan, & Duzel, 2010;Mather & Schoeke, 2011;Wittmann et al., 2011).

Dopamine signalling of reward cues, outcomes and uncertainty
Evidence from neuroscience (both single cell recordings in nonhuman primates and neuroimaging in humans) suggests that the reward system-comprising areas such as the ventral tegmental area (VTA), the ventral striatum, the frontal cortex and amygdala-show several changes in activity in response to rewards and reward-predicting cues (Cromwell & Schultz, 2003;Fiorillo et al., 2003;Paton, Belova, Morrison, & Salzman, 2006;Schultz, 1998Schultz, , 2002. Dopaminergic neurons in the midbrain exhibit two patterns of firing. The first, known as the phasic bursts, are transient responses to reward cues and outcomes. One view is that this phasic response encodes the reward prediction error: if the reward is smaller than expected the neurons respond below their baseline firing rate, and if it is larger than expected the neurons fire above their baseline rate (Fiorillo et al., 2003;Glimcher, 2011;Hollerman & Schultz, 1998;Schultz, 1998Schultz, , 2010Tobler et al., 2005). The second type of signal, tonic firing, refers to sustained activity in response to anticipation and expectancy. This tonic firing has been linked to reward uncertainty (Hsu, Krajbich, Zhao, & Camerer, 2009;Liu et al., 2011;Preuschoff et al., 2006;Preuschoff, Quartz, & Bossaerts, 2008;Tobler et al., 2005;Tobler, O'Doherty, Dolan, & Schultz, 2007). Uncertainty refers to the predictability of the outcome of an event. Whereas expected value refers to a combination of reward magnitude and probability, uncertainty refers to the spread of the reward probability distribution irrespective of the magnitude (Tobler et al., 2007). In the case where there are two possible outcomes (e.g. reward vs. no reward), uncertainty follows uncertainty follows an inverted U-shaped function of probability of reward, so that it maximal at p = 0.5. A common measure of uncertainty is entropy. Entropy is calculated as minus the weighted sum of the logarithm of the probabilities of each possible outcome. Unlike variance it is not dependent on the reward magnitude (Preuschoff et al., 2006). An additional information theoretic term we will examine is surprisal. Surprisal refers to the information gained from an event when it occurs, (i.e., the reduction in uncertainty) and is bigger for less probable events: less probable events are more surprising when they do occur. Surprisal differs from signed prediction error as a surprisingly good and surprisingly bad outcome will generate the same surprisal value, but will be associated with different prediction errors (positive vs negative).
While much of the work on reward uncertainty coding has been conducted with non-human animals, separate responses to value and uncertainty have also been observed in humans using fMRI (D'Ardenne, Mcclure, Nystrom, & Cohen, 2008;Glimcher, 2011;Hsu et al., 2009;Liu et al., 2011;Ludvig, Sutton, & Kehoe, 2008;Preuschoff et al., 2006Preuschoff et al., , 2008Schultz et al., 2008;Tobler et al., 2005Tobler et al., , 2007. Using a monetary gambling task Preuschoff et al. (2006) found evidence of neural encoding of expected value and uncertainty in regions including the midbrain and ventral striatum. In this study, and as similarly observed in other studies, the authors find both a linear and quadratic components to the reward signal (Cooper & Knutson, 2008;Dreher, Kohn, & Berman, 2006;Rolls, McCabe, & Redoute, 2008). In summary, there is compelling evidence indicating that expected value and uncertainty are represented by temporally distinct signals in the brain.
As we review next, there is both neurobiological and behavioural evidence that these reward signals linked to reward cues and outcomes are associated with enhanced memory consolidation (Lisman & Grace, 2005;Lisman, Grace, & Duzel, 2011;Shohamy & Adcock, 2010). However, there are no studies to date that directly examine the role of reward uncertainty in memory.

Reward-related memory enhancements
Reward-related enhancements in memory have also been found for items where memory is incidental. Under incidental learning conditions, the rewards are not contingent upon memory but instead rewards or reward cues are presented in close temporal proximity to memory targets (Murayama & Kitagami, 2014;Murayama & Kuhbandner, 2011;Wittmann et al., 2005). These reward-related enhancements are only seen for items tested after a delay (24 h) (Murayama & Kuhbandner, 2011;Wittmann et al., 2011). This type of learning is thought to be supported by the functional links between the reward circuitry in the brain and the hippocampus (Lisman & Grace, 2005) and emerging evidence suggests that dopaminergic activity modulates hippocampal encoding (Shohamy & Adcock, 2010). Although studies have focused on the potential role of dopamine, it is likely that other neurotransmitters such as acetylcholine and noradrenaline are coreleased with dopamine and play a critical role in reward processing and memory consolidation (Clewett & Mather, 2014;Mather, Clewett, Sakaki, & Harley, 2015;Murty, Labar, & Adcock, 2012;paper284, Preuschoff, 't Hart, & Einhauser,2011;Preuschoff et al., 2011;Shaikh & Coulthard, 2013;Takeuchi et al., 2016).
The incidental learning literature has investigated-to a greater degree than motivated learning-which aspects of the reward signal may be critical to the reward-related memory enhancement. A key question has been whether the fidelity of the reward memory enhancement is sufficient to reflect small changes in magnitude? Wittmann et al. (2011) found that recognition memory for items showed a non-linear effect of reward on memory performance with only significant differences in memory performance between cases where reward was delivered and where it was not, regardless of the reward value. The focus has now shifted to the relationship between reward cue and reward outcome (Bunzeck et al., 2010;Mason, Ludwig, & Farrell, 2016;Mather & Schoeke, 2011). Mather and Schoeke (2011) propose that the critical factor is the reward outcome relative to expectation as opposed to absolute amount of reward received on each trial. In their study participants were presented with a reward cue indicating one of three trial types (monetary loss and no outcome trial). Participants had to respond as quickly as possible to a picture target after which the reward outcome was revealed. The reward outcome could either be congruent or incongruent with the reward cue meaning that trials could be classified as either rewarded or loss avoided (regardless of actual reward outcome). Recognition memory performance for the target pictures was significantly better for trials resulting in a ''hit" outcome, which includes trials where the reward value may have been 0. Similarly in our recent direct replication  of findings by Bunzeck et al. (2010) we found evidence that memory performance was primarily influ- enced by the relative reward magnitude, albeit in the opposite direction to the original finding. The consistent evidence of an effect of reward outcome on memory under incidental learning conditions demonstrates that these reward-memory effects cannot be explained by strategic encoding alone. Murayama and Kitagami (2014) conducted an incidental learning task to provide direct evidence that reward-related memory enhancements are not simply a result of increased engagement or attention. The to-be-remembered items were presented prior to the reward cue and therefore the reward cue was essentially irrelevant to the memory stimuli. The results showed that items that were followed by a reward cue were better remembered than items that were followed by a neutral cue. These findings suggest a post-presentation and retroactive effect of monetary incentives on memory encoding which indicates that the effects may rely on mechanisms of dopaminergic consolidation.
There also exist many situations where the individual has expectations that some information will be more relevant, or that there are reasons to actively prioritise the encoding of particular information at the expense of other information. In educational settings the contingency between performance and reward is often explicit, either through classroom rewards such as points or through the desire to achieve good grades . This type of learning is often referred to as motivated or intentional learning as participants are explicitly informed that each reward is only earned if the associated item is successfully remembered in a later memory test (Adcock et al., 2006;Castel, 2007;Spaniol et al., 2013). Findings in the value-directed learning literature that use a linear range of point values demonstrate better memory in an immediate memory test for the higher compared to lower value items (Ariel & Castel, 2014;Castel, 2007;Castel et al., 2002;Cohen, Rissman, Suthana, Castel, & Knowlton, 2014Friedman & Castel, 2011;Madan et al., 2012). Adcock et al. (2006) conducted the first study to directly examine the link between reward anticipation and motivated learning. In this study the reward was presented prior to the stimulus. Participants were asked to remember pictures in exchange for a high reward ($5) or a low reward ($1), which they received upon successful recognition of the word. The authors found that the expectation of receiving a reward increased memory for high reward items. It therefore appears that the reward-related memory enhancement occurs when people directly prioritise rewarded items as well as when they incidentally associate items with rewards. Using fMRI, the authors found that higher activity in reward-related areas at encoding predicted superior memory performance. It is thought that dopaminergic consolidation processes support reward-related motivated learning. However, it is likely that in motivated learning, strategic learning also plays a critical role. Spaniol et al. (2013) conducted a motivated learning study to address the contributions of strategic value-based learning and dopaminergic consolidation to reward-related enhancements in memory. In their task, a monetary incentive was presented before the memoranda and the participant's goal was to remember as many items as possible in order to maximise earnings. The presentation of each memoranda was followed by a simple distractor task which served to reduce the chance of strategic rehearsal of the individual memory items. Spaniol et al. (2013) included a within-subject manipulation of test interval (immediate vs. 24 h delay). The authors found that higher reward increased memory performance only after a delay, which suggests a greater role for consolidation processes compared to selective rehearsal or increased attention which should be evident at immediate test. It should be noted that the task used by Spaniol et al. (2013) differed from those used in value-directed learning studies in that it was fast paced, leaving little time for effortful encoding and there was a large number of trials. Therefore, it is likely that both value-directed learning and dopaminergic consolidation contribute to explaining the reward-related enhancements in motivated learning.

A possible role for reward uncertainty
The preceding discussion highlights the extensive investigation of the role of reward in episodic memory. However as noted earlier, the reward system also signals the uncertainty of reward delivery following a reward cue. There is a notable absence of studies examining the influence of reward uncertainty on memory. Several studies in the neuroscience and education literature have looked at the potential influence of uncertainty on learning, but they confound reward uncertainty with other aspects of the reward environment, such as expected value. For example, Ozcelik et al. (2013) tested participants' general knowledge in a game-like learning environment in which correct answers earned points and incorrect answers lost points. A dice was rolled to indicate the number of points available in each trial. There were two conditions: a certain condition, under which a dice was rolled but the points available remained the same; and an uncertain condition, in which a dice was rolled but any value was possible. This design means that expected value changes in the uncertain condition but not in the certain condition and therefore the two variables are confounded. The results indicated better performance in the uncertain condition and increased self-reported motivation among players in the uncertain condition. However, it is not possible to conclude from these results whether it is uncertainty of outcome, or reward outcome value that is actually driving the increased learning, as the design of the experiment confounded expected value of reward and uncertainty.
Although they did not explicitly examine uncertainty in their study, Howard-Jones et al. (2011) also noted the potential role of uncertainty in driving learning in educational settings. Indeed, there has been an increasing interest in game-based learning. The gamification of the learning environment aims to motivate and engage students by involving elements typically found in video-games. The success of attempts to introduce gaming techniques in classroom education is still debated (de Freitas & Maharg, 2011;Perrotta, Featherstone, Aston, & Houghton, 2013). The idea that chance-based games promote learning due to increased reward activity in the brain suggests that a demonstrated role of uncertainty in memory consolidation would have potential application in educational settings (Howard-Jones et al., 2014).

The current study
The primary goal of the study presented here was to test behaviourally if there is a memory-enhancing effect of reward uncertainty. The information provided by reward uncertainty is conceptually different from that provided by prediction error and therefore the two signals may differentially affect memory. This distinction is consistent with findings that uncertainty of reward is signalled by the more tonic activity change (as opposed to phasic bursting) of dopamine neurons (Fiorillo et al., 2003).
Alternatively, if the reward memory enhancement is driven purely by the signal associated with reward value then there should be no effect of uncertainty on memory. The effective reward value signal might be the expected value, the actual reward outcome or the prediction error. Prediction errors are an integral part of reinforcement learning models (Sutton & Barto, 1998). In such models, prediction errors are used to update the current belief about the value of different actions in order to maximise future rewards. It has been suggested that neurons in the dopaminergic system encode the prediction error term of these models (Schultz, 1998). Previous studies, particularly in incidental memory, have not clearly distinguished between the effects of reward anticipation and outcome (Bialleck et al., 2011;Mather & Schoeke, 2011;Wittmann et al., 2005). Memory enhancement could be attributed to either reward anticipation or a post-encoding enhancement of items after reward delivery (Murayama & Kitagami, 2014).
Finally, consideration of behavioural theories suggests that uncertainty might even have a negative effect on memory. There are many demonstrations of individuals' tendency to be risk averse, meaning that safer gambles are preferred to riskier gambles, and this is observed even when riskier gambles have a higher expected value (Kahneman & Tversky, 1979). Psychological theories of utility-such as prospect theory (Kahneman & Tversky, 1979)-typically incorporate risk aversion via a concave utility function, which down weights the high value outcomes that would be more likely under the riskier option. It may be that memory encoding is driven not just by value (e.g., Castel, 2007) but by expected utility incorporating a concave utility function, in which case uncertainty will devalue items and make them less memorable.

Experiment 1
The aim of Experiment 1 was to use a simple verbal memory task to determine if uncertainty of reward enhances memory when controlling for expected value of reward. Participants studied a list of words in expectation of a delayed recognition memory test (see Fig. 1). Each word was preceded by one of three reward cues: no reward (0p), a certain reward (10p), or an uncertain reward cue (0/20p), the two outcomes being equally probable of reward (p = 0.5). The actual reward outcome was not shown. One notable feature of this design is that the subjects encode the uncertain 0/20p condition as an expected value of 10p. This allows us to compare certain and uncertain rewards with the same expected value.
A first concern was whether we would replicate the findings that memory performance is enhanced by an associated reward cue (Adcock et al., 2006). Our second, more critical interest was in comparing memory performance under conditions of certainty and uncertainty, to determine if reward uncertainty also has an effect on memory performance. In this experiment uncertainty persists for the duration of encoding and memory recognition, so it is never resolved until the very end of the second session. Even then, no individual rewards are tied to individual items.

Participants
A total of 30 participants took part in our study (age range 18-36 years, mean 22.73 years, SD = 4.27; 9 males and 21 females). All participants received a minimum £5 for their time. The rest of their earnings were related to performance in the memory task, on which they could earn up to an additional £6.40. All participants were fluent English speakers and gave informed written consent prior to the study, which was approved by University of Bristol Ethics Committee.

Materials
A total of 204 words were selected from the pool of 400 words used in Oberauer, Lewandowsky, Farrell, Jarrold, and Greaves (2012). All words were concrete nouns, and were chosen to refer to common objects that are larger or smaller than a soccer ball, with the pool consisting of 102 objects rated as larger and 102 rated as smaller.

Procedure
Participants were required to attend two experimental sessions approximately twenty-four hours apart. During the first session participants were told that they would see a series of words and that their memory for those words would be tested in the next day's session. Participants were not given details regarding the type of memory test. Each word would be presented with a monetary reward, such that participants would earn the reward upon correct recognition of the word. There were three reward cues: a certain 0p (no reward); a certain 10p; or uncertain (0/20p with equal probability), where the actual outcome was determined pseudo randomly by the computer. In the latter condition the participants were told that the cue indicated that they would win either 0p or 20p for a correct recognition. The certain 10p and uncertain 0/20p conditions are equated for expected value, but differ in uncertainty.In each trial participants were presented with a reward cue for 1500 ms, followed by the target word (4000 ms). To ensure that the words presented were encoded-including those that the cue indicated would not be rewarded-participants were required to indicate whether the object was smaller or larger than a soccer ball.
Participants used the left and right arrow keys (with their index and middle fingers of their dominant hand) to input their response. The word then changed from black to blue to show that their response had been registered, but remained on the screen to control the word presentation time. There was an intertrial interval of 1500 ms between each trial during which a fixation cross was displayed in the centre of the screen. Participants completed a block of 12 practice trials before starting the main part of the experiment. The learning phase was then run as three blocks of 34 trials, with optional breaks between blocks. There were an equal number of each type of reward, randomly intermixed across the 102 trials.
During the second session on the following day, participants completed a recognition test following the ''remember/know" procedure (Tulving, 1985), which is often used in reward-related memory studies (Bunzeck et al., 2010;Spaniol et al., 2013;Wittmann, Bunzeck, Dolan, & Düzel, 2007;Wittmann et al., 2005). Using the left/right arrow keys participants were first required to make an old/new judgment. After ''new" judgments participants were asked to rate how confident they were about this decision by deciding if the word was ''certainly new" or ''guess". After ''old" decisions subjects were asked to indicate if they were able to recollect something specific about seeing the word during the study phase (''remember"), or if they simply felt the word was ''familiar". Alternatively, if they were unsure that they had in fact seen the word they could select ''guess". Participants were told that they would earn the reward associated with the word, but if they classified a new word as old they would lose 7 pence from their total earnings. The participants had up to 4 s to make each of the two judgments and a response terminated each trial. There was an inter-trial interval of 500 ms. Their total earnings were revealed at the end of the test phase. There is a continued debate in recognition memory regarding single or dual process models. Dual process models (Wixted & Stretch, 2004;Yonelinas, 2002) propose that recognition memory comprises of both recollection and familiarity, in contrast single process models such as retrieving-effectively-from-memory (REM) (Shiffrin & Steyvers, 1997) propose that items are stored against a single continuum of familiarity or strength of evidence (Dunn, 2004). Dunn (2004) demonstrated that ''remember/know" data can be explained by single processes as both options reflect confidence in the choice as opposed to two distinct processes. Proponents of dual-process models argue that a distinct recollection component must be included in recognition memory models in order explain the findings in the recognition data (Diana, Reder, Arndt, & Park, 2006). However, there is evidence to suggest that both types of model account for the existing data, for example see work by Malmberg, Zeelenberg, and Shiffrin (2004) on wordfrequency effects in recognition memory. We had no prior beliefs concerning differential effects of uncertainty on recollection vs. familiarity, but in line with other studies in the literature we collected recollection and familiarity data as this is sometimes analysed in the literature. The confirmatory analysis focused on correct recognition and given that there was no overall effect of uncertainty, we did not feel justified in examining recollection and familiarity separately. All our data is available on the Open Science Framework at https://osf.io/bn5e8/.

Data analysis
For each of our experiments we tested 30 participants, with the exception of Experiment 3 where we tested 50 participants. We are using Bayesian statistics as our inferential framework, which allows us to competitively test models and explicitly calculate a strength of evidence for these models that is not systematically biased by the sample size.
For all experiments a mixed-effects logistic regression was conducted on individual trials. Firstly, this allowed us to accommodate individual differences, at least in overall performance levels (by way of a random subject factor). Secondly, this approach allowed us to test the role of range of reward-related factors that were not necessarily represented by single conditions in each experiment. Our dependent variable was therefore each participants' ''old/new" response to each of the studied words shown in the experiment. The modelling of individual trials becomes particularly important when different trial-types (e.g. low and high probability rewards in Experiment 4) occur with different frequencies.
In this case, an analysis of ''by-participant summary statistics", such as the hit rate, using standard statistical approaches would give equal weight to hit rates that were measured with very different numbers of trials. We will plot the proportion of items correctly recognised based on reward cue and outcome in each experiment so as to graphically represent the data. In addition, to illustrate the goodness of fit, for each experiment we plot the predictions of each of the best models (model with the lowest BIC) along side the data.
An additional and important benefit of our approach is that we are able to make inferences in favour of the null. A Bayesian model selection approach was used to assess the unique contribution of predictors. For all experiments, models were fit using the ''glmer" function in the ''lme4" package in ''R" (Bates, Maechler, Bolker, & Walker, 2015). The Bayesian Information Criteria (BIC) provided by the ''glmer" function can be converted to an approximation of a Bayes Factor (with uninformative priors) according to the following rule: BF M1 M2 ¼ expðÀ0:5 Ã ðBIC M1 À BIC M2 ÞÞ (Raftery, 1995;Wagenmakers, 2007). The unit information prior that the BIC assumes is objective in that the researcher does not specify their own prior. The BICś assumed prior is relatively uninformed, and tends to be conservative (i.e., it can favour the null hypothesis more than under an informed prior (Weakliem, 1999)).
For our model comparisons we first selected the model with the lowest BIC value and we then compared each of the other models to this model. The subscript of Bayes Factor indicates the direction of the model comparison: the first element of the subscript represents the numerator; the second element represents the denominator. A Bayes Factor greater than 1 indicates that the model in the numerator is better supported by the data than the model in the denominator. A Bayes Factor less than 1 indicates that the model in the denominator wins over the model in the numerator. The parameter estimates and confidence intervals for all the models we ran are reported for each experiment in the appendix. Table 1 lists the factors that were tested in each experiment; the design of the experiments meant that not all factors could be uniquely tested in all experiments. The BIC values for each model in each experiments are presented in the subsequent tables, along with a third column which compares the best model to each other model (denoted as M). For example, if the EV model has the lowest BIC value it is selected as the best model and a BF will be given providing a strength of evidence in favour of the EV model compared to each other model (BF EV M ).
The value of the BF indicates the relative evidence, provided by the data, in favour of one statistical model over another. Although this evidence is continuous, several authors have suggested heuristics to interpret this evidence. Here it is useful to loosely follow the interpretation scheme of Jeffreys (1961), who suggested that odds greater than 3 can be interpreted as some evidence, odds greater than 10 as strong evidence, and odds greater than 30 as very strong evidence for a particular hypothesis compared to an alternative (see also Wagenmakers, 2007). Fig. 2 shows the recognition memory rate for each of the reward conditions in the experiment. The false alarm rate along with those from the other three experiments can be seen in Table 2. The summary statistics indicate that memory performance was greater in both of the rewarded conditions, certain and uncertain, compared to no reward.

Results
A mixed-effects logistic regression was run, with the outcome variable being recognition accuracy of each of the old words (correct/incorrect). The baseline model was a model containing only subject as a random effect. The predictors entered into the model were expected value (0 or 10) and reward uncertainty, coded as a binary variable indicating the presence or absence of uncertainty (1 or 0, respectively). First each of the predictors was entered individually into a model, along with subjects as a random effect.
The results from the model comparisons are shown in Table 3. The first column depicts the predictors in each of the models that were tested and the second column provides a BIC value for each of these models. In the final column strength of evidence (as a Bayes Factor) is provided for each of the models compared to the best model. The BIC values for this experiment indicated that expected value alone was the best predictor of recognition memory performance. Therefore in column three of the table the BFs indicate how much evidence there is in favour of this EV model compared to each of the other models. In this case there is strong evidence that EV alone is a better predictor of memory performance compared to EV and reward uncertainty, or reward uncertainty alone.

Discussion
The results from Experiment 1 demonstrate a strong effect of a reward-associated memory enhancement. Both uncertain and certain rewards improved memory performance for items associated with reward. There was, however, evidence against an effect of uncertainty on memory performance. The model containing uncertainty was only marginally better than the baseline model. Expected value alone was a better predictor of memory performance than a model containing both uncertainty and expected value. One explanation for the results is that both the certain reward (10p) and the uncertain reward (0/20p) were being processed only in terms of expected value. This would either suggest that the uncertainty signal does not in fact contribute to enhanced memory performance and expected value alone can explain the memory advantage, or that the design of the experiment did not lead to an uncertainty signal. The majority of reward studies, both looking at reward anticipation (Knutson, Adams, Fong, & Hommer, 2001) and the relationship between reward and memory performance (Adcock et al., 2006;Wittmann et al., 2011Wittmann et al., , 2005, have included a reward cue and reward outcome, thus creating an anticipatory period for reward. Dopaminergic neurons show a tonic response to reward uncertainty between reward cue and reward outcome and therefore this anticipatory period may be critical for encoding reward uncertainty (Fiorillo et al., 2003;Schultz et al., 2008). However, in our experiment the outcome was never revealed. On the one hand, this manipulation may ensure that the uncertainty signal is around for a sufficiently long period to influence the system. On the other hand, it may be that the reward outcome needs to be revealed for the uncertainty signal to emerge. Experiment 2 addresses this concern.
The results indicate a clear effect of reward on memory performance and there was strong evidence against the role of uncertain rewards. In Experiment 2 our aim was to address the concern that in Experiment 1 we may not have induced a reward uncertainty signal given that we did not reveal the reward outcome.

Experiment 2
The dopamine signal associated with uncertainty is thought to emerge as a slow, sustained ramping between the reward cue and delivery of the reward (Schultz et al., 2008). In our first experiment uncertain rewards were cued but the reward outcome was never revealed. To address this issue, in the second experiment participants were informed of the outcome of the uncertain cues following the presentation of each word.

Participants
A total of 30 participants took part in our study (age range 18-44 years, mean 24.73 years, SD 5.72; 12 males and 18 females). All participants received a minimum £5 for their time. The rest of their earnings were related to performance in the memory task. They could earn a total of £11.40. All participants were fluent English speakers and gave informed written consent prior to the study, which was approved by University of Bristol Ethics Committee.

Procedure
The procedure used in Experiment 1 was modified to include reward outcomes. For all trials, the reward cue was presented before the word (as in Experiment 1). For all cues (certain and uncertain) the reward outcome was presented after the word for 1500 ms (see Fig. 3). For certain trials the reward value was repeated and for uncertain trials (0/20p) each possible outcome (0p or 20p) was presented an equal number of times across the experiment. The rewards could then be obtained if the word was successfully recognised the next day. A minor amendment was made to the recognition test procedure so that participants had 5   s to make each of the recognition test choices. This was done to ensure that the recognition test was as similar as possible to others in the literature (e.g. Bunzeck et al., 2010). Otherwise the procedure was identical to that reported in Experiment 1.

Results
The recognition memory rates (proportion correct) for each of the reward conditions are shown in Fig. 4. Memory performance was higher in both the certain and uncertain conditions compared to the no reward condition. In addition, the recognition hit rates in the uncertain condition were greater than those in the in certain condition. As can be seen in Fig. 4, enhanced memory performance seems to be particularly associated with the uncertain 20p reward.
The delivery of outcomes in Experiment 2 allowed us to test the following predictors: expected value, prediction error, outcome and uncertainty. It should be noted that expected value, prediction error and outcome are linearly dependent as prediction error is equal to the reward outcome minus expected value.
We ran models consisting of all possible combinations of the factors (models containing all three of the variables expected value, outcome and prediction error could not be run due to the linear dependence of factors).
The model with the lowest BIC is a model containing reward outcome and uncertainty. This model was then compared to all other models (see Table 4). There is little evidence that this model is better than a model containing outcome alone. Therefore, we have ambiguous evidence regarding the role of uncertainty. These models reflect the finding that there was greater recognition performance in the uncertain condition but that this enhancement was linked to the reward outcome and was largely driven by higher recognition rates associated with the uncertain 20p outcome.

Discussion
Experiment 2 replicated the finding from Experiment 1 that pairing a word with a higher reward improves memory performance. We found the strongest evidence in favour of a model that contained reward uncertainty and reward outcome, but the evidence for uncertainty alone was ambiguous, the Bayes Factors showing approximately equal evidence for a model including both uncertainty and reward outcome, and a model containing only outcome as a predictor. These results do not allow us to drawn any conclusion-positive or negative-about the effect of uncertainty on episodic memory. The model comparisons do, however, clearly point to a role of outcome such that words associated with higher outcomes were better remembered.
Rather than run additional participants on this design to gather more discriminating evidence on the role of uncertainty, we chose to run an additional experiment with a large number of participants that also utilised a more balanced factorial design.

Experiment 3
The main purpose of Experiment 3 was to provide more diagnostic evidence regarding the roles of reward uncertainty and reward outcome. To more carefully pick apart the contributions (or lack thereof) of expected value, reward outcome, prediction error and uncertainty, a factorial crossing of reward cue (uncertain vs. certain) and reward outcome (0, 10 and 20 pence) was used. In addition, we ran 50 participants on this design; given all predictors were varied within-subjects, and the use of a mixed-effects analysis, a clear result on the role of uncertainty was anticipated.

Methods Participants
A total of 50 participants took part (age range 18-36 years, mean = 21.24 years, SD 2.81; 20 males and 30 females). Participants were recruited for paid participation via adverts on the University of Bristol School of Experimental Psychology web page. Five additional participants were tested for one session, but did not complete the entire study, either due to an error with the computer hardware or a failure to attend the second session. All participants were fluent English speakers and gave informed written consent prior to the study, which was approved by University of Bristol Ethics Committee.

Procedure
The procedure of the experiment followed that of Experiment 2. The design of this experiment was factorial, with the main factors being reward value (0p, 10p or 20p) and reward uncertainty (certain or uncertain-with each outcome occurring an equal number of times). The recognition test followed the same procedure outlined in Experiments 1 and 2.
The total number of trials and the timings were kept the same. Reward cues were either certain or uncertain (0/10/20 pence). Across the experiment there were 34 trials for each reward outcome, 17 signalled by a certain reward cue and 17 (for each value) by the uncertain reward cue. All participants received a minimum £3 for their time. The rest of their earnings were related to performance in the memory task. Participants could earn a maximum total of £13.20.

Results
The proportion of items correctly recognised in each of the conditions in the experiment are plotted in Fig. 5.
We ran a series of all possible models to test the contribution of the following factors: expected value, prediction error, reward outcome and reward uncertainty. Our results suggest that the best model of the data contains only reward outcome as a predictor. We compared the evidence in favour of the reward outcome model to each of the other possible models and the results can be seen in Table 5. The model comparisons indicate that none of the other models were competitive. Fig. 5 visually suggests a potential role for uncertainty at least for the 0p and 20 p outcomes. Our model comparisons illustrate that there is some evidence against a model containing both reward outcome and uncertainty. Accordingly, the evidence suggests that the visually apparent difference is more likely due to chance than a non-trivial effect.

Discussion
In Experiment 2 we were unable to provide clear evidence for or against reward uncertainty. The design of Experiment 3 allowed a more powerful examination of the effects of uncertainty across three different reward values. The results from Experiment 3 provide evidence against an additional effect of uncertainty. Instead, the findings suggest that reward outcome is the strongest single predictor of memory performance.

Experiment 4
Experiments 1-3 all found evidence in favour of expected value or, when available, reward outcome. In Experiments 1 and 3 we found evidence against reward uncertainty. In Experiment 2 we were not able to distinguish between the roles of uncertainty and outcome. To lend some additional confidence to the emerging conclusion that uncertainty does not affect recognition memory, a fourth experiment was conducted in which uncertainty was varied over a greater range (rather than just the comparison between no and maximal uncertainty). Experiment 4 also aimed to generalise the findings of Experiments 1-3 to a different paradigm that includes a variation of reward probabilities rather than magnitude of reward. This paradigm is closer to that used in human imaging studies that have shown separable midbrain responses to value and uncertainty (Preuschoff et al., 2006;Schultz et al., 2008;Tobler et al., 2007).
Recording of brain activity in primates and humans suggests that the activity of a number of brain regions varies as a function of different aspects of uncertainty (Hsu et al., 2009;Preuschoff et al., 2006Preuschoff et al., , 2008Schultz et al., 2008;Tobler et al., 2005;Tom, Fox, Trepel, & Poldrack, 2007). Particularly relevant here is the finding that the variance of the probability distribution across different reward values relates to changes in activity in posterior parietal cortex (PPC) (Mohr, Biele, & Heekeren, 2010;Symmonds, Wright, Bach, & Dolan, 2011), while varying the probability of a fixed reward produces uncertainty-related activity in areas such as ventral striatum (Preuschoff et al., 2006). Symmonds et al. (2011) suggested that uncertainty-related changes in activity in the PPC might be related to another function of PPC, namely the representation of magnitude more generally. Experiments 1-3 here used a binary manipulation of the uncertainty about the size of the reward, and this may not have been effective in producing an uncertainty-related response sufficient to produce a concomitant effect on memory performance.
Given that reward-related activity in striatum is predictive of episodic memory performance (Adcock et al., 2006;Bunzeck et al., 2010;Wittmann et al., 2005), and that varying probability of reward is known to produce uncertainty-related activation in striatum (Liu et al., 2011;Preuschoff et al., 2006;Schultz et al., 2008;Tobler et al., 2005Tobler et al., , 2007, in Experiment 4 the probability of obtaining a reward of fixed magnitude was varied. Reward probability is related to both expected value and uncertainty, and the two factors can be disentangled by parametrically varying reward probability (Preuschoff et al., 2006;Schultz et al., 2008;Tobler et al., 2007). Reward value was fixed at 20 pence and for each trial the probability of reward was visually signalled to participants;   accordingly, the design is similar to other studies finding uncertainty-related activity in striatum (Preuschoff et al., 2006;Schultz et al., 2008). In order to ensure continuity with our previous experiments, we kept the timings and other aspects of the experiment as similar as possible to those in Experiments 1-3.

Participants
A total of 31 participants took part in our study (age range 18-36 years, mean 22.73 years, SD = 4.27; 9 males and 21 females). All participants received a minimum £3 for their time. The rest of their earnings were related to performance in the memory task. They could earn a total of £11. All participants were fluent English speakers and gave informed written consent prior to the study, which was approved by University of Bristol Ethics Committee. During one testing session the network crashed, so data from this participant was incomplete. One participant did not attend the second experimental session and so their data could not be used.

Procedure
In this experiment participants were told that for each word they could earn a reward with a given probability. The reward was fixed to 20 pence, and the probability of reward varied from 0.1 to 0.9 in increments of 0.2. There were an equal number of trials for each reward probability, however, the outcomes were calculated pseudo-randomly for each participant. This means that for each probability and reward outcome there will necessarily be a different number of trials per condition. The probability of earning a reward was illustrated by a rectangle filled with a green bar that increased in size the higher the probability of the reward. Before starting the experiment participants were told which size bar mapped to each probability. During the trial a yellow ball was dropped randomly onto the rectangle. This was done by randomly sampling from the height and width of the green area and placing the ball at that point. The outcome on each trial was determined randomly with the predicted probability. If the ball landed in the green area this indicated that a reward could be earned for correct memory performance (see Fig. 6).
In each trial participants were presented with a green 1 bar indicating the probability of earning a reward (1500 ms) followed by the target word (4000 ms). To ensure that the words presented were encoded participants were required to make a size judgment about the word. They were asked to judge if the object was smaller or larger than a football. They used the left and right arrow keys (and their right index and middle fingers) to input their response and the word changed from black to white to show that their response had been registered. The green bar indicating probability of reward was shown again (500 ms) before a yellow ball dropped into the Fig. 5. Recognition memory performance in Experiment 3 as a function of the certainty of the cue (certain vs. uncertain) and the reward outcome. Error bars show SEM within-subject error bars calculated using the method in Morey (2008). The symbols illustrate the predicted values for the best fitting model (Outcome).

Table 5
Experiment 3: the first column lists each of the models we tested and the best model, with the lowest BIC value is highlighted in bold. Each of the models (M) was compared to the best model and the third column shows the BF comparisons. rectangle and either landed in the grey area (no reward) or in the green area (reward) (1500 ms).
There was an inter-stimulus interval of 1500 ms between each trial during which a fixation cross was displayed in the centre of the screen. Participants completed a block of 12 practice trials before starting the learning phase of the experiment. The experimental session was then run as three blocks, with a total of 100 trials and a 15 s break between each block. The recognition test followed the same procedure as in the previous experiments. Fig. 7 shows memory accuracy as a function of reward probability and reward outcome (note that different numbers of data points contribute to the different cells). Memory performance was higher for outcomes resulting in a reward compared to no reward. There also appears to be some effect of unexpected outcomes on memory performance. Memory performance is higher when the reward outcome is unexpected: a reward when the probability is low (p = 0.1) and the absence of a reward when the probability is high (p = 0.9), again note that the number of trials where this occurs for each participant is very low. This effect will be tested in our modelling by including a predictor ''surprisal", as described below.

Results
The factors of interest for the logistic regression were expected reward value, reward outcome, prediction error and uncertainty of reward. In addition, the design of Experiment 4 allowed us to examine an additional predictor: surprisal. Surprisal can be calculated by Àlog 2 ðPÞ, where P is the prior probability of the event occurring. Surprisal is particularly relevant here as it represents an interaction between reward outcome and expected value. That is, when the outcome is positive, but has a low prior probability (e.g. 0.1), surprisal is high (Àlog 2 ð0:1Þ ¼ 3:32). When the outcome is positive and has a high prior probability (e.g. 0.9), surprisal is low (Àlog 2 ð0:9Þ ¼ 0:15). However, when the prior probability is high (e.g. 0.9), the probability of not getting a reward is low and surprisal is high again (Àlog 2 ð1 À 0:9Þ ¼ 3:32). As seen in Fig. 7, memory performance appears to be relatively enhanced for those points representing low-probability outcomes. The inclusion of surprisal in our model allows for a statistical evaluation of this pattern. Table 6 gives the modelling results for Experiment 4. A model containing only reward outcome as a predictor was most favoured by the data. There was strong evidence against adding the factor surprisal BF O O&S = 19.81, and very strong evidence against all other models. Most relevant here is that the model including only outcome was strongly supported over the model that contained uncertainty as an additional predictor (Uncertainty + Outcome), BF O O&Un = 44.08.

Discussion
Our results indicate that reward outcome is the best single predictor of memory performance. Fig. 7 suggested that memory may be improved for rewarded items when the reward probability is low, and non-rewarded items when the reward probability is high. To allow the model to capture this possible effect, we added a new predictor of surprisal to our models. We found some evidence against an effect of surprisal, meaning that memory was not particularly good for unexpected outcomes. Our regression analysis therefore suggests that the apparent boost in memory performance for unexpected outcomes is either very small and not worth the additional model complexity of an extra parameter or is due to noise as these means are, by definition, based on fewer data points.

General discussion
The experiments presented in this paper aimed to identify which properties of reward contribute to the reward-related enhancements observed in motivated learning (Adcock et al., 2006;Bunzeck et al., 2010;Castel, 2007;Mather & Schoeke, 2011;Murayama & Kuhbandner, 2011;Murayama & Kitagami, 2014;Wittmann et al., 2007Wittmann et al., , 2005. In particular, we focused on the previously unaddressed role of reward uncertainty. We designed this series of experiments to disentangle the role of some of the properties of the reward signal: expected value, linked to phasic dopamine, and reward uncertainty, linked to tonic dopamine (Schultz et al., 2008). We were also able to assess the contribution of other factors including, prediction error, reward outcome and surprisal.
Across all of our experiments participants were required to learn a series of words, in exchange for monetary incentives. In Experiment 1 we used certain and uncertain reward cues. There was no additional effect of uncertainty on memory performance, beyond expected value. A potential issue with the design of this experiment was that we did not reveal a reward outcome, meaning that the uncertain cue may not have been treated as such by the participants. Participants may have interpreted the uncertain reward cue as equivalent to the certain 10p cue, or may have mentally simulated an outcome. In order to address this issue, we introduced outcomes to the reward cues to the design for both Experiments 2 and 3. In Experiment 4 we manipulated uncertainty and expected value through a variation of reward probability, keeping reward magnitude constant. Across all these experiments we consistently found evidence in favour of reward outcome. In Experiments 1, 3 and 4 we found evidence against an effect of reward uncertainty. Experiment 2 returned ambiguous information on the role of uncertainty: the best fitting model included both uncertainty and outcome as predictors, but this model was essentially indistinguishable in penalised fit from a model including only outcome as a predictor. In aggregate, the results point against an effect of uncertainty on episodic memory.
One caveat is that model selection based on the BIC punishes models that are more complex (i.e. have more parameters) (Rouder, Morey, Verhagen, Province, & Wagenmakers, 2016). If the effect is very small, a very large amount of data will be needed to find evidence for that effect. As a result, the alternative model is more complex and flexible, and incurs a penalty for this additional complexity. If the effect is so small that a null model provides a Fig. 6. Trial sequence for study phase in Experiment 4. The bar initially represents the probability that the upcoming word will be rewarded at recognition. The yellow ball drops in the grey area to indicate that there was no reward on the current trial. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) reasonable fit to the data, a very large amount of data will be needed to overcome the penalty associated with the more complex model. This means that the most generous conclusion for the reward uncertainty effect here is that the effect is very small, and substantially more data would be needed to detect the effect (Wagenmakers, 2007). The parameter estimates for each of the models ran across all four experiments can be seen in the appendix. It is worth noting that across the four experiments, when looking at best-fitting model that includes uncertainty, the parameter estimates for uncertainty are just as often negative as they are positive.
We had prior theoretical reasons to expect a relationship between reward uncertainty at encoding and subsequent memory performance. Dopaminergic neurons in the midbrain are thought to exhibit two patterns of firing. The first, known as the phasic bursts, are transient responses to reward cues and outcomes. The second signal, tonic firing, refers to sustained activity in response to anticipation and expectancy. This tonic firing has been linked to encoding of reward uncertainty (Hsu et al., 2009;Liu et al., 2011;Preuschoff et al., 2006Preuschoff et al., , 2008Tobler et al., 2005Tobler et al., , 2007. Evidence from imaging studies has shown that dopaminoceptive areas respond to reward uncertainty (Preuschoff et al., 2006;Tobler et al., 2007), and that this is connected to hippocampal activity (Shohamy & Adcock, 2010). We found no behavioural evidence for such a relationship. Instead, our results are consistent with a model in which the phasic bursts of activity are the primary predictors of memory performance. In particular we find that reward outcome or, equivalently a combination of expected value and prediction error, is the best predictor of memory.
One reason for specifically looking at motivated learning is the potential applied benefits of reward in educational settings. Research has found that game-like learning environments that contain reward uncertainty lead to better learning than those where rewards are fixed . However, previous studies did not dissociate uncertainty from other reward factors such as expected value (Howard-Jones et al., 2011;Ozcelik et al., 2013). It is possible that reward uncertainty makes the environment as whole more engaging. Most studies in the reward learning literature deliver rewards with a probability of about 80% in order to keep participants engaged (Wittmann et al., 2011). However, our results suggest that reward uncertainty does not influence memory on an item by item basis, and suggests any apparent classroom benefits of reward uncertainty may require alternative explanations (Devonshire et al., 2014). However, our study is the first in this area to examine the independent role of reward uncertainty. Although we find no evidence to support its role in promoting memory, this does not rule out conducting controlled experiments in a classroom setting as the uncertainty linked to the reward environment (and therefore acting over longer timescales) may be more strongly represented in the classroom environment than in a lab-based setting.
One obvious explanation for the lack of effect of uncertainty is that our manipulation may not have induced a state of uncertainty at the time of encoding. Across the four experiments we tested two types of uncertainty. In Experiment 1 we looked at uncertainty that persisted for the duration of encoding and recall and in Experi-  ments 2-4 uncertainty was resolved after encoding but before recall (this could have led to post-encoding and rehearsal effects of reward outcome which we will discuss later). Given that our experimental design did not include an independent measure of uncertainty, we cannot rule this explanation out. However, there are good reasons to assume that varying states of uncertainty levels were induced in paradigm. Neuroimaging studies, on which our paradigms were based, have reported uncertainty-related activation in dopaminoceptive areas of the midbrain (Cooper & Knutson, 2008;Preuschoff et al., 2006Preuschoff et al., , 2008Tobler et al., 2007). Our experiments use a very similar manipulation to the above studies. In all of our studies, as is common in the literature, the initial reward cue indicated expected value of reward and the degree of uncertainty associated with that reward. In Experiments 1-3 reward uncertainty was maximal, whereas in Experiment 4 the level of reward uncertainty varied across trials as a function of reward probability in a similar manner to the task used by Preuschoff et al. (2006). Although the reward outcomes in our experiments were anticipatory (i.e. delivered the next day and dependent upon correct recognition performance), there is neuroimaging evidence suggesting that midbrain dopamingeric neurons show a robust signal to anticipatory rewards and this signal is linked to memory performance (Adcock et al., 2006). While there are some methodological differences regarding how the probability of reward was conveyed in our studies and those looking at reward-related uncertainty, these are relatively small and have to do with the nature of the cues and communication of reward outcomes. The stimulus differences should be considered relatively incidental and if the neural uncertainty signal depends on such incidental features of the design, it would suggest that the neural response to uncertainty is rather fragile. It should, however, be noted that we manipulated uncertainty within the context of small financial gains in motivated learning, therefore our results can only be interpreted within this context. One could argue that uncertainty may differentially affect incidental and motivated learning and that there could be an effect of reward uncertainty on memory when the conditions of learning do not allow for strategic learning to take place. In this vein, it is important to highlight that the mechanisms affecting motivated and incidental learning are likely to be different as motivated learning involves the complex interplay between strategic memory encoding and dopaminergic consolidation. Recent research has begun to manipulate the degree to which participants can apply effective encoding strategies in order to dissociate the two learning mechanisms (Cohen, Rissman, Hovhannisyan, Castel, & Knowlton, 2017;Spaniol et al., 2013). There is also the possibility that people could show sensitivity to reward uncertainty in the domain of gains and losses (Tversky & Kahneman, 1981). Recent research in the decision-making literature has also demonstrated that people show better memory for the extreme outcome in both risky gains and risky losses (Madan, Ludvig, & Spetch, 2014. So although we do not see evidence of an effect of reward uncertainty on memory with respect to small gains, it is possible that we would see an interaction between uncertainty and outcome with respect to losses or larger gambles.
In addition to dopaminergic consolidation effects on memory, strategic value-directed learning processes contribute to better encoding of and memory for items with higher outcomes. Individuals may allocate more time and resources at encoding to items linked to a higher monetary value (Ariel & Castel, 2014;Castel et al., 2002;Loftus & Wickens, 1970). The results from our experiments concerning reward outcome could be explained by strategic influences of value on memory, either in the form of enhanced processing of high-value items (Castel, Murayama, Friedman, McGillivray, & Link, 2013;Cohen et al., 2014, Cohen, Rissman, Suthana, Castel, & Knowlton, 2015 or directed forgetting of low-value items (Fawcett & Taylor, 2008;Friedman & Castel, 2011;Hayes, Kelly, & Smith, 2013;Lehman & Malmberg, 2009;Wylie, Foxe, & Taylor, 2008). These effects may occur alongside dopaminergic consolidation and the two processes could serve to strengthen each other. For example, the consolidation process itself could be strengthened by strategic influences, and these strategic influences-i.e. signalling an item as ''high-value"-may be mediated by an enhanced dopamine response. Friedman and Castel (2011) found that participants were able to predict accurately which items they would remember and forget, and this was directly linked to the item's value. It has also been suggested that memory selectivity could occur due to differences in semantic processing of high and low value words (Cohen et al., 2014(Cohen et al., , 2015. This is supported by recent findings from fMRI studies showing differences in activity in the fronto-temporal network, associated with semantic processing, during processing of high and low value words (Cohen et al., 2014(Cohen et al., , 2015.
Overall, the results from our experiments add weight to findings from the incidental learning literature (Bunzeck et al., 2010;Mason et al., 2016;Mather & Schoeke, 2011;Murayama & Kitagami, 2014;Wittmann et al., 2011) that stress the importance of reward outcomes and are consistent with findings from valuedirected learning where a range of reward values are used (Castel, 2007;Cohen et al., 2014Cohen et al., , 2015. Our findings further highlight the need for memory models to explain the processes at work for post-encoding memory enhancements across a range of reward outcomes. The post-encoding effect of reward outcome can be explained as the removal or suppression of items within the framework of several existing memory models, usually focused on directed forgetting (Malmberg, 2006;Norman, Newman, & Detre, 2007;Oberauer et al., 2012). In the framework of the Search of Association Memory Theory (SAM) model Malmberg (2006) suggested that directed forgetting is accomplished by shifting the context with which new memories are associated. By shifting the context of to-be-forgotten items away from the context used to cue for items at test, those items will be less activated at recall and are less likely to be recalled. In the case of item directed forgetting, where the ''forget" cue is presented after each item to be forgotten, it is assumed that people shift attention away from items to be forgotten by thinking about recently presented items to be remembered, thus giving those items additional rehearsal and making them more likely to be retrieved at a later time. Accordingly, people might respond to reward values here by treating high reward as a ''remember" cue, and a low/absent reward as a ''forget" cue. Other models assume that forgetting occurs due to the synaptic weakening of memory traces for items. For example, according to a neural network model of retrieval-induced forgetting (Norman et al., 2007) items in episodic memory tasks are forgotten due to the weakening of memory traces in the hippocampal layer of the network (Anderson, 2003;Norman et al., 2007). Similarly, the forgetting of the unrewarded or lower reward items could occur by a process of unlearning. Models such as SOB-CS (Oberauer et al., 2012) assume that irrelevant information is removed from memory by unlearning the association between that information and the current context in memory. Although a model of working memory tasks, SOB-CS could be extended to explain episodic memory tasks [see e.g., Farrell 2012], including the effects of reward seen here.
Not all items can be promoted in memory, but rewards potentially serve the purpose of promoting memory for important items. Rewards do not simply enhance memory for all items, instead some items are prioritised over others (Castel, 2007;Castel et al., 2013). Our series of experiments suggest that reward outcome is the over-riding factor that leads to items being selectively enhanced in memory.