Retrieving knowledge is not a neutral event: Every act of retrieval modifies the state of memory (Bjork, 1975). Most research on the mnemonic effects of retrieval has focused on the effects of successful retrieval, and this research has consistently shown that repeated retrieval directly enhances the subsequent retrievability of knowledge (e.g., Karpicke & Blunt, 2011; Karpicke & Roediger, 2007, 2008; Pyc & Rawson, 2009). In addition, other research has suggested that unsuccessful retrieval attempts can enhance learning by improving the encoding of unrecalled items during subsequent study trials (e.g., Izawa, 1970; Karpicke, 2009; Karpicke & Roediger, 2007; Kornell, Hays, & Bjork, 2009; Pastötter, Schicker, Niedernhuber, & Bäuml, 2011; Richland, Kornell, & Kao, 2009; Slamecka & Fevreiski, 1983; Wissman, Rawson, & Pyc, 2011). This article explores the circumstances under which unsuccessful retrieval attempts improve subsequent encoding and enhance learning.

Recently, Kornell et al. (2009) revived a paradigm originally developed by Slamecka and Fevreiski (1983) to examine the effects of failed retrieval attempts on subsequent encoding. In this procedure, subjects learned weakly associated word pairs (e.g., tide–beach). Some items were “pretested” immediately before each study trial, while others were only studied. For pretested items, subjects were given the cue word and told to guess the target word (tide–?) immediately prior to studying the target (tide–beach). Because the target words were weakly related to the cues, subjects almost always failed to guess the correct target word and typically produced a different related word (e.g., wave). Thus, the procedure was designed to ensure high rates of retrieval failure prior to the encoding of the correct target. On a final cued recall test, subjects recalled a greater proportion of the pretested items than of the studied items. Therefore, the failed retrieval attempts seemed to have enhanced the subsequent encoding of the pretested items.

Kornell et al. (2009) mentioned three theoretical ideas, which we have elaborated and defined here, that might explain why failed retrieval attempts would enhance subsequent encoding. According to the first theory, which we refer to as a search set theory, the presentation of a cue on the pretest (tide–?) initiates a search process wherein related candidates become activated. The activation of candidates during retrieval enhances the encoding of those candidates when they are subsequently presented as target words to study (tide–beach). The idea that retrieval activates related candidates corresponds with theories of retrieval such as SAM (Gillund & Shiffrin, 1984; Raaijmakers & Shiffrin, 1981) and MINERVA2 (Hintzman, 1986, 1988). According to these models, cues present at retrieval are used to probe memory and to activate relevant traces in parallel based on similarity to the cue information. These active traces form the search set, and any one trace is probabilistically sampled from the set and recovered in order to make a response. Although a strongly associated trace (e.g., wave) may be retrieved and produced as a response, other candidates (e.g., beach, surf, ocean, etc.) receive some activation. Thus, according to the search set theory, failed retrieval attempts may enhance learning because activation of the search set facilitates encoding of the target.

The second theory, which we refer to as an error correction theory, is that failed retrieval attempts enhance learning through a general error correction process (Carrier & Pashler, 1992; Kang et al., 2011; McClelland & Rumelhart, 1985). When subjects produce a response and then are shown the correct target, any discrepancy between the correct target and a subject’s response would produce an error signal, and a general error correction mechanism would guide adjustment of connection weights among nodes in the system in favor of the target answer. In general, the amount of learning that occurs during error correction corresponds to the size of the discrepancy between the response made and the desired response. The idea that learning is driven by the degree of discrepancy between what is expected and what actually occurs is foundational to several learning theories (Rescorla & Wagner, 1972; Rumelhart, Hinton, & McClelland, 1986). Thus, failed retrieval attempts may enhance learning by virtue of general error correction mechanisms that are sensitive to the discrepancy between the outcome of a retrieval attempt and the subsequent study event.

Finally, the third theory, which we will call an additional-cue theory, is that the response that a subject produces during an initial failed retrieval attempt is covertly recalled during a future retrieval attempt, functioning as an additional cue that aids in retrieval of the target item (Soraci et al., 1994). For example, if subjects were given tide–? and produced wave as a response before they studied the pair tide–beach, then wave would be encoded along with the target pair and could potentially serve as a retrieval cue in the future. On a final test, when subjects are given tide–?, they may implicitly recall wave as an additional retrieval cue for beach. Therefore, failed initial retrieval attempts may enhance learning because subjects have two cues to aid recovery of target items. Carpenter (2009, 2011) has recently proposed a similar idea, that during retrieval subjects covertly generate many potential words and that those words can serve as “mediators” for the target word (see also Pyc & Rawson, 2010). Here, we focus only on the one word that is explicitly generated.

In the three experiments reported here, we examined the conditions under which retrieval attempts enhance learning, and tested predictions derived from the theories outlined above. The three theoretical ideas need not be mutually exclusive, but the theories lead to specific predictions that were examined in the following experiments.

Experiment 1

Experiment 1 used the procedure developed by Kornell et al. (2009, Exps. 3–6; see also Slamecka & Fevreiski, 1983), which was shown to create high rates of failure during initial retrieval attempts. In addition to the pretesting manipulation, the other key manipulation in Experiment 1 was the relatedness of the cue–target word pairs. Subjects learned a mixed list containing 30 related word pairs (e.g., tide–beach) and 30 unrelated word pairs (e.g., pillow–leaf). In the no-pretest condition, subjects simply studied each pair one at a time. In the pretest condition, subjects were shown the cue word of each pair (e.g., tide–? or pillow?) and attempted to retrieve a target item immediately before they studied the correct cue–target pair. Retention was assessed on a final cued recall test over all items.

The search set theory proposes that semantically related candidates become activated during a retrieval attempt. When one of these related words is presented as a target on the subsequent study trial, the residual activation from the retrieval attempt contributes to the enhanced encoding of the target word. It is unlikely that this enhanced encoding would occur if subjects were to study unrelated words during study trials, because in this case, the initial search set of candidates would not include the target. Thus, attempting retrieval during a pretest should enhance encoding of related cue–target words pairs but should not enhance the encoding of unrelated pairs.

Both the error correction and additional-cue theories lead to predictions that differ from the one derived from the search set theory. According to an error correction theory, an error signal would be produced during the encoding of both related and unrelated word pairs, and thus learning should be enhanced for both item types. Moreover, it is reasonable to assume that the magnitude of the discrepancy between the subject’s response and the correct target would be greater for unrelated than for related pairs. Given that the degree of learning may be related to the degree of this discrepancy (Rescorla & Wagner, 1972), the mnemonic benefit of attempting retrieval prior to study may be greater for unrelated word pairs than it would be for related word pairs.

The idea behind an additional-cue theory is that words produced during a retrieval attempt are encoded as additional retrieval cues. Cues that are generated by subjects are typically more effective than are experimenter-provided cues (Arndt & Jones, 2008; Karpicke & Cross, 2011; Mäntylä, 1986). If subjects can retrieve additional cues at final test, then they could use both experimenter-provided cues and their own self-generated cues to retrieve the target items, and it is reasonable to assume that using both cues would be more effective than using the experimenter-provided cue alone. Thus, like the general error correction theory, an additional-cue theory would predict that because subjects generate an additional cue during a retrieval attempt, performance should be enhanced for both unrelated and related words.

Method

Subjects

Thirty-two Purdue University undergraduates participated in exchange for course credit. All subjects indicated that they were fluent in written and spoken English.

Materials

A list of 60 word pairs (containing 120 medium-frequency, medium-concreteness words) was used in the experiment. The word pairs were divided into two sets that were matched on frequency and concreteness. One set (30 of the word pairs) included associatively related word pairs (e.g., tide–beach, jelly–bread, kite–wind), and the other set included unrelated pairs (e.g., stem–candy, pillow–leaf, spray–bone). The forward associative strength of the related pairs was between .050 and .054, based on Nelson, McEvoy, and Schreiber’s (1998) norms. Forward associative strength represents the likelihood that a word will be the first word generated on a free-association test; therefore, the related target words had about a 5% chance of being correctly guessed by subjects during pretest trials. For unrelated pairs, there was no associative relationship between the words, according to the Nelson et al. norms.

Design

The experiment used a 2 (pair relatedness: related or unrelated) × 2 (learning condition: pretest or no-pretest) mixed factorial design. Pair relatedness was manipulated within subjects, and learning condition was manipulated between subjects. Sixteen subjects were assigned to each condition.

Procedure

The subjects were tested in small groups. They were told that they would learn a list of word pairs and take a memory test on the pairs. Subjects in the no-pretest condition studied the list of word pairs (e.g., tide–beach) in a study phase. The word pairs were shown on a computer screen in a random order, and each word pair was presented for 5 s with a 500-ms intertrial interval. Subjects in the pretest condition saw each cue word (tide–?) on the screen for 7 s prior to each study trial. They were instructed to attempt to guess the target word and to type their response into an input field shown on the screen (following Kornell et al.’s, 2009, instructions). The subjects were encouraged to provide a response for every pair. After each pretest trial, subjects immediately studied the correct cue–target pair (tide–beach) for 5 s with a 500-ms intertrial interval.

After the study phase, subjects completed a distractor task that involved solving math problems for 5 min. They then took a cued recall test on the word pairs. On each test trial, subjects were shown a cue word and a blank text input field for 7 s, and they were instructed to recall and type the target word they had studied in the study phase. The cue words were presented in a random order in the test phase. Subjects were told to avoid guessing and to respond only with words they remembered studying, not words they had produced. At the end of the experiment, the subjects were debriefed and thanked for their participation.

Results and discussion

Figure 1 shows the proportions of words recalled on the final cued recall test. For all statistical tests, alpha was set to .05. To ensure that these analyses included only items from initial failed retrieval attempts, items that were correctly guessed during the pretest trials were excluded from the analyses. On average, subjects correctly guessed the target answer for 6% of the related pairs, a rate that is similar to the rates of correct guessing reported by Kornell et al. (2009). For unrelated pairs, the target responses were never correctly guessed during pretest trials. The removal of correctly guessed pairs from the analyses did not affect the pattern of results.

Fig. 1
figure 1

Proportions of words recalled on the final cued recall test in Experiment 1. Error bars represent standard errors of the means

The final test results were analyzed using a 2 (learning condition) × 2 (pair relatedness) mixed factorial ANOVA, with learning condition as a between-subjects factor and pair relatedness as a within-subjects factor. There was a main effect of pair relatedness, F(1, 30) = 180.26, η 2p = .86, but the main effect of learning condition did not reach significance, F(1, 30) = 2.14, p = .15. However, there was a significant learning condition × pair relatedness interaction, F(1, 30) = 7.67, η 2p = .20. For unrelated pairs, there was no difference between the pretest and no-pretest conditions (.24 vs. .24), t(30) < 1. For related pairs, the pretest condition produced significantly greater recall relative to the no-pretest condition (.73 vs. .56), t(30) = 2.78, d = 0.94. Thus, attempting retrieval prior to each study trial enhanced subsequent recall of related word pairs but not of unrelated word pairs.

The results of Experiment 1 replicate and extend the results reported by Kornell et al. (2009). Subjects who attempted retrieval prior to each study trial recalled more words on the final test than did subjects who did not attempt retrieval. However, this effect only occurred when the cue and target words were related, and it did not occur when the words were unrelated. The results are consistent with the search set theory described at the outset, which holds that related candidates are activated when the subject establishes a search set during a retrieval attempt. Because semantic relatedness is used to establish the search set, attempting retrieval enhances the subsequent encoding of related but not of unrelated words. The lack of a pretesting effect for unrelated words is more difficult to reconcile with predictions from both an error correction and an additional-cue theory. Both theories would suggest that attempting retrieval would enhance encoding of unrelated word pairs, either due to a general error correction mechanism or by virtue of producing additional retrieval cues.

Experiment 2

The purpose of Experiment 2 was to identify aspects of the retrieval processes that are responsible for enhancing subsequent learning. One question was whether the process of retrieval itself matters for the effects observed in Experiment 1. It may be the case that simply encoding another related word with the cue–target pair facilitates recall, and a retrieval attempt simply provides this additional encoding opportunity. This idea would be consistent with an additional-cue theory, because the locus of the effect might not be the process of retrieval but rather the product of retrieval (i.e., the presence of an additional cue at encoding). It would also be consistent with associative network models of memory, wherein activation spreads automatically from one concept to related concepts (e.g., Collins & Loftus, 1975). The presence of an additional word (like wave) may produce spreading activation to related target words (like beach), thereby enhancing encoding of the target words, and this may occur regardless of whether the additional word was studied or retrieved. To examine this possibility, in Experiment 2 we included a study lure condition, which was closely matched to the pretest condition but did not involve retrieving a word prior to each study trial. During the study lure trial, subjects studied the cue and a related but incorrect lure word, which was the most frequently produced word based on data from Experiment 1 and from another pilot study (e.g., subjects studied tide–wave immediately before studying the target pair tide–beach, because wave was the word most frequently produced to tide in our normative data). If the presence of an additional cue is responsible for the pretesting effect, and if engaging in a retrieval attempt is irrelevant, then the study lure condition should enhance learning in the same way as the pretesting condition.

The second purpose of Experiment 2 was to examine the search set theory by manipulating the nature of the search process that subjects engage in during retrieval attempts. If a set of candidates becomes activated during a retrieval attempt, then constraining the search set to a particular candidate should limit the activation accrued by other candidates and reduce the benefits of attempting retrieval for subsequent encoding. To examine this idea, Experiment 2 included a constrained-pretest condition in which subjects were given the stem of a lure word to be generated during pretest trials (e.g., tide–wa_ _). The search set theory predicts that the pretesting effect should be reduced or eliminated in the constrained-pretest condition. Importantly, the error correction and additional-cue theories predict that a constrained retrieval attempt would enhance encoding just as an unconstrained retrieval attempt would, because an error signal or an additional cue would be produced even under constrained-retrieval conditions.

Method

Subjects

One hundred twenty Purdue University undergraduates participated in exchange for course credit. None of the students had participated in Experiment 1.

Materials

The materials used in Experiment 1 were used again in Experiment 2. Lure words or the stems of lure words were presented in the study lure and constrained-pretest conditions, respectively. The lure words used in Experiment 2 were the responses produced most frequently to each cue word in the data from Experiment 1 and from an additional pilot study. The lure words selected for Experiment 2 had been produced 22% of the time, on average, in those two previous studies.

Design

The experiment used a 2 (pair relatedness: related or unrelated) × 4 (learning condition: pretest, constrained-pretest, study lure, or no-pretest) mixed factorial design. Pair relatedness was manipulated within subjects, and learning condition was manipulated between subjects. Thirty subjects were assigned to each condition.

Procedure

The procedure used in Experiment 2 was similar to that used in Experiment 1. The pretest and no-pretest conditions were identical to those used in Experiment 1. Subjects in the constrained-pretest condition were given the cue and a word stem of the lure word (e.g., tide–wa_ _). The word stems always included the first two letters of the lure and the appropriate number of blanks for the remaining letters in the word. Subjects were instructed to complete the word stem by typing the entire word (e.g., typing wave, not ve) into an input field. They were also told that the word was often the first word that would come to mind when thinking of the cue word. Subjects were given 7 s to complete the stem prior to each study trial. Subjects in the study lure condition studied the cue and the lure word (tide–wave) for 7 s prior to each study trial. Subjects were informed that the first pair was an incorrect answer and that the second pair was the correct target answer. Subjects in both the constrained-pretest and study lure condition were informed at the beginning of the experiment that they would be tested on their memory for the correct second word, not the stem word or lure word. In all other respects, the procedure was identical to the one used in Experiment 1.Footnote 1

Results and discussion

Figure 2 shows the proportions of words recalled on the final cued recall test. As in Experiment 1, only items that were not correctly guessed during pretest trials were included in the analyses. For the related pairs, subjects in the pretest condition correctly guessed 8% of the targets on the pretest, and subjects in the constrained-pretest condition correctly guessed 1% of the targets on the pretest. For the unrelated pairs, target responses were never correctly guessed during pretest trials. The removal of correctly guessed pairs from the analyses did not affect the pattern of results. For completeness, we also computed the probability of producing the lure word during the initial pretest trials. Subjects in the pretest condition produced the lure word on 27% of trials, and subjects in the constrained-pretest condition produced the lure word on 80% of the trials.

Fig. 2
figure 2

Proportions of words recalled on the final cued recall test in Experiment 2. Error bars represent standard errors of the means

The final test results were analyzed with separate one-way ANOVAs on the unrelated and related pairs. For unrelated pairs, subjects in the constrained-pretest condition recalled fewer target words (.10) than did subjects in the other conditions (.22, .19, and .22 for the no-pretest, pretest, and study lure conditions, respectively), F(3, 116) = 2.99, η 2p = .07. For related pairs, there was also a significant difference in recall among the four conditions, F(3, 116) = 10.56, η 2p = .22. Subjects in the pretest condition recalled significantly more targets than did subjects in the no-pretest condition (.68 vs. .57), t(58) = 2.42, d = 0.58, replicating the results of Experiment 1. Recall in the study lure condition was worse than recall in the no-pretest condition (.47 vs. .57), although this difference did not reach significance, t(58) = 1.62, p = .11, d = 0.41. Finally, recall in the constrained-pretest condition was significantly worse than recall in the no-pretest condition (.39 vs. .57), t(58) = 3.49, d = 0.81.

The lack of an effect in the study lure condition suggests that the process of attempting to retrieve a word, rather than the mere presence of a word as a possible additional cue, was responsible for enhancing the encoding of the target word. If spreading activation alone, as it is typically conceived, were responsible for the enhancements to encoding, then there would have been a positive effect in the study lure condition, but there was not. Perhaps the most striking finding of Experiment 2 is that, whereas engaging in an unconstrained retrieval attempt (in the pretest condition) enhanced performance relative to not attempting retrieval, engaging in a constrained retrieval attempt produced a negative effect, decreasing recall relative to not attempting retrieval. The only difference between the pretest and constrained-pretest conditions was the provision of a stem of the lure word in the constrained-pretest condition. The constrained-pretest condition guaranteed that subjects would produce an error, and the word subjects produced could function as an additional cue. Thus, the results are problematic for both general error correction and additional-cue theories.

It is easier to explain the results in terms of a search set theory. A constrained retrieval attempt likely restricted activation to a particular candidate and limited the activation of other candidates that would occur under unconstrained retrieval conditions. Because other candidates received only limited activation, the retrieval attempt provided no enhancement to the encoding of those candidates during the subsequent study trial. However, the finding that constrained retrieval attempts produced negative effects on subsequent recall, not merely neutral effects, suggests that mechanisms other than the activation of candidates within a search set must be considered to account for the results. We discuss additional possible explanations of these findings in the General Discussion.

One might wonder whether the errors from pretest trials and lures from study lure trials produced proactive interference on the final test, and whether or not proactive interference could explain the differences in recall performance. To examine this possibility, we computed the proportions of intrusion errors on the final test. This analysis only included cases in which the word produced on a pretest or constrained pretest, or studied on a study lure trial, was falsely recalled on the final test. These results are shown in Table 1. For unrelated pairs, there were significant differences in intrusions between the conditions, F(2, 87) = 6.54, η 2p = .13. The study lure condition produced more intrusions than did the pretest condition, t(58) = 4.49, d = 0.68, as did the constrained-pretest condition, t(58) = 3.87, d = 0.99. For related pairs, there were also significant differences in intrusions between the conditions, F(2, 87) = 10.01, η 2p = .19. The study lure condition produced more intrusions than the pretest condition, t(58) = 4.49, d = 1.08, as did the constrained-pretest condition, t(58) = 4.38, d = 1.08. The intrusion results suggest that more proactive interference occurred in the study lure and constrained-pretest conditions than in the pretest condition. However, subjects in the study lure condition produced more intrusions than did those in the pretest condition for unrelated words, yet the groups’ recall performance for unrelated words was equivalent. This suggests that although there was proactive interference, it was not necessarily diagnostic of cued recall performance.

Table 1 Proportions of intrusion errors on the final test as a function of learning condition and pair relatedness in Experiments 2 and 3

Experiment 3

The purpose of Experiment 3 was to test a prediction derived from the search set theory. According to the theory, activation of candidates during the initial retrieval attempt should be relatively short lived. Raaijmakers and Shiffrin (1981) proposed that the activation of candidates within a search set was at least long enough to ensure sampling of a candidate. Similarly, research on semantic priming has shown that priming is reduced when one intervening word occurs between a prime and a target word, and some studies have observed no semantic priming when two intervening words occur (McNamara, 1992). In Experiments 1 and 2, there was no lag (no intervening word or trials) between the pretest trial and the subsequent study trial (e.g., tide–? was immediately followed by tide–beach). In Experiment 3, we examined performance in a delayed-pretest condition in which subjects experienced an entire block of pretest trials (e.g., tide–?, jelly–?, kite–?) and then studied target word pairs in a separate study block (e.g., tide–beach, jelly–bread, kite–wind). If attempting retrieval results in activation of candidates, and if this activation fades quickly, then increasing the interval between the pretest trials and study trials should eliminate the effect in the delayed-pretest condition. The error correction and additional-cue theories also lead to similar predictions, because presumably a general error correction mechanism would depend on the immediate presence of a correct response, and because an additional cue would need to occur in close temporal proximity to the target in order to establish a link between the two.

Method

Subjects

Fifty-four Purdue University undergraduates participated in exchange for course credit. None of the subjects had participated in Experiment 1 or 2.

Materials

The materials used in Experiment 1 were also used in Experiment 3.

Design

The experiment used a 2 (pair relatedness: related or unrelated) × 3 (learning condition: immediate pretest, delayed pretest, or no pretest) mixed factorial design. Pair relatedness was manipulated within subjects, and learning condition was manipulated between subjects. Eighteen subjects were assigned to each condition.

Procedure

The procedure in Experiment 3 was similar to the one used in Experiment 1. The no-pretest and immediate-pretest conditions were identical to the no-pretest and pretest conditions of Experiment 1. In the delayed-pretest condition, subjects experienced a block of pretest trials for all 60 word pairs followed by a block of study trials for all of the pairs. The order of trials was randomized within each block. Note that total exposure times were matched in the immediate-pretest and delayed-pretest conditions. The only difference between the conditions was the interval between the pretest trial and the study trial for each item.

Results and discussion

Figure 3 shows the proportions of words recalled on the final cued recall test. As in Experiments 1 and 2, the results include only items that were not correctly guessed on the pretest. For related pairs, the proportions of items correctly guessed on the pretests were 6% and 5% in the immediate-pretest and delayed-pretest conditions, respectively. No unrelated pairs were correctly guessed in either pretest condition. The exclusion of correctly guessed pairs did not affect the pattern of results.

Fig. 3
figure 3

Proportions of words recalled on the final cued recall test in Experiment 3. Error bars represent standard errors of the means

A one-way ANOVA on the unrelated-word-pair results did not indicate a significant difference among the conditions, F(2, 51) = 2.28, p = .11. A separate one-way ANOVA on the related-word-pair results did indicate that there was a significant difference among the conditions, F(2, 51) = 5.13, η 2p = .17. Subjects in the immediate-pretest condition recalled more words than did subjects in the no-pretest condition (.70 vs. .56), t(34) = 3.00, d = 1.08, replicating the pretest advantage observed in the previous experiments with related word pairs. However, subjects in the delayed-pretest and no-pretest conditions produced similar levels of final recall (.55 vs. .56), t(34) < 1. Thus, attempting retrieval only enhanced performance when the cue and target words were related, replicating our previous experiments and those of Kornell et al. (2009), and this enhancement only occurred when the study trial occurred immediately after the pretest trial. This result represents a boundary condition of the pretesting effect, and it also supports the idea that the activation of candidates during retrieval is short lived, and thus, attempting retrieval does not enhance encoding when study trials are delayed relative to the pretest trials.

As in Experiment 2, we conducted an analysis of intrusion errors on the final test in order to examine the role of proactive interference. The results are shown on Table 1. For unrelated words, subjects in the delayed-pretest condition produced more intrusions than did those in the immediate-pretest condition, but this effect did not reach significance, t(34) = 1.8, p = .08, d = 0.58. For related pairs, subjects in the delayed-pretest condition again produced more intrusions than did those in the immediate-pretest condition, t(34) = 2.07, d = 0.72. Thus, the intrusion rates corresponded to the cued recall results. For unrelated pairs, the two groups had equivalent cued recall performance and equivalent intrusion rates. For related pairs, the immediate-pretest condition had greater cued recall performance and lower intrusion rates for related pairs than did the delayed-pretest group. These results suggest that when a retrieval attempt is followed immediately by encoding of a related item, proactive interference from the initial retrieval attempt is reduced. This is consistent with the search set theory—improving the recall probability of one item in the search set also reduces the recall probability of other items in the search set (Mensink & Raaijmakers, 1988).

General discussion

In three experiments, we identified conditions under which attempting retrieval enhanced learning, had no effect on learning, and hurt learning. Relatively minor changes in the conditions of initial retrieval attempts were sufficient to bring about sizeable changes in memory performance. In the following section, we briefly review the key findings of these experiments and interpret the results in light of the theoretical explanations discussed throughout this article.

First, attempting retrieval enhanced the subsequent learning of related target words but not of unrelated target words. This pattern of results was observed in all three experiments. The general error correction and additional-cue theories would lead to the prediction that attempting retrieval should enhance subsequent learning of unrelated words, because even when target words were unrelated to the cues, failed retrieval attempts would still generate error signals and still result in the production of an additional cue. Instead, the results are consistent with a search set theory, in which establishment of a search set during retrieval activates related candidates. The encoding of related candidates would be enhanced because of this activation, but the encoding of unrelated words would not be enhanced, because those words were not part of the search set during the retrieval attempt.

Second, the effects of pretesting prior to study trials are retrieval-specific (Exp. 2). Kornell et al. (2009) showed that providing additional study time for target items did not produce the same effects as attempting retrieval prior to studying the items. In Experiment 2, studying the cue paired with a lure word did not produce the same benefit as attempting to retrieve a word (which was typically the lure word) in the presence of the cue. Thus, theoretical explanations of pretesting effects should focus on the nature of the retrieval process, particularly the establishment of a search set.

Third, unconstrained retrieval attempts enhanced learning, but constrained retrieval attempts did not (Exp. 2). In fact, constraining retrieval to a particular word decreased subsequent recall, relative to making no retrieval attempt in the no-pretest control condition. Again, a constrained retrieval attempt would yield an error signal and would lead to production of an additional cue, so this result is difficult to accommodate within error correction or additional-cue theories. The result is more consistent with a search set theory. Constraining retrieval likely restricted the search set and limited the activation of other candidates. Therefore, constrained retrieval attempts would not bolster the encoding of candidates during the subsequent study trial.

Finally, attempting retrieval enhanced learning when study trials occurred immediately after the pretest trials but not when they occurred after a delay (Exp. 3). This finding is also consistent with a search set theory. Activation of a search set is assumed to be short lived (Raaijmakers & Shiffrin, 1981). Therefore, a delayed study trial would occur at a time when the search set is presumably no longer activated, so that the studied word would not benefit from a prior retrieval attempt.

Thus, the majority of the present results are consistent with a search set theory of the pretesting effect. The theory suggests that when subjects attempt retrieval, a search set consisting of related candidates becomes activated, and residual activation enhances encoding when a candidate is presented in a subsequent study trial. The theory can account for the findings that attempting retrieval enhanced the learning of related but not of unrelated word pairs, that constraining retrieval to a particular candidate did not enhance learning, and that inserting a delay between the retrieval attempt and the subsequent study trial eliminated the pretesting effect.

One result from the present experiments is not readily explained by the search set theory: Engaging in a constrained retrieval attempt impaired learning relative to making no attempt in the no-pretest control condition. In Experiment 2, while an unconstrained retrieval attempt enhanced recall relative to the control condition (.68 vs. .57), a constrained retrieval attempt impaired recall relative to the control condition (.39 vs. .57). It was likely the case that directing subjects to retrieve a particular candidate introduced competition that was not present under unconstrained retrieval conditions (Barnes & Underwood, 1959). This competition during retrieval may have interfered with the subsequent encoding of candidate words enough to produce a negative effect on recall. In Experiment 2, constraining the initial retrieval attempts reduced the levels of final recall for both related and unrelated word pairs, suggesting that interference was created by general competition during retrieval. At this point, we can only speculate about the possible mechanisms responsible for this negative effect of engaging in initial constrained retrieval attempts. The topic certainly merits further exploration.

Conclusion

The act of attempting retrieval alters the encoding that occurs in a subsequent study episode and thereby affects learning. The effects of attempting retrieval depend not only on the relatedness of the materials and the temporal context of the retrieval and encoding events but also on the nature of the retrieval processes engaged in by subjects. Whereas unconstrained initial retrieval attempts enhanced encoding, constrained retrieval attempts did not. Indeed, constrained retrieval reduced performance, perhaps due to competition afforded by the search for a particular target word. The present experiments provide initial evidence supporting the role of establishing a search set in the positive effects of attempting retrieval. Regardless of the theoretical underpinnings of these effects, it is clear that attempting retrieval has the potential to improve learning by enhancing subsequent encoding.