Fewer Constraints Enhance the Generation Effect for Source Memory in Younger, but not Older Adults

Abstract The generation effect is the memory benefit for information that is self-generated compared to read. This effect is robust for both younger and older adults. Recent work with younger adults has shown that the generation effect for context memory (i.e., contextual details associated with an episode) can be increased when there are fewer rather than greater experimental constraints placed on what participants can generate. This increase in context memory is attributable to enhanced relational processing. Given older adults’ deficits in context memory the present study tested whether fewer generation constraints would similarly improve the generation effect for contextual details in older adults. In this study, we examined age differences in item and context (i.e., source and associative) memory across three different tasks comprising the encoding of cue-target pairs: a lower-constraint generation task (i.e., free response to cue, such as assist – ____), a higher-constraint generation task (i.e., solving an anagram, such as assist – hlpe), and a read task (i.e., simply reading the cue-target pair, such as assist – help). Both age groups showed improved item and context memory for materials studied during the generation tasks (both lower- and higher-constraint) compared to the read task. However, only younger adults showed increased source memory for lower-constraint compared to higher-constraint generation, whereas older adults showed equivalent source and associative memory for both lower- and higher-constraint generation tasks. These findings suggest both age groups benefit from self-generation, but older adults may benefit less from conditions that enhance relational processing (lower-constraint generation) in younger adults.

to remember details like the original location or setting in which they encountered information (Spencer & Raz, 1995). Another type of context memory decline in older adults is in associative memory. For example, when studying cue-target word pairs, older adults can remember the individual target words (item memory) as well as younger adults, but have difficulty remembering which cue and target were paired together (i.e., associative memory; Naveh-Benjamin, 2000). Despite these deficits, there is some evidence that certain encoding strategies may partially improve source and associative memory in older adults (Dunlosky & Hertzog, 1998;Hertzog, McGuire, & Lineweaver, 1998;Leshikar, Gutchess, Hebrank, Sutton, & Park, 2010;Naveh-Benjamin, Brav, & Levy, 2007).
One particular encoding strategy that can improve memory in both younger and older adults is selfgeneration. Research has shown that self-generated information is better remembered than information that is read or otherwise not self-generated, now commonly known as the generation effect (Jacoby, 1978;Slamecka & Graf, 1978). A large body of work has shown that this effect is robust for the information that is generated (i.e., item memory) across a variety of experimental procedures for both younger and older adults (for a review, see Bertsch, Pesta, Wiscott, & McDaniel, 2007). Although the use of varying procedures has been valuable in advancing what we know about the generation effect, an often-overlooked detail in the literature is the influence this variability in experimental tasks (i.e., different generation tasks) has on memory. Specifically, many studies on the generation effect constrain participants to generate a single "correct" response per trial. As an example, a common generation procedure requires participants to generate target words from a cue, but they are given the constraints of a generation rule and some letters of the to-be-generated target to ensure they generate the "correct" response (e.g., solve the anagram: open -cosle; Foley & Foley, 2007). Memory for generated words is then compared to a control condition in which participants simply read associated word pairs (e.g., hot -cold). Generated words are often better remembered than read words (Bertsch et al., 2007;Slamecka & Graf, 1978), however, some work has shown that constraints placed on what participants self-generate may limit the magnitude of the memory benefits from self-generation (Fiedler, Lachnit, Fay, & Krug, 1992;Gardiner, Smith, Richardson, Burrows, & Williams, 1985;McCurdy, Leach, & Leshikar, 2017;McCurdy, Sklenar, Frankenstein, & Leshikar, Under Review). We tested this hypothesis in younger adults and found that a generation task with fewer constraints, in which participants responded freely to a cue word (e.g., open -_____), often led to better memory when compared against commonly used higher-constraint generation tasks (i.e., anagram & word fragment tasks; McCurdy et al., 2017). Interestingly, this effect of generation constraint was most robust and consistent for measures of source (i.e., remembering which task was performed with the word pair) and associative memory (i.e., recalling the associated target word when given the cue word). Given the documented declines in older adults' ability to remember these types of contextual details (e.g., source and associative; Naveh- Benjamin, 2000;Spencer & Raz, 1995), the goal of this study was to examine the extent older adults might show memory improvements for materials produced under lower compared to higher generation constraints, which might buffer against declining memory across the lifespan.
Prior work on the generation effect has investigated the memory mechanism(s) that might underlie the generation effect. This research has supported a hypothesis known as the two-factor theory that suggests self-generation improves memory (compared to reading) via two primary mechanisms: 1) increased distinctiveness of the generated item and 2) enhanced relational processing (e.g., processing of the association between the cue and the generated target; Hirshman & Bjork, 1988). Our prior work on the influence of generation constraints on the generation effect may also be understood through the twofactor theory (McCurdy et al., 2017). One way to examine differences between the item distinctiveness and relational processing components of the two-factor theory is through the use of different types of memory tests. Specifically, item recognition tests are sensitive measures of item distinctiveness (Einstein & Hunt, 1980;Hunt & Einstein, 1981;Kintsch, 1974), and cued recall tests are known to be sensitive measures of relational processing (Donaldson & Bass, 1980;Hirshman & Bjork, 1988). In some of our prior work in younger adults, we aimed to tie the enhanced memory benefits of lower-constraint generation to either or both components of the two-factor theory through two separate memory tests (item recognition, cued recall; McCurdy et al., Under Review) to better understand how generation constraint influences memory for self-generated information. Across several studies we have consistently found that lower-constraint generation improved memory over higher-constraint generation for measures of cued recall, but not item recognition (McCurdy et al., 2017;McCurdy et al., Under Review). Because cued recall tests are sensitive to relational processing, these findings suggest lower-constraint generation may influence the generation effect through enhanced relational processing, relative to a higher-constraint task. In addition, we have also found evidence that lower-constraint generation consistently leads to improved source memory compared to higher-constraint generation, which provides further support that lower-constraint generation enhances relational processing that can extend to a variety of contextual details (e.g., source and associative memory). Providing an encoding strategy that induces greater relational processing may be an effective way to improve older adults' context memory (both source and associative), which is important given age-related declines for these two types of details (Naveh-Benjamin, 2000;Old & Naveh-Benjamin, 2008;Spencer & Raz, 1995).
Research on the generation effect for source memory has generally found that younger and older adults show similar source memory benefits from self-generation. More specifically, younger and older adults are often equally effective in their ability to distinguish between internally generated and externally presented (e.g., read) information, an ability known as reality monitoring (Cohen & Faulkner, 1989;Johnson & Raye, 1981;D. B. Mitchell, Hunt, & Schmitt, 1986). Critically though, when asked to distinguish between two different types of internally generated pieces of information (e.g., say versus think), older adults often experience pronounced deficits compared to younger adults (i.e., source monitoring; Brown, Jones, & Davis, 1995;Hashtroudi, Johnson, & Chrosniak, 1989). These findings in the source monitoring literature are inconsistent with the aforementioned suggestion that an effective encoding strategy that enhances relational processing may improve context memory similarly between age groups. Given that lower-and higher-constraint generation tasks are both internal generation tasks, the source monitoring framework would predict that older adults should not experience the same gains as younger adults, given their relative inability to discriminate between sources of the same type (Hashtroudi et al., 1989). However, there is no prior work on source monitoring with respect to age-related differences between generation tasks that vary on a dimension of constraint (i.e., different modes of generation, "say versus think", do not vary in amount of generation constraint). Therefore, whether the enhanced relational processing from lower-constraint generation can reduce or eliminate age-related source monitoring and source memory deficits remains an empirical question we aim to answer in the present study. Finding evidence of an enhanced generation effect for source memory from lower-constraint generation may offer insights into ways to reduce the source memory deficits often experienced by older adults (Spencer & Raz, 1995).
Turning to associative memory, prior investigations have also shown that younger and older adults often experience similar associative memory benefits from self-generation compared to read reading (D. B. Mitchell et al., 1986;Taconnat, Froger, Sacher, & Isingrini, 2008;Taconnat & Isingrini, 2004;Whiting, 2003). Taconnat et al. (2008), however, reported age differences in the generation effect for associative memory based on generation difficulty, where both age groups showed a generation effect when the generated word pairs were strongly related, but only younger adults showed a generation effect for associative memory when the generated word pairs were weakly related (i.e., more difficult generation). Taconnat et al. (2008) suggest these age differences occurred as a result of the "associative deficit hypothesis" (Naveh-Benjamin, 2000), which suggests older adults have difficulty spontaneously encoding associations between items, especially between items that are not already strongly related. Based on this research, we see two possible outcomes regarding generation constraint in older adults for associative memory. If older adults experience deficits in spontaneously forming associations as some prior work suggests (Taconnat et al., 2008), then one possibility is that older adults will not experience associative memory benefits from a lower-constraint generation task, as younger adults have shown in our prior work (McCurdy et al., 2017). Alternatively, if lower-constraint generation enhances relational processing as we suspect, then we would expect to see mitigation of associative deficits in older adults. Prior work has not directly investigated age differences in the effect of generation constraint on associative memory, therefore the present study aims to clarify these two possibilities.
In this investigation, we compared memory effects (item, source, associative) for a lower-constraint generation, higher-constraint generation, and read task in both younger and older adults. We had three predictions. First, for all memory measures, we expected both age groups to show the standard generation effect (both lower-constraint and higher-constraint generation will lead to better memory compared to the read task) consistent with previous work that younger and older adults benefit similarly from self-generation (Gregory, Mergler, Durso, & Zandi, 1988;Hashtroudi et al., 1989;D. B. Mitchell et al., 1986;Rabinowitz, 1989;Taconnat et al., 2008;Taconnat & Isingrini, 2004;Whiting, 2003). For item recognition, we did not expect to see generation constraint effects for either age group, in line with our prior work showing the effect of generation constraint is less robust for item memory (Exp. 2 & 3, McCurdy et al., 2017). Second, for younger adults we expected greater context memory (source and associative memory) for materials generated in a lower-compared to higher-constraint task, in line with the enhanced relational processing account of the two-factor theory (Hirshman & Bjork, 1988), and our prior work in younger groups (McCurdy et al., 2017). Third, for older adults we see two possible outcomes for both context memory types (source and associative memory): One possibility in line with the two-factor theory (Hirshman & Bjork, 1988) is that lower-constraint generation will enhance relational processing to a greater extent than a higher-constraint task, leading to improved source and associative memory in older adults. This outcome would be consistent with our prior work in younger adults (McCurdy et al., 2017) and other evidence that suggests generation effects for both source and associative memory measures in younger and older adults are often similar in magnitude (D. B. Mitchell et al., 1986;Taconnat et al., 2008;Taconnat & Isingrini, 2004;Whiting, 2003). However, because older adults show pervasive declines for both types of contextual details (Old & Naveh-Benjamin, 2008;Spencer & Raz, 1995), a second possibility is that interventions that enhance relational processing (such as lower-constraint generation), are less effective in older adults than in younger adults. In other words, it may be that older adults do not experience the same boost in relational processing, thus leading to equivalent source and associative memory between lower and higher-constraint generation. This outcome would be another instance of context memory deficits in older adults (Old & Naveh-Benjamin, 2008;Spencer & Raz, 1995) and would be consistent with other work showing older adults still experience memory deficits relative to younger adults, even when given an encoding strategy to use (Dunlosky & Hertzog, 1998;Hertzog, Fulton, Mandviwala, & Dunlosky, 2013;Naveh-Benjamin et al., 2007). Given no work to date has directly examined the effect of generation constraint in older adults, we aim to explore which hypothesis is supported in the present study.

Method Participants
Twenty-four younger (age: 19.0, SD: 1.1, range: 18-23, 12 females) and 25 older (age: 69.4, SD: 7.6, range: 60-85, 13 females) adults participated in this study1. Younger adults were recruited through the University of Illinois at Chicago introductory psychology course subject pool. Older adults were recruited from the greater Chicago community. Participants gave written informed consent in accordance with the University of Illinois at Chicago Institutional Review Board and were compensated with course credit or paid for their participation. Each participant completed a language history questionnaire and a battery of neuropsychological tests (including the Mini-mental state examination, Shipley vocabulary, word fluency, digit span, and digit symbol). Results of these neuropsychological tests are reported in Table 1. Independent samples t-tests showed that older adults were more educated and performed better on measures of vocabulary and word fluency, yet older adults performed more poorly on a measure of processing speed (digit symbol), which is consistent with the idea that older adults show declines in processing speed relative to the young, but show stable or improved crystallized intelligence (Park et al., 2002;Salthouse, 2012).

Stimuli
A total of 96 unique cue-target word pairs were used, selected from the University of South Florida Free Association Norms (Nelson, McEvoy, & Schreiber, 2004). Word pairs were selected so that both the cue and target were between 4-7 letters. Each word pair was highly associated (Mean forward cue-to-target strength: .624, Range: .493-.899). Across participants, each pair was counterbalanced so that it occurred in each encoding task (generate, scramble, read), and as a new item at retrieval. The stimuli set and counterbalance procedure were analogous to those used in our prior study (McCurdy et al., 2017).

Procedure
Participants completed three phases in the experiment: Encoding, recognition, and cued recall. The encoding and recognition phases were presented on a monitor using E-Prime presentation software (Psychology Software Tools, 2012). Prior to the encoding phase participants were trained on shortened versions of the encoding and recognition phases, thus participants were aware of the recognition test before encoding, but not the following cued recall test. In the encoding phase, participants saw a total of 72 word pairs across three conditions: generate, scramble, and read (24 pairs in each task). In the "generate" task (i.e., lowerconstraint generation task), participants were shown a cue word followed by a blank line (e.g., brief -_____) and were instructed to generate a word related to the cue word and then say both words aloud (i.e., the cue and the generated target). Participants were instructed that if they did not understand the meaning of the cue word, they could say "skip", however no participant skipped a trial in this task. The generate task instructions emphasized that the responses were subjective and that there were no correct answers so that participants were free to generate any word that came to mind. Participant responses during this phase were recorded by the experimenter. In the "scramble" task (i.e., higher-constraint generation task), participants were shown a cue word followed by an anagram of the target word (e.g., blaze -feri) and were instructed to solve the anagram and then say both words aloud. If the participant was unable to determine the scrambled word, they were instructed to say "skip" and that trial was later removed from the memory analyses. The first letter of the scrambled target word was always in its correct position, followed by the rest of the letters randomly arranged, as done previously (Foley & Foley, 2007;Foley, Foley, Wilder, & Rusch, 1989;McCurdy et al., 2017). This procedure served to reduce the number of skipped trials2. Importantly, this scramble task is a commonly used procedure that places high constraints on what participants can generate (Foley & Foley, 2007;Foley et al., 1989). In the "read" task, participants were shown a cue and target word (e.g., done -finish) and were instructed to read both words aloud. For all encoding tasks, participants were instructed to say both words aloud to ensure that the participant was processing the cue and target word in all tasks. The presentation of each trial was self-paced; once the participant said both words for that trial, the experimenter advanced the screen and the participant was presented with the next encoding trial after a 500ms fixation (see Figure 1). Encoding trials were presented in six blocks. Each block contained 12 trials of the same type (generate, scramble, or read). Before each block, an instruction prompt appeared for 3000ms indicating the next task to perform (e.g., "Get ready to do the generate/scramble/read task."). The order of the blocks was randomized so that the same task was not presented successively (e.g., a generate block never followed another generate block). Following the encoding phase, participants filled out a demographics questionnaire while the experimenter uploaded the generated target words from the generate task into E-Prime. This process took approximately 2 minutes before the participant began the recognition phase. During recognition, participants were shown all 72 target words (i.e., the words they generated, unscrambled, or read) and 24 new words in a random order, for a total of 96 recognition trials. Each recognition trial consisted of two self-paced recognition judgments: an item memory judgment and a source memory judgment (see Figure 1). First, for item memory, a target or new word was presented on the screen, and participants judged whether the word was old, new, or whether they were unsure ("don't know"), by pressing corresponding keys on a keyboard ('V' = old | 'B' = new | 'M' = don't know). Second, following a 500ms fixation, participants were shown the same target word and judged whether they encountered that word in the generate, scramble, or read condition, or whether they were unsure ("don't know"), by pressing corresponding keys on a keyboard ('V' = generate | 'B' = scramble | 'N' = read | 'M' = don't know). This response served as the source memory judgment. The "don't know" response was included as an option to reduce the likelihood of guessing (Duarte, Henson, & Graham, 2008;Duarte, Henson, Knight, Emery, & Graham, 2010). For items that were judged as "new" for the first decision (item memory), participants were then instructed to make a "don't know" response for source memory decision, as done in previous studies (Leshikar, Duarte, & Hertzog, 2012;Leshikar, Dulas, & Duarte, 2015;McCurdy et al., 2017) Immediately following the recognition test, participants were given a surprise associative memory test (cued recall). Participants were given a pencil and paper that listed all 72 cue words seen during the encoding phase (in a randomized order) followed by a blank space to write their response. Participants were instructed to write the word that was paired with that cue word from the experiment. If they were unsure they were instructed to leave the line blank to reduce guessing. After completing the associative memory test, participants completed a language history questionnaire and neuropsychological battery, were debriefed, and compensated for their participation.

Results
At encoding both younger and older adult participants gave responses on 100% of the generate3 and read trials. Younger adults successfully unscrambled 95.9%, and older adults successfully unscrambled 98.2% of words in the scramble task, which did not significantly differ, t(47) = 1.45, p = .15. Trials where participants were unable to unscramble the target word correctly were removed from all analyses. The raw (uncorrected) recognition responses for item and context memory are presented in Table 2. First, a 3 (task: generate, scramble, read) by 2 (age group: younger, older) repeated measures mixed ANOVA was conducted to examine memory performance between younger and older adults for each memory test (item, source, associative) respectively. Any significant interactions were followed up by conducting one-way ANOVAs for each age group separately, including the appropriate post-hoc pairwise comparisons.
Item recognition scores were corrected for false alarms by taking the number of items correctly judged as "old" for each task (hits) minus the number of "new" items identified as "old" (false alarms). Source recognition scores were calculated using the conditional source identification measure (CSIM; Murnane & Bayen, 1996), a procedure that removes the influence of item recognition on source memory performance. Using this measure, source memory scores are represented by the proportion of correctly recognized items (i.e., item hits) that are also correctly identified with their source (i.e., source "hits" / item "hits" for that source). Associative memory scores were calculated as the percent of correctly recalled target words out of all words seen in the encoding phase. Items that were incorrect or left blank on this test were counted as misses.

Item Memory
Data from the item recognition analysis are graphed in Figure 2. Results showed a significant main effect for task4, F(2, 94) = 60.42, p < .001, η p 2 = .557. Bonferroni corrected follow up tests revealed that across younger and older adults, the scramble task (M = .71, SD = .18) led to better item memory than the generate task (M = .62, SD = .19) , t(47) = 3.45, p = .004, d = .49, suggesting higher-constraint generation enhanced item recognition over lower-constraint generation across both age groups. This finding is inconsistent with our prior work in younger adults (McCurdy et al., 2017). Critically, both the generate, t(47) = 6.69, p < .001, d = .96, and scramble, t(47) = 11.76, p < .001, d = 1.68, tasks led to better item recognition compared to the read task (M = .42, SD = .17), consistent with the standard generation effect. The main effect of age, F(1, 47) = 0.31, p = .58, η p 2 = .006, and age by task interaction were not significant, F(2, 94) = 1.03, p = .36, η p 2 = .009. Figure 2. Corrected item memory performance as a function of condition and age. Error bars represent standard errors of the mean. Item recognition was corrected for by calculating hits minus false alarms (proportion of "new" words identified as "old"). "Generate" = lower-constraint generation task; "Scramble" = higher-constraint generation task.
4 One method of accounting for item selection confounds in the generate task is to separate the trials for which participants produced the normed, expected target and the trials for which they produced a novel, unexpected target. We conducted an analysis in which we removed all trials in which the participant did not produce the normed response to remove the influence of item-selection confounds. This analysis revealed a similar pattern of results, all key results (item, source, associative) showed a similar pattern to the analysis with all items included.

Source Memory
Data for the source memory analysis are shown in Figure 3. ANOVA results revealed a significant main effect for task, F(2, 94) = 7.18, p = .001, η p 2 = .126. Bonferroni corrected follow-up comparisons showed that across both age groups there was no difference between the generate (M = .76, SD = .20) and scramble tasks (M = .72, SD = .16), t(47) = .75, p = 1.0, d = .11. However, both the generate, t(47) = 3.62, p = .002, d = .52, and scramble, t(47) = 2.68, p = .03, d = .38, tasks led to higher source memory compared to the read task (M = .61, SD = .21) consistent with the standard generation effect. There was also a main effect for age in the source memory analysis, F(1, 47) = 17.76, p < .001, η p 2 = .274, where younger adults (M = .77, SD = .12) outperformed older adults (M = .63, SD = .12) across all tasks. The task by age interaction in this analysis was not significant, F(2, 94) = 2.70, p = .07, η p 2 = .101. We conducted further analyses to address our specific apriori predictions regarding whether both younger and older adults show a generation constraint effect. Figure 3. Conditional source memory performance as a function of condition and age. Error bars represent standard errors of the mean. Conditional source memory reflects response rates on source memory decision only for items that were correctly identified in the item memory judgment. "Generate" = lower-constraint generation task; "Scramble" = higher-constraint generation task.
For younger adults, a one-way ANOVA of the source memory data revealed significant differences between the three tasks, F(2, 22) = 7.23, p = .004, η p 2 = .396. Bonferroni corrected follow-up comparisons revealed that source memory was significantly greater for the generate (M = .84, SD = .12) compared to the scramble task (M = .75, SD = .15), t(23) = 2.78, p = .011, d = .57. Importantly, source memory was greater for the generate task compared to the read task (M = .72, SD = .17), t(23) = 3.58, p = .002, d = .73, but there was no significant difference between the scramble and read tasks, t(23) = .622, p = .54, d = .13. This finding is in line with our predictions and prior work in younger adults that shows lower-constraint generation consistently improves source memory compared to a higher-constraint generation task (McCurdy et al., 2017).
For older adults, a one-way ANOVA for the source memory data indicated significant differences between tasks, F(2, 23) = 4.69, p = .01, η p 2 = .164. Bonferroni corrected follow-up analyses showed no statistical differences between the generate (M = .66, SD = .26) and scramble task (M = .70, SD = .17), t(24) = -0.54, p = 1.0, d = -.11. However, the scramble task led to significantly greater source memory compared to the read task (M = .51, SD = .24), t(24) = 2.81, p =.03, d = .56, and the generate task showed numerically greater source memory than the read task, t(24) = 2.31, p = .09, d = .46, generally consistent with the standard generation effect. This pattern of results differs in comparison to younger adults, in that lower-constraint generation showed source memory benefits for younger adults, but not older adults. Because traditional frequentist statistics cannot be used as evidence to support the null hypothesis (i.e., that older adults show equal source memory benefits from lower-compared to higher-constraint generation) we conducted a Bayesian paired samples t-test between lower-constraint and higher-constraint source memory in the older adult sample using JASP statistical software (JASP Team, 2018) to determine the likelihood of the null hypothesis is true, given this data. This analysis revealed a BF 01 (relative odds ratio of the data given the null hypothesis, compared to the alternative hypothesis) of 4.14, suggesting that these data are four times more likely under the null hypothesis compared to the alternative hypothesis, providing more evidence that older adults likely do not experience source memory benefits for lower-constraint compared to higher-constraint generation. For comparison, an equivalent Bayesian paired-samples t-test between lower-and higher-constraint for source memory in younger adults revealed a BF 01 of 0.21, suggesting that in younger adults, the observed data are nearly five times less likely under the null compared to the alternative hypothesis.
Given the different pattern of results between age groups, we further examined these age-related differences by conducting independent samples t-tests between younger and older adults by encoding task (generation, scramble, read). These analyses showed that younger adults showed significantly better source memory compared to older adults for the generate task t(47) = 3.00, p = .004, d = .86, and the read task, t(47) = 3.52, p < .001, d = 1.00. There were no differences in source memory between younger and older adults for the scramble task, t(47) = 1.03, p = .31, d = .29.
Because we saw an age by task interaction for source memory, we wanted to investigate whether this effect might have been driven by bias. In particular, one type of bias is the "it had to be you" effect, where participants are more likely to incorrectly attribute "new" items to external sources (e.g., read) compared to internal sources (e.g., generate; Johnson & Raye, 1981). Therefore, we examined the false alarm data, where participants identified a "new" word as "old", and analyzed the source judgments to determine if there was a bias to attribute these false alarms to the external source (read) compared to internal sources (generate and scramble). This analysis revealed that both age groups were indeed more likely to attribute "new" items to the external source (read) compared to both internal sources (generate: t(47) = 2.75, p = .03; scramble: t(47) = 2.84, p = .02), providing evidence of the "it had to be you" bias in our sample. Critically, there was no interaction by age group F(2, 94) = 0.27, p = .76, suggesting this bias was similar for both younger and older adults.

Discussion
This study investigated the effects of generation constraint on item and context (source and associative) memory in younger and older adults. There were three main findings: First, both age groups showed greater item, source, and associative memory for generated materials (both lower-and higher-constraint) compared to reading, consistent with the standard generation effect in memory (Bertsch et al., 2007). Second, for younger adults, the lower-constraint task led to increased source memory performance compared to higherconstraint and read tasks, consistent with prior work (McCurdy et al., 2017); however, for item recognition the opposite pattern was evident (higher-constraint showed greater memory compared to lower-constraint). Third, for older adults, for context memory measures there was no significant difference between the lowerand higher-constraint generation tasks. Overall, these findings suggest both age groups clearly benefit from self-generation compared to reading (the generation effect); however, only younger adults experienced greater source memory benefits from a lower-constraint generation task, whereas older adults did not. Turning first to item memory, we found the standard generation effect for both age groups, where both generation tasks (lower-and higher-constraint) led to better item memory than the read task, providing further evidence that self-generation is a powerful mnemonic that can improve item memory in both younger and older adults (Bertsch et al., 2007). However, in our comparison of lower-and higher-constraint generation, we found that in both younger and older adults, the higher-constraint generation task led to greater item recognition compared to a lower-constraint task. Additionally, participants took longer to make their recognition judgements for lower-constraint and read items compared to higher-constraint items, perhaps suggesting they were more confident in their judgements for items generated in the higherconstraint task. These findings are not consistent with our expectations and prior work showing equivalent item memory between generation tasks (McCurdy et al., 2017, Exp. 2 & 3;McCurdy et al., Under Review). Given that recognition tests are primarily sensitive to item distinctiveness component of the two-factor theory (Einstein & Hunt, 1980;Hunt & Einstein, 1981;Kintsch, 1974), our present findings suggest that fewer generation constraints do not seem to increase the distinctiveness of the generated item. In contrast, higher-constraint tasks may actually produce greater item distinctiveness given that we found evidence that the higher-constraint task led to enhanced item recognition. However, more work should be done to corroborate this claim given that this is the first time we have seen this pattern in our investigations of generation constraint. Overall, it is clear that generating acts to increase item distinctiveness relative to reading, as suggested by the two-factor theory (Hirshman & Bjork, 1988), but more work is necessary to better understand how generation constraint influences distinctiveness.
A primary goal of this study was to investigate the influence of generation constraint on source memory in younger and older adults to examine if older adults would experience enhanced relational processing from lower-constraint generation that we believe underlies the source memory benefits in younger adults. We found a different pattern of results between age groups however, suggesting age-related differences in the source memory benefits from lower-constraint generation. For younger adults, we found that lowerconstraint generation leads to better source memory compared to higher-constraint. These findings provide converging evidence with prior work (McCurdy et al., 2017) suggesting that lower-constraint generation leads to enhanced relational processing in line with the two-factor theory (Hirshman & Bjork, 1988). Although the two-factor theory originally claimed that the relational processing induced from self-generation was limited specifically to the cue-target relationship (associative memory), more recent work has shown that influence of heightened relational processing can lead to improved memory for other context (relational) details (Greenwald & Johnson, 1989;Marsh, 2006;Marsh, Edelman, & Bower, 2001). Enhanced relational processing may allow for multiple contextual details (e.g., the task that was performed, the cue word used to generate) to be bound with the target item into a single memory representation (Chalfonte & Johnson, 1996), leading to the improved source recognition for younger adults generated in a lower-constraint task. It should be noted however, that more recent work has suggested that not all context memory types are similarly affected by self-generation (Mulligan, 2004(Mulligan, , 2011Mulligan & Lozito, 2004). Specifically, Mulligan (2011) has suggested that self-generation improves memory (relative to reading) for intrinsic context details (i.e., conceptually related information like source memory and cue-target association), but not for extrinsic context details (i.e., perceptually related information like font color, or font type). Although we did not examine extrinsic context details in the present study, Mulligan and colleagues' work implies that the effect of generation constraint on source memory may not generalize to all other types of contextual details (specifically extrinsic details). Future work should examine whether generation constraint has differential effects between intrinsic and extrinsic details, or whether this effect is more general and influences all types of context memories similarly.
For older adults, we found that both the lower-and higher-constraint tasks led to equivalent source memory. Although this finding is not consistent with younger adults in the present study or prior work (McCurdy et al., 2017), equivalent source memory between two generation tasks in older adults is supported by other literatures-especially that of reality and source monitoring effects across age (Brown et al., 1995;Hashtroudi et al., 1989;Johnson, Hashtroudi, & Lindsay, 1993). This literature suggests that older adults are often as good as younger adults in discriminating between internal (e.g., "say") and external sources (e.g., "listen"; i.e., reality monitoring), however they show difficulty in discriminating between two sources of internally generated materials (e.g., "say" versus "think"; i.e., source monitoring; Hashtroudi et al., 1989). In the present study both younger and older adults were effective at identifying the source of generated items (internal source; lower-and higher-constraint) compared to read items (external source), as predicted by reality monitoring work (Hashtroudi et al., 1989;Johnson & Raye, 1981;Rabinowitz, 1989). However, when asked to identify between two "internal" (or self-generated) sources (e.g., lower-constraint vs. higherconstraint), older adults showed deficits compared to the young. Indeed, older adults made nearly twice as many source confusions between the two generation tasks, compared to younger adults (see Table 2), in line with source monitoring deficits often reported for older adults (Brown et al., 1995;Hashtroudi et al., 1989;Johnson et al., 1993). Importantly, the present findings provide further evidence that older adults have impoverished memory for source details (Spencer & Raz, 1995), and support the hypothesis that older adults' deficits in source memory diminish the benefits from lower-constraint generation shown in younger adults. Prior work has suggested older adults' source memory deficits may occur as result of a reduced ability to bind multiple details of episode into a coherent memory representation (Chalfonte & Johnson, 1996;K. J. Mitchell, Johnson, Raye, Mather, & D'esposito, 2000), or the reduced ability to spontaneously engage in elaborative encoding processing (Craik & Byrd, 1982;Craik & Simon, 1980). If this is indeed the case, and enhanced relational processing is the primary mechanism behind the additional benefits from lower-constraint generation as we suspect (McCurdy et al., 2017;McCurdy et al., Under Review), these deficits could potentially explain why older adults are less able experience the source memory benefits from lower-constraint generation that we have seen in younger adults.
Although our results suggest that generation constraint had little effect on source memory in older adults, there is some reason to believe that older adults might still show improved source memory for a lower-constraint task over a higher-constraint task under different experimental procedures. Specifically, because the source memory measure we used required participants to distinguish between two internal sources, it may be that lower-constraint generation does improve context memory in older adults over a higher-constraint task, but that the design we used in this experiment (discriminating between two internal sources) obscured our ability to observe this improvement. Future work comparing a lower-constraint and higher-constraint task modifying the design of this manipulation (e.g., between-subjects) could provide insight into this possibility.
Another factor to consider in source memory effects is the bias for subjects to attribute new items identified as "old" (false positives) to external sources (read), compared to internal sources (generate), a bias known as the "It had to be you" effect (Johnson & Raye, 1981). We attempted to reduce guessing in our data by adding the "don't know" response option. However, an examination of Table 2 suggests that some false positives were also falsely identified as being seen in one of the three tasks (generate, scramble, or read). We examined these false positive source attributions and found evidence of the "it had to be you" bias in our sample, but critically no differences between younger and older adults. This bias is important to consider because it suggests that source memory performance for read items across both age groups is likely overestimated given evidence of a bias to attribute "new" items to an external (read) source. Thus, it is possible that the benefits of self-generation (lower-and higher-constraint) compared to reading for both age groups may be larger than observed in the present findings, providing further support that selfgeneration is effective in enhancing source memory in younger and older adults.
Our final goal in this study was to investigate the impact of generation constraint on associative memory in younger and older adults. Although we found numerical differences in associative memory for younger adults between items produced in the lower-constraint compared to higher-constraint task, this effect was not statistically significant for either age group. Our previous work in younger adults has shown consistent effects of generation constraint (lower-constraint led to better memory compared to higherconstraint) in measures of cued recall (McCurdy et al., 2017;McCurdy et al., Under Review) using similar procedures. Specifically, in Experiment 2 of McCurdy et al. (2017), we used precisely the same procedure as the present study, but without the older adult sample, and in McCurdy et al. (Under Review) we used the same procedure but gave the cued recall test in isolation (without the preceding item recognition and source recognition tests) to reduce the influence of a previous memory test on cued recall performance. Thus, we expected to find similar results in this study, at least with the younger adult sample. One possibility is that ceiling effects limited our ability to detect a statistically significant difference between lower-and higher-constraint in our younger adult data. Indeed, the younger adult sample showed high percentages of recall, and more than half (13 of 24) of the younger adults recalled 100 percent of items in the lowerconstraint task. Furthermore, a Bayesian paired-samples t-test provided some evidence against the null hypothesis (BF 01 = .451) in the younger adult sample, suggesting the data were twice as likely under the alternative hypothesis. Future work should continue to investigate the influence of generation constraint on associative memory in younger adults, perhaps using more stimuli to bring these effects off ceiling, to confirm the previous conjecture that lower-constraint generation improves associative memory in younger adults (McCurdy et al., 2017;McCurdy et al., Under Review). For older adults, we found no evidence of differences between lower-and higher-constraint on our associative memory measure. Similar to source memory, these findings provide evidence supporting the hypothesis that older adults' deficits in associative memory (Naveh-Benjamin, 2000;Naveh-Benjamin et al., 2007;Old & Naveh-Benjamin, 2008) may have mitigated the benefits of lower-constraint generation for associative memory we have seen in our prior work on younger adults (McCurdy et al., 2017;McCurdy et al., Under Review). Overall, in conjunction with the source memory findings, this study provides evidence supporting the hypothesis that older adults' deficits in context memory reduced their ability to experience the context memory benefits from lower-constraint generation we have seen in younger adults.
Despite these age-related differences in generation constraint, it is important to highlight the general finding that both younger and older adults showed better memory for generated materials (lowerconstraint and higher-constraint) compared to read materials for all memory measures. This study provides more evidence that self-generation is a successful encoding strategy that can be used to support various aspects of memory (e.g., item, source, associative), even in older adults who show deficits for these types of memory (Balota et al., 2000;Craik, 2000;Naveh-Benjamin, 2000;Naveh-Benjamin et al., 2007;Old & Naveh-Benjamin, 2008;Park et al., 2002;Schacter et al., 1991;Spencer & Raz, 1995). Prior theoretical work on the generation effect has proposed self-generation improves memory by enhancing the distinctiveness of the generated target, and by enhancing relational processing of the cue and target word and other relational details associated with that target (e.g., source), known as the two-factor theory (Hirshman & Bjork, 1988). Our findings further support that the two-factor theory can account for the generation effect in both younger and older adults. Additionally, the demonstration of a robust generation effect for older adults in the present study adds to a larger body of work investigating ways to improve memory across the lifespan (Leach, McCurdy, Trumbo, Matzen, & Leshikar, 2018;Leshikar & Duarte, 2014;Leshikar, Dulas, et al., 2015;Leshikar, Park, & Gutchess, 2015).

Conclusion
Prior work on the generation effect has primarily shown that both younger and older adults benefit similarly from self-generation. In this study, we found further evidence that self-generation is a powerful mnemonic that can improve item, source, and associative memory in both age groups. We also found, however, that younger and older adults exhibit differential generation effects under different levels of generation constraint. In younger adults, the data suggest that fewer generation constraints increase the generation effect for source memory due to enhanced relational processing; however, older adults did not show the same effect, suggesting they did not experience enhanced relational processing from the lower-constraint task, which reflects older adults' source memory deficits (Spencer & Raz, 1995).