Human long-term memories are improved by the processing of emotion and semantic elaboration. Previous psychological studies investigating interactions between emotion and memory have consistently reported that memories for emotion-laden stimuli are remembered more accurately than memories for emotionally neutral stimuli (Bradley, Greenwald, Petry, & Lang, 1992; Heuer & Reisberg, 1990; Kensinger & Corkin, 2003). The memory enhancement reflects stimulus-driven modulation of memory conveyed by emotional stimuli. Other psychological studies have also shown that the encoding operation of semantic elaboration has a beneficial effect on long-term memories (Craik & Lockhart, 1972; Craik & Tulving, 1975). In addition to stimulus-driven emotions, emotions are intentionally generated by the semantic interpretation of neutral stimuli as emotional stimuli. Although the neural mechanisms associated with emotional processing generated by semantic elaboration have been reported in functional neuroimaging studies (Ochsner et al., 2009; Teasdale et al., 1999), evidence for interacting mechanisms between self-generated emotion and long-term memory is scarce. The current functional magnetic resonance imaging (fMRI) study investigated how encoding-related activation of emotionally neutral pictures is modulated by the semantic interpretation of emotional context or the emotion generation.

One critical process in the generation of emotion is semantic elaboration. The enhancement of episodic memories by the semantic elaboration process during encoding is known as the levels-of-processing effect (Craik & Lockhart, 1972; Craik & Tulving, 1975), and the neural correlates of this effect have been identified in interacting mechanisms between the left inferior frontal gyrus, which is associated with semantic processing, and the hippocampus, which is associated with episodic memory processing (Fletcher, Stephenson, Carpenter, Donovan, & Bullmorel, 2003; Fliessbach, Buerger, Trautner, Elger, & Weber, 2010; Kapur et al., 1994; Otten, Henson, & Rugg, 2001; Schott et al., 2013; Wagner et al., 1998). Other functional neuroimaging studies have linked left inferior frontal activation to cognitive reappraisal, in which emotional feeling is regulated by semantic elaboration such as rethinking the meanings of emotional stimuli (Buhle et al., 2014; Hayes et al., 2010; Ochsner, Bunge, Gross, & Gabrieli, 2002). Left inferior frontal activation has also been linked to self-generation, in which the verbal information semantically paired to target words is self-generated (Vannest et al., 2012). Thus, memories encoded in self-generated semantic processes could be enhanced, and such memory enhancement could be involved in the interactions between the left inferior frontal cortex, which is associated with self-generated semantic process of stimuli, and the hippocampus, which is associated with episodic memory.

Another critical process in the generation of emotion is the processing of emotions induced by intentional processes of semantic elaboration. The emotion-related processes are divided into two different processes (LeDoux, 1995; Ochsner et al., 2009; Scherer, Schorr, & Johnstone, 2001; Teasdale et al., 1999). The primary process of emotions refers to the quick, lower-level processing of emotional stimuli conveyed by sensory inputs (LeDoux, 1992, 1995; Ochsner et al., 2009) and contributes to better remembering of emotionally rich stimuli than emotionally neutral stimuli (Bradley et al., 1992; Heuer & Reisberg, 1990; Kensinger & Corkin, 2003). Functional neuroimaging studies have shown that interacting mechanisms between the amygdala, which is related to the processing of emotional stimuli, and the hippocampus could be involved in the enhancement of memories for emotion-laden stimuli (Hamann, 2001; LaBar & Cabeza, 2006; Phelps, 2004). In addition, other studies have reported that neutral items are remembered more accurately when the items are encoded in emotional context than in neutral context (Anderson, Wais, & Gabrieli, 2006; Dunsmoor, Martin, & LaBar, 2012; Erk et al., 2003; Erk, Martin, & Walter, 2005; Murty, Labar, & Adcock, 2012), and that the memory enhancement is involved in the amygdala and medial temporal lobe (MTL) structures (Erk et al., 2003, 2005; Murty et al., 2012). The secondary process of emotions reflects the self-generation of emotional feelings through higher-level cognitive appraisal based on stored knowledge (Leventhal & Scherer, 1987; Ochsner et al., 2009). Previous functional neuroimaging studies have identified dissociable activation between the primary and secondary processes of emotions in which activation of the amygdala was associated with the processing of stimulus-driven emotions and activation of the dorsal medial prefrontal cortices (dmPFC) was associated with the processing of emotions generated by cognitive interpretation (Ochsner et al., 2009; Teasdale et al., 1999). Thus, emotions generated by semantic elaboration could modulate episodic memories through a network that includes the left inferior frontal region, which is involved in semantic elaboration; the dmPFC, which is related to the processing of self-generated emotion; and the hippocampus, which is related to episodic memory.

The underlying psychological and neural mechanisms that mediate the effects of semantic-emotion interactions on episodic memories have been examined in the context of stimulus-driven emotions (Dillon, Ritchey, Johnson, & LaBar, 2007; Jay, Caldwell-Harris, & King, 2008; Reber, Perrig, Flammer, & Walther, 1994; Ritchey, LaBar, & Cabeza, 2011). For example, emotion-related enhancement of episodic memories was not observed when emotional and neutral stimuli were deeply encoded by semantic processing, whereas better retrieval of emotional stimuli was identified only when the stimuli were encoded by shallow processing (Jay et al., 2008; Reber et al., 1994). In functional neuroimaging studies, encoding-related amygdala activation was increased during the encoding of emotional stimuli compared with that of neutral stimuli, when the encoding operations were semantically shallow but not deep (Ritchey et al., 2011). These findings suggest that semantic elaboration could inhibit the effect of stimulus-driven emotions to enhance memories and amygdala activation. Memories encoded by semantic elaborations could be enhanced by interactions between semantic-related left inferior frontal and memory-related hippocampal activation, whereas memory enhancement of emotional stimuli by amygdala-hippocampal interactions could be inhibited by semantic elaboration. In contrast, given that neutral items encoded in emotional context are remembered more accurately than those in neutral context (Anderson et al., 2006; Dunsmoor et al., 2012; Erk et al., 2003, 2005; Murty et al., 2012), emotional feelings generated internally by semantic elaboration could further increase memory performance and activations in the left inferior frontal gyrus and dmPFC related to the processing of semantic elaboration and self-generated emotion. However, little is known about the neural mechanisms underlying an interaction between memory and self-generated emotion.

To investigate the neural mechanisms of how memory-related activations are affected by emotions generated by semantic elaboration, we performed an event-related fMRI experiment during the encoding of emotionally neutral pictures. The design of the present study is summarized in Fig. 1. During encoding with fMRI scanning, healthy young participants were presented with pictures of emotionally neutral targets and cued words (emotionally negative, positive, and neutral) and were required to apply emotional feelings generated by semantic elaborations of imagining stories associated with the cued words to the target pictures. During retrieval without fMRI scanning, participants were presented with neutral pictures one at a time that included target (old) and distracter (new) stimuli, and the participants judged whether the pictures were previously learned.

Fig. 1
figure 1

An example of encoding trials. During encoding, participants were presented with a cue word and asked to imagine a background story associated with the target picture prompted by that cue word. Cue words were categorized into three types of emotion (negative, positive, and neutral valence), and all target pictures were emotionally neutral. After the presentation of target pictures, participants rated their subjective feelings of emotional valence and semantic depth for each picture during the encoding of the pictures. In the encoding phase, participants were not told that they would be tested later on their memory of the target pictures. The example picture shown in this figure was not included in the stimulus sets of our experiment, and all verbal labels were actually presented in Japanese; they are used here for illustration purposes only

Based on previous research, we made three predictions for the present study. First, memories for neutral pictures would be enhanced by semantic elaboration during the active generation of emotions, and memories would be further enhanced by the emotions conveyed from the semantic elaboration. Second, the left inferior frontal cortices and dmPFC would be involved in the generation of emotions by semantic elaboration, in which for the target neutral pictures, participants imagined background stories related to emotional or neutral cue words. Third, memories for neutral items encoded with emotional feelings generated by semantic elaboration would be modulated by the neural network including the left inferior frontal gyrus related to the semantic elaboration, dmPFC related to the emotion generated by semantic elaboration, and memory-related hippocampus.

Method

Participants

The study participants included 28 right-handed, college-aged adults who were recruited from the Kyoto University community and paid for their participation. Participants were healthy native Japanese speakers with no history of neurological or psychiatric disease. The data from five participants were excluded from the behavioral and fMRI analyses because it was suspected that three of these participants might have slept during fMRI scanning, one participant was absent in the retrieval phase, which was conducted 1 week after the encoding, and one participant incidentally showed pathological changes in brain structure on MRI. Thus, behavioral and MRI data from a total of 23 participants (nine women and 14 men) with an average age of 21.2 years (SD = 1.53) were analyzed in this study. All participants gave informed consent to participate in the study. The protocol was approved by the Institutional Review Board (IRB) of the Graduate School of Human and Environmental Studies, Kyoto University (24-H-8).

Stimuli

A total of 264 emotionally neutral pictures were selected from the International Affective Picture System (IAPS) (Lang, Bradley, & Cuthbert, 1999), pictures taken by the authors and free materials from web sites. These pictures were divided into 88 distracter stimuli and 176 target stimuli, which were further divided into four lists of 44 pictures each. To control the assignment of the pictures to the five lists of target and distracter stimuli, the pictures were rated by ten young adults (five women and five men, ranging in age from 18 to 25 years), who were recruited from the Kyoto University community and who were not participants in the fMRI study. Ratings were assigned in terms of emotional valence (1: unpleasant−9: pleasant), emotional arousal (1: calm−9: exciting), content of people (1: not people−9: people), and complexity (1: simple−9: complex). The mean rating scores for the five lists were compared by a one-way analysis of variance (ANOVA). The results showed no significant difference in the rating scores among the five lists in terms of emotional valence [F(4,259) = 1.48, p = .21, η 2 = .02], emotional arousal [F(4,259) = 1.11, p = .35, η 2 = .02], content of people [F(4,259) = 0.47, p = .76, η 2 = .01], and complexity [F(4,259) = 0.70, p = .59, η 2 = .01].

In addition, we prepared 132 Japanese words chosen from a database of two-letter Japanese Kanji words (Gotoh & Ohta, 2001). The words in this database were given scores of frequency, imagery, ease of learning, and emotional valence. We then divided the words into three lists of 44 words (negative, positive, and neutral words) according to their scores of emotional valence. To confirm the categorization of the words to the three lists, each word was rated by the same ten young adults who rated the neutral pictures. An ANOVA for the scores of emotional valence showed a significant effect of word list [F(2,129) = 1252.8, p < .01, η 2 = .95], in which valence scores for negative words were significantly lower than the valence scores for positive and neutral words (p < .01), and valence scores for neutral words were lower than the valence scores for positive words (p < .01). The ANOVA results for the other scores showed no significant effect of word list: frequency [F(2,129) = 2.26, p = .11, η 2 = .03], imagery [F(2,129)= 1.23, p = .30, η 2 =.02], and ease of learning [F(2,129) = 1.55, p = .22, η 2 = .02]. Three of the four lists of target pictures were combined with a list of negative, positive, or neutral words. Target pictures in the remaining list were combined with the word “Nothing.” All pictures were randomly paired with words by the experimenters. The combinations between picture and word lists were counterbalanced across participants.

Experimental procedures

All participants performed both encoding and retrieval tasks, and brain activation was measured only in the encoding phase using the event-related fMRI method. Participants were not informed during the encoding phase that they would be tested on the retrieval of target pictures (incidental encoding).

An example of encoding trials is illustrated in Fig. 1. During encoding, participants were first presented with a cue word for 2 s and were then shown a target picture for 6 s. All participants were given instructions regarding which of the four strategies they should use when viewing the target pictures, which was indicated by the preceding cue word. When a cue word with an emotionally positive, neutral, or negative meaning (e.g., “Success,” “Preparation,” and “Explosion,” respectively) was presented before the target picture, participants were required to imagine a meaningful story for the target picture that was associated with the cue word. When a cue word of “Nothing” was presented, participants just viewed the target pictures passively without additional thinking. These four strategies for viewing the pictures were defined as the “Negative,” “Positive,” “Neutral,” and “View” encoding conditions. Each encoding strategy was carefully explained to the participants by presenting examples, and the experimenters confirmed that all participants fully understood the differences among the strategies.

After the target pictures were presented, subjective ratings of emotional valence and semantic depth were displayed on the screen for 3.5 s each. In the rating task of emotional valence, participants subjectively rated whether the stories they imagined in response to the cue words were emotionally pleasant or unpleasant. In the rating task of semantic depth, participants were instructed to rate how deeply the meaning of cue words was associated with target pictures. Participants indicated their ratings by pressing one of eight buttons (1: unpleasant−8: pleasant in the valence rating, and 1: shallow−8: deep in the semantic depth rating). The encoding trials were separated by fixation intervals of variable lengths (jittered from 1 to 7 s). In one encoding run, 44 trials including 11 trials of Negative, Positive, Neutral, and View were prepared, and four successive runs were performed. Thus, each participant performed 176 trials during encoding. In each run, the order of the stimulus presentation was randomized across participants.

One week after the encoding, participants performed the recognition test on a Windows PC (outside of the fMRI scanner). During the recognition task, participants were presented with the target and distracter pictures in a random order and were required to judge whether the pictures were old (target) or new (distracter) in two levels of confidence. The participants could reply with one of the following four response options: “definitely old (DO),” “probably old (PO),” “probably new (PN),” and “definitely new (DN).” Participants were asked to choose one of these responses by pressing one of four key buttons as soon as possible. Each picture was presented for 3.5 s and was followed by a variable (0.5−6.5 s) fixation interval. One retrieval run included 44 target stimuli (11 stimuli for each encoding condition) and 22 distracter stimuli, and the run was repeated four times with different sets of stimuli. Participants therefore made old or new judgments for 176 target and 88 distracter stimuli in four retrieval runs.

MRI acquisition

All MRI data were acquired using a Siemens MAGNETOM Verio 3T scanner located in the Kokoro Research Center, Kyoto University. The timing of the stimulus presentation and recording of behavioral responses by participants were controlled by MATLAB® programs (www.mathworks.com) on Windows PC. All stimuli were visually presented through a projector and back-projected onto a screen. Participants viewed the stimuli via a mirror attached to the head coil of the scanner. Behavioral responses were recorded using an eight-button optic fiber response device (Current Designs, Inc., Philadelphia, PA, USA) comprising two boxes, each with four buttons. The two four-button response boxes were assigned to each hand. Scanner noise was reduced with earplugs, and head motion was minimized by foam pads.

During MRI scanning, first T1-weighted sagittal localizer scanning was performed. Second, echo-planar functional images (EPIs), which are sensitive to blood-oxygenation-level dependent (BOLD) contrasts, were acquired for functional scanning (TR = 2 s, TE = 25 ms, flip angle = 75°, FOV = 22.4 × 22.4 cm, matrix size = 64 × 64, slice thickness/gap = 3.5/0 mm, voxel size = 3.5 × 3.5 × 3.5 mm, 39 horizontal slices). Finally, high-resolution T1-weighted structural images were collected (MPRAGE, TR = 2.25 s, TE = 3.51 ms, FOV = 25.6 × 25.6 cm, matrix size = 240 × 240, slice thickness/gap = 1.0/0 mm, 208 horizontal slices).

Data analyses

To analyze the MRI data in the preprocessing and statistics, we used Statistical Parametric Mapping 8 (SPM 8-Wellcome Department of Cognitive Neurology, London, UK) implemented in MATLAB®. During preprocessing, functional images for the initial four scans were discarded, and the remaining functional images were corrected for slice timing and head motion, spatially normalized into the Montreal Neurological Institute (MNI) template (resampled voxel size = 3.5 × 3.5 × 3.5 mm), and then spatially smoothed using a Gaussian kernel with a FWHM of 8 mm.

In the statistical analysis, preprocessed images were statistically analyzed in two steps, at the individual level and at the group level. In the individual-level (fixed-effect) analyses, trial-related activations were modeled by convolving a vector of onsets with a canonical hemodynamic response function (HRF) in the context of the General Linear Model (GLM). To identify activations related to the processing of emotion generated by semantic elaboration, the timing of trial onsets was set to when the target pictures appeared after the cue presentation, and the duration of each event was modeled as 6 s. The rationale of using the duration of 6 s for each event was to separate between transient activation of stimulus-driven emotion from cue words and sustained activation of self-generated emotion for target pictures. Thus, this manipulation enabled us to identify activation related to the processing of emotions generated with the target pictures.

According to the subsequent memory paradigm (Paller & Wagner, 2002), all encoding trials were divided into two categories: subsequent hits in both high and low confidence responses (Hit) and misses in both high and low confidence responses (Miss). The Hit and Miss trials were further subdivided according to the encoding conditions (Negative, Positive, Neutral, and View). Thus, trial-related activations were modeled with eight conditions that were decided by two factors of encoding condition and subsequent memory (Negative-Hit, Positive-Hit, Neutral-Hit, View-Hit, Negative-Miss, Positive-Miss, Neutral-Miss, and View-Miss), and with one no-response condition. Confounding factors (head motion and magnetic field drift) were included in this model. Significant activations were identified by comparing trial-related activations in each condition with baseline activations related to visual fixation. The contrasts reflecting condition-related activations yielded a t-statistic at each voxel.

In the group-level (random-effect) analyses, condition-related contrast images identified in the individual-level analyses were analyzed by a two-way repeated measure ANOVA using the factors of encoding condition (Negative, Positive, Neutral, and View) and subsequent memory (Hit and Miss). The 4 × 2 ANOVA was performed using a flexible factorial design in SPM 8, and the factor of subject was included in this model. Four types of statistical analysis were performed with this ANOVA model. First, to identify regions related to the semantic elaboration during encoding, an F-contrast of encoding condition was inclusively masked by three t-contrasts of Negative versus View, Positive versus View, and Neutral versus View. This procedure yielded an activation map fulfilling a significant main effect of encoding condition, and greater activations in Negative, Positive, and Neutral than in View. Second, to identify regions related to the processing of emotions generated by the semantic elaboration, an F-contrast of encoding condition was inclusively masked by two t-contrasts of Negative versus Neutral and Positive versus Neutral. This procedure yielded an activation map showing a significant main effect of encoding condition, and greater activations in Negative and Positive than in Neutral. Third, successful encoding activations were identified in an F-contrast of subsequent memory masked inclusively by a t-contrast of Hit versus Miss. Finally, a significant interaction between the two factors of encoding condition and subsequent memory was analyzed in an F-contrast reflecting the interaction. In all analyses conducted with this ANOVA, the statistical thresholds at the voxel level were set at p < .001 and corrected for whole-brain multiple comparisons at the cluster level (FWE, p < .05) with a minimum cluster size of 9 voxels. Anatomical sites showing significant activations were primarily defined by the SPM Anatomy Toolbox (Eickhoff, Heim, Zilles, & Amunts, 2006; Eickhoff et al., 2007; Eickhoff et al., 2005).

To investigate how the left inferior frontal region interacted with other brain regions, two patterns of the functional connectivity analysis were performed by a generalized form of context-dependent psycho-physiological interactions (gPPI) (McLaren, Ries, Xu, & Johnson, 2012). Before performing the gPPI analyses, four encoding runs were collapsed into one run. In addition, all Hit and Miss trials were collapsed in the second pattern of the gPPI analysis. Through the prior manipulation of the fMRI data at the individual level (fixed effect) analyses, new one-run GLMs, which included eight experimental conditions (Negative-Hit, Positive-Hit, Neutral-Hit, View-Hit, Negative-Miss, Positive-Miss, Neutral-Miss, and View-Miss) and one no-response condition for the first gPPI analysis, or four experimental conditions (Negative, Positive, Neutral, and View) and one no-response condition for the second gPPI analysis, were produced. In these models, the left inferior frontal seed was defined for each participant as a volume-of-interest (VOI) of sphere with a 6-mm radius around the peak voxel (-50, 25, 13), which reflected significant activation in contrasts of Negative versus View, Positive versus View, and Neutral versus View (main effect of encoding condition in an ANOVA model). The VOIs defined in each participant were masked by the left inferior frontal region of interest (ROI; opercular, triangular, and orbital parts) constructed by the WFU PickAtlas (www.fmri.wfubmc.edu) and the AAL ROI package (Tzourio-Mazoyer et al., 2002).

In the present study, we employed the gPPI toolbox (www.nitrc.org/projects/gppi). In the individual-level (fixed-effect) analyses, the toolbox creates a design matrix with three sets of columns including: (1) condition regressors formed by convolving a vector of condition-related onsets with a canonical HRF; (2) BOLD signals deconvolved from the seed region; and (3) PPI regressors as the interaction between the first and second regressors at the individual level. Thus, in the first pattern of GLMs, the following 19 regressors were modeled: (1) PPI regressor of Negative-Hit; (2) PPI regressor of Positive-Hit; (3) PPI regressor of Neutral-Hit; (4) PPI regressor of View-Hit; (5) PPI regressor of Negative-Miss; (6) PPI regressor of Positive-Miss; (7) PPI regressor of Neutral-Miss; (8) PPI regressor of View-Miss; (9) PPI regressor of no-response; (10) Condition regressor of Negative-Hit; (11) Condition regressor of Positive-Hit; (12) Condition regressor of Neutral-Hit; (13) Condition regressor of View-Hit; (14) Condition regressor of Negative-Miss; (15) Condition regressor of Positive-Miss; (16) Condition regressor of Neutral-Miss; (17) Condition regressor of View-Miss; (18) Condition regressor of no-response; (19) BOLD signals in the left inferior frontal seed VOI. In the second pattern of GLMs, the following 11 regressors were modeled: (1) PPI regressor of Negative; (2) PPI regressor of Positive; (3) PPI regressor of Neutral; (4) PPI regressor of View; (5) PPI regressor of no-response; (6) Condition regressor of Negative; (7) Condition regressor of Positive; (8) Condition regressor of Neutral; (9) Condition regressor of View; (10) Condition regressor of no-response; (11) BOLD signals in the left inferior frontal seed VOI. Six motion-related regressors were also included in these GLMs. After the creation of these models, the gPPI toolbox estimated the model parameters and computed linear contrasts. Regions reflecting a significant effect of the PPI regressor contrasts were considered to be functionally connected with the left inferior frontal seed on the significant threshold of statistics. The PPI regressor contrasts were applied to the group-level (random-effect) analyses.

In the first gPPI analysis for the group-level, eight PPI regressor contrasts (Negative-Hit, Positive-Hit, Neutral-Hit, View-Hit, Negative-Miss, Positive-Miss, Neutral-Miss, and View-Miss), which were identified in the first pattern of GLMs, were analyzed by a two-way repeated measure ANOVA using the factors of encoding condition (Negative, Positive, Neutral, and View) and subsequent memory (Hit and Miss). The 4 × 2 ANOVA was modeled by the same procedures explained in the prior ANOVA analysis. To investigate whether functional connectivity patterns with the left inferior frontal seed were modulated by these factors, we analyzed F-contrasts reflecting significant main effects of each factor and a significant interaction between the two factors. In the second gPPI analysis for the group-level, we analyzed four PPI regressor contrasts (Negative, Positive, Neutral, and View), which were identified in the second pattern of GLMs. In this analysis, first, the contrast images identified in each participant were analyzed by one-sample t-tests. Second, to identify the functional connectivity patterns modulated by the subsequent memory performance, covariates of d-primes in each participant were included in the one-sample t-test model of PPI regressor contrasts. This analysis enabled us to find the functional connectivity patterns predicting the subsequent memory performance in each encoding condition.

In both patterns of the gPPI analysis, the statistical thresholds at the voxel level were set at p < .001 and corrected for whole-brain multiple comparisons at the cluster level (FWE, p < .05) with a minimum cluster size of 9 voxels. In addition, based on our hypotheses, the small volume correction (SVC) method (Worsley et al., 1996) was employed for a ROI in the dmPFC, which was defined by the bilateral superior medial frontal gyri. In addition, the MTL ROI defined by the bilateral hippocampi and parahippocampal gyri was also analyzed by the SVC method, because successful encoding activations were identified in the MTL region in previous functional neuroimaging studies (Davachi, 2006; Murty, Ritchey, Adcock, & LaBar, 2010; Rugg et al., 2012; Wais, 2008). These ROIs were defined by the WFU PickAtlas (www.fmri.wfubmc.edu) and the AAL ROI package (Tzourio-Mazoyer et al., 2002). For the ROI analysis, the statistical threshold at the voxel level was set at p < .001 and corrected for multiple comparisons in defined ROIs at the cluster level (FWE, p < .05). Anatomical sites showing significant functional connectivity were primarily defined by the SPM Anatomy Toolbox (Eickhoff et al., 2006; Eickhoff et al., 2007; Eickhoff et al., 2005).

Results

Behavioral results

Consistent with our first prediction, memories for neutral pictures were recognized more accurately by the encoding strategies of semantic elaboration than of passive viewing. In addition, the memory enhancement by semantic elaboration was more remarkable when the emotional feeling generated was negative or positive. As shown in Fig. 2, a one-way repeated measure ANOVA for hit rates showed a significant effect of encoding condition [F(3,66) = 4.02, p < .05, η 2 = .15], in which hit rates in Negative (p < .01), Positive (p < .01), and Neutral (p < .05) were significantly higher than those in View. However, there was no significant difference of hit rates between Negative or Positive and Neutral, and between Negative and Positive. In an ANOVA for d-primes, an effect of encoding condition was significant [F(3,66) = 3.52, p < .05, η 2 = .14], and post-hoc tests showed significant differences between Negative and View (p < .01), and between Positive and View (p < .05). However, there was no significant difference of d-primes between Neutral and View, between Negative or Positive and Neutral, and between Negative and Positive.

Fig. 2
figure 2

Results of retrieval accuracy and subjective rating scores in emotional valence and semantic depth during encoding. (a) Hit rates [hits/(hits + misses)] were calculated by each encoding condition. Hit responses for target pictures encoded by imagining background stories were significantly higher than those for target pictures encoded by simply viewing the pictures. (b) Scores of d-primes were calculated by each encoding condition. The d-primes were significantly higher in memories for target pictures encoded by imagining emotional stories than those for target pictures encoded by simple viewing. (c) Rating scores of emotional valence for target pictures were calculated by each encoding condition. The results confirmed that emotional feelings for target pictures were appropriately generated by cue words in both Positive and Negative. (d) Rating scores of semantic depth for target pictures were calculated by each encoding condition. The results confirmed that target pictures encoded with cue words were processed more deeply by the semantic processing of background stories than those without cue words. Error bars represent standard errors. **p < .01, *p < .05

The data from the rating of emotional valence demonstrated that emotional feelings were appropriately induced by the cue words in both conditions of Positive and Negative (see Fig. 2). A repeated measure ANOVA for emotional valence scores of target pictures showed a significant effect of encoding condition [F(3,66) = 172.0, p < .01, η 2 = .89], in which valence scores in Negative were significantly lower than those in the other conditions (p < .01 for all contrasts), and valence scores in Positive were significantly higher than those in the other conditions (p < .01 for all contrasts). In the rating of semantic depth, we found that target pictures encoded through semantic elaboration were processed more deeply than those viewed passively (see Fig. 2). In a repeated measure ANOVA for rating scores of semantic depth, we found a significant effect of encoding condition [F(3,66) = 177.8, p < .01, η 2 = .89]. Post-hoc tests revealed that rating scores of semantic depth in Negative, Positive, and Neutral were higher than those in View (p < .01 for all contrasts), and the scores in Negative were higher than those in Neutral (p < .01).

Regarding response time (RT) data (ms) during retrieval, a repeated measure ANOVA with factors of encoding condition (Negative, Positive, Neutral, and View) and memory (Hit and Miss) demonstrated no significant main effect of encoding condition [F(3,66) = 0.70, p = .56, η p 2 = .03] and memory [F(1,22) = 0.45, p = .51, η p 2 = .02], and no significant interaction between these factors [F(3,66) = 1.15, p = .34, η p 2 = .05]. Details of behavioral results are summarized in Table 1.

Table 1 Behavioral results

fMRI results

Confirming our second prediction, activations in the left inferior frontal gyrus and dmPFC were associated with the processing of semantic elaboration during imagining background stories for target pictures and of emotional feelings generated by semantic elaboration for target pictures.

To analyze the fMRI data, a two-way repeated measure ANOVA with factors of encoding condition and subsequent memory was performed. Details regarding these activation patterns are summarized in Table 2. As illustrated in Fig. 3, the results demonstrated that activations in the left inferior frontal gyrus and dmPFC showed a significant main effect of encoding condition, and the activations were significantly enhanced in Negative, Positive, and Neutral, which required participants to imagine stories of target pictures according to cued words, compared to View, in which participants passively viewed target pictures without additional thinking. Other regions showing the same patterns of activation were identified in the bilateral middle temporal gyri, middle occipital gyri, left fusiform gyrus, right inferior frontal gyrus, precentral gyrus, angular gyrus, and cerebellum. In addition, activations in the left inferior frontal gyrus and dmPFC were also significantly greater in the emotion-related conditions of Negative and Positive than in the non-emotional condition of Neutral. Greater activations in the processing of emotions generated by semantic elaboration for target pictures were also found in the right inferior frontal gyrus, middle temporal gyrus, and left temporal pole. Given that the left inferior frontal and dmPFC activations were also identified in the contrasts of Negative, Positive, and Neutral with View, both regions of the left inferior frontal gyrus and dmPFC involved in the processing of semantic elaboration could be further enhanced by the processing of emotional feelings generated by the semantic elaborations.

Table 2 Regions showing significant activations in an ANOVA with factors of encoding condition and subsequent memory
Fig. 3
figure 3

Results of ANOVA for fMRI data. (a) Activation in the left inferior frontal gyrus in this figure was significantly identified in a main effect of encoding condition masked inclusively by contrasts of Negative versus View, Positive versus View, and Neutral versus View. (b) Activation in the left dmPFC region in this figure was significantly identified in a main effect of encoding condition masked inclusively by contrasts of Negative versus Neutral and Positive versus Neutral. Parameter estimates in graphs were extracted from a peak voxel of regions showing significant activation, and the mean values were computed in each condition. Error bars represent standard errors

In the ANOVA of fMRI data, we investigated activations associated with the successful encoding of target pictures by comparing activations between Hit versus Miss trials. However, significant activation involved in the successful encoding process was not found in any other regions. In addition, a significant interaction between factors of encoding condition and subsequent memory was not observed in any region.

Confirming our third prediction, encoding-related functional connectivity of the left inferior frontal gyrus as a seed with the dmPFC and hippocampus predicted the subsequent retrieval accuracies of neutral pictures encoded with negative and positive emotions generated by semantic elaboration. However, the retrieval accuracies of neutral pictures encoded with neutral emotion generated by semantic elaboration were predicted by functional connectivity between the left inferior frontal gyrus and hippocampus, but not between the left inferior frontal gyrus and dmPFC.

In the first gPPI analysis, we investigated whether functional connectivity patterns with the left inferior frontal seed were modulated by two factors of encoding condition and subsequent memory using the 4 × 2 ANOVA model. Results demonstrated that no region related to the functional connectivity showed significant main effects of encoding condition and subsequent memory. In addition, a significant interaction between these factors was not identified in any region reflecting the functional connectivity. The second gPPI analysis regarding the left inferior frontal seed demonstrated that the dmPFC and hippocampus showed the significant functional connectivity correlated with individual scores of d-primes in Negative and Positive, whereas in Neutral, the significant functional connectivity modulated by individual scores of d-primes was identified only in the hippocampus (see Fig. 4). In View, we did not find any other regions reflecting significant functional connectivity correlated with individual scores of d-primes. The detailed results of the second gPPI analysis are summarized in Table 3.

Fig. 4
figure 4

Brain regions reflecting functional connectivity with left inferior frontal seed modulated by individual scores of d-primes. (a) Functional connectivity in Negative. Functional connectivity between the left inferior frontal seed and the dmPFC/hippocampal regions showed significant correlations with d-primes in the Negative condition. (b) Functional connectivity in Positive. Functional connectivity between the left inferior frontal seed and the dmPFC/hippocampal regions showed significant correlations with d-primes in the Positive condition. (c) Functional connectivity in Neutral. Functional connectivity between the left inferior frontal seed and the hippocampus showed a significant correlation with d-primes in the Neutral condition

Table 3 Regions modulated by correlations between functional connectivity with the left inferior frontal gyrus and subsequent retrieval performance in each encoding condition

Discussion

Three major findings emerged from the present study. First, hit rates of pictures encoded by a strategy of semantic elaboration were significantly higher than those encoded by passive viewing. In addition, d-prime scores for memories encoded in emotional feelings generated by the semantic elaboration were significantly enhanced, compared to those without the semantic elaboration. Second, activations in the left inferior frontal gyrus and dmPFC were enhanced by the process of semantic elaboration, and these activations also increased during the processing of emotions generated by semantic elaboration. Third, the functional connectivity between the left inferior frontal gyrus and the dmPFC/ hippocampus during the encoding of neutral pictures with emotions generated by semantic elaboration was significantly correlated with individual scores of d-primes in Negative and Positive, whereas the functional connectivity correlated with d-primes in Neutral was identified only between the left inferior frontal gyrus and hippocampus. These findings suggest that encoding-related activation during the emotion generation by semantic elaboration could be modulated by interacting mechanisms in the left inferior frontal region, which is involved in semantic elaboration; the dmPFC, which is involved in the processing of emotions generated by semantic elaboration; and the hippocampus, which is involved in the memory process. These findings are discussed in separate sections below.

Memory enhancement by semantic elaboration and emotion generation

The first main finding of this study was that hit rates of pictures encoded by imagining background stories associated with the cue words were significantly higher than those of pictures viewed passively. In addition, d-primes of pictures encoded with emotionally positive and negative feelings generated by imagining stories were significantly enhanced, compared to those of pictures without imagining stories. However, there was no significant difference of d-primes between pictures encoded with neutral stories associated with neutral words and pictures encoded without imagining stories (see Fig. 2). These findings suggest that memories could be enhanced by the process of semantic elaboration, and that the enhancement of memories could be further raised by emotions generated by a cognitive control process, such as semantic elaboration.

In the present study, hit rates were significantly improved when target pictures were encoded in the Negative, Positive, and Neutral conditions than when the pictures were encoded in the View condition. The result is consistent with previous findings that stimuli encoded by the deep encoding strategy, such as the semantic process of the target stimuli, are remembered better than memories encoded by the shallow encoding strategy, such as the perceptual process of the target stimuli. This effect is known as the levels-of-processing effect (Craik & Lockhart, 1972; Craik & Tulving, 1975). Previous studies also have demonstrated that memories are significantly enhanced when their meaning is self-generated (Bertsch, Pesta, Wiscott, & McDaniel, 2007; Fiedler, Lachnit, Fay, & Krug, 1992; Mcdaniel, Waddill, & Einstein, 1988; Slamecka & Graf, 1978). This is known as the “generation effect.” These findings imply that stimuli might be processed more deeply through self-generation of their semantic information. Thus, the present findings, in which hit responses for target pictures encoded in the Negative, Positive, and Neutral conditions were significantly enhanced, could reflect the self-generation of semantic information during encoding.

The analysis of d-primes in each encoding condition showed that d-prime scores in both emotional conditions of Negative and Positive were significantly higher than those in the View condition. However, we did not find significant difference of d-primes between Neutral and View. This finding may be explained by previous findings, which found the beneficial effect of emotional context on memory for neutral items (Anderson et al., 2006; Dunsmoor et al., 2012; Erk et al., 2003; Erk et al., 2005; Murty et al., 2012). For example, one study reported that emotionally neutral words encoded in the context of emotionally positive pictures were remembered more accurately than those in the context of emotionally negative and neutral pictures (Erk et al., 2003). Taken together with the present finding, memories for neutral items could be enhanced by the emotional context conveyed from self-generated emotions associated with the intentional processes of semantic elaboration, as well as from stimulus-driven emotions associated with the emotional stimuli.

Activations in the left inferior frontal gyrus and dmPFC

The second main finding of this study was that the left inferior frontal gyrus and dmPFC showed greater activation when participants imagined background stories for target pictures through the process of semantic elaboration than when the participants passively viewed the target pictures. In addition, activations of these regions were also identified during the processing of emotions generated by semantic elaboration (see Fig. 3). These findings suggest that cooperative operations between the left inferior frontal gyrus and dmPFC could contribute to the processing of emotions generated internally by semantic elaboration.

The importance of the left inferior frontal gyrus in the processing of semantic elaboration has been consistently identified in functional neuroimaging studies (Bookheimer, 2002; Costafreda et al., 2006; Jefferies, 2013). For example, left inferior frontal activation related to the semantic processing was found when verbal information semantically associated with target words was self-generated (Vannest et al., 2012). In the context of emotion regulation, previous studies have linked left inferior frontal activation to the cognitive reappraisal process, in which emotions are regulated by semantic processing of emotionally aroused stimuli (Buhle et al., 2014; Ochsner, Silvers, & Buhle, 2012). The present findings of left inferior frontal activation suggest that the left inferior frontal gyrus could contribute to the generation of emotions by semantic elaboration, as well as to the regulation of emotionally aroused stimuli through reappraisal.

In the present study, significant dmPFC activation was associated with the semantic elaboration, and the activation was further increased by the processing of emotions generated by semantic elaboration. The finding of dmPFC activation is consistent with previous findings that revealed significant dmPFC activation during self-generation of emotions through the cognitive appraisal process (Ochsner et al., 2009; Partiot, Grafman, Sadato, Wachs, & Hallett, 1995; Teasdale et al., 1999). For example, one fMRI study demonstrated that the amygdala showed significant activation in the bottom-up perception of aversive stimuli and the top-down interpretation of neutral stimuli as aversive, whereas activation in the dmPFC was associated only with the latter process. In addition, activation in the amygdala was significantly correlated with the subjective rating scores of bottom-up perception of emotional stimuli, whereas dmPFC activation showed significant correlations with the subjective rating scores of top-down interpretations of self-generated emotions (Ochsner et al., 2009). Similar patterns of dmPFC activation were identified in another fMRI study, which found significant dmPFC activation associated with the cognitive generation of emotions (Teasdale et al., 1999). Thus, dmPFC activation in the present study suggests that this region could contribute to the processing of emotions generated by intentional or cognitive interpretations of emotionally neutral stimuli, whereas the amygdala could be critical in the processing of stimulus-driven emotions.

Functional connectivity of the left inferior frontal gyrus with other regions

The third main finding of our study was that functional connectivity between the left inferior frontal gyrus and the dmPFC/hippocampus in the encoding conditions of Negative and Positive was significantly correlated with individual scores of d-primes in each condition (see Fig. 4). However, individual scores of d-primes in the Neutral condition were significantly correlated with functional connectivity between the left inferior frontal gyrus and hippocampus, but not between the left inferior frontal gyrus and dmPFC (see Fig. 4). These findings suggest that memories encoded with emotional feelings generated internally by semantic elaboration could be modulated by region-to-region interactions including the left inferior frontal gyrus, dmPFC and hippocampus, and that the region-to-region interactions during encoding could predict the individual differences in the retrieval of memories associated with the self-generated emotion.

The present finding of functional connectivity between the left inferior frontal gyrus, which is involved in semantic elaboration, and the hippocampus, which is involved in memory encoding, has been consistently found in fMRI studies (Hayes et al., 2010; Schott et al., 2013). For example, one study reported that words encoded through pleasantness judgments, as a deeper process, were remembered more accurately than those encoded by syllable judgments, as a shallow process, and that interactions between left inferior frontal and hippocampal activations were significantly enhanced with words for which encoding was deep compared with those for which encoding was shallow (Schott et al., 2013). Another study found that functional connectivity between encoding-related activation in the left inferior frontal and hippocampal regions was significant when emotionally negative stimuli were encoded by the “reappraisal” strategy (Hayes et al., 2010). In addition, there is also functional neuroimaging evidence that activations in the left inferior frontal gyrus related to the self-initiated semantic elaboration predicted individual differences in context memory discrimination (Raposo, Han, & Dobbins, 2009). In the present study, functional connectivity between the left inferior frontal gyrus and hippocampus during encoding was significantly correlated with individual differences of retrieval accuracy in the encoding conditions of Negative, Positive, and Neutral, in which semantic elaboration was required to imagine stories associated with cued words. The present findings of functional connectivity suggest that the within-participant interactions between the left inferior frontal and hippocampal regions during memory encoding could contribute to the successful encoding of memories by semantic elaboration associated with the self-generation of emotions, as well as the reappraisal of stimulus-driven emotions. In addition, individual differences of the within-participant interactions between these regions could predict the subsequent retrieval performance of memories encoded by semantic elaboration.

The present finding of functional connectivity between the left inferior frontal gyrus and dmPFC is consistent with evidence from functional neuroimaging studies investigating neural mechanisms underlying cognitive reappraisal of emotions (Buhle et al., 2014; Cromheeke & Mueller, 2014; Kalisch, 2009). For example, one fMRI study reported that greater activations in the dmPFC, dorsolateral PFC, and ventrolateral PFC were identified during the successful reappraisal of negative emotions than during the simple viewing of emotionally negative stimuli (Silvers, Weber, Wager, & Ochsner, 2015). In another fMRI study, functional connectivity between activations in the anterior cingulate and lateral PFC regions was significantly higher during the cognitive reappraisal of emotionally negative films than the passive viewing of them (Allard & Kensinger, 2014). In addition, a voxel-based morphometry (VBM) study revealed that gray matter volumes in the dmPFC were significantly correlated with individual scores in the habitual use of expressive suppression as an emotion regulation (Kuhn, Gallinat, & Brass, 2011). The present results, showing that functional connectivity between activations in the left inferior frontal and dmPFC regions during encoding of target pictures with self-generated emotions was correlated with retrieval accuracies of the pictures, extend previous findings by demonstrating that the mechanism of interaction between these regions was modulated by emotions generated by the cognitive control of semantic elaboration, as well as by the cognitive reappraisal of negative emotions, and that the interacting mechanisms during the encoding of items with self-generated emotions predicted individual differences in the subsequent retrieval performance.

Conclusion

The present study used event-related fMRI to investigate the neural mechanisms underlying the encoding of neutral pictures during the processing of emotions generated by semantic elaboration. In the behavioral data, memories for neutral pictures were remembered more accurately when they were encoded by semantic elaboration than when they were encoded by passive viewing, and the memory enhancement was more remarkable in the encoding of neutral pictures with emotional feelings generated by semantic elaborations. fMRI data revealed that left inferior frontal and dmPFC activation was significantly enhanced during encoding with the processing of semantic elaborations, and activations in these regions significantly increased during the processing of emotions generated by semantic elaboration. In addition, analyses of the functional connectivity of the left inferior frontal seed demonstrated that interacting mechanisms between the left inferior frontal gyrus and the dmPFC/hippocampal regions during the encoding of pictures with self-generated emotion were significantly correlated with individual differences in retrieval accuracies of the pictures. These findings suggest that functional networks including the left inferior frontal region, which is associated with semantic elaboration; the dmPFC, which is associated with the processing of generated emotions; and the hippocampus, which is associated with successful memory encoding, could contribute to the modulation of memories encoded by emotion generation.