The negation-induced forgetting effect remains even after reducing associative interference

negation-induced forgetting effect was still powerful, while controlling for potential contaminating variables. Our finding would support that the impaired long-term memory could be ascribed to reusing the inhibitory mechanism of negation.


Introduction
Negation processing is often associated with the inhibition of the cognitive, motor, and neural representations of conceptual meaning (for a recent review, Beltrán, Liu, & de Vega, 2021). At the cognitive level, negated concepts are less accessible in working memory than affirmed ones (e.g., Kaup & Zwaan, 2003;MacDonald & Just, 1989;Orenes, Beltrán, & Santamaría, 2014), which suggests that negation somehow counteracts (or inhibits) the ordinary encoding of concepts. In the same vein, negation applied to manual action sentences reduces the automatic hand motor activity frequently observed in their affirmative counterparts (Aravena et al., 2012), and reduces the interference of action verbs with typing . Finally, at the neural level, the results are quite heterogeneous. Some studies report negation-induced reductions of the sensory-motor cortical activity during the processing of action language using neuroimaging (e.g., Tettamanti et al., 2008;Tomasino, Weiss, & Fink, 2010) and brain stimulation studies (Liuzza, Candidi, & Aglioti, 2011;Papeo, Hochmann, & Battelli, 2016;Vitale, Monti, Padrón, Avenanti, & de Vega, 2022). Yet, the evidence is inconclusive, since other neuroimaging studies have obtained increased negation-induced activations in the left premotor cortex (Christensen, 2009) or in Broca's area (Bahlmann, Mueller, Makuuchi, & Friederici, 2011), depending on the particular task demands.
The apparent inhibitory effects mentioned above were observed during or immediately after the comprehension of the negative sentence and reflect, thereby, the short-term consequences of negation processing. Yet, there are a few behavioral studies reporting that the influence of negation extends over time, also affecting the long-term recall of negated information (Cornish & Wason, 1970;Craik & Tulving, 1975;Fiedler, Walther, Armbruster, Fay, & Naumann, 1996; and more recently Mayo, Schul, & Rosenthal, 2014). Essentially, these authors observed that negated concepts are forgotten more often than nonnegated concepts, suggesting that the short-term inhibitory effect of negation might result in later memory loss for the concepts under its scope. The current study aims to consolidate the negation-induced forgetting (NIF) effect as a genuine inhibition-related effect, discarding other alternative explanations. To that end, it presents two experiments that replicate and extend the original findings reported by Mayo et al. (2014). In addition, it offers an explanation that emphasizes the parallel between the mechanisms driving NIF and other well-known forgetting paradigms from the memory literature, which link inhibition at encoding and retrieval with further memory loss (for a recent review, Anderson & Hulbert, 2021).

Inhibition and forgetting
Inhibitory control plays a critical role in recent studies, which report how the retrieval of memories can selectively induce forgetting in goaldirected contexts. One representative task in this field is the retrievalinduced forgetting (RIF) paradigm (Anderson, Bjork, & Bjork, 1994, 2000. After studying different semantic categories, participants retrieved half of the items of the categories through practice tests. Retrieval-induced forgetting is observed when non-practiced items from practiced categories are recalled less than non-practiced items from nonpracticed categories. Nowadays, the dominant explanation for this forgetting effect is that inhibitory control processes are recruited during the retrieval to resolve the competition between memories, selecting the relevant target while overcoming the competing distractor (Anderson & Hulbert, 2021;Storm & Levy, 2012). A more direct involvement of inhibitory control in forgetting is observed in the Think/No-Think (TNT) paradigm (Anderson et al., 2004;Anderson & Green, 2001). After a study phase, the participants were required to remember some paired items while preventing the other paired items from entering consciousness. Suppression-induced forgetting (SIF) is inferred from the impaired recalling of No-Think pairs as compared to the recalling of baseline pairs. RIF and SIF reflect the operation of different cognitive functions: incidental stimulus selection (between competitors) and intentional stopping, respectively. Yet, both are thought to imply active forgetting processes that are mediated by the domain-general inhibitory control system; or in other words, they are implemented by mechanisms that are not exclusive to memory but are shared with other cognitive functions that require inhibition, either selection or stopping (Anderson & Hulbert, 2021). These mechanisms operate at retrieval over already established memories, having as a consequence the weakening of the memory traces for either competitors or to-be-prevented items. Then, according to this view, forgetting does not come from an automatic decay but from the inhibitory processes that, either incidentally or intentionally, intervene during goal-directed retrieval (i.e., RIF and SIF). Importantly, this picture of active forgetting, as induced by inhibition at retrieval, partially resembles current views of how negation is processed, and of how it could affect long-term memory (e.g., Beltrán et al., 2021;Kaup & Dudschig, 2020;Mayo et al., 2014).

Negation as inhibition
Evidence gathered with the probe recognition paradigm provides the main support for the conclusion that negated information is less active (or less available in working memory) than affirmed information (Giora, Fein, Aschkenazi, & Alkabets-zlozover, 2007;Hasson & Glucksberg, 2006;Kaup, Lüdtke, & Zwaan, 2006;MacDonald & Just, 1989). In this paradigm, the basic trial procedure consists of a sentence containing a critical concept, followed by a probe stimulus (word or image) that corresponds or not to the critical concept (McKoon & Ratcliff, 1980). The relevant measurements are the latency and accuracy associated with the response (e.g., naming, lexical decision or recognition) to the probe stimulus. Shorter latency and/or lower error rate are supposed to indicate higher availability of the concept in working memory. For negated concepts, the probe paradigm reflects a two-stage time course (e.g., Kaup et al., 2006). For short sentence-probe intervals, both negated and affirmed concepts (e.g., "The umbrella is [is not] open") produce similar activation levels on the next probe (e.g., "umbrella"); in contrast, for longer intervals (several hundred milliseconds), negated concepts show reduced activation on the probe (larger response times) relative to affirmed ones. This time course indicates that, initially, negated concepts are encoded and represented, but over time they are to some extent deactivated or suppressed.
In line with the above account for active forgetting, an emerging view is that negation induces this suppression effect by recruiting domain-general inhibitory mechanisms, and especially by recruiting those mechanisms associated with either preventing or stopping dominant responses and representations de Vega et al., 2016;Dudschig, Kaup, Svaldi, & Gulewitsch, 2021;Liu, Gu, Beltrán, Wang, & de Vega, 2020;Montalti, Calbi, Cuccio, Umiltà, & Gallese, 2021). As noted above, inhibition occurs at retrieval for RIF and SIF paradigms, increasing the likelihood of forgetting the inhibited items. For negation, inhibitory control is thought to intervene at comprehension, by suppressing negated information. Then, like in RIF and SIF paradigms, the short-term inhibitorylike actions of negation could provoke subsequent memory loss for the concept(s) under its scope (Beltrán et al., 2021;Mayo et al., 2014).
The above studies provide evidence for the inhibition mechanism underlying the comprehension of sentential negation. However, people also produce negations as much as understand them. Thus, people may produce negative utterances or denials in response to questions or requests for several pragmatic reasons, such as judging or rejecting something as false or somehow incorrect (Givón, 1978;Beltrán, Orenes, & Santamaría, 2008). Furthermore, developmental studies have shown that children begin to use negative responses as early as the second year of life, to reject an object, or to stop or prevent intended or ongoing actions, thus establishing a strong association between the verbal marker of denial and action suppression (e.g., Bloom, 1970;Choi, 1988;Pea, 1980). In the typical verification paradigms, denying a false sentence is found to be more difficult than confirming a true sentence, suggesting additional cognitive cost (Arroyo, 1982;Gough, 1966;Just & Carpenter, 1971;Wason, 1961). Yet, it remains unexplored whether denial also recruit inhibition mechanisms, as does understanding of sentential negation. We will focus on this issue in this study.

Negation and memory
There are several studies that have already examined the impact of negation on memory, though adopting different perspectives. One perspective has been to investigate memory intrusions (false memories) for negated information. For instance, Fiedler et al. (1996) adopted a memory retrieval paradigm in which participants initially watched a short film of an apartment with different objects inside. Subsequently, in a first recognition test, participants were asked to respond whether a series of objects were present or not in the apartment. Negation here corresponded to the action of responding with "no" (negating) to the questioned objects. After a distractive task, participants were asked again to recognize objects that were present in the short film. They found that non-existing objects that were correctly negated in the first recognition test were more often remembered as present than nonexisting objects that were never mentioned. Thus, negating a nonexisting object or entity can lead to an intrusive false memory. More recently, Maciuszek and Polczyk (2017) arrived at the same conclusion using a slightly modified paradigm. Their participants listened first to a description of an apartment that included both affirmed and negated objects, which means that in this study negation was associated with the negative operators presented in the initial learning phase (e.g., "There is a living room in the house, but not a porch") rather than producing a "no" response. Next, immediate and delayed (one week later) memory tests were administered. With respect to false memories, the main result arose in the delayed test, revealing that negated objects were remembered more often as present than objects never mentioned. Accordingly, this and the findings of Fiedler et al. indicate that negation cannot completely counteract the effect that pure mentioning (vs. non-mentioning) has on memory, and for that reason, it might be seen as promoting false memories and misinformation (though see Weil, Schul, & Mayo, 2020). This phenomenon might sound inconsistent with negation having inhibitory consequences in the short term. However, it is worth remarking that negation-induced false memories are basically a mention effect, resulting from comparing negated non-existent objects with unmentioned non-existent objects.
By contrast, the other research perspective on negation and memory focuses on the direct comparison between affirmed and negated items, referring in both cases to previously mentioned information. In their classical research on the levels of processing, Craik and Tulving (1975) already used this paradigm, finding differential effects of polarity on memory. In their paradigm, participants had first to encode words at different depths of processing; for example, they were asked to respond whether they were written in capital letters (shallow encoding) or whether they fit into a given semantic category or sentence structure (deep encoding). Then, several incidental memory tests were administered. The authors found worse recall for words with negative (no) responses, as compared to those receiving affirmative (yes) responses, especially in the deep encoding conditions. This finding suggests a weaker encoding strength for negated information and is consistent with the idea that negation can induce forgetting. As already noted, Mayo et al. (2014) recently gave the most systematic demonstration of negation-induced forgetting (NIF). In four experiments, they presented participants with either short videos (Experiments 1 to 3) or verbal stories (Experiment 4) describing everyday places and situations (e.g., a video of a virtual walk through an apartment). Immediately after, participants performed a first memory test in which they were asked about features of some of the entities shown or described, for instance "wine". For some questions, the correct answer was "yes" (e.g., was it white wine?), because the feature in the question matched with the actual feature of the entity. For others, the correct answer was "no" (e.g., was it red wine?), due to a feature mismatch. So, as in Craik and Tulving (1975) and Fiedler et al. (1996), the negation was not explicit in the learned materials, but corresponded to participants' "no" responses. Finally, after a distractor task, an incidental recognition test (Experiment 1-3) or a free recall test (Experiment 4) was administered. All these memory tests showed greater memory loss when an entity received a "no" response than when it received a "yes" response. Thus, actively negating the feature of an entity leads to the forgetting of the entity itself, an effect which they suggested is produced by the short-term inhibitory actions of negation at the first memory test. Thus, there are two memory effects associated with negation: 1) false memories when the comparison is made between negated and never mentioned information, and 2) forgetting (NIF) when the comparison is made between affirmed and negated mentioned information. Taking together the results of the two research lines, we can conclude that there is different activation of the entity concepts depending on whether or not the entity was mentioned in the first memory retrieval phase, that is, zero activation for non-mentioned concepts, some activation for mentioned and correctly negated concepts (false memory effect), and the strongest activation for mentioned and correctly affirmed concepts. However, only the contrast between mentioned affirmed and negated concepts supports the idea that inhibitory processes occur during negation, leading to subsequent memory loss for the inhibited concepts.
Another important consideration for the study of negation and memory is the distinction between bi-polar (or binary) and uni-polar (or non-binary) negations (e.g., Dudschig et al., 2021;Kaup & Dudschig, 2020;Mayo, Schul, & Burnstein, 2004). A binary negation has a clear alternative for the negated concept (e.g., "not rich" ➔ "poor"), whereas a non-binary negation does not (e.g., "creative"). Although this distinction has potential consequences on the underlying processes of negation, both binary and nonbinary negations predict inhibition of the negated concept (e.g., Beltrán et al., 2019). Specifically, non-binary negation more likely activates the core concept (e.g., not [creative]), and leads to a two-step simulation process (Kaup & Dudschig, 2020), with inhibition demanded to deal with both the initially activated core concept and the updated actual state (Dudschig & Kaup, 2018). By contrast, binary negation could immediately increase the activation of the intended concept (e.g., "poor"), by inhibiting the activation level of the core concept faster, possibly in a single-step process (MacDonald & Just, 1989;Mayo et al., 2004;Orenes, Moxey, Scheepers, & Santamaría, 2016).

The current study
This study is composed of two experiments aimed to replicate and extend the negation-induced forgetting (NIF) effect. The first experiment is essentially a replication of Experiment 4 conducted by Mayo et al. (2014), using a four-step paradigm: a learning phase of a short narrative, then a verification task, followed by a distractive task and then a delayed free recall test. We selected this experiment because it is likely the most robust demonstration of NIF, since it makes it possible to exclude other potential explanations (e.g., response matching between first and second memory tests). This replication is important to consolidate the NIF effect and hence to provide further support for the idea that, as happens in active forgetting paradigms, negation drives inhibition which, in turn, induces forgetting.
The second experiment addresses a potential problem of the NIF reported by Mayo et al. (2014) and replicated in our first experiment, which could result from an artifact of the experimental task. In the first memory test, questions requiring a "yes" response reinforced the originally learned materials, repeating the same object feature (e.g., blue lamp… blue lamp), while the "no" verification questions included a new object feature (e.g., blue lamp… red lamp) that produced a transient mental representation differing from the previously learned mental representation of the target object. This could induce an interference during retrieval (fan effect), due to the introduction of new, competing information associated with the same object (Radvansky, 1999a(Radvansky, , 1999b. That means that the memory impairment observed by Mayo et al. (2014) for "no" responses could have been caused by associative interference rather than by inhibitory processes driven by negation. To overcome this confounding, we modified the materials in such a way that no new feature or content was introduced for the questions in the first Memory test. If the NIF were still present in the long-term free recall task, then we could conclude that there is a genuine inhibitory impact of negative responses.

Experiment 1
This experiment strictly followed the same experimental and analytical procedure described in Mayo et al. (2014), Experiment 4. That is, the participants read a story describing the daily activity of a protagonist and then completed by a first memory test composed of verification questions requiring the participants to affirm or negate sentences. These sentences were either identical to the original ones ("yes" questions) or differed in just the feature attributed to the object ("no" questions). Note that all the questions repeated the concepts in the original learned sentences, except the object feature, which was modified for "no" questions (Table 1). Following a 20-min filler task, an unexpected free recall task was requested, consisting of writing down as many details of the story as possible. Frequentist and Bayesian statistical comparisons were conducted to test the hypothesis that negation ("no" responses) increases memory loss, as compared to affirmation ("yes" responses). As in Mayo et al. (2014), the effect of repetition was also examined, comparing items that appeared twice, in the initial story and in the verification task, with items that only appeared in the initial story.

Participant
Thirty-one undergraduate students (5 males and 27 females; mean age in years: 18.7; SD: 1.0) from the University of La Laguna participated in exchange for course credits. All gave written informed consent and were right-handed and neurologically healthy native Spanish speakers with normal or corrected-to-normal vision. We determined the sample size required to detect the NIF effect from the results in Mayo et al. (2014, Experiment 4

Materials and procedure
Participants were tested individually in a comfortable chair in a sound attenuated room. The experiment script was programmed and conducted on E-prime (Schneider, Eschman, & Zuccolotto, 2002). The experimental session comprised four different phases: study phase, verification task (first memory test), distractive task, and free recall task (second memory test). For the study phase, we translated and adapted to Spanish the "a typical day of a university student" story employed by Mayo et al. (2014) in their Experiment 4. This story was written from the second-person perspective and included 55 affirmative sentences describing various details of daily activities (Appendix A). Two versions of the same story were created, only differing in the target feature associated with the sentence objects (e.g., ambulance siren vs. police car siren), in such a way that the same verification sentences involved different response polarity in the two versions. Half of the participants received one version of the story and the other half received the other version. Participants were instructed to read the story and attempt to imagine the scene as clearly and vividly as possible. The story started with a 500-ms fixation cross at the center of the screen, and then the fifty-five sentences were presented separately, one after the other, on a computer monitor for 4 s each. Immediately after completing the guided study task, the participants received the verification task, which comprised 40 sentences from the initial story, of which 20 were complete repetitions (e.g., "While waiting for the bus, you hear an ambulance siren"), while the other 20 were modified versions (e.g., "While waiting for the bus, you hear a police car siren"). The repetition and the modified verification sentences were counterbalanced within each version of the stories. Participants were required to press the "yes" and "no" keys to indicate whether the sentences were correct (affirming condition) or incorrect (negating condition) according to the story read in the study phase (Table 1). Apart from the 40 sentences presented both in the initial story and in the verification phase, the remaining 15 sentences were also part of the initial story but were not shown in the verification phase, including 10 having identical sentence structure to the experimental sentences (i.e., subject + verb + object + feature), which were regarded as baseline (e.g., "Your friend is sitting in the third row of the classroom"), while the other 5, which had a different sentence structure, were fillers included to make the story natural and coherent (e.g., An unfriendly waiter receives the money). For the verification task, sentences were shown at the center of the screen (4000 ms) following an initial fixation cross (500 ms). Next, "yes" and "no" response options written in capital letters ("SÍ" -"NO") appeared at the bottom of the screen in two separate rectangles, as a reminder of their position in the keyboard. The participants were required to respond as quickly and correctly as possible. If they failed to respond in the 4000 ms, the program moved to the next sentence. The next trial started after a random blank period (1000-1200 ms). After completing the verification task, the participants performed a 20-min non-linguistic distractive task, specifically a match-three puzzle video game (Candy Crush). Finally, for the free recall task, participants were asked to write down as many details as they could remember from the initial story, for 15 min. The experiment started with a practice block, which included 9 sentences in the study phase, of which 6 were presented (with or without changes) in the practice verification phase. The practice did not include the distractive task or the incidental free recall task.

Response coding and analysis
The analytical procedure to evaluate performance in the free recall task was as follows.
First, the non-verified sentences (baseline) and the correctly verified sentences were selected to be analyzed in the free recall task. Second, for both sets of sentences (correctly verified and baseline), the free recall responses were coded by two judges following the coding protocol described in Appendix C. For each recalled sentence, three different types of items (character, action, and object) were coded and considered for further analysis. Recalled items were coded as "1" if they included the correct details, otherwise they were coded as "0". For example, let us suppose that the guided imagination task included the following sentence: "While waiting for the bus, you heard the ambulance [police] siren", and the participant in the free recall task remembers: "While waiting for the bus, you noticed an ambulance siren", the recalling of the correct character "you" was coded as "1", and the same for the recalling of the correct object "siren", while omissions and false memories (e.g., remembering "noticed" instead of "heard") were coded as "0". One independent judge coded twice the answers in the free recall task following the pre-defined coding protocol. Next, a second independent judge coded the answers to check if the coding instructions provided consistency across judges. The agreement between the two judges reached 98.44%.
Third, the proportion of failure in memory was calculated for each participant and condition concerning the verified and non-verified (baseline) sentences. As for the former, we initially accumulated the points of the recalled items and calculated the proportions with respect to the correctly verified sentences. Specifically, for each verification sentence the points of the three coded items (character, action, and object) of each recalled sentence were accumulated in the free recall test and divided by the corresponding total number (3 identical items for each sentence) of each correctly verified sentence in the verification phase. In this way, the resulting numbers represented the proportion of successful recalling for correctly affirmed and negated items. Subsequently, the complementary failure in memory rate was obtained by subtracting the success rate from 1 and used as input in further analyses. This error-based rate was preferred here because it best represents the rationale behind a negation-induced forgetting effect (i.e., an increase in memory loss) and, more importantly, because it was the measure reported by Mayo et al. (2014) in their Experiment 4. For the non-verified (baseline) sentences, the proportion of successfully recalled items was calculated by dividing the accumulated points in the free recall task by the total number of expected points (i.e., the number expected if all the items in the initial story sentence would have been correctly recalledthree items for each one of the 10 baseline sentences). Same with the verified sentences, the resulting proportions were transformed into failure in memory rates.
Finally, study predictions were tested by means of one-sided pairwise comparisons on the resulting proportions (i.e., failure in memory rate): 1) between affirmed ("yes" response) and negated ("no" response) items for the hypothesis that negating increases memory loss (negationinduced forgetting, NIF), and 2) between verified and non-verified (baseline) items for the hypothesis that non-repeated items are remembered worse than repeated ones. In addition, the proportions of correct responses during the verification task were also analyzed by comparing between "yes" and "no" responses. A hybrid approach, including both frequentist and Bayesian statistics, was adopted to analyze and report these planned comparisons. An important advantage of the Bayesian approach is that, unlike the frequentist framework, which focuses on binary decisions regarding the null hypothesis, it quantifies the relative strength of the null and alternative hypotheses by means of the so-called Bayes Factors (e.g., Jeffreys, 1961;Rouder, Speckman, Sun, Morey, & Iverson, 2009;van Doorn, Aust, Haaf, Stefan, & Wagenmakers, 2021). In this way, hypothesis testing with Bayes Factors allows researchers 1) to interpret the evidence in a continuous scale, communicating to what extent one hypothesis outperforms the other, and 2) to discriminate among three categorical states of evidence: support for the alternative hypothesis, support for the null hypothesis, and absence of evidence (i.e., data are not sufficiently diagnostic). The data were analyzed with JASP (Version 0.12.2; JASP Team, 2020), while the violin plots represented in Fig. 1 were created by own-made Matlab scripts. An annotated .jasp file including all statistical analyses is available at https://osf.io/ktjfp/?view_only=d5006242e00548629a4 16191dc41ea48.

Accuracy in the verification task
The main purpose of dealing with verification accuracy was to identify, on a condition and participant basis, the wrongly verified sentences, to discard them from the calculations and analyses performed on the free recall data. Yet, we were also interested in exploring the polarity effects in this first recognition test, to see possible differences in performance between affirmative and negative trials. Verification responses were highly accurate, with similar rates for questions requiring "yes" (M = 94.0%, SD = 7.3%) and "no" responses (M = 91.8%, SD = 8.9%). Given that the Shapiro-Wilk tests revealed violations of the normality assumption, ps < 0.001, a two-tailed Mann-Whitney U test was applied to evaluate whether there was any difference between the two conditions. The frequentist analysis revealed lack of evidence to reject the null hypothesis, U(31) = 216, p = .061, Cohen's d = 0.440, while the Bayes Factor was close to one, BF 10 = 1.62, suggesting that the evidence was insufficient to establish whether "yes" and "no" responses differed in accuracy.

Free recall task
One-tailed (directional) comparisons were conducted to assess whether the proportion of forgotten items was larger 1) for negated ("no" response) than affirmed ("yes" response) items, and 2) for nonrepeated (baseline) than repeated items. The Shapiro-Wilk tests revealed no violation of the normality assumption in the data from the three conditions (ps > 0.180). Therefore, we ran the comparisons using the student t-test. For the Bayesian t-test, a default Cauchy distribution (r = 1/√2 t) truncated to allow only positive effect size values was used as the prior distribution for the alternative hypothesis.
Descriptive statistics showed that negated items were remembered worse (M = 37%, SD = 16.3%) than affirmed items (M = 28.2%, SD = 13.9%), as Fig. 1 illustrates. This pattern was statistically supported by both the frequentist, t(31) = 3.67, p < .001, Cohen's d = 0.649 (medium effect size), and the Bayesian t-tests. The latter revealed that the data were approximately 71 times more likely to occur under the alternative hypothesis (H 1 ) than under the null (H 0 ) hypothesis (BF 10 = 71.27), which qualitatively means a very strong effect (BF > 30; Jeffreys, 1961). Similarly, non-repeated, baseline items (M = 53.4%, SD = 17.9%) were recalled in a lesser proportion than both negated and affirmed items. These differences were both reliable and large according to both the frequentist, ts(31) = 6.26 and 8.68, ps < 0.001, Cohen's d = 1.107 and 1.535, and the Bayesian t-tests, with the Bayes Factors indicating that H 1 was thousands of times more likely to occur than H 0 , suggesting a very strong effect. .

Experiment 2
Experiment 2 tested the negation-induced forgetting effect using new story material in the study phase. In the new story, two protagonists were included, and the story introduced details of their daily activities at the university. The verification phase consisted of cleft questions referring to the events of the story, which were exactly the same for affirmative and negative trials, but only differed in the attribution of the event to one or the other protagonist. In this way, no new transient representation was created and the potential fan effect confound was reduced.

Participants
Thirty-four undergraduate students from the University of La Laguna carried out Experiment 2 in exchange for course credits (including 27 females, mean age = 19.3, SD = 1.8). Like in Experiment 1, the number of participants was three times higher than required to detect a negation-induced forgetting effect, and was the same size as in Experiment 4 of Mayo et al. (2014). All participants were given written informed consent and were right-handed and neurologically healthy native Spanish speakers with normal or corrected-to-normal vision.

Materials and procedure
Experiment 2 followed the same four-phase procedure as Experiment 1: study phase, verification task, distractive task, and free recall task. However, a new story including 62 Spanish sentences was created for the study phase, offering details of two protagonists' university life, a female (Montse) and a male (Jordi), involving thereby a third-person perspective (Appendix B), rather than the second-person perspective of the previous experiment. Among the 62 sentences, 50 were about the protagonists, consisted of 25 semantically related pairs, with each member of a pair assigned to one of the protagonists (e.g., Montse studies psychology. Jordi studies computer sciences/Montse es estudiante de psicología. Jordi es estudiante de ciencias informáticas.). The other 12 sentences were created to make the story natural and coherent. Two versions of stories were created only differed in the fact attributed to which protagonists within one semantic pair. Following a 500-ms fixation cross at the center of the screen, the 62 story sentences were presented one by one, with a total 4-s duration for each. For the verification task, the sentences also differed from those used in Experiment 1, in this case in both content and format. Specifically, a set of 40 cleft questions was created asking if a specific fact in the story corresponded to one of the characters. The sequence of a verification trial included these events: a 500-ms fixation cross, the first part of the cleft question (e.g., Who a psychology student is …/ Quién es estudiante de psicología es…) lasting for 3000 ms, 500 ms blank screen, and finally the name of one of the two characters (either "Montse" or "Jordi"). Participants were instructed to press the "yes" or "no" key to indicate, as quickly and accurately as possible during the next 3000 ms, whether the given information about the character was correct (see Table 1). The next trial started after a random period of 1000-1200 ms. Half of the questioncharacter pairs were correct and required a "yes" response, while the other half were incorrect and required a "no" response. The questioncharacter pairs were presented in a pseudo-random order and counterbalanced through two experimental lists resulting from the response polarity within each version of story. Like in Experiment 1, twenty-two sentences from the initial story were not presented to the participants in the verification task, of which 10 showed the same structure as the experimental ones and were regarded as baseline sentences (see Appendix B). The other 12 fillers were not presented in the verification either. Subsequently, the participants performed the same 20-min distractive task as in Experiment 1, after which they were unexpectedly asked to write down as many details as they could remember from the study task. Participants had 18 min to complete this free recall task. The experiment started with a practice story composed of 10 sentences for study and followed by 6 questions for verification. The practice did not include the distractive task or the incidental free recall task. In Experiment 1, the whole statements were presented in the verification trials, and collecting the RTs was not appropriate since they differed in length. By contrast, in this experiment the cleft verification trials were presented in two frames, as described above (the specific event context followed by the name of one of the protagonists). This format allows to collect the RTs from the onset of the protagonist's name.

Response coding and analysis
The response coding and data analysis procedures were the same as in Experiment 1. First, we also coded the three different types of the recalled items (character, action, and object) according to the coding protocol (see Appendix C). Two independent judges coded the answers for two coding cycles and obtained 98.83% agreement. The statistical analyses were performed using the same methods as in Experiment 1.

Verification task
The Shapiro-Wilk tests revealed no violation of the normality assumption for the verification task accuracy; then, a student t-test was performed. Very similar accuracy rates were obtained for both affirmative (M = 86.9%, SD = 8.7%) and negative (M = 85.3%, SD = 9.6%) responses. The frequentist analysis failed to reject the null hypothesis, t (34) = 0.914, p = .367, Cohen's d = 0.154., while the Bayesian Factor reflected a moderate effect in favor of H 0 (BF 10 = 0.231, <1/3) -i.e., it suggested moderate evidence for the absence of a response polarity effect. For the RTs, the Shapiro-Wilk test revealed the violation of the normality assumption, thus, the non-parametric Wilcoxon U test was performed. Negative responses took longer than affirmative responses (Affirmative: M = 600 ms, SD = 166 ms; Negative: M = 687 ms, SD = 186 ms; U (34) = 54.00, p < 001, Cohen's d = 0.916). The Bayesian t-test showed a very strong effect in favor of H 1 (BF 10 = 501.574).

Free recall task
One-tailed (directional) comparisons were performed to test whether the proportion of forgotten items was larger: 1) for negated ("no" response) than affirmed ("yes" response) items, and 2) for non-repeated (baseline) than repeated items. The Shapiro-Wilk tests found no violations of the normality assumption in the data of the three conditions (ps > 0.161). Then, the comparisons were conducted using the student ttest. In the Bayesian t-test, a default Cauchy distribution (r = 1/√2 t) truncated to allow only positive effect size values was used as the prior distribution for the alternative hypothesis.
Descriptive statistics revealed that memory loss was more likely for negated (M = 28.8%, SD = 13.0%) than affirmed items (M = 24.3%, SD = 11.8%), which was statistically supported by both the frequentist, t (34) = 2.333, p = .013, Cohen's d = 0.394, and the Bayesian t-test. The latter suggested that the data were approximately 3.8 times more likely to occur under the alternative (H 1 ) than the null (H 0 ) hypothesis (BF 10 = 3.81). This result indicated moderate evidence for the alternative hypothesis (BF > 3; Jeffreys, 1961). Furthermore, non-repeated, baseline items were remembered worse than both negated and affirmed items (M = 39.7%, SD = 17.4%). These differences were confirmed by both the frequentist, ts (34) = 5.066 and 5.782, ps < 0.001, Cohen's d = 0.856 and 0.977, and the Bayesian analysis, with Bayes Factors indicating that H 1 had a rather stronger possibility to occur than H 0 (BF 10 = 3006.32 and 21,773.98).

General discussion
The current study investigated how producing a negation in response to a verification question impacts long-term memory in two experiments adopting a revised memory paradigm. To this end, participants read an initial story, immediately followed by a verification task requiring a "yes" or "no" response. After a 20-min distractive task, they were required to freely recall the initial story with as many details as possible. This study validates and extends Mayo et al. (2014) findings, involving a stricter control of materials and avoiding possible confounds. Our two experiments revealed that accuracy was very high in the verification task, without differences between "yes" and "no" responses. Most important, in the free recall test, memory loss (measured as a proportion of forgotten items) was larger for negated ("no" response) than affirmed ("yes" response) items. This negation-induced forgetting was also obtained under strict control of interference factors during the verification phase (Experiment 2). This study differs from other studies in the literature of negation in two crucial aspects. First, most behavioral studies on negation report working memory effects, measured as reaction times in sentence-picture verification tasks (Arroyo, 1982;Gough, 1966;Just & Carpenter, 1971;Trabasso, Rollins, & Shaughnessy, 1971;Wason, 1959Wason, , 1961, or in test probes following sentences (Giora et al., 2007;Hasson & Glucksberg, 2006;Kaup et al., 2006;MacDonald & Just, 1989). Here, in addition to confirming working memory effects on verification times, we obtained long-term memory effects in an incidental free recall task, about 20 min after producing a negative verification response. Second, most studies focus on the comprehension of sentential negation, that is, participants are given sentences with or without a negation marker as input (e.g., Kaup & Zwaan, 2003;MacDonald & Just, 1989). However, in the current study, the story initially read by the participants consisted of sentences in affirmative format, and the manipulation of polarity only emerged during the verification phase, when participants were asked to respond yes or no. Let us comment the main findings.
In Experiment 1, negated sentences were recalled worse than affirmed sentences, indicating a robust NIF effect. However, in Experiment 1, as in Mayo et al. (2014) study, there is a possible confound between the NIF effect and an associative interference or fan effect (Radvansky, 1999a). To better illustrate this, consider the following example from the current experiment. The sentence While you waited for the bus, you heard the ambulance siren was presented to the participants in the initial story. Next, in the verification task, half of the participants saw this sentence again, and were to respond "yes" to indicate that it was indeed the same sentence as before. The other half also saw the sentence but with a small change, police siren instead ambulance siren, therefore they had to negate by responding "no". Later, in the free recall task, the participants were asked to retrieve the information presented in the initial story. The results indicated that the words you (character), heard (action), or siren (object) were recalled less frequently after correctly responding "no" than after correctly responding "yes". The problem is that, in affirmative questions the same sentence appears twice, in the study phase and in the verification phase, possibly reinforcing the memory trace, whereas in negative questions there were two sentences sharing concepts including the original and the new feature, associated with the same noun. This creates a new memory trace that could induce interference either at encoding or retrieving the whole sentence. The effect of negation might be contaminated, at least partially, with the interference or fan effect induced by two competing representations. Hence, we conducted Experiment 2, which avoids including a new sentence in negative verification trials, thus reducing the fan effect.
In Experiment 2 the verification task consisted of cleft questions with exactly the same words in the affirmative and negative trials, which only differed in the attribution of the events to a given protagonist (e.g., Who a psychology student is… Montse [Jordi]), requiring a "yes" or "no" response, respectively. Furthermore, since the verification trials involved a large delay (3500 ms) between the onset of the cleft sentence and the onset of the probe name, we can assume that in most cases participants mentally retrieved the correct protagonist's name before to receive the probe name, and to produce any response. Consequently, the entire original sentence could be (re)encoded in each verification trial ("Who a psychology student is Montse"), regardless of the polarity of the response. If so, the response "no" emerges as an act of retrospective denial rather than the encoding of a competing statement ("Who a psychology student is Jordi"). This contrasts with Experiment 1, where each negative verification trial requires the encoding of a false (new) representation, which would potentially interfere with the originally encoded representation, while each affirmative verification trial repeated the original encoded representation, which would strengthen the initial memory. Despite this, Experiment 2 clearly shows that participants remembered denied items worse than affirmed items, confirming that the negation-induced forgetting effect in long-term memory was still powerful, while controlling for a potentially contaminating variable.
Although Experiment 2 "purified" the effect of polarity in the NIF, some degree of associative interference cannot be completely ruled out. It is possible that some participants in some negative verification trials initially encode a proposition with the expected protagonist [The psychology student is Montse], and then, another proposition with the alternative wrong protagonist whose name appears on the screen [The psychology student is Jordi], leading to some degree of interference. However, if associative interference plays any significant role in negation induced forgetting, we could expect frequent confusions or false memories in the free recall task, especially for negative verification trials. Thus, in Experiment 1, the confusion would consist in retrieving a new feature for each denial trial rather than the original feature, and in Experiment 2, the confusion may be based on attributing the content of the denial trial to the wrong protagonist. However, a post hoc recount showed very low rates of confusion during the free recall task (see the descriptive data for the rates of confusion in Appendix B-1), indicating that false memories are scarce even in the case of competing representations as Experiment 1. Consequently, we could infer that the forgetting effect was mainly related to the response polarity rather than to associative interference. Another explanation for the NIF effect is that it results from assigning less endorsement to negated concepts. Note that this account differs from the associative interference proposal in that it does not require the encoding of two competing propositions, only reduced attention to the negated concepts (the "wrong" protagonist). In any case, one consequence could be that during the free recall phase the participants have less confidence in the status of the negated concepts, which reduces the probability to mention them. This proposal would explain that false memories predicted by the associative interference account were not observed here. In addition, the less endorsement account is compatible with inhibition as an underlying mechanism of NIF; that is, inhibition could cause less endorsement with negated concepts in the verification phase. Admittedly, the details of the inhibition mechanism of the NIF effect could not be unveiled by this behavioral study. Future studies on the neural processes operating at verification stage could be crucial to confirm the role of inhibition in this task.
In both experiments, the results of the verification phase showed that the accuracy rate was relatively high and similar for affirmative and negative responses, indicating that participants succeeded in encoding the initial story, further evidencing that the amnestic effect obtained in the final free recall task was mainly due to the response polarity. Importantly, the response time in Experiment 2 was slower for negating than for affirming the protagonists' names. Longer time means more elaboration and more cognitive resources required for processing negation compared to processing affirmation, confirming previous findings (Carpenter & Just, 1975;Clark & Chase, 1972). But what happens between the verification phase and the recall phase? Let us consider some explanations in experimental paradigms like ours.
One possibility is that the increased cognitive cost of negation in a verification task like ours is due to its deeper encoding in memory, as Craik and Tulving (1975) suggested. However, deeper encoding should result in better memory performance in the free recall task, which was not the case here. Rather, our findings revealed the opposite pattern: the additional time required to generate negative responses was associated with poorer performance on free recall, so the processing depth hypothesis does not explain the results. Maciuszek and Polczyk (2017), for their part, used a long-term memory paradigm aiming to test how negation induced false memories, especially when the memory test was delayed several days (see introduction). Their experimental procedure differed substantially from ours, in that they directly manipulated the sentence polarity in the initial story, and they used a forced-choice recognition task ("the item was present or absent in the story?") for the long-term memory tasks. Furthermore, they were mainly interested in comparing unmentioned items with negated items, and they found that the former was falsely recognized as "present" more frequently than the later. Our results also obtained a similar effect. That is, repeated items, regardless of being affirmed or negated, were better recalled than non-repeated items (baseline). However, importantly, our research directly compared negative and affirmative items, which yield different long-term recall despite being equally repeated. Under this relatively fair condition, negating a concept impaired memory compared to affirming a concept, supporting the inhibitory-like effect. Another possible explanation for slow negative responses is that after mentally encoding the original statement in all trials, denials posit a conflict detection between the expected and the incorrect protagonist, which would require additional time to solve. However, conflict probably had reduced impact on recall, given the low rates of confusions in the free recall. A final explanation is that generating a negative response in the verification phase takes more time because it is associated with inhibition of the denied contents, and further impairs long-term memory.
Our finding is consistent with recent studies highlighting the recruitment of inhibition mechanisms in the processing of negation. Understanding negative sentences, in comparison with affirmative sentences, modulated neural markers of response inhibition, such as a power reduction of frontal theta oscillations in a Go/NoGo task (de Vega et al., 2016), and an enhancement of N1 and P3 components in a stopsignal task (Beltrán, Muñetón-Ayala, & de Vega, 2018) in EEG studies. Moreover, in a TMS study an increase of the silent period was reported following single pulse stimulation on M1 associated with negative sentences (Papeo, Hochmann, & Battelli, 2015). Finally, the inhibitory effect of negation can be reduced by producing a "virtual lesion" by means of low-frequency repetitive TMS on the right inferior frontal gyrus (rIFG), a key area of the inhibitor control system (Vitale et al., 2022).
Although our design and experimental control circumvented confounds related to associative interference, the study has several limitations, which open avenues for further research. First, as mentioned above, this behavioral study does not allow us to determine the exact underlying mechanisms on this negation-related amnesia effect. The inhibitory mechanism remains as a "black box", so further research should focus on brain activity during the verification process to explore and unveil the neural networks involved in the negation-induced forgetting effect. Second, this study provided an incomplete view concerning the generalization of negation-induced forgetting. The stimuli adopted here simply include affirmative statements, so only the action of denial was observed, but other kinds of negation, such as sentential negation, were not included. It is not yet known whether understanding sentential negation leads to the same negating-induced forgetting effect as producing negative responses in a verification task. Future studies are needed to reveal how other kinds of negation influence working memory and long-term memory. Third, although the possibility of associative interference was significantly reduced in this study, further research is needed, for instance, using more strictly controlled experimental stimuli or an additional test to clearly dissociate inhibition from associative interference.
In summary, the study demonstrated that denying a concept impairs its memory retention in an incidental long-term memory task, compared to the affirming counterparts, after controlling repetition effects and interference or fan effects. The findings would support that the reusing inhibitory mechanism of negation might suppress memory in the long run, indicating that negation should be used with greater caution for those situations requiring important memories. Further research on how sentential negation influences the memory and how the underlying cognitive inhibition mechanism of negation affects memory is promising.

Author credit statement
A.Z., D.B. and M.dV. devised the idea and designed the experiments. A.Z. and K.R. run the experiments and coded participants responses. A.Z. and D.B. analyzed the data. A.Z. took the lead in writing the experiment, under the supervision of D.B. and M.dV. All authors (A.Z., D.B., K.R., H. W., M.dV.) discussed the results and contributed to the final version of the manuscript.

Data availability
Data will be made available on request.