Belief Inhibition during Thinking: Not So Fast

The present study is a conceptual replication of a study by De Neys and Franssens (2009) about the role of belief inhibition in reasoning, operationalized as the change in reaction times to different categories of words presented after syllogistic reasoning task. As in the original study, we examined the accessibility of cued beliefs after syllogistic reasoning, by presenting participants ( N = 145) with incongruent (heuristic and normatively correct answers differ) and congruent categorical syllogisms, and lexical decision tasks comprising cued and unrelated words, and imposed methodological restrictions within the original pro - cedure. Mean RT was overall shorter to cued than to unrelated words, and for all combinations of both syllogism congruency and response accuracy on the preceding syllogism, indicating that the full neglect of content is not necessary for correct evaluation of logical status. We registered shorter RTs for words cued by incongruent syllogisms after correct than after incorrect evaluation, which indicates that participants actively process the content of the syllogism while reasoning, as a form of cognitive control. The success - fully conducted Type 2 reasoning enhances lexical access to the cued content, rather than impairing it. In short, findings of the original study were replicated for the priming effects, but not for the inhibition of content.


Introduction
Erroneous answers in syllogistic reasoning tasks in which participants judge the logical validities of categorical syllogisms occur due to the less pronounced logical skills, but also due to the intuitive reasoning that is assumed to take place, thus hindering a reasoner from engaging in deliberate logical thinking (De Neys & Franssens, 2009;Bajšanski & Žauhar, 2019;Teovanović, 2019). Drawing or evalu-ating logical conclusions is not content-independent. When presented with all three elements of a categorical syllogism (two premises and conclusion), and with the instruction to determine whether the conclusion is valid, people should discard the content and beliefs about content and focus only on the logical structure. Concurrently, in problem-solving situations, people usually have some kind of intuitive idea of what the correct answer is, and this idea is supported by the person's knowledge and beliefs about the content of the task. Focusing on content and the believability of the conclusion of syllogism can lead to an erroneous evaluation of logical validity and this phenomenon is dubbed belief bias (Evans, Barston, & Pollard, 1983).
To reach a valid solution, we have to put in some extra cognitive effort; firstly to inhibit our intuitive incorrect answer, and secondly to engage in the slower and cognitively more expensive analytic process, as proposed by the default-interventionist dual-process account (Evans & Stanovich, 2013;Stanovich, 2009). This traditional notion of the duality of cognitive processing is probably most precisely embodied in the experimental practice of using simple tasks where the great majority of participants' responses fall in one of the two distinctive categories: the typical erroneous ones, and the correct ones. Experimental yielding of these two types of responses testifies to the dual-process approaches to human reasoning, in which two types of processing algorithms are presumed and conflicted: the encapsulated heuristic (Type 1) which primes intuitive, normatively incorrect, response and the analytical (Type 2) which conveys normatively correct response (Evans, 1989;Stanovich, 2009;Wason & Evans, 1974;Kahneman, 2011;Kahneman & Frederick, 2002;Sloman, 1996;Stanovich & West, 2000;Stanovich, 2018). Computation of the normatively rational responses, in terms of process-ing, relies on the detection that overriding miserly Type 1 processing is necessary, while the failure of such detection is a processing defect (Stanovich, 2018). The more recent iterations of the DPT, however, propose continuous processing (Melnikoff & Bargh, 2018), and the existence of multiple heuristic processes, which may be both logical and intuitive (Bago & De Neys, 2017;De Neys, 2012;Evans & Stanovich;Pennycook, Fugelsang, & Koehler, 2015;De Neys, Thompson, & Pennycook, 2018).
Tasks in which participants judge the logical validities of conclusions of categorical syllogisms convey the described sequence of processing. Such tasks are one of the most commonly employed dual response tasks in the field of higher cognition (Evans, 2003;Evans, 2008). They comprise three categorical propositions: two premises and a conclusion, and participants are asked if the conclusion indeed logically follows from the premises, or not. Categorical syllogisms can be formed as conflict and non-conflict problems (De Neys & Franssens, 2009). In the non-conflict versions, the believability of the conclusion and the logical validity of the conclusion are congruent, while in the conflict versions they are purposefully incongruent, thus simulating a hostile environment in which reasoning can take place. Human reasoning is always immersed and hence interrelated with the surrounding. Two types of surrounding conditions in which our reasoning takes place are benign and hostile environments (Chater, Felin, Funder, Koenderink, Krueger, Noble, Nordli, Oaksford, Schwartz, Stanovich, & Todd, 2018). A benign environment is an environment that contains useful cues, such as beliefs congruent with the logical structure, thus making the task of evaluating the conclusion easy, even for Type 1 processing. A hostile environment for Type 1 processing is one in which none of the available cues are useful, or are even misguiding, causing the substitution of an attribute only weakly correlated with the true target, that is substituting the logical status of the conclusion with its believability (Evans & Stanovich, 2013). The incongruent tasks are designed to simulate a hostile environment, and thus trigger and allow assessment of the miserly cognitive processing.
Sound reasoning in such a hostile environment requires inhibiting belief about content, or the content itself, and engaging in analytical processing which enhances the probability of reaching the correct answer. Inhibition, as one of the props of cognitive control, refers to the ability to ignore information or responses that are irrelevant to the task at hand (Gilmore, Göbel, & Inglis, 2018), and as one of the executive functions refers to the propensity to deliberately override dominant, automatic, or prepotent responses (Miyake, Friedman, Emerson, Witzki, Howerter, & Wager , 2000;Myake & Friedman, 2012). However, cognitive control is a costly process. Being cognitive misers, people are tempted to respond based on the believability of the conclusion, rather than its logical validity (De Neys & Franssens, 2009;Handley, Newstead, & Trippas, 2011;Pennycook, Fugelsang, & Koehler, 2015). In the non-conflict tasks, miserly and normative approaches cue the same response, thus inhibition is not necessary.
One of the first studies in which the role of the belief inhibition in the evaluation of the syllogistic conclusion was investigated was conducted by De Neys and Franssens (2009). The study rationale was based on findings in the field of memory research that show that when people purposely neglect information, access to that information will subsequently be distorted, and such temporary inaccessibility of information refers to the concept of inhibition (De Neys & Franssens, 2009). In line with this, authors explored the nature of inhibition failure and the resulting erroneous responses by testing the accessibility of cued beliefs after participants' analytic and miserly reasoning. For this purpose, they used sequences comprising two tasks. In each sequence, the first task was to evaluate the logical status of a conclusion in a conflict (validity and believability incongruent) or a non-conflict (validity and believability congruent) categorical syllogism (ET: evaluation task). The second task in the sequence, which was administered immediately after the ET, was the lexical decision task (LDT). In LDTs participants are presented with letter strings and instructed to indicate whether the presented letter string is a word or not by pressing designated buttons. The 24 stimuli used in the LDT comprised 12 pseudo-words and 12 words. Half of the words were target words -core words used in the preceding ET and words closely semantically related to core words, while the other half were words that were unrelated to the six target words. This sequence, the ET+LDT, was repeated 8 times per participant.
The results of this study showed that RTs for lexical decisions on target words were longer after evaluating conflict syllogisms (i.e., syllogisms to which normatively correct and belief based but erroneous responses were different), compared to no-conflict syllogisms, while RTs to unrelated words were statistically the same regardless of the type of preceding syllogism. Because only the RTs to target words were affected (longer after conflict syllogism compared to no-conflict), the authors concluded that the distortion of the memory access to words was not general, and was due to inhibition of beliefs. In other words, when beliefs cued a response consistent with the logical status of the syllogism, additional cognitive control, that is -inhibition of beliefs, was not required, and RTs for target words in those cases were significantly shorter than in conflict tasks. All participants, regardless of the level of their overall achievement on syllogistic reasoning tasks, displayed the memory distortion after solving conflict problems, which suggests that even "the poorest reasoners" were engaged in fighting the biasing beliefs. Based on that, the authors postulated the occurrence or initiation of the process of belief inhibition in reasoning in all participants, and implied that an erroneous response resulted from a failure to complete the inhibition process, and not from a failure to initiate inhibition (De Neys & Franssens, 2009). Using the same lexical access paradigm, other research confirmed that processing belief-logic conflicts involve effortful belief inhibition (Svedholm-Häkkinen, 2015). Further research on these topics confirmed that distinct heuristic and analytic processing systems underpin reasoning in belief-bias tasks (Stupple, Ball, Evans, & Kamal-Smith, 2011). Findings were generalized from the belief bias paradigm to matching bias tasks (Stupple, Ball, & Ellis, 2013), and it was confirmed that beliefs, indeed, automatically influence reasoning and that ignoring them comes with an attentional cost (Barton, Fugelsang, & Smilek, 2009).
Some particular features of the tasks employed in the study conducted by De Neys and Franssens (2009) could have, to some extent, shaped the findings, and this pertains to both reasoning as a complex cognitive process required for sound judging of the validity of a deducted conclusion, and reasoning as a simpler cognitive process such as deciding if the string of letters is a word of a certain language. First, the authors used different items in the conflict and no-conflict conditions, each with different content. Each of the four possible types of syllogistic reasoning tasks (2 conflict, and 2 non-conflict) was represented by two tasks. As those tasks differed by the content, there were altogether 8 tasks, all heterogeneous in terms of the content (themes) that were expected to be discarded in the ex-perimental procedure. Human reasoning is, however, not content independent -i.e., the content, which is to be discarded in syllogisms used to register belief bias, influences task performance, and this notion is supported by voluminous empirical documentation (for a comprehensive review see e.g., Casadio, 2016;Davies, Fetzer, & Foster, 1995;Gigerenzer & Hug, 1992;Stanovich, 2018). Second, in the original study, words used in LDT were not controlled for the factors known to influence RTs within the lexical decision paradigm -word length (Balota, 1994;Hudson & Bergman 1985), word type (Kostić & Katz, 1987;Tyler, Bright, Fletcher, & Stamatakis, 2004), and their relative frequency (Gardner, Lapan, & Lafferty, 1987;Brysbaert, Mandera, & Keuleers, 2017). Indeed, the authors established in a pilot study that there were no a priori lexical decision time differences for the two different sets of (only) target words presented after conflict and no-conflict syllogisms. However, it remains possible that the differences in RTs registered at the level of the word type 1 factor (2: target, unrelated), that is -the fact that RTs to target words were shorter than RTs to unrelated words, was at least partly due to accidental systematical differences in length, frequency, and type of words between words within each set, and not solely because of identity or semantic priming (due to exposure to same or semantically related words in the preceding syllogism) that was reduced by belief inhibition in the case of LDT after conflict syllogism.
Our rationale was to test whether De Neys and Franssens' finding of memory impairment after logical reasoning, on which they (partially) based the notion that the belief inhibition always occurs (but is not always completed successfully), would still hold after implementing methodological improvements into the original design. For that purpose, we have run a conceptual replication where contents of the syllogisms and characteristics of words were directly matched and controlled. As in the original study, the design of our study is grounded in the relative complexity methodology, in which time is simply used as a measure of how many steps information processing took (Pylyshyn, 1999). More precisely, this conveys the idea that, next to the general priming effects of the mere presentation of reasoning problem containing target words, the process of inhibiting the content will affect access to target words in the subsequent LDT task, which can be registered as prolonged RTs on those words.

Aim and Hypotheses
The present study aimed to observe the impact or cognitive cost of reasoning (ET) on subsequent simpler cognitive tasks (LDT), in a more restrictive, controlled, and parsimonious experimental design, both in terms of the content and the characteristics of stimuli. Directly, our aim was to test if the inhibition of belief is indeed always occurring in syllogistic reasoning.
In line with this we hypothesized as De Neys and Franssens (2009), with the additional hypothesis, taking into account the possibility of the non-replicated findings: H1a: if people indeed try to discard beliefs when solving conflict syllogisms, that is -if everybody always engages in an inhibition process but not all complete it successfully, RTs to lexical decisions on cued words should be longer after conflict syllogisms compared to no-conflict syllogisms, regardless of the accuracy of syllogistic reasoning (their access to target words should be distorted regardless of whether they were biased or not). The dif-ference in RTs for unrelated words between conflict and no-conflict would not occur.
H1b: if people exhibit belief bias because they do not even initiate belief inhibition due to not detecting that their beliefs conflict with the syllogism validity, those who fail to solve conflict syllogisms should not show longer RTs to stimuli related to beliefs that should have been inhibited during reasoning, but those who solved all conflict syllogisms correctly should.

Method Participants
A total of 145 first-year psychology students at the University of Belgrade, all native speakers of Serbian language, who provided answers to all tasks were included in the study. Data on gender and age was not collected. All participants had normal or corrected to normal eyesight and reported no neurological impairments. Participants received course credit for participation.

Evaluation Tasks
The ETs used were the type of syllogisms employed by Sá, West, and Stanovich (1999) and Markovits and Nantel (1989), as well as by De Neys and Franssens (2009). Each categorical syllogism consisted of two premises (arguments) and a conclusion that was either valid or invalid. Moreover, each syllogism was either believable or not, meaning that its content, particularly the conclusion, either did or did not match common beliefs. When these two binary factors -the logical status and the believability of a syllogism are crossed, four types of tasks appear: believable and valid, not believable and invalid, believable and invalid, and not believable and valid. The first two types belong to the higher category of congruent syllogisms as their believability and logical status agree. The last two typesthose in which logical status and believability are conflicted, belong to the category of incongruent syllogisms. An example of an incongruent (not believable and valid) syllogism is: 'All plants are oak trees. The root is a plant. Therefore, the root is an oak tree.' Each evaluation task, i.e., syllogism, was defined by three keywords that pertained to one of the four themes: oak, dove, school, and lemonade. Each of these themes (represented by triplets of keywords) was represented by all four types of tasks. This allowed for all four themes (triplets of keywords) to appear in all four types of syllogisms. For example, the words from the theme "oak" (oak, plants, root) appeared in believable invalid, believable valid, not believable valid, and believable valid syllogism. This procedure yielded a total of 16 syllogisms that were counterbalanced against participants using a Latin square, as each participant was presented with four different types of syllogisms of four different themes. Participants' response times to syllogisms were recorded from the moment a conclusion appeared on the screen until the answer was given.

Lexical Decision Task
In total, 96 letter strings were used, half of which were pseudo-words. From the total of 48 pseudo-words, four sets, each comprising 12 pseudo-words, were randomly assigned to each of the four ET themes. The other half of letter strings -48 preselected words, was also divided into four sets of 12 words. The words from each of those four sets belonged to two categories: cued and unrelated words. Cued words comprised three keywords that appeared in a syllogism and defined the theme (e.g., oak, plant, root), and three words that were semantically related to those keywords (e.g., tree, trunk, leaf). Unrelated words comprised six words that were semantically unrelated to any of the words from the cued words category (e.g., money, lamp, snake, jacket, etc.). So, to each of the four themes of syllogisms, a set of 24 letter strings was assigned. Letter strings within each set were presented together in one lexical decision task. For the complete list of words see Appendix A.
All used words were nouns matched by frequency and length. To check the stimuli selection, LDT comprising the very same words and pseudo-words as used in the main experiment was administered, but without the preceding ETs. A fixation cross was presented for 300 ms at the center of the screen before each word. Words were presented at the center of the screen in a randomized order. Participants' task was to indicate whether each of the presented letter strings was a word or not by pressing one of two buttons on a keyboard.
In comparison to the original study by De Neys and Franssens (2009), we have applied two new solutions to control confounding influences. First, all stimuli in LDTs were nouns and were of the same length and frequency. Second, we have controlled the themes of reasoning tasks. Namely, we had predefined four themes, each of which was then represented in all four types of syllogisms. Counterbalancing themes and syllogism types allowed for the LDT words to be entirely counterbalanced too. This enabled direct comparison of RTs of the very same words in all available conditions, that is -after four different types of syllogisms, instead of comparing reaction times to different words presented in different conditions. In other aspects, we have followed the original De Neys' and Franssens' (2009) procedure of repeating ET + LDT sequence.

Procedure
Two experimental sessions with six months time gap inbetween were set up. In both experimental sessions, experiments were administered using the OpenSesame software 3.0.7 (Mathôt, Schreij, & Theeuwes, 2012). All stimuli were presented on a black screen and in 1366x768 screen resolution.
In the first session, all participants completed four ET + LDT sequences. Participants were familiarized with both tasks and the procedure before starting the main experiment. The data from the practice were not recorded. Each participant was presented with four types of syllogisms belonging to all four themes. Participants were instructed to a) assume that the information presented in the syllogisms was true and b) focus only on the rules of logical reasoning. In the LDT, participants responded to strings of letters by indicating on the keyboard whether they represented a real word or not. They were instructed to respond as fast as they can. The exact instructions for both tasks are provided in Appendix B.
The ETs started by presenting each of the two premises for three seconds. After six seconds both the premises and the conclusion were presented on the screen until the participants gave their response. Immediately after the response was given, a short lexical decision task corresponding to the ET theme comprising a total of 24 letter strings began.
To avoid the possibility of immediate priming within each LDT stimuli were presented in a pseudo-random order, that is -the cued word Note. Left: Evaluation Task (congruent syllogism); the labels next to the upper right corners of the boxes containing premises are presentation times of premises, and the label above the upper right corner of the third box indicates that response time was recorded from the moment of the presentation of an entire syllogism. Right: Lexical Decision Task; the labels next to the upper right corners of the boxes containing letter strings used in LDT indicate the word categories. The left to right order of graphical presentation of the two tasks and the corresponding arrows indicate the order of stimuli presentation in a sequence. Figure 1 The graphical representation of an ET+LDT sequence (Oak theme). was never immediately preceded or followed by a related word. All letter strings were preceded by a fixation cross presented for 300 ms at the center of the screen. After completing the LDT, participants had a short break before starting the next ET which differed from the preceding ET by congruency. For the graphical presentation of the procedure employed in each ET + LDT sequence see Figure 1.
For confirmation that there were no differences in RTs for cued and unrelated words (when they were unaffected by preceding syllogistic reasoning tasks), after six months, 78 participants again completed only the LDT. The comparison of RTs showed no significant difference between cued and unrelated words (F(1, 77) = 1.643, p = .204, η p 2 = .021).

Design
The study employed a factorial design. For registering belief bias and potential differences in response times to different types of syllogisms, we used factorial design with syllogism congruency, i.e., belief-logic conflict (2: conflict, no-conflict) as the within-subjects factor. The dependent variables were accuracy of evaluation (number of correctly solved syllogisms), and evaluation task response time, recorded from the moment of conclusion display until the response was given. For registering whether belief-logic conflict affected reaction times to cued and unrelated words in subsequent lexical decision task, we used a 2 x 2 factorial design with congruency, i.e., belief-logic conflict (2: conflict, no-conflict) and word category (2: cued and unrelated words) as within-subjects factors. The dependent variable was reaction time in lexical decision tasks.
Finally, to test whether accuracy on evaluation tasks influenced reaction times in lexical decision tasks after solving different types of syllogisms, we employed 2 x 2 x 2 mixed facto-rial design with syllogism belief-logic conflict, i.e., congruency (2: conflict, no-conflict) and word category (2: cued, unrelated) as within-subjects factors and reasoning skill (2: bad reasoners, good reasoners) as between-subjects factors. The dependent variable was, again, RT in LD tasks.

Evaluation Tasks
Overall accuracy on evaluation tasks was 63%. The percentage of overall correct answers was lower for conflict syllogisms (50%), compared to no conflict syllogisms (75%), with 29 participants who correctly solved all conflict syllogisms and 28 participants who didn't solve any conflict syllogism correctly. Conversely, the numbers of participants who correctly solved all or none no-conflict syllogisms were 74 and 1, respectively. The difference between accuracy (number of correctly solved tasks per participant) on conflict and no-conflict syllogisms was significant (F(1, 144) = 49.384, p < .001, η p 2 = .255). Further, participants took less time to solve conflict (M = 3475.36 ms, SD = 4109.13 ms) than no-conflict (M = 4075.33 ms, SD = 4977.71 ms) syllogisms (F(1, 144) = 4.529, p = .035, η p 2 = .030). The finer-grained analyses including logical status and believability as separate factors are in the Online Supplement.

Evaluation Tasks and Reaction Time on Lexical Decision Tasks
To test the effects of word type and belief-logic conflict on lexical decision times we conducted a repeated-measures ANOVA and registered significant interaction between the two factors (F(1, 144) = 5.111, p = .025, η p 2 = 0.034), as well as significant main effect of word type (F(1, 144) = 74.744, p < .001, η p 2 = .342). As expected, due to identity and semantic priming effects because of exposure to cued words in preceding evaluation tasks, RTs to cued words were shorter (M = 646.579 ms) than RTs to unrelated words (M = 672.357 ms). We registered insignificant main effect of logic-belief conflict on lexical decision times (F(1, 144) = 2.529, p = .114, η p 2 = .017) as participants took the same time to make lexical decisions about words after conflict and no-conflict syllogisms. Simple main effects analysis showed, as can also be seen in Figure 2, that RTs for cued words were statistically the same after solving (whether correctly or incorrectly) conflict and no-conflict syllogisms. The mean difference between these lexical decision times was M conflict -M no-conflict = 0.190 ms (F(1, 144) = 0.003, p = .957, η p 2 = .000).
There was significant simple main effect of belief logic conflict on unrelated words (F(1, 144) = 6.340, p = .013, η p 2 = 0.042)lexical decisions regarding unrelated words were longer by 9.545 ms after no-conflict syllogisms. More importantly, we registered significant simple effects of word type factor on both levels of the belief-logic conflict. Participants' lexical decisions were longer for unrelated words compared to cued by mean difference of 20.910 ms (F(1, 144) = 32.537, p < .001, η p 2 = 0.184) in case of conflict syllogisms and by 30.646 ms (F(1, 144) = 68.976, p < .001, η p 2 = 0.324) in case of no-conflict syllogisms. The analysis of the three-way interaction of believability, logical status, and word type can be found in the Online Supplement.

Reasoning Skill and Reaction Time on Lexical Decision Tasks
Next, we include the the reasoning skill in the analyses because the absence of the effects of belief inhibition could be due to averaging RTs to cued words after conflict and no conflict syllogisms for participants who were both biased and unbiased by beliefs. Namely, it could be the case that even if not all reasoners detect conflict and initiate inhibition, those that give unbiased validity judgements Note. Error bars are standard errors of the mean. do. In that case we expect longer reaction times to cued words after solving conflict syllogisms than after solving no-conflict syllogisms, but only in the good reasoners group.
Note. Error bars are standard errors of the mean.

Figure 3
The word type, congruency, and reasoning skill interaction. Results of post-hoc tests showed a few differences and are presented in Tables 1 through  3. First, lexical decisions about cued words were always faster than decisions about unrelated words. This difference was significant for good reasoners regardless of the syllogism congruency, while bad reasoners showed significantly shorter reaction times to cued words only after no-conflict syllogisms. After solving conflict syllogisms, bad reasoners were equally fast to make a lexical decision regardless of the word type (see Table 1). These findings indicate priming effects, which are even stronger in the "good reasoners" group. In line with this is the finding that bad reasoners in general took longer than good reasoners to make a lexical decision, and this difference reached significance in the case of words cued by conflict syllogisms (see Table 3 and Figure 3).
To summarize, we registered longer reaction times to unrelated than cued words, and longer reaction times in bad reasoners compared to good reasoners, indicating priming effects, that are stronger in good reasoners group.

Discussion
When evaluating the validity of the syllogism, reasoners should focus only on the validity of the conclusion, which means that they should not pay any attention to the content of that syllogism. However, not paying attention is not a passive nor an automatic process. Tossing away is not cognitively cheap, especially for cognitive misers, because human reasoning is related and even dependent on the content about which we are reasoning (e.g.,   Casadio, 2016;Davies, Fetzer, & Foster, 1995;Gigerenzer & Hug, 1992;Stanovich, 2018;etc.). Keeping content on hold requires cognitive effort; one should inhibit beliefs and knowledge about that content to override intuitive belief-based answers, and then initiate more demanding type 2 processing, as proposed by De Neys and Franssens (2009). The aim of the present study was conceptual replication through repeating the procedure comprising of the reasoning task and the lexical decision task, established by De Neys and Franssens (2009), and to test if the findings of the original study would still hold up after the implementation of more strict control into the design. We have employed two response and lexical access paradigm in repeated experimental design. We have presented participants with four types of categorical syllogisms, in 2 of which believability of the content was conflicted with the logical status, and in the other 2 it was not. Participants' task was to evaluate if the conclusion logically stems from premises. Each syllogistic ET was followed by the LDT comprised of the words cued by the syllogism (identical and semantically related), non-related words, and pseudo-words as control. The important additions in our repeated design were two. Firstly, there were altogether four themes of syllogisms, which means that each content (or belief) appears in every type of syllogism. This enabled each word to be primed by the four types of syllogisms without changing the content, e.g., the word oak was primed by the believable-logical, believable-illogical, non-believable-logical, and non-believable-illogical syllogism. Secondly, all the words in our LDT were controlled for length, frequency, and type.
Results of our study are following findings of a large body of research on belief bias, which show that evaluation of the logical validity of the incongruent syllogisms is indeed significantly harder compared to the congruent syllogisms (De Neys et al., 2011;De Neys, & Franssens, 2009;Sá et al., 1999;Stupple & Ball, 2008). Accuracy of the evaluation depended on the logicality of the syllogism, but to a different degree depending on the believability of the content, which is in accordance with the classical notion that belief bias is more pronounced on invalid syllogisms (Evans et al., 1983). The logicality did not make it any easier to evaluate the logical status of the syllogism when the content of the identically themed syllogism was not believable (e.g., Root has oak does not differ from Oak has plant). When the content is believable, the evaluation generally takes a smaller amount of time, and it is easier to correctly assess the logicality of the valid syllogisms in comparison to the invalid ones. When the content is not believable, differences in the accuracy and the lasting of the evaluation process between valid and invalid syllogisms were not registered. The accuracy of evaluation of believable syllogisms was higher in comparison to the not-believable on both valid and invalid syllogisms. Even though evaluation of not believable invalid syllogism took a longer time than evaluation of believable ones, time invested did not pay off in terms of accuracy rates. These findings are in accordance with De Neys and Franssens' findings, and indicate that the interplay of the logicality and believability works in the following manner: positive logical status helps to reason when content is believable, but if that is not the case -then reasoning requires an additional effort, that is -the inhibition of unbelievable content.
The question was what happened when that very content, conflicted or not, was presented in the subsequent lexical decision task. As expected, every target word was identically or semantically primed by the preceding syllogism which contained those words, as found in previous studies employing the lexical ac-cess paradigm (De Neys & Franssens, 2009;Svedholm-Häkkinen, 2015). Overall, mean RT for cued words was significantly shorter than mean RT for unrelated words. However, the registered general priming effect is not a crucial finding. What was the focus of both the original and our study was the comparison between RTs for cued and uncued words in 4 different reasoning conditions. We analyzed the expected inhibitory effects, within priming, registered as the difference in RTs of different types of words after a different type of syllogism. De Neys and Franssens (2009) findings showed that RTs of the cued words after conflict syllogisms were longer compared to the RTs of the cued words after non-conflict syllogisms. Based on this result, they proposed that inhibition of content is a crucial phase in the evaluation of logical validity of the deductive syllogism, initiated by all reasoners though not always completed. The difference between RTs to words cued by conflict and non-conflict syllogisms in our study was virtually zero, although differences were observed between "good" and "bad" reasoners.
It is proposed that inhibitory processes are triggered by the successful detection of the conflict between content and validity of the conclusion (Stanovich, 2018), which may be observed as longer RTs of the cued words after conflict syllogism compared to the RTs of the cued words after a non-conflict syllogism, regardless of the correctness of the answers. Our findings do not support this notion, since, as stated, RTs to the cued words after conflict syllogisms were not longer than RTs to cued words after no-conflict ones. Moreover, the type of the task did not play a role in reshaping the priming effects, meaning that inhibition or the difference in inhibition was not observed. RTs of the words cued by conflict syllogisms were shorter than for unrelated words after conflict syllogism, and the same pattern was observed after non-conflict syllogisms. In short, after introducing methodological restrictions into the original design, priming effects were registered, meaning that neither the believability of the content nor the logicality of the conclusion were factors, the only difference in priming effects came from the type of words.
The interpretation of our findings should take into account theoretical considerations of the nature of the inhibition process. Inhibition may indeed be treated as memory impairment, meaning that the content of the syllogism should be put aside while judging logicality, so the subsequent retrieval requires a longer time, as implied in the original study. Yet, inhibition, a form of cognitive control, demands cognitive resources, meaning that the content which is inhibited is being actively processed, even though shallower than computing the content. It could be that the inhibited content was actively processed, which enabled easier access to that content in subsequent LD tasks.
Non-registered effects of the type of the task still do not mean that different types of answers based on type 1 and type 2 process es do not influence word access, therefore, we analyzed the interplay between syllogism believability, logical status, and word type on RTs in lexical decsision task, and the only significant effect was that of the word type, though the effect of believability was extremely close to being significant. Both findings dispute our prediction that, because of inhibition, RTs for cued words after conflict syllogisms will be longer compared to the RTs of the same words after no-conflict syllogisms. This pattern implies that sound evaluation of congruent and evaluation of incongruent syllogisms do differ in terms of cognitive effort, most probably due to the ex istence of conflict in the latter ones and de tection, despite the shorter response times for incongruent ones.
Next to the role of the inhibition, the findings of our study, more specifically those regarding the good and the bad reasoners, could also be discussed in light of the more recent iterations of the dual-process theories (DPT) framework. One plausible assumption is that the good reasoners do not need to engage in inhibitory processing because they simply do not generate a belief based-response. This is supported by the findings of Svedhölm-Hakkinen (2015), which are based on the observed inhibitory lexical decision effects among less gifted reasoners; this effect was not registered in a group of highly reflective reasoners. The De Neys & Franssens' median split data also indicated that the effect of belief inhibition was less pronounced among the better reasoners. Findings of these three studies (De Neys & Franssens, 2009;Svedhölm-Hakkinen, 2015 and the present study) point toward theoretical re-conceptualization in light of fast logic. Namely, the description of a successful evaluation of the validity of a categorical syllogism was, until recently, based on the traditional dual-process theories (DPT) framework or the "basic and binary" one (Evans 2003;Evans & Over 1996;Smith & DeCoster 2000;Stanovich 1999;Kahneman, 2011). However, as previously pointed out recent iterations of the DPT propose that people can be cognitive misers and logicians simultaneously (De Neys, 2012, 2014, 2018De Neys & Bonnefon, 2013;Bago & De Neys, 2017;Evans & Stanovich, 2013;Pennycook, Fugelsang, & Koehler, 2015;De Neys, Thompson, & Pennycook, 2018). The simulation of the interplay of reasoning and performance on subsequent simpler cognitive tasks can contribute to the understanding of the cognitive price of the logical intuitions. Research suggests that intuitive sensitivity to logical structure arises because logical arguments are more fluent and based on mindware (Howarth, Handley, & Walsh, 2018;Klauer & Singmann, 2013;Mor-sanyi & Handley, 2012;Stanovich, 2018). That would mean that reaching a correct answer on conflict syllogism does not require effort and time, rather it can be the result of implicit and automatic processing. If this is the case, the implication could be that the reason for our finding of shorter RTs for cued words after correctly evaluated conflict syllogism than after erroneous evaluation, is because there is no need to discard content to give the correct answer (De Neys & Pennycook, 2019).
The main limitation pertains to materials. Rigorous control for type, length, and frequency of words limited the number of ETs, so every participant solved only one ET per syllogism type, which made the individual differences approach impossible, and analyses were conducted only at the level of the experimental material. On a paradigmatic level, the binary response syllogistic reasoning tasks remove the possibility of registering different heuristic answers and allow guessing the right answer. Further studies should override these limitations by increasing the number of ETs, and, if possible, including different, more pronounced modes of answering, and other reasoning tasks.

Conclusion
The employed sequence of evaluation and lexical decision task introduced by De Neys and Franssens (2009) proved to be conducive in the field of experimenting within the dual-process approach. Our replication study imposed methodological restrictions, which improved control in the proposed method, and findings, to some extent, shed light on the notion that the inhibition of belief in syllogistic reasoning requires cognitive effort, and, as such, even though subtle, effects of belief inhibition are one of the markers of logicality. To summarize, findings of the original study were replicated for the priming effects, but not for the impairment of lexical acces. Our findings suggest that, using the original terminology, the successfully conducted Type 2 reasoning enhances lexical access to the cued content, rather than impairs it, and that, at the same time, heuristic reasoning could be more cognitively pricey than commonly presumed.