The inhibitory control reflex

Response inhibition is typically considered a hallmark of deliberate executive control. In this article, we review work showing that response inhibition can also become a 'prepared reflex', readily triggered by information in the environment, or after sufficient training, or a 'learned reflex' triggered by the retrieval of previously acquired associations between stimuli and stopping. We present new results indicating that people can learn various associations, which influence performance in different ways. To account for previous findings and our new results, we present a novel architecture that integrates theories of associative learning, Pavlovian conditioning, and executive response inhibition. Finally, we discuss why this work is also relevant for the study of 'intentional inhibition'.


Introduction
Few seem to doubt the importance of response inhibition for optimal and goal-directed behaviour. Without the ability to stop habitual or no-longer relevant actions, we would be slaves of our past; we would be impulsive creatures that respond to any potentially relevant stimulus that presents itself; and we would not be able to respond adequately to changes in the environment. Quite often, this would lead to terrible outcomes. One could even say that we would be doomed without inhibition… At least, it seems this way when you look at the pivotal role of response inhibition in current theories of self-control. There is a vast amount of cognitive and neuroscience literature that suggests that response inhibition is one of the core 'executive' or 'cognitive control' functions (Logan, 1994;Miyake et al., 2000;Ridderinkhof, van den Wildenberg, Segalowitz, & Carter, 2004;Verbruggen & Logan, 2008d). Furthermore, work in psychiatry and clinical psychology suggests that deficits in response inhibition are associated with various clinical disorders (Bari & Robbins, 2013;Chambers, Garavan, & Bellgrove, 2009). It is not always obvious whether the response inhibition deficit is the cause or a consequence of the disorder, but some longitudinal studies suggest that the ability to stop one's actions can influence behavioural and substance addictions later in life (e.g. Nigg et al., 2006). In this article we will not dispute that response inhibition is a critical aspect of cognitive and emotional functioning. However, we will question the general idea that response inhibition is always a deliberate act of control. We will demonstrate that learning to stop can lead to automatisation of response inhibition. We focus primarily on 'external' or 'stimulus-driven' response inhibition, but also consider briefly how this work can have implications for the study of 'intentional' inhibition (Brass & Haggard, 2007Filevich, Kühn, & Haggard, 2012). We will review previous research on 'automatic' inhibition and related topics, and present new empirical material that speaks to the issues of what is learned and how it is learned.

Response inhibition in the laboratory
Popular paradigms to study top-down or deliberate response inhibition include the go/no-go paradigm and the stop-signal paradigm. In the go/no-go paradigm, subjects are presented with a series of stimuli and are told to respond when a go stimulus is presented and to withhold their response when a no-go stimulus is presented (e.g. press the response key for a square but do not press the response key for a diamond). In the stop-signal paradigm, subjects usually perform a choice reaction task on go trials (e.g. press the left response key for a square and press the right response key for a diamond). On a random selection of the trials (stop trials), a stop signal (e.g. an auditory tone or a visual cue, such as the outline of the go stimulus turning bold) is presented after a variable delay (stop-signal delay; SSD), which instructs subjects to withhold the response to the go stimulus on those trials. Popular variants of the stop-signal paradigm include the countermanding task, in which eye movements have to be cancelled , and the stop-change task, in which the cancelled response has to be replaced by another response (Verbruggen & Logan, 2009b).
Performance in response inhibition paradigms can be modelled as an independent "horse race" between a go process, which is triggered by the presentation of a go stimulus, and a stop process, which is triggered by the presentation of the no-go stimulus or the stop signal (Logan & Cowan, 1984;Logan, Van Zandt, Verbruggen, & Wagenmakers, 2014;Verbruggen & Logan, 2009b). When the stop process finishes before the go process, response inhibition is successful and no response is emitted (signal-inhibit); when the go process finishes before the stop process, response inhibition is unsuccessful and the response is incorrectly emitted (signalrespond). The latency of the stop process (stop-signal reaction time or SSRT) is covert, but it can be estimated in the stop-signal task (Logan & Cowan, 1984). SSRT has proven to be an important measure of the cognitive control processes that are involved in stopping (but see Verbruggen, Chambers, & Logan, 2013, for a cautionary note).
The independent race model of Logan & Cowan (1984) assumes stochastic independence between the go and stop processes. However, complete independence between the go and stop processes is unlikely. Neuroscience studies indicate that going and stopping interact in the basal ganglia (note that for the inhibition of eye movements, the interaction seems to take place in the frontal eye fields and the superior colliculus; see e.g. Schall and Godlove (2012)). A motor response can be activated via the direct cortical-subcortical pathway (Nambu, Tokuno, & Takada, 2002). This involves the activation of 'Go' cells in the striatum, which inhibit the internal segment of the globus pallidus (GPi); this reduces inhibition of the thalamus, leading to the execution of a motor response. But the execution can be cancelled via activation of the indirect or hyperdirect pathways (Nambu et al., 2002). The indirect pathway involves the activation of 'No-go' striatal cells, which inhibit the external segment of the globus pallidus (GPe); this reduces tonic inhibition between GPe and the GPi, resulting in increased activity in GPi, and consequently, increased inhibition of the thalamus. It is thought that this can lead to the selective inhibition of a particular response (Aron & Verbruggen, 2008;Smittenaar, Guitart-Masip, Lutti, & Dolan, 2013). The downside of this pathway is that inhibition may be relatively slow (Aron, 2011;Aron & Verbruggen, 2008). Fast but global response inhibition could be achieved via a third pathway, namely the hyperdirect pathway (Aron et al., 2007;Wiecki & Frank, 2013). This involves activation of the subthalamic nucleus, which has in turn a broad effect on GPi, leading to global suppression of the thalamus. Computationally, the interaction between the go and stop processes can be described by the interactive race model (Boucher, Palmeri, Logan, & Schall, 2007). In this model, the go process is initiated by the go stimulus and a go representation is activated after an afferent delay. The stop process is initiated by the stop signal and a stop representation is activated after an afferent delay. Once the stop representation is activated, it inhibits go processing strongly and quickly. In this interactive race model, SSRT primarily reflects the period before the stop unit is activated, during which stop and go processings are independent, so its predictions correspond to those of the independent race model (Logan & Cowan, 1984).
Most research on response inhibition focuses on 'reactive' control processes after a no-go or stop signal is presented. However, successful performance in inhibition tasks requires finding a balance between going quickly on go trials and withholding a response on no-go or stop trials (Verbruggen & Logan, 2009c). Reaction time (RT) is typically longer in blocks in which stop signals can occur than in blocks in which no stop signals can occur. Several researchers have argued that this slowing reflects 'proactive' control adjustments: when subjects expect a no-go or stop signal, they adjust attentional settings, increase response thresholds, or proactively suppress all motor outputs to prevent premature responses (e.g. Aron, 2011;Jahfari, Stinear, Claffey, Verbruggen, & Aron, 2010;Verbruggen & Logan, 2009c;Zandbelt, Bloemendaal, Neggers, Kahn, & Vink, 2013). Inter-and intraindividual differences in proactive control may influence overall stopping performance. Therefore, proactive control is an important avenue for future research. But in this paper, we will highlight another aspect of response inhibition, namely the impact of priming and learning on performance.

Inhibition as a primed or prepared reflex
Most researchers assume that response inhibition in the go/nogo and stop-signal paradigms is a goal-driven and deliberate act of control. But in a series of studies, Van Gaal et al. demonstrated that response inhibition could be triggered by low-visibility primes in both the go/no-go and stop-signal paradigms. In their studies, they contrasted no-go or stop trials on which the briefly presented nogo or stop signal was masked, with go trials without a signal and with no-go or stop trials without a mask. Behaviourally, they found that the presentation of low-visibility no-go or stop-signals slowed down responding and increased the percentage of missed responses slightly ( van Gaal, Ridderinkhof, Fahrenfort, Scholte, & Lamme, 2008;van Gaal, Ridderinkhof, van den Wildenberg, & Lamme, 2009;van Gaal, Ridderinkhof, Scholte, & Lamme, 2010;van Gaal, Lamme, Fahrenfort, & Ridderinkhof, 2011). They attributed this pattern to the 'unconscious' activation of the response inhibition network (but see Newell and Shanks (2014) for general concerns about procedures to assess consciousness) 1 . In the stopsignal experiments, the slowing seemed to increase over practice (van Gaal et al., 2009(van Gaal et al., , 2011, suggesting that there was a learning component to the priming effect. The idea that the response inhibition network could be primed was further supported by a comparison between the low-visibility primes and the highvisibility no-go/stop signals. More specifically, the low-visibility primes elicited activation in frontal regions that are typically associated with deliberate, top-down inhibition (van Gaal et al., 2008(van Gaal et al., , 2010, although it should be noted that there were some differences as well (van Gaal et al., 2011). Importantly, the activation of this 'unconscious inhibition network' correlated positively with the degree of slowing.
In one of our own studies we demonstrated that stopping could also be primed by task-irrelevant (highly visible) features (Verbruggen & Logan, 2009a). In a series of experiments, we presented the primes GO, ♯♯♯, or STOP inside stimuli (circles or squares). In Experiment 1, subjects were instructed to respond to the shape (e.g. circle ¼ left, and square¼right) but to withhold the response when an auditory stop signal was presented. They were instructed to ignore the primes in the go stimulus. Even though the words were always irrelevant, we found that reaction times on go trials were significantly longer for STOP than for ♯♯♯ and GO primes; there was no reliable difference between ♯♯♯ and GO (Verbruggen & Logan, 2009a). In another experiment, GO, #♯♯, or STOP were presented as stop signals. Subjects were told to inhibit the go response whenever any of these stimuli appeared. An analysis of stop-signal reaction times revealed that stop performance was slower for GO than for ♯♯♯ or STOP (Verbruggen & Logan, 2009a). Combined, these findings suggest that task goals, such as going and stopping, can be primed and that response inhibition and executive control can be influenced by automatic processing 2 . However, the priming effects were influenced by task context; the 'STOP' prime slowed responding on go trials in a stopsignal task or a go/no-go task, but not in a task in which subjects could always respond (go-only task) (Verbruggen & Logan, 2009a, Experiment 2). In other words, 'STOP' primed the stop goal in only in conditions in which the goal was relevant to the task context. Note that Chiu and Aron have shown that the effects of lowvisibility primes may also be context-dependent (Chiu & Aron, 2014; but see Lin and Murray (2014), who raised some methodological concerns about their priming manipulation). Finally, contingent involuntary response inhibition was also demonstrated by Folk (2012, 2014), who found that flankers that share the colour of a no-go stimulus could suppress motor responses. They also concluded that response inhibition can be automatically triggered by a stimulus based on top-down goals.
These priming studies suggest that stopping or inhibitory control may not require high levels of awareness. It may even be an automatic act of control, triggered by the presentation of irrelevant stimulus features. However, the context effects demonstrate the bottom-up and top-down acts of control interact. Indeed, it is possible that inhibitory control became a 'prepared reflex' in those studies discussed above. We have recently reviewed the action control literature, which demonstrates that people can proactively allocate attention to a specific location or to a specific stimulus feature, proactively select an action, or prepare a movement (Verbruggen, McLaren, & Chambers, 2014). When attention is preallocated or a response is prepared, goal-directed actions may not require much control anymore (Hommel, 2000;Logan, 1978;Meiran, Cole, & Braver, 2012); instead, actions can be activated easily by stimuli in the environment (Chiu & Aron, 2013;van Gaal et al., 2008van Gaal et al., , 2009van Gaal et al., , 2010van Gaal et al., , 2011, even when they are inappropriate (Verbruggen & Logan, 2009a). We consider response inhibition to be a specific form of action control (Verbruggen, Aron, Stevens, & Chambers, 2010;, which involves the selection of a stop response rather than another go response. Therefore, it seems plausible that both going and stopping can become a 'prepared' reflex (Hommel, 2000;Logan, 1978;Meiran et al., 2012;. Note that the 'prepared reflex' account overlaps strongly with the 'implementation intention' idea . Implementation intentions refer to the linking of critical situations or cues to specific actions (e.g. 'Whenever I see an unhealthy food item, I will not buy it'). This could lead to a prepared reflex; indeed, Gollwitzer noted that after implementation intentions are formed, 'action initiation becomes swift, efficient, and does not require conscious intent' (Gollwitzer, 1999, p. 495).

Response inhibition as a learned reflex
In the previous section, we explored the idea that response inhibition could become a 'prepared' reflex. The studies reviewed suggest an interaction between bottom-up and top-down processes. But is top-down control required at all? It is well documented that responding to a stimulus or cue can become 'automatised' over practice (Dickinson, 1985;Logan, 1988;. Given that we consider response inhibition to be similar to response execution in many ways, it naturally follows that we hypothesise response inhibition can also become 'automatised' over practice. In this section, we will review some works that suggest response inhibition can indeed become a 'learned' reflex, easily triggered by the retrieval of acquired stimulus-stop associations.

Sequential effects of stopping
After a stop trial, response latencies generally increase. This postsignal slowing is more pronounced when the stimulus or stimulus category of the previous trial is repeated (Bissett & Logan, 2011;Enticott, Bradshaw, Bellgrove, Upton, & Ogloff, 2009;Oldenburg, Roger, Assecondi, Verbruggen, & Fias, 2012;Rieger & Gauggel, 1999;Verbruggen & Logan, 2008a, 2008cVerbruggen, Logan, Liefooghe, & Vandierendonck, 2008). We have attributed this stimulus-specific slowing to the retrieval of stimulus-stop associations: a go stimulus becomes associated with a 'stop' representation on a stop trial; when it is repeated on the next go trial, the stop representation is activated via associative retrieval, and this will suppress the go response. This idea is related to the 'do-not-respond tag' account of the negative priming effect of Neill et al. Neill, Valdes, Terry, & Gorfein, 1992). Negative priming refers to the finding that after a stimulus has appeared as a distractor in congruency tasks such as a picture-naming task or an Eriksen flanker task, responding to it on the next trial is usually impaired. Neill et al. proposed that a distractor becomes associated with a do-not-respond representation; when it is repeated on the next trial as a target, the do-not-respond association is activated via associative retrieval, and this will interfere with responding. It is no coincidence that our 'stimulus-stop association' account and Neill's 'do-not-respond tag' account overlap, as both are explicitly based on the Instance Theory (Logan, 1988). Logan suggested that every time people respond to a stimulus, processing episodes are stored as instances in memory. These episodes consist of the stimulus (e.g. a shape), the interpretation given to a stimulus (e.g. 'square'), the task goal ('shape judgment'), and the response ('left'). When the stimulus is repeated, previous processing episodes are retrieved, facilitating performance if the retrieved information is consistent with the currently relevant information but impairing performance if the retrieved information is inconsistent. On a stop trial, the go stimulus or stimulus category becomes associated with stopping; when the stimulus (or category) is repeated, the stimulus-stop association is retrieved, and this interferes with responding on go trials. The idea here, then, is that the go response/goal and the stop response/goal are mutually inhibitory (cf. Boucher et al. (2007)).
A recent study has shown that stimulus features that are not relevant to the current task goal can also become associated with a stop representation (Giesen & Rothermund, 2014). In their Experiment 1, Giesen and Rothermund used a prime-probe design in which subjects had to press the space bar whenever a word appeared. The identity of the word was always irrelevant. Nevertheless, they found that stopping the response on the prime trial delayed responding on the probe trial if the prime word was repeated. This suggested that the word was associated with stopping, even though its identity was never relevant. In a second experiment, they demonstrated that responding was delayed even when the prime and probe response were different. In this experiment, the colour of a letter indicated whether subjects had to execute a left or right response; the identity of the letter ('D' or 'L') was irrelevant. Giesen and Rothermund found that responding to a letter was slowed down if a stop signal was presented on the prime trial, regardless of the 'to-be-executed' or 'to-be-stopped' response (e.g. a green D on the prime, followed by a red D). This suggests that the stimulus-stop associations may have a global effect on responding. 2 We assumed that priming reflected automatic and unintentional processing because the identity of the primes never predicted whether subjects needed to go or stop or which go response they should make (Tzelgov, 1997).

Response inhibition as an automatic act of control?
The stimulus-stop effects are observed up to 20 trials after the presentation of the stop signal (Verbruggen & Logan, 2008c). Similar long-term effects have been observed in task-switching studies, suggesting that stimuli can become associated with tasks or task goals (Koch & Allport, 2006;Waszak, Hommel, & Allport, 2003, 2005. Such long-term associations may support the development of 'automatic' response inhibition. Instance-based accounts of automatisation attribute automaticity to retrieval of stored instances 3 , which will occur after practice in a consistent environment (Logan, 1988). The repetition priming effects could be a first step towards automatisation (Logan, 1990). In a series of experiments, we examined the idea that inhibitory control in go/ no-go and stop-signal tasks can be triggered automatically via the retrieval of stimulus-stop associations from memory (Verbruggen & Logan, 2008b). Initially, we used go/no-go tasks in which the stimulus category defined whether subjects had to respond (e.g. living objects ¼go) or not (e.g. non-living objects ¼ no-go). We trained subjects to stop their response to a specific stimulus, and then reversed the go/no-go mappings in a test phase. In this test phase, subjects were slower to respond to that stimulus compared with stimuli that they had not seen before (Verbruggen & Logan, 2008b, Experiment 1). This slowing was still observed when the tasks changed from training to test: subjects made living/nonliving judgements in training but large/small judgments in test (or vice versa; Experiment 2), and RTs were longer for inconsistent items (i.e. no-go in one task but go in the other task) than for consistent items (i.e. go in both tasks). Again, this is consistent with findings in task-switching literature. For example, Pösse, Waszak, and Hommel (2006) have demonstrated that stimulusresponse associations could survive one or more task switches.
Based on these findings, we proposed the automatic inhibition hypothesis: 'automatic inhibition' occurs when old no-go stimuli retrieve the stop goal when they are repeated, and this interferes with go processing (Verbruggen & Logan, 2008b). Stimulus-stop mapping is typically consistent in the go/no-go paradigm, so automatic inhibition is likely to occur. However, automatic inhibition can also occur in the stop-signal task when the mapping is manipulated. In the training phase of Experiment 5 of Verbruggen and Logan (2008b), a subset of the stimuli was consistently associated with stopping or going, and another subset was inconsistently associated with stopping and going, as is typical in stop-signal experiments. In the test phase, the stimulus-stop and stimulus-go mappings were reversed for consistent stimuli. Consistent with the go/no-go experiments, we found that responding was slowed down for items that always occurred on stop trials in the training phase.
The stop-signal experiment of Verbruggen and Logan (2008b) suggests that even in the stop-signal task, response inhibition is not always an effortful or deliberate act of control. As discussed in more detail below, similar behavioural results were obtained in a neuroimaging study of automatic inhibition (Lenartowicz, Verbruggen, Logan, & Poldrack, 2011). However, slowing may not have been caused by associative retrieval in these experiments but by sequential dependencies. Due to the stimulus-stop manipulation and the fixed overall probability of stop signals, an old stop item (which was always 'go' in test) was more likely to follow a stop trial than the control items (which were 'go' or 'stop') in the test phase. As mentioned above, responding generally increases after a stop trial; this slowing can reflect top-down shifts in goal priorities (Bissett & Logan, 2011). Furthermore, subjects can detect even small statistical regularities in sequential designs (e.g. Yeates, Jones, Wills, McLaren, & McLaren, 2012). Therefore, we have conducted an experiment to check whether the response slowing for old stop items is due to top-down goal-shifts or sequential learning rather than bottom-up factors. A detailed description of the procedure and the results can be found in Appendix A. There were two groups: an experimental group and a control group. Subjects in the experimental group made speeded semantic categorisations (living/non-living) on a series of words. On some trials (stop trials) an additional signal was presented, instructing subjects to withhold their planned response (see Fig. 1). Each word was presented five times within the block; the first four presentations were 'training' or acquisition trials, the fifth and final presentation was the 'test' trial. There were four stimulus types within each block that occurred with equal Example of a trial sequence in the stop-learning paradigm that is used to check whether the response slowing for old stop items is due to top-down goal-shifts or sequential learning rather than bottom-up factors. The stop-then-go and stop/go-then-go words are depicted; the first four presentations are the training phase, and the fifth presentation is the test phase. The distinction between the training and test phase is for illustration only, as subjects are not informed about this distinction. FIX ¼ duration of the fixation interval; SSD ¼ stop-signal delay; MAXRT ¼ maximum reaction time (see Appendix A for further details).
3 Instance Theory construes automaticity as a memory phenomenon: 'Automaticity is memory retrieval: Performance is automatic when it is based on singlestep direct-access retrieval of past solutions from memory. The [Instance Theory] assumes that novices begin with a general algorithm that is sufficient to perform the task. As they gain experience, they learn specific solutions to specific problems, which they retrieve when they encounter the same problems again. Then, they can respond with the solution retrieved from memory or the one computed by the algorithm. At some point, they may gain enough experience to respond with a solution from memory on every trial and abandon the algorithm entirely. At that point, their performance is [completely] automatic' (Logan, 1988, p. 493).
probability. 'Stop-then-go' items always occurred on stop trials during training, but occurred on a go trial in the test phase (stopstop-stop-stop-go). 'Stop/go-then-go' items occurred with equal probability on stop and go trials during training (50%) but the order was otherwise random; they always occurred on a go trial in the test phase (e.g. go-stop-go-stop-go). Other stimuli in the block were the 'stop/go-then-stop' (e.g. go-stop-go-stop-stop) and the 'go-then-stop' (go-go-go-go-stop) items. Hence, the overall probability of a stop trial was 0.5. New words were used in each block to prevent re-learning. Subjects in the control group made the same speeded semantic categorisations, but no items were consistently associated with stopping or going in the training phase. For each subject in the control group, the signal sequence was yoked to the signal sequence of a subject in the experimental group. This allowed us to test whether effects in the experimental group were due to item-specific learning or effects of the stop-signal sequence.
The main result of this 'sequential dependencies' experiment is shown in Fig. 2 (see Appendix A for a full overview of the descriptive and inferential statistics). We found an interaction between stimulus type and group in the test phase (p ¼0.015; Table A3). Planned comparisons showed that subjects in the experimental group were slower to respond to stop-then-go items, which were consistently associated with stopping in the training phase, than to stop/go-then-go items in the test phase (mean difference: 25 ms; t(20) ¼3.11, p ¼0.005, Cohen's dz ¼0.68). This is consistent with the 'automatic inhibition' hypothesis. Importantly, there was no significant difference between these two items in the test phase of the control group (mean difference: À 7.6 ms; t(20) ¼ À0.763, p ¼0.45, Cohen's dz ¼0.16). This indicates that the slowing observed in the experimental group is not due to general (non-stimulus specific) sequential effects because the overall signal sequence is the same in the experimental and control groups. There were no other significant interactions between stimulus type and group (see Table A3). The difference in the experimental group is consistent with our previous findings: reaction times are longer for items that are associated with stopping. Importantly, the absence of a difference in the control group indicates that the slowing for old stop items is not due to general (non-stimulus specific) sequential effects.
To conclude, responding to items that are associated with stopping is slowed. We have attributed this slowing to the retrieval of stimulus-stop associations from memory, making response inhibition an automatic act of control. Our novel experiment reported in this section provides a replication of our earlier findings and demonstrates that the slowing is not due to non-stimulus specific sequential effects. In the next sections, we will explore the neural mechanisms of 'automatic inhibition' and discuss what is learned when people withhold their response.

The cognitive neuroscience of automatic inhibition
Only a few studies have examined the neural substrates of automatic inhibition. Lenartowicz et al. (2011) used neuroimaging to study the neural mechanisms underpinning automatic inhibition. On Day 1, subjects made gender judgments (male/female) about face stimuli. In some trials, a tone was presented, requiring participants to withhold their response. Unbeknown to the subjects, some faces always occurred on stop-signals. On Day 2, subjects performed the task in the scanner. The first two blocks were training blocks, to refresh the subjects' memory. Then the mapping was reversed and the old stop faces became go faces. Behaviourally, it was found that during training stop performance was better for the consistent stop faces than for faces that could occur on both go and stop trials. But in the test phase, go performance was impaired for these old stop faces. Thus, stop-stimulus associations had the effect of slowing down RTs on go trials and increasing accuracy on stop trials, consistent with our other findings. Neurally, it was found that the right inferior frontal gyrus (rIFG) was activated upon the presentation of the old stop faces in the test phase (Lenartowicz et al., 2011). The rIFG is generally considered to play a key role in topdown response inhibition (Aron, Robbins, & Poldrack, 2004Chambers et al., 2009), although the precise role of the area is still debated. The results of Lenartowicz et al. suggest that in addition to being activated in response to explicit stop signals, it may also be possible that the rIFG, and possibly other regions in the frontostriatal network, can be activated automatically on go trials. This is similar to the findings of van Gaal et al. (2010), who showed that both high-and low-visibility stop signals activated a similar cluster in the right inferior frontal cortex. However, which control functions are influenced by learning or priming is less clear because the rIFG has been associated with a multitude of roles, including context monitoring, response selection and reversal learning. For example, rIFG could have been activated to deal with the higher response selection demands after reversal. Chiu, Aron, and Verbruggen (2012) used transcranial magnetic stimulation to probe the excitability of motor cortex after practicing stimulus-stop associations in a go/no-go task. This study showed that motor excitability was suppressed a mere 100 ms after the presentation of stimuli that were previously associated with no-go, but now required going. This seems consistent with the idea that stimuli can automatically activate the fast hyperdirect pathway, independent of the task instructions. Surprisingly, this reduction was not observed in a condition in which no-go items were always associated with no-go throughout training and test. This can indicate that the effects of stimulus-stop on the motor cortex are context-dependent (Chiu et al., 2012). However, it could also indicate that the decreased motor excitability for inconsistent items was driven (at least partly) by conflict between competing goal or response representations 4 . Some have argued that a global inhibition mechanism is activated to suppress all motor responses when conflict between alternative actions is detected (Frank, 2006;Wiecki & Frank, 2013); this global mechanism would effectively allow the system to prevent premature responses and to select the appropriate response. The detection of conflict would trigger the global braking mechanism via the hyperdirect pathway, and this can explain the reduced motor excitability when the mapping is reversed. In other words, the main difference between the 'automatic suppression' account and the 'conflict' account is the trigger of the braking or stopping mechanism: the stimulus itself or the conflict caused by the retrieved information, respectively. Initially we had doubts about the likelihood that conflict could be detected early enough to cause reduced motor evoked potentials (MEPs; Chiu et al., 2012); furthermore, MEPs were lower for old stop items on a go trial than for old go items on a no-go trial. However, the goal-or response-conflict account receives some support from the short-term after-effect literature. As discussed above, a go stimulus may become associated with a 'stop' representation on a stop trial; when it is repeated on the next go trial, the stop representation will be activated via associative retrieval. This will interfere with responding on a go trial. . An event-related potentials (ERPs) study has demonstrated that stopping on the previous trial affected the stimulus-locked parietal P300, but only when the stimulus was repeated (Oldenburg et al., 2012). Response-locked motor components were not influenced in this study. This suggests that stimulus-specific response slowing after a stop trial is not caused directly by 'automatic' suppression of motor output, but by interference between a stop and a go goal.

Learning associations between stimuli and signals
In a recent study, we have demonstrated that signal detection processes are an important part of both reactive and proactive response inhibition . Computational work even suggests that most of SSRT is occupied by afferent or detection processes Salinas & Stanford, 2013). Based on this literature, it seems plausible that subjects may learn about the stop signal and build up links between the stimulus and the stop signal (or the no-go signal in experiments in which the go and no-go signals are superimposed on the stimuli). When no-go/stop items (i.e. items such as images or word associated with no-go or stopping) are presented, they can activate a representation of the no-go/stop signal, which activates the stopping network.
Research on learning and Pavlovian conditioning provides some support for the stimulus-signal learning idea. In Pavlovian conditioning, a conditioned stimulus (e.g. a bell) and an unconditioned stimulus (e.g. the delivery of food) usually occur one after the other. After practice the conditioned stimulus can activate the conditioned response (e.g. salivation) via this CS-US link, or directly via a CS-R link (Hall, 2002). Stimulus-signal learning is also exactly what the associative APECS model would predict (McLaren, Forrest, & McLaren, 2012;McLaren, 1993McLaren, , 2011. APECS is a two-layer backpropagation connectionist network that can learn associations; it does this by selecting 'mediating units' to carry mappings between input and output (for an application of APECS to task switching, see Forrest, Monsell, and McLaren (2014)). These units have dynamically parameterised biases (in effect these play the role of thresholds) that control how easily a mediating unit can be activated by input as a result of their associative history. Thus, the system serves as a substrate for an associative memory that can vary the accessibility of what is learned in an adaptive manner. This allows it to produce priming effects as well as direct influences on behaviour. Recent versions of the model can act as an autoassociator, allowing stimuli that co-occur to associate with one another via a mediating representation. In the context of stop-signal experiments, the idea is that the go stimulus and stop signal representations jointly activate the mediating unit which then activates the stop system.
The possibility that subjects can learn various associations may explain discrepancies between studies. To further study the neural mechanisms of automatic inhibition using neuroimaging, we recently conducted a neuroimaging study using the paradigm of the 'sequential dependencies' experiment reported above (Fig. 1). A detailed description of the procedure and the behavioural results of this neuroimaging experiment can be found in Appendix B. Subjects performed the semantic categorisation task on two consecutive days in the scanner. Like in the 'sequential dependencies' experiment, each word was presented five times within the block; the first four presentations were 'training' or acquisition trials, the fifth and final presentation was the 'test' trial. The four stimulus types ('stop-then-go', 'stop/go-then-go', 'stop/go-then-stop', and the 'go-then-stop') occurred with equal probability, and new words were used in each block. There was no control group in this experiment. Due to a few technical issues and the unanticipated behavioural pattern, we will focus on the behavioural effects only. Fig. 3. Mean RTs and p(respond|signal) as a function of stimulus type and stimulus presentation (1-5; X-axis). Error bars indicate 95% confidence intervals.
The main results of this experiment are shown in Fig. 3 (see Appendix B for a full overview of the descriptive and inferential statistics). Contrary to our expectations, responding was not slower for the stop-then-go words (657 ms) than for the stop/ go-then-go words (660 ms) in the test phase, p ¼0.634 (Table B3). This result is inconsistent with the findings of the 'sequential dependencies' experiment reported in Section 4.2, and suggests that subjects do not learn stimulus-stop associations. However, the analysis of p(respond|signal) data indicate that learning did influence performance in the training phase. As expected, there was no difference in the p(respond|signal) between the words on the first presentation ( Fig. 3; Table A2), but a difference between the stop-then-go (0.28) words and the stop/go-then-n (0.31) words emerged throughout training (i.e. stimulus presentations 2-4). The difference between the stop-then-go and stop/go-then-n words was reliable (p¼ 0.042; Table B3). In other words, we found evidence of learning during training in the p(respond|signal) but no effect in RTs upon the reversal of the consistent stimulus-stop mappings. We propose that this pattern of results indicates that subjects learned stimulus-signal associations rather than stimulusstop associations. Such associations between the stop items (i.e. the stop-then-go words) and the stop signal (i.e. the line turning bold) will prime the representation of the stop-signal detection rather than the stop goal or stop response. This can explain why learning influences the probability of stopping in training without influencing responding on go trials in test. We are currently exploring why some paradigms or experiments lead to stimulussignal learning, stimulus-stop learning, or both.
In sum, our theoretical analysis and the findings of the experiment reported in this section suggests that, at least in some paradigms, stimuli can become associated with stop signals or a 'mediating unit' that carries the mapping between the stimulus (cue) and stop signal. The stopping network could then become activated via this S-S or S-unit link. The effect of associative learning on perception and associative learning could also help us to provide a more detailed explanation of some other findings. For example, Manuel et al. documented that following performance of an auditory go/no-go task, the topography of auditory-evoked potentials was modulated within $ 80 ms of presenting no-go stimuli previously consistently associated with stopping (Manuel, Grivel, Bernasconi, Murray, & Spierer, 2010). They attributed this to the development of 'automatic inhibition' (broadly defined). Our analysis suggests that this effect can be due to associative processes influencing detection of the no-go stimulus (rather than direct activation of an inhibition or no-go network).

Expectancy and awareness of the stop associations
In the studies of van Gaal et al. (van Gaal et al., 2008, subjects were reportedly unaware of the presentation of the low-visibility primes (but see Footnote 1). In most go/no-go experiments reported above, the stimulus-no-go rules were explicit. By contrast, the rules were implicit in the stop-signal studies. This raises the question whether subjects were also unaware of the stimulus-stop associations in these studies. Whether or not subjects were aware of them could have theoretical implications. In the associative-learning literature, there is an ongoing debate as to whether learning associations between a stimulus and an action is rule-based or based on the formation of specific stimulusresponse associations Mitchell, De Houwer, & Lovibond, 2009). Furthermore, awareness of the stimulus-stop association can indicate that the response slowing observed for old stop items is due to proactive control, rather than 'automatic inhibition'. When a cue [e.g. 'p(stop-signal) ¼0.75'] indicates that a stop signal is likely to occur on the following trial(s), subjects proactively slow their responses (e.g. Verbruggen & Logan, 2009c).
Stimuli associated with stopping could act as such cues (e.g. 'if stimulus X then p(stop) is high'), and subjects would adjust their response strategies accordingly. The studies of Chiu et al. and Manuel et al. (see above) indicate that strategies would have to be adjusted very quickly (within $ 100 ms), making it at least a very efficient, acquired, form of top-down control. However, the 'automatic inhibition as a form of proactive control' account fails to explain why associatively mediated effects are still observed when the go/no-go rules are explicit and subjects are informed about the rule change (e.g. Verbruggen & Logan, 2008b, Experiments 1-4). Furthermore, in a recent study (Verbrugen, Best, Stevens, & McLaren, 2014), we used a modified version of the paradigm of , which was specifically developed to examine effects of proactive control in the stop-signal task. We found that subjects learned associations between specific items and stopping in this paradigm, but that this did not interact with measures of proactive control. These results are inconsistent with the 'automatic inhibition as a form of proactive control' account. Instead, they suggests that stopping can indeed become an automatic act of control, triggered by the retrieval of associations from memory.
It is also important to stress that awareness of a stimulus-stop association does not necessarily indicate that inhibition is not automatic. For example, Tzelgov argued that the defining feature of non-automatic processing is 'monitoring' and not awareness. In this context, monitoring refers to 'the intentional setting of the goal of behaviour and to intentional evaluation of the outcome of the process' (Tzelgov, 1997, p. 444). He argues that all psychological processes return some symbolic representation in humans, so the organism will be 'conscious' or 'aware' of most behaviour, including automatic processing. Thus, awareness should not be used to distinguish between automatic and non-automatic processing. We have also recently hypothesised that similar learning mechanisms may underlie rule-based behaviour and stimulusresponse link-based behaviour . The main difference between the two is the kind of representation that is linked with the stimulus: an abstract, rule-like representation (X-'if x then left'), or more concrete stimulus-response associations (X-left). In other words, even if subjects are aware of the contingencies, this does not necessarily imply that an entirely different form of learning has taken place compared with situations in which subjects were not aware of the contingencies.
In sum, we think it is unlikely that proactive adjustments triggered by the presentation of an old stop item can account for all findings reported above. However, more systematic research is needed to determine whether subjects learn specific stimulus-(stop)response associations or more abstract rules that can be monitored.

Conditioned inhibition vs. conditioned inhibitory control
We claim that response inhibition can become associatively mediated. Stimuli or items that are reliably paired with stopping can prime and potentiate that act of control, and may even be able to instigate it in their own right. In learning-theory terms, one could say that response inhibition or inhibitory control becomes 'conditioned'. But there is another type of conditioned inhibition, which has been studied in animals. A viable recipe for producing 'conditioned inhibition' in animals is to use a design such as A þAB À , which simply denotes trials where A (e.g. a light) and an unconditioned stimulus (US; e.g. the delivery of food or a shock) are paired, interspersed with trials where A and B (e.g. a tone) occur in compound but without the US. The result is that B acquires the properties of being hard to condition to that US (i.e. it passes the retardation test for a conditioned inhibitor), and of suppressing excitatory responding when presented in compound with A or with another excitatory CS that has been conditioned with the same US (i.e. it passes the summation test for conditioned inhibition). Our recent theoretical analysis suggests that 'conditioned inhibition' and 'conditioned inhibitory control' show some important similarities .
Our review of the animal-learning literature shows that two types of associations can be learned during conditioned inhibition ; see also e.g. Hall (2002), Dickinson and Balleine (2002)). First, animals can learn a specific inhibitory association between the conditioned stimulus (CS) and the US, which suppresses the US representation (Konorski, 1948). The basic idea here is that an inhibitory association is simply a negative excitatory one. This type of associative structure emerges naturally from the Rescorla-Wagner view of conditioning (Rescorla & Wagner, 1972), and from the idea that inhibition is the consequence of a disconfirmed expectation of an outcome. In essence, the contingencies involved in the A þAB À training lead to the development of the excitatory connection from the representation of A (e.g. light) to the US representation (e.g. food), and the inhibitory connection from the representation of B (e.g. tone) to that same US representation. Thus, excitation is simply the converse of inhibition and vice versa 5 . Second, animals can learn an excitatory link from the 'B' representation to a 'No-US' centre or representation that then inhibits the US representation (e.g. Dickinson & Dearing, 1979;Konorski, 1967;Le Pelley, 2004;Pearce & Hall, 1980). The key difference between this structure and the earlier one is the use of this 'No-US' representation, which is susceptible to at least two different interpretations. In one (initially favoured by Konorski) the representation is US-specific, and so, in the case where A is trained with food pellets, the No-US representation would be 'No food pellets', but in the case where A is trained with sucrose, the No-US representation would be 'No sucrose'. The other approach is that all conditioning is either appetitive or aversive, and that there are "centres" corresponding to this that mutually inhibit one another (Dickinson & Balleine, 2002;Dickinson & Dearing, 1979;Konorski, 1967). Support for this idea comes from transreinforcer blocking and counterconditioning. For example, Dickinson and Dearing (1979) showed that training B to be an inhibitor for a food US (i.e. the presentation of B predicted the absence of food) enabled it to successfully block learning involving a shock US (for other related findings, see Dickinson and Balleine (2002)). In another study, Dickinson and Lovibond (1982) (as cited in Dickinson and Balleine (2002)) demonstrated that a conditioned appetitive jaw movement could be suppressed by an aversive defensive eye-blink in rabbits; because rabbits can usually blink and swallow at the same time, this interference was attributed to an inhibitory interaction between an appetitive centre and an aversive centre. These centres can function as the US and No-US centres, with the aversive acting as the No-US centre for appetitive learning and vice versa. Thus, there is evidence for (i) a specific form of inhibition that is equivalent to (though it may not be instantiated as) a direct inhibitory link to the stimulus representation (be it CS or US) in question, and (ii) a more general form of inhibition mediated via excitatory connections to appetitive/aversive centres that mutually inhibit one another.
The 'CS-no-US' link in conditioned inhibition paradigms could be the Pavlovian equivalent of a link between the stimulus and a 'do not respond' or 'no response' representation in negative priming and response-inhibition paradigms, respectively. Indeed, some studies indicate that conditioned inhibitory control (as studied in go/no-go and stop-signal paradigms) can also have general effects on behaviour. For example, several studies have found that consistent pairing of food-related pictures to stopping in a go/no-go or stop-signal-paradigm reduced subsequent food consumption (Houben, 2011;Houben & Jansen, 2011;Veling, Aarts, & Papies, 2011;Veling, Aarts, & Stroebe, 2013). Furthermore, a similar procedure with alcohol-related stimuli reduced alcoholintake in the laboratory (Jones & Field, 2013) and even selfreported weekly alcohol intake of heavy drinking students (Houben, Havermans, Nederkoorn, & Jansen, 2012; but see Jones and Field (2013)). These effects have been linked to devaluation of the stop or no-go stimuli as several studies have demonstrated that stopping responses to stimuli can lead to devaluation of these stimuli (Houben et al., 2012;Kiss, Raymond, Westoby, Nobre, & Eimer, 2008;Veling, Holland, & van Knippenberg, 2008). Ferrey, Frischen, and Fenske (2012) showed that stop associations not only impact on the hedonic value of the stimuli associated with stopping but also on the behavioural incentive of them. They paired sexually attractive images with either going or stopping in a training phase, and then asked subjects to rate the attractiveness of the images. They found that the no-go (stop) images were rated less positively than the go images. In a second study, Ferrey et al. showed that subjects were less willing to work to see the erotic images that were paired with stopping. Thus, conditioned inhibitory control may impact on the motivational value of stimuli, perhaps via creating links between the stimuli and the appetitive/ aversive centres postulated by Dickinson and Dearing (1979).
To conclude, research on Pavlovian conditioning in animals (and humans; see e.g. McLaren, Kaye, and Mackintosh (1989)) suggests that a stimulus can become a conditioned inhibitor which suppresses activation of another (stimulus or response) representation directly or via a link with a 'no-US' or aversive centre that suppresses appetitive behaviour. Note that these two mechanisms are not mutually exclusive. Indeed, some have argued that dualassociation formation is the norm (Hall, 2002). The research focusing on 'far transfer' effects of response inhibition suggests that similar mechanisms may operate when subjects learn to stop in 'instrumental' go/no-go or stop-signal experiments.

What is learned?
We have reviewed a series of studies and novel empirical findings that demonstrate that learning can play an important role in response inhibition. This could lead to 'automaticity' in stopping, as some experiments have demonstrated that responding can be suppressed even in the absence of an explicit instruction, rule, or no-go/stop signal. What is learned is perhaps less obvious: multiple factors seem to interact with each other (but again, this is not too different from Pavlovian conditioning; see Hall (2002)). Here we expand on the general framework offered by  to provide some characterisation of the multiple pathways that might allow learning of a stimulus-stop relationship. Fig. 4 shows the architecture of the system that we have in mind. The top portion of the figure, characterised by the presence of mediating units, is an associative system running the APECS algorithm (McLaren, 1993(McLaren, , 2011McLaren et al., 2012) that takes stimulus input (whatever is presented) and learns about it (as discussed in Section 4.4). The black arrows denote possible associations that can be acquired via the mediating units. We have not shown either the auto-associative links from the mediating units back to the cue or stimulus units, or the possible Pavlovian associations from cues or stimuli to the appetitive or aversive systems to prevent the figure from becoming unduly cluttered. 5 The fact that there is little evidence for relatively long distance inhibitory connections at the neural level is not an immediate argument invalidating this architecture, as we can imagine the inhibitory connection being made up of a longdistance excitatory connection to an inhibitory neurone that operates at a local level.
The middle portion of the figure (the red and green units) represents the system that implements stopping and going as instrumental actions, and is adapted from the interactive race model . The stop (in red) and go (in green) 'response units' have hard-wired mutually inhibitory connections, so that when one is activated it suppresses the other (although there may be an asymmetry, such that the stop unit inhibits the go unit more than the other way around; see Boucher et al. (2007)). This inhibitory connection can explain why both going and stopping are slowed when the stop and go unit are activated simultaneously (Note that detection of conflict could trigger a general braking mechanism, slowing go RTs furthersee, e.g. Frank, 2006). We also have an excitatory connection from the go signal to the go unit (i.e. the green arrow), to denote that this influence is brought about by means of task instructions and is not acquired associatively. The stop system has a similar arrangement that implements the stop signal as a means of stopping. Finally, cues or stimuli could also become associated directly with the go and stop units via instructions (i.e. the second pair of red and green arrows). For example, in a stop-signal task, the primary-task stimulus would be represented by the cue/stimulus unit, and have a direct instructionally-mediated link with the 'go response' unit.
The lower portion of the system represents the Pavlovian aversive and appetitive centres described by Dickinson and Balleine (2002) (see Section 4.6). We see the stop and go systems as the instrumental equivalents of the Pavlovian aversive and appetitive systems, and we show this by means of hard-wired reciprocal excitatory connections between the stop system and the aversive system, and between the go system and the appetitive system. Thus, stop and go as actions have intrinsic motivational qualities by virtue of these connections, and "nice" stimuli will tend to activate approach (i.e. go) and "nasty" stimuli withdrawal (via activation of the stop system). Finally, we posit hard-wired mutually inhibitory links between appetitive and aversive systems as well (see Section 4.6).
It is immediately apparent from this framework that there are multiple pathways that will allow a stimulus consistently paired with stopping to acquire some associatively-mediated influence over stopping. A cue or primary-task stimulus in the stop-signal task that precedes a stop signal may become associated with the representation of the stop signal, which will then have the effect of priming the detection of that stop signal and/or the stop system via the instructionally-mediated pathway between the stop signal and the stop unit (shown in red). The stimulus will also form a more direct link to the stop system itself via the appropriate mediating unit, and in this way may come to act, to some extent, as a stop signal in its own right. Which associations predominate after training will depend on the particular contingencies and schedule in play. Hence, it comes as no surprise that we have been able to find examples of associatively-mediated stopping that affect performance both as measured by slowing of RTs to go stimuli and a reduction in errors to stop signals (e.g. Verbruggen & Logan, 2008b, Experiment 5), and sometimes only seem to affect performance on stop trials (see Section 4.4). In the latter case, we hypothesise that the associative pathway involved is the pathway that predicts and primes the stop signal; consequently, it requires the presentation of that signal to become effective.
Another implication of this framework is that if a stimulus becomes associated with stopping, then this may devalue that stimulus via the stop system's interaction with the aversive system. This could explain the recent findings of studies that focused on inhibitory control training and consumption of food and alcohol (see above). We might also expect stop training to lead to persistent activation of the aversive system (perhaps controlled by contextual cues) leading to a general devaluation of outcomes experienced in that environment. Our work on the training of inhibitory control involving gambling is consistent with this position. We have shown that the stopping motor responses can reduce gambling (Verbruggen, Adams, & Chambers, 2012). A recent series of experiments suggests that this carry-over effect is not due to a change in cognitive processing style or increased control. Instead, we hypothesise that stopping generally reduced approach motivation via the link with the aversive system . The stop/go-aversive/appetitive component of our framework is based on the work of Dickinson et al. and draws heavily on the comparative literature. Thus, one corollary of our proposal is that automatic inhibition or conditioned inhibitory control may tap into the same mechanisms that are present in infra-humans. Admittedly, this is still speculative and we are starting a research programme to explore the similarities and dissimilarities across species. But for now we can say that it does offer a parsimonious account that can explain conditioned inhibition in a variety of species (including humans), why pairing a signal with stopping leads to slowing/reduced approach, and why it can result in devaluation.

Implications for intentional inhibition
In this article we have focused on response inhibition triggered (initially) by the presentation of a no-go stimulus or stop signal. Recent studies have explored to what extent people can also 'intentionally inhibit' actions when there is no obvious external stimulus. Intentional inhibition is an important component of the 'what, when, whether' model of intentional action proposed by Brass and Haggard (2008). In this model, the 'what' component reflects the decision of what action to execute; the 'when' component reflects the decision of when to execute the action; and the 'whether' component reflects the decision to execute the Fig. 4. A schematic overview of the architecture of the associative stop system. We combine elements of APECS (top section), the interactive race model (middle section), and the Konorskian model of motivational systems (bottom section). See the main text for further details. Excitatory and inhibitory connections are represented, respectively, by arrows and filled circles. The red and green connections represent connections established via instructions; the black connections represent connections of the associative system. In APECS, the mediating units (dashed ovals) act as a 'glue' to link the stimulus and response representations. There are also reciprocal connections from these mediating units back to the inputting units that are not shown in this figure. It is also quite possible for the cue, go signal, and stop signal to become associated directly with the appetitive and aversive system (i.e. Pavlovian conditioning). In this overview, we focus mostly on the instrumental components of our framework, and so, for clarity, we have not drawn these connections. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) intentional action or not (Brass & Haggard, 2008, p. 320). If an action is prepared, deciding not to execute it at the very last minute may require intentional inhibition according to Brass and Haggard. The work reviewed in Section 3 shows that go and stop actions can be primed by information in the environment. Previously we have proposed that such priming effects can be explained in terms of an accumulator model in which evidence for an action accumulates until it reaches a threshold (Verbruggen & Logan, 2009a). Both low-and high-visibility primes may influence the accumulation rates, resulting in an increased probability of selecting the primed action and shorter decision times when the primed information is congruent with the selected action, but increased decision times when the information is incongruent. The accumulation rate is constrained by higher-order goals and task settings, which can explain why the priming effects depend on task context (see above). We speculate that priming can influence the decision components of the 'what, when, whether' model in a similar way. Schurger, Sitt, and Dehaene (2012) have demonstrated that 'when' decisions can be modelled as an accumulation of spontaneous fluctuations of neural activity, and that an action is executed when this activity reaches a certain response threshold. It seems likely that the 'whether' component or intentional inhibition can also be described as the accumulation of information. Based on previous work, one could therefore hypothesise that information in the environment could influence the 'whether' decision by altering the fluctuations of neural activity towards the 'whether'-decision thresholds. Of course, one of the defining characteristics of intentional inhibition is that there is no obvious external stimulus or specific cue triggering the stopping system. But decisions about whether or not to act are never made in a 'vacuum'. Therefore, it seems likely that multiple sources of information may influence our decisions. The priming work of van Gaal et al. (see above) even suggests that subjects do not have to be aware of such influences. Thus, we suggest that even 'intentional inhibition' can be primed by cues in the internal and external environment. Even a small (unconscious) 'push' towards one of the possible action options may be sufficient to alter behaviour.
In a similar vein, associative processes are likely to influence the decision of whether or not to execute an action. We have argued that subjects will learn associations between stopping and stimuli, task contexts, or social contexts. This may influence seemingly 'intentional' behaviour. For example, the inhibition of taboo words in certain social contexts could be a form of automatic inhibition, triggered by associations between stopping and the taboo word (Severens, Kühn, Hartsuiker, & Brass, 2011). Thus, when somebody is asked to repeat a highly insulting word in front of an audience, they might refuse. To the person, this may seem an intentional decision, but it might be 'automatically' triggered by the retrieval of stimulus-'no' associations from long-term memory. More generally, throughout development, people have been instructed not to do certain things many times; eventually, they will learn some of this, and this could influence decision-making in a direct, automatic fashion. Sequential dependencies, which can influence the 'whether' decision (Schel et al. 2014), may also be (partly) associatively mediated. After executing an action a couple of times, subjects may be more (or less) likely to execute the action again. Such effects may be driven by expectancy or by other deliberate processes. However, associative learning also seems to play a role. Work by ourselves and others has demonstrated that such 'sequential' effects are at least partly due to associatively mediated processes (Livesey & Costa, 2014;McAndrew, Yeates, Verbruggen, & McLaren, 2013;Perruchet, Cleeremans, & Destrebecqz, 2006). These processes can be dissociated from conscious expectancy. For example, in a choice reaction time task, after a series of B's, subjects indicate that they expect A (gamblers fallacy), but they respond faster when another B is presented compared to when an (expected) A is presented. Such findings are consistent with a dual-processing account, which suggests that behaviour is determined by the interaction between a conscious system and an associative system. Above, we discussed Tzelgov's idea that the difference between non-automatic behaviour and automatic behaviour is 'intentionality', which refers to setting the task goal and monitoring the task outcome. Our proposal that priming and associative learning interact with making the decision to execute an action or not may seem at odds with this idea. To clarify our position, we have to go back to the Instance Theory, and the idea that action selection can be construed as a race between an algorithmic process and a memory-retrieval process; the process that finishes first determines which action is selected (see above). When the memory-retrieval process wins the race, the decision is said to be automatic (Logan, 1988); whereas decisions based on algorithmic processing are deliberate or intentional. The accumulation process described by Schurger et al. (see above) could be an example of an algorithmic process, and we had already discussed how priming could influence it. Thus, even though the process itself may be 'intentional', non-intentional factors can still influence it. Second, in some situations, people may 'intentionally' try to set a goal or make a decision, but the memory-retrieval process can win the race with the algorithmic process. According to our definition, this would make this decision 'automatic'. However, the person who makes the decision would probably still consider their decision to be 'intentional' as they will only have access to the outcome and not to how this outcome was achieved.
To conclude, the fact that no external stop signal is presented does not necessarily imply that priming and learning will not influence performance, or that the 'whether' decision is less susceptible to bottom-up priming effects. We have recently argued that the dichotomy between automaticity and executive control (as measured in tasks with external control signals) is a false dichotomy that should be abandoned . Instead, we have proposed that learning associations between stimuli and actions, and learning which actions lead to reward, are an integral part of executive control. Similarly, to explain seemingly 'intentional' behaviour, we should not only look at what is happening in the brain at a specific moment, we should also look at what is present in the current environment and at what happened in the past.

Conclusion
The work on priming suggests that response inhibition can be a prepared reflex, readily triggered by information in the environment. Furthermore, the work on associative learning indicates that response inhibition can become a learned reflex: it may initially depend on topdown biasing and rely on instruction-based pathways, but it may gradually become automatised, with the need for top-down bias disappearing altogether. This should not be taken to imply that inhibition does not serve an important role in action control. However, our review does indicate that inhibition can be achieved in various ways. Therefore, in order to understand how people control their impulses and urges, we should explore these different possibilities.
Appendix A. Is the response slowing for old stop items due to top-down goal-shifts or sequential learning?

Subjects
Forty-six subjects participated for monetary compensation (d6) 6 . Two pairs of subjects (i.e. two subjects in the experimental group and their yoked controls) were excluded because the percentage of missed go trials was 40.40 for at least one of the subjects of the pair. Thus, there were 21 subjects in each group.

Apparatus and stimuli
Stimuli were presented on a 21-inch CRT monitor. The task was run using PsychToolbox (Version 3; Brainard, 1997). Twenty-eight lists of 8 words (4 living items, 4 non-living items) were selected. Fourteen lists were used in the experimental group, and the other lists were used in the control group 7 . The lists were matched for word frequency (SUBTLEXWF; Brysbaert & New, 2009) and word length (average word frequency: 2.2; average word length: 5.2).
All words were presented in a white lower-case font against a black background. Subjects had to make living (natural)/non-living (man-made) judgments about the referents of words. They responded by pressing the 'j' and 'k' keys of a QWERTY keyboard with the index and middle fingers of the right hand, respectively. The category-response mapping was counterbalanced. The words appeared above a white fixation line that remained on the screen during the whole trial. On stop trials, the line turned bold after a variable stop-signal delay (SSD).

Procedure
In the experimental group, the first four stimulus presentations were 'training' (or acquisition) trials; the fifth (final) word presentation was the 'test' trial. Stimulus presentation was pseudorandomised. First, stop-then-go stimuli (one living item, one nonliving item) were presented on stop trials in the training phase, and on go trials in the test phase. Second, stop/go-then-go stimuli (one living item and one non-living item) were presented on both stop (50%) and go trials in the training phase, and on go trials in the test phase. Third, stop/go-then-stop stimuli (one living item and one non-living item) were presented on both stop trials (50%) and go trials in the training phase, and on stop trials in the test phase. Fourth, go-then-stop stimuli (one living item and one non-living item) were presented on go trials in the training phase, and on stop trials in the test phase. Each stimulus type occurred with equal probability, so the overall stop-signal probability was .50. Subjects were neither informed about the different stimulus types nor about the different phases of the experiment.
Trial presentation was also pseudo-randomised in the control group. Each subject in the control group was 'yoked' to a subject in the experimental group to determine signal presentation. For example, when a stop signal was presented on the first trial of the first block for Subject 1 in the experiment group, then a stop signal was presented on the first trial of the first block for Subject 1 in the control group. In the control group, items were not consistently associated with stopping in the training phase. We achieved this by randomly swapping two 'stop-then-go' items with two 'go-then-stop' items in the training phase.
All trials started with the presentation of the fixation line (see Fig. 1 in the main text; FIX interval). After 1000 ms, the stimulus appeared above the line. The stimulus was removed after 1500 ms (MAXRT, Fig. 1 main text), regardless of RT. After the stimulus was removed, the next trial started immediately. On stop trials, the fixation line turned bold after a variable stop-signal delay (SSD), instructing subjects to withhold their response. Our previous work indicates that subjects are less likely to learn stimulus-stop associations when stopping is unsuccessful (see Verbruggen and Logan (2008b)). Therefore, SSD was adjusted with a two-up/onedown staircase procedure (Verbruggen & Logan, 2009b) based on the subject's performance on stop/go-then-n items to ensure that they were able to stop their responses to those items $ 70% of the time. The SSDs for the stop-then-go and go-then-stop items were yoked to the SSD values for stop/go-then-n items. Subjects were told not to wait for the stop signal and that it would be easy to stop on the majority of the trials but difficult or impossible to stop on a minority of the trials.
The experiment consisted of 14 blocks of 40 trials. We used a new list of 8 words in every block, and each word was presented five times per block. To familiarise subjects with the new words, the whole list was presented at the beginning of the block. After 5 s, the trials started.

Analyses
All data processings and analyses were completed using R (R Development Core Team, 2014). Proactive response-strategy adjustments could result in a higher percentage of omitted responses as well as higher accuracy (Verbruggen & Logan, 2009), so we distinguished between the proportion of correct go trials [p(correct)] and the proportion of missed go trials [p(miss)]. Mean reaction time (RT) on go trials was calculated after removal of incorrect trials. As we used novel words in each block, outliers could have influenced mean RT. Therefore, we detected outlying RTs with the nonparametric box-and-whisker method (Tukey, 1977) as a function of stimulus-type and stimulus presentation (1-5) for each subject, and subsequently removed the trials with outlying values before calculating the mean RT (o1% of all correct go trials were excluded). Inclusion of these outliers did not alter the overall pattern of results in a meaningful way.
In the training phase, we collapsed stop/go-then-go and stop/ go-then-stop items as these were equivalent in this phase. The mean RT, proportion of correct go trials, proportion of missed responses [p(miss)] for go trials appear in Table A1. Probability of responding [p(respond|signal)], and mean SSDs for stop trials appear in Table A2. The low probability of responding (due to the two-up/one-down tracking procedure) and high signal probability ensured that this design was optimal to examine stimulusstop learning but made it suboptimal for the estimation of stop latencies; therefore, SSRTs were not estimated or analysed.
To compare the experimental group and control group, we contrasted the items types in the experimental group with performance for items that occurred at the same moment in the control group. For example, if a stop-then-go item occurred on trial 38 of block 2 for Subject 1 in the experimental group, then we labelled the item that occurred on trial 38 of block 2 a 'stop-thengo' item for Subject 1 in the control group.
All data files and R scripts used for the analyses are deposited in the Open Research Exeter data repository (http://hdl.handle.net/ 10871/15358).

Results
Tables A1 and A2 provide an overview of the go and stop data, respectively. An overview of the analyses appears in Table A3.
Appendix B. Do subjects learn associations between the stimulus and the stop signal?

Method Subjects
Twenty-one students from the University of Exeter participated for monetary compensation (d10). Two subjects were excluded from analyses because their percentage of correct go trials was r50% and one subject was excluded because their percentage of signal-respond trials was Z50% (a high proportion of correct stop trials may be required for optimal stimulus-stop learning; Verbruggen & Logan, 2008b). Exclusion of these subjects did not substantially alter the overall pattern of results (see Table B3).

Apparatus, stimuli, and procedure
We will focus on the differences with the experiment discussed in Appendix A. We created 24 matched lists of eight words (four living items and four non-living items). Twelve lists were used in the first experimental session, and the remaining 12 lists were used in the second experimental session (average word frequency: 2.3; average word length: 5.2). In addition, four lists of eight words were selected for the practice phase of each session (two lists per session).
The experiment was run on a PC using Psychtoolbox (Brainard, 1997). The stimuli were projected onto a presentation screen and Table A1 Overview of the go data. Probability of an accurate go response [p(correct)], probability of a missed go response [p(miss)] and average reaction time as a function of stimulus type, trial (i.e. stimulus presentations 1-5), and group. Accuracy is the ratio of correct go trials to the number of correct and incorrect go trials (missed trials were excluded). P(miss) is the ratio of omitted responses to the total number of go trials. M ¼ mean; sd¼ standard deviation. viewed via a 451? headcoil-based mirror. All words were presented in a black lower-case Arial font on a white background. Subjects responded via button presses of the left (living) and right (non-living) buttons of a MRI compatible response box using the index and middle fingers of the right hand, respectively. The duration of the intertrial interval varied randomly on an exponential distribution with a range of 500-4000 ms and a mean of 1000 ms (as in Lenartowicz et al. (2011)). The screen was blank during the intertrial interval. Subjects completed two experimental sessions on consecutive days. Each session consisted of 12 blocks of 40 trials. New word lists were used in every block to prevent re-learning. To familiarise subjects with the new words, the words were presented at the beginning of each block. After 10 s, the words were removed from the screen and the first trial started.

Analyses
The go and stop data were analysed using repeated-measures ANOVAs. All analyses were conducted using R (R Development Core Team, 2014). The training and test phase data were analysed separately. In the training phase, we collapsed the stop/go-then-go and the stop/go-then-stop items (as these were equivalent in the training phase; see Appendix A). Outliers in the RTs were again detected using the same methods as in Appendix A, and were removed prior to analysis (4.5% of all correct go trials). Inclusion of these outliers did not alter the overall pattern of results in a meaningful way. All data files and R scripts are deposited in the Open Research Exeter data repository (http://hdl.handle.net/ 10871/15358).

Results
Tables B1 and B2 provide an overview of the go and stop data, respectively. An overview of the analyses appears in Table B3.

Table A3
Overview of the results of the mixed univariate analyses of variance. For the analysis of the go data in the training phase, we excluded 'stop-then-go' items in the control group. For the analysis of the signal data in the training phase, we excluded 'go-then-stop' items in the control group.

Table B1
Overview of the go data. Probability of an accurate go response [p(correct)], probability of a missed go response [p(miss)] and average reaction time as a function of stimulus type and stimulus presentation (1-5). Accuracy is the ratio of correct go trials to the number of correct and incorrect go trials (missed trials were excluded). P(miss) is the ratio of omitted responses to the total number of go trials. M¼ mean; sd ¼ standard deviation.

Table B3
Overview of repeated measures analyses of variance. Stimulus type (stop-then-go, stop/go-then-go, stop/go-then-stop, go-then-stop) and stimulus presentation (1-5) were within-subjects factors. In the go RT analysis, incorrect and missed go trials were removed. We did not analyse p(miss) because values were low. To account for potential violations of sphericity, the Huynh-Feldt correction was applied where appropriate (uncorrected degrees of freedom are reported). Note that the main effect of stimulus type on p(respond|signal) was marginally significant when the three outliers were included (p ¼0.056); the main effect of trial on p (respond|signal) remained significant (p o 0.001). The RT and accuracy main effects and interactions remained nonsignificant.