The concepts of spatial selective attention and response inhibition have been topical in the psychological literature since at least Helmholtz (1867) and James (1890). The ability to inhibit some elements in the environment while focusing on other target elements is crucial to cognitive functioning (e.g., Broadbent, 1958; Cowan, 1995). Moreover, failures to properly inhibit responses to inappropriate elements in the environment are used to describe and understand pathological development (Barkley, 1997; Hasher and Zacks, 1988).

Inhibition is often treated as a unified or homogeneous concept (see Rey-Mermet, Gade, & Oberauer, 2018 for a review). Accordingly, one might expect tasks that measure inhibition to co-vary together. If inhibition is a unified process, then people who are particularly good at excluding irrelevant information in one task should be good at doing so in other tasks. Yet, in individual-difference studies, there have been meager correlations across individuals among tasks that purportedly localize inhibition. For example, consider the correlation between Stroop interference and flanker interference. In the Stroop task, information from reading an item must be suppressed to accurately respond; in the flanker task, information from neighboring items must be suppressed to accurately respond. Earlier work showed some evidence for a correlation among the flanker and Stroop tasks (e.g., Friedman & Miyake, 2004), but more recent work has shown low or even slightly negative correlations (e.g., Hedge, Powell, & Sumner, 2018; Pettigrew & Martin, 2014; Rey-Mermet et al., 2018). Rey-Mermet et al., (2018) provide a salient account of similarly meager correlations across a number of tasks and conclude that inhibition is distinct rather than unified.

The distinction in results is also seen in studies that employ large batteries of inhibition tasks run across hundreds of people. In these studies, researchers typically reduce the dimension of the data by using latent variables to decompose the covariation among tasks. Unfortunately, most of these analyses have resulted in the awkward situation where factors largely load onto single tasks (MacKillop et al., 2016; Rey-Mermet et al., 2018), indicating a lack of covariation to decompose.

In this paper, we provide a different, more targeted assessment of inhibition. Here, we focus on an experimental method rather than latent variable modeling. We focus our attention on two tasks that are very similar. Both tasks are versions of Eriksen flanker tasks where the participant must suppress surrounding visual information. We leverage here the fact that sometimes very similar tasks result in exactly opposite behaviors.

In the Eriksen flanker task, the usual behavior is in a specific direction that we term assimilation. Consider the display in Fig. 1a where the goal is to identify the center letter as either an “A” or an “H”. If assimilation occurs, then people are more likely to misidentify the target H as an “A” when surrounded by A’s than when surrounded by H’s. Because they are making responses seemingly driven by the identity of the flankers, we say their inhibition failure leads to an assimilation of the background.

Fig. 1
figure 1

Three flanker paradigms. The participants’ task is to identify the center letter as either an A or an H. a Conventional flanker paradigm results in assimilation (Eriksen & Eriksen, 1974). b Modified paradigm with morph-letter targets results in contrast (Rouder & King, 2003). c Word contexts result in assimilation (Neisser, 1967)

Assimilation, however, is not the only possible outcome. Rouder and King (2003) used a modified version of the flanker task and found the opposite effect, which we call contrast. Rather than using well-formed letters, Rouder and King’s targets were morphed letters between A and H (see Fig. 1b). Perhaps surprisingly, morphs surrounded by H’s were less likely to be identified as an “H” than morphs surrounded by A’s. This effect is exactly opposite of the typical assimilation because inhibition failures lead to a response that is in contrast to the background.

The presence of two different inhibition effects, assimilation and contrast, provides a fruitful window for examining the unity of inhibition. Both contrast and assimilation here reflect a failure to completely inhibit the background, but they lead to opposite behavioral patterns. The question then is whether the inhibition processes are the same in both tasks. We address this question by studying the correlation of inhibition abilities across people. Are people who are affected by assimilation also affected by contrast? We interpret the presence or absence of correlation rather conventionally. If assimilation and contrast effects are correlated, especially if strongly so, then the tasks seemingly rely on some elements in common. Conversely, if the effects of assimilation and contrast are unrelated, then the pattern serves as a marker that inhibition in these tasks relies on statistically separable processes.

We employ a similar procedure to Rouder and King (2003). In Rouder and King, assimilation occurred when the target letter was clear, a pure letter. In this condition, accuracy was high and response times were the salient indicator of performance. Contrast occurred when the target letter was morphed. Here, the choice proportions were variable and these choices were the salient indicator of performance. Our concern with this earlier work is that assimilation and contrast occurred for different stimuli (clear vs. morphed) and for different behavioral measures (reaction time vs. choice). To that end, we measure response choice and focus on morphed targets.

To induce contrast and assimilation, we manipulate the background as follows: Our targets were morphed letters like those in Fig. 1b. Our backgrounds came in two types, either a letter context such as in Fig. 1b or a word context such as in Fig. 1c. For the letter context, we expect large contrast effects as was observed by Rouder and King (2003). The rationale for the word context (Fig. 1c) comes from Neisser (1967). We expect that a morph between A and H will be judged more “A” like in the C_T context than in the T_E context. This effect is assimilative, and it has been repeatedly demonstrated that there is an assimilative effect of word contexts for both visually and aurally presented letters (Baron & Thurston, 1973; Reicher, 1969). As an aside, this assimilation effect has been a motivating phenomenon in the development of connectionist models (e.g., Carpenter & Grossberg, 1987; Rumelhart & McClelland, 1982) where word nodes feed positive activation to corresponding letter nodes.

In the following experiment, each participant identified several A-to-H morphs (see the top row of Fig. 2). These target morphs were embedded within four background contexts (see the bottom row of Fig. 2). By comparing performance in the A-letter and H-letter contexts, we assess each participant’s ability to inhibit contrastive information. By comparing performance in the C_T-word and T_E-word contexts, we assess each participant’s ability to inhibit assimilative information. Note that for each target, there are assimilative and contrastive background contexts such that the critical comparisons may be made across backgrounds. We assess the correlation between inhibition measures for the two context types across individuals. The critical question is whether inhibition across contrastive and assimilative contexts is unified or distinct. If the two forms of inhibition share some common mechanism, then a non-zero, positive correlation is expected. Conversely, if contrast and assimilation are distinct mechanisms of inhibition, then a null correlation is expected.

Fig. 2
figure 2

Stimuli. Top The five targets used in the experiment. Bottom The four backgrounds. The stimulus on a trial consisted of one of the five targets placed into the blank center location of the 3 × 3 grid background

Method

Participants, exclusion, and sampling justification

Ninety-nine undergraduates from the University of Missouri served as participants as part of an introductory course requirement. Data from two participants were discarded due to a computer error and data from an additional four were discarded because 20 or more of their responses were shorter than a criterial 200 ms in duration. The data from the remaining 93 participants were used in analysis.

The critical question is how many participants to use. Because the dependent measure, response choice, is not common, there is little guidance in the literature to justify a sampling plan. We use Bayesian analysis, and, consequently, may use optional stopping. As discussed by Rouder (2014) and several others before him, the interpretation of the Bayes factor does not depend on whether one uses a set stopping rule or proceeds haphazardly. We first peeked at the data around 20 people and observed a Bayes factor of 2-to-1, which we were not satisfied with. By 50 participants, we had obtained a Bayes factor of 5-to-1. At that point, we decided to run as many participants as we could until the end of the semester. At semesters end, we ran 93 usable participants who provided 65,541 usable responses.

Design

The experiment was a 5 × 2 × 2 within-subject factorial design. The first factor was the target, and it was manipulated through five levels from the letter H through the morphs to the letter A. The second factor was the context type, the background was either a letter or a word. The final variable was context direction, a context that promotes “A” or “H” responses. We coded the H background and the C_T background as promoting “A” responses based on prior literature. This coding does not determine the direction of results; it simply provides a clear language for discussing them.

Material

The stimuli are shown in Fig. 2. The top row contains the five target letters. The bottom row contains the four 3 × 3 letter grid contexts with a space left in the middle for the target. The backgrounds are exactly as shown. The X-in-a-box characters were used in the word contexts to give them the same overall dimension as the letter contexts. All target letters appeared in each of the four contexts, though the rates were not equal. To emphasize the morphs, the three central targets in Fig. 2 were each twice as likely to appear than each well-formed letter.

Procedure

Participants were presented with the stimuli and asked to judge whether the center letter was more similar to an “A” or an “H” by pressing the corresponding keys on a standard keyboard. Participants were explicitly instructed to ignore the background context and base their responses on the central target alone.

An experimental trial proceeded as follows: The screen was blank during a 1.5-s foreperiod. We warned participants that a target within one of the context grids was about to appear as follows. Two brief tones were presented 500 ms and 250 ms before the stimulus. These tones allowed participants to precisely time the stimulus. Next, the stimulus was presented for 100 ms, and thereafter, was replaced by a blank screen. This blank screen remained until the participant pressed either the “A” or “H” key to indicate their judgment about the target. The response marked the end of the current trial and the beginning of the next one. Responses and the time taken to respond were recorded. A block consisted of 96 trials and the experimental session consisted of ten blocks for a total of 960 trials. Participants were encouraged to take breaks between blocks. No feedback was given about participant responses during the course of the experimental session.

All experimental sessions were conducted on MacMini computers running the operating system MacOSX 10.6.2 with Octave version 3.2.3. This experimental procedure was approved by the Institutional Review Board at the University of Missouri.

Data curation

Data were born-open (Rouder, 2016) in that they were made publicly available as they were collected. They may be found at https://github.com/PerceptionCognitionLab/data1/tree/master/ctxIndDif/flankerMorph4.

Results

Data were cleaned by discarding responses with latencies less than 200ms and greater than 2s. These discards comprised about 1% of the total. Additionally, the first 20 trials of the session were considered practice and excluded. These criteria were chosen before data collection.

Figure 3a and b show the proportion of “H” responses as a function of target and context. As expected, the curves start low for A and A-like stimuli and increase as the targets become more H-like. Figure 3a displays the results for the letter contexts (A-letter and H-letter) and Fig. 3b displays results for the word contexts (C_T-word and T_E-word). Solid and dashed lines in Panel A show the effect of contexts. The effect here is contrast as a morph is more likely to be identified as an “H” when surrounded by A s than surrounded by H s. The opposite effect—assimilation—may be seen in the word contexts. A morph in the context T_E is more likely to be identified as an“H” than in the context C_T.

Fig. 3
figure 3

Empirical Results. a Response proportions for letter contexts. Solid and dashed lines differentiate between A-letter and H-letter contexts. b Response proportions for word contexts. Solid and dashed lines differentiate between C_T-word and T_E-word contexts. c Individual effects in the letter contexts. The negative direction denotes a robust contrast effect. d Individual effects in the word contexts. The positive direction denotes a robust assimilation effect

Figure 3a and b show effects averaged across individuals. Effects for each individual are shown in Fig. 3c and d. To observe individual effects in the letter-context condition, we subtracted the proportion of “H” responses for the A-letter context from that for the H-letter context. In this graph, positive values indicate an assimilation effect; zero indicates no effect of context direction; negative values indicate a contrast effect. The following three points are noted: 1) The contrast effects of letter contexts are robust across individuals. 2) The size of these effects is much larger than usual. The differences in proportions average as much as 0.34, which dwarfs the size of differences in most experiments. 3) The degree of individual variability is also quite large. This degree provides increased resolution in the following correlational analysis. Figure 3d shows the same plot for the word context; it is formed by subtracting the proportion of H responses in the C_T-word context from that in the T_E-word context. The story about individuals is largely the same: 1) seemingly every individual shows an assimilation effect, 2) the effect is large, averaging as much as 0.21, and 3) there is a suitable range of variation.

To assess the correlation among individual assimilation and contrast effects, we developed a Bayesian hierarchical mixed model with a probit link. The benefits of the modeling approach are two-fold: First, it provides a principled means of combining data across the different targets. Second, and more importantly, the hierarchical structure provides a form of regularization used to avoid overstating the range of individual variation (Davis-Stober, Dana, & Rouder, submitted; Efron & Morris, 1977; Lehmann & Casella, 1998). The model and corresponding analysis are described in an Online Supplement at https://github.com/PerceptionAndCognitionLab/ctx-flanker/tree/public/papers/app/public. The main outputs are individual estimates of assimilation and contrast effects and a posterior distribution of the correlation. The individual estimates are shown in the scatter plot in Fig. 4a; more positive values indicate a stronger assimilation and stronger contrast effects in the respective background contexts.

Fig. 4
figure 4

Model Results. a Each participant’s estimated assimilation effect against their estimated contrast effect. An ellipse denotes the standard deviations of the estimate and the blue regression line is the line of best fit. b The posterior distribution of the population correlation, ρ, between assimilation and contrast. The solid line denotes the prior distribution

As can be seen in Fig. 4a, there is a fair degree of variation across individuals as well as an overall positive relationship. An issue, however, is the presence of an outlying point, indicated with an arrow. This participant had the highest degree of assimilation and the lowest degree of contrast. Performance here stands apart from that of the other participants; if that point is included in analysis, then it would have great leverage. We decided in a post hoc manner to exclude this participant, and the following analyses are based on this exclusion. Our results therefore pertain to the vast majority of individuals who are in the main cluster.

Assimilation and contrast estimates show a positive relationship as seen by the OLS regression line in Fig. 4a. OLS regression, however, is inappropriate as an inferential tool because the estimates are correlated through the hierarchical structure. To perform inference, we plot the prior and posterior distributions of the population-level correlation coefficient. The prior distribution here was chosen to be flat, placing equal plausibility on all values of the correlation coefficient. The resulting posterior distribution is well localized for positive values away from zero. The mean of this posterior distribution serves as a point estimate, and it is 0.35. One way of competitively assessing the null-correlation hypothesis vs. the alternative is to compute the change in plausibility at zero. The plausibility was reduced by a factor of 24.98. This reduction indicates the data are 24.98 times more plausible under the alternative than the null. This computation is the Savage–Dickey approach to Bayes factors (Dickey, 1971; Gelfand & Smith, 1990).

In this experiment, the effects of the surround are minimal for the first several trials and grows slowly to an asymptote. Both contrast and assimilation effects tend to reach their asymptotes after the second of ten blocks. We reran the analysis eliminating the first two blocks (about 20% of the data). The Bayes factor was found to be smaller: 6-to-1 rather than the 25-to-1 with all data. We think this lower value serves as a lower limit; perhaps the strength of evidence is best thought of as a range, say from 6-to-1 to 25-to-1 or roughly, one order of magnitude. The supplement provides more information about this analysis.

To place the correlation value in context, it is helpful to consider the reliability of individual estimates. We suspected high reliability because we had each participant perform a total of 960 trials, which is quite numerous. We split each individual’s data into odd and even trials, and reran the Bayesian probit regression analysis separately for the odd and even trials. Each analysis provides an estimate of each individual’s assimilation and contrast, and the assimilation estimates in the odd trials may be correlated with the assimilation in the even trials, and the same for the contrast estimates. The correlation among individuals’ assimilation estimates was 0.76; the correlation among individuals’ contrast estimates was 0.75. These values when extrapolated to the full sample imply reliabilities of 0.86 and 0.86, respectively. Hence, much of the variation in the scatter is not due to trial-by-trial noise, but reflect true latent variation across individuals.

Discussion

Our goal here was to assess whether inhibition was mediated by common or distinct mechanisms in assimilation and contrast contexts. To address this goal, we assessed the correlation between individuals’ ability to inhibit background assimilative information and their ability to inhibit background contrastive information. Our approach relied on morph-letter targets. Large assimilation effects were found when the surrounding, to-be-ignored information could potentially be a word, replicating the often observed top-down effect of word contexts on letter identification. Contrast effects were found when the surrounding contexts were letters, replicating (Rouder & King, 2003) main finding. The key finding here is a positive correlation across individuals. Individuals who were better able to inhibit the contrastive effects of surrounding letter contexts were better able to inhibit the assimilative effects of surrounding word contexts.

The findings are inline with the view that selective attention is achieved by narrowing receptive fields. This narrowing process occurs whether the to-be-excluded information is contrastive or assimilative (e.g., Cowan, 1995; Desimone & Duncan, 1995; Eriksen & Schultz, 1979; Hedden & Gabrieli, 2010). Yet, there are strong arguments that contrast and assimilation are not the same process. Most theories of contrast rely on a center-surround organization of low-level, perceptual receptive field structures (Palmer, 1999). Most theories of assimilation flanker effects, however, are failures of response inhibition, which is conceptualized as a higher-level, top-down process (Eriksen & Eriksen, 1974). Rouder and King (2003) interpreted their original findings as evidence for distinct processes of contrast and assimilation. They theorized that the assimilative effect was at the response level and the contrastive effect was at a perceptual level.

Our finding, the correlation between inhibition effects, could be interpreted as evidence for a unified mechanism of inhibition. Here is how: attention acts fairly early but is imperfect and some irrelevant information is processed. When it is, the ensuing contrast and assimilation effects result. In this view, the common variation across these tasks index the individual’s raw ability to control selective attention. This explanation follows the parsimonious account from Lachter et al., (2004), who provide an updated version of Broadbent’s classic early-attention theory (Broadbent, 1958).

While this interpretation fits with the current finding, we caution readers when relying on correlations to explain structure and processing. The correlation may not result from a leaky early filter, but might reflect more mundane explanations such as variability in demand characteristics. Some participants are going to take more time, respond with more care, and simply try harder to exclude the background. These participants will show smaller differences between A and H backgrounds, that is, smaller contrasts, and also smaller differences between T_E and C_T contexts, that is smaller assimilation. So, as is often the case in individual-differences research, diagnosing the cause of correlations remains difficult. The next step is to manipulate spatial attention within this paradigm and to explore the effects of those manipulations on the correlations across contrastive and assimilative tasks.