Studying contextual influences on speaking provides a window into the cognitive mechanisms underlying speech planning, and such studies have informed influential speech production models (e.g., Levelt, Roelofs, & Meyer, 1999). The present study is concerned with taboo interference in the picture-word interference (PWI) task, that is, the observation that picture naming is slowed down by taboo distractor words (i.e., offensive or socially inappropriate words) compared with neutral distractor words (Dhooge & Hartsuiker, 2011; for a similar effect in color naming, see MacKay et al., 2004; Siegrist, 1995). Previous studies have related this effect to attentional capture by taboo distractors (Hansen, McMahon, Burt, & de Zubicaray, 2017), prearticulatory verbal self-monitoring (Dhooge & Hartsuiker, 2011), or a combination of mechanisms across multiple processing stages (White, Abrams, Koehler, & Collins, 2017). In the present study, we further investigated the source of the taboo interference effect by testing for taboo interference in three tasks, which systematically differ with respect to the processing stages involved.

Attentional capture accounts of taboo interference posit that increased attention to taboo distractor words (relative to neutral words) slows preparation of the target response at prelexical and lexical processing stages. Increased attention to taboo words might affect prelexical, conceptual processing of target pictures because binding of word meaning and contextual information is promoted with taboo words, which would divert resources away from target processing (Hansen et al., 2017; MacKay et al., 2004). However, increased attention to distractor words might also affect lexical processing of target pictures. For instance, in the WEAVER++ model of PWI (Roelofs, 2003), increased attention to taboo words might slow down reactive blocking of the distractor and increase distractor activation, which would increase interference of the distractor in lexical selection and phonological encoding of the picture name.

Dhooge and Hartsuiker (2011) argued against an attentional capture account of taboo interference based on two observations. In a standard PWI task, they observed slower naming latencies with taboo distractors compared with neutral distractors. However, in a speeded PWI task, they found fewer intrusion errors (i.e., distractor naming) with taboo distractors compared with neutral distractors. The authors argued that an attentional capture account cannot explain both effects simultaneously. Rather, they proposed that both effects reflect a verbal self-monitoring mechanism that scrutinizes readily prepared responses before articulation. To account for the error effect, they proposed that taboo word intrusions into an articulatory buffer can be detected and prevented more efficiently than neutral word intrusions because verbal self-monitoring is fine-tuned to prevent potentially offensive and embarrassing errors (Motley, Camden, & Baars, 1981). To account for the latency effect, they proposed that the monitoring mechanism adapts trial by trial to the context, so that correctly prepared target responses are scrutinized longer before executing the response in trials in which a taboo distractor word is encountered, compared with trials in which a neutral word is encountered.

This monitoring account of taboo interference has also lately been questioned. Hansen et al. (2017) observed that taboo interference did not correlate with rated offensiveness or valence of taboo words, but rather with arousal ratings, which suggest attentional modulations as the cause of the effect. Furthermore, Hansen et al. (2017) observed additivity of phonological facilitation and taboo interference (cf. White, Adams, LaBat, & Rhynes, 2016), suggesting that the two effects arise at different loci. Thus, taboo interference could either arise before word-form encoding (e.g., attentional capture affecting conceptual or lexical processing) or afterward (e.g., verbal self-monitoring). Importantly, when distractor words were backward masked, taboo interference persisted, whereas phonological facilitation disappeared. This suggests that masked distractor words were not processed deep enough to enter an articulatory buffer. The persistence of taboo interference with masked distractors rules out that the effect arises after taboo distractor words have entered the articulatory buffer. However, the monitoring account by Dhooge and Hartsuiker (2011) does not tie taboo interference directly to the removal of taboo distractors from the buffer. Instead, detecting the presence of a taboo word could trigger a more conservative (slower) checking of the target response. Possibly, backwards-masked taboo distractors still trigger an adaption of verbal self-monitoring, even if the distractors themselves are not intruding into the articulatory buffer. In summary, previous studies on the localization and functional explanation of taboo interference have led to different conclusions, and it is so far not clear whether attentional modulations of distractor processing, or verbal self-monitoring, or a combination of these processes, cause the effect.

In the present study, we used a different approach to explore the locus and cause of taboo interference. We tested for the effect of taboo distractor words on picture processing in three tasks, which systematically differed in the processing stages involved. In picture naming, participants named the target pictures—this requires conceptual processing, lexical processing, and articulation. In phoneme decision, participants indicated via a button press whether the picture name starts with a b or a k—this requires conceptual and lexical processing, but not articulation. In size decision, participants indicated via a button press whether the depicted object is typically bigger or smaller than a given standard—this requires conceptual processing, but not lexical processing or articulation. Attentional capture by taboo distractor words should be effective independent of the involvement of lexical processing or articulation. Thus, on this account, taboo interference should be observable in all three tasks, although it might differ in size depending on the processing stages involved. Specifically, if attentional capture affects both conceptual and lexical processing, then taboo interference should be larger in picture naming and phoneme decision (involving conceptual and lexical processing) than in size decision (involving conceptual processing but not lexical processing). In contrast, under the verbal self-monitoring account, the effect should only emerge in a task in which erroneously responding to the distractor would result in a (potentially) embarrassing articulatory response. Thus, according to this account, the effect should only be found in picture naming.

Experiment 1

Experiment 1 contrasted the effects of taboo versus neutral distractor words in picture naming, phoneme decision, and size decision. Picture naming served as the baseline task to establish the sensitivity of our materials. Of critical interest was whether taboo interference would also be obtained in the two other tasks.

Method

Participants

Seventy-two native speakers of German participated (64 female, mean age: 21.7 years, range: 18–40 years). They received either course credit or €8. One additional participant was replaced because she misread the (size decision) instructions. Participants were informed about the potentially offensive nature of the experimental materials in the invitation to the experiment. The study procedures were approved by the ethics board of the Medical Faculty of Leipzig University.

Materials

Picture stimuli were 32 black line drawings of objects (max. 8.4° × 8.4° at 60 cm distance). Half of the objects had a natural size smaller than, the other half bigger than, a selected standard (a box of 26.5 × 18 × 12.5 cm). Within the sets of small and large objects, half of the object names started with b and the other half with k. Thirty-two taboo words (swear words, obscenities, sexual terms) and 32 neutral words were selected as distractors (see Supplemental Material). Taboo status of the distractor words was confirmed by a rating administered after the experiment proper. For each word, participants rated how “taboo or socially inappropriate” it would be in a regular conversation with colleagues or friends (1 = not at all to 7 = very strongly). Taboo and neutral words were matched word by word in length, word frequency, cumulative bigram frequency, and neighborhood density (see Table 1). No distractor word started with b or k. Two different sets of target–distractor pairings were created by combining each pair of matched taboo and neutral words with one small and one large object. This ruled out that the critical contrast between taboo and neutral words would be confounded with potential distractor-response congruency differences of the words and their associated pictures in size decision.

Table 1. Properties of the taboo and neutral distractor word sets

Distractor words were presented in regular orthography (i.e., with an initial capital letter) superimposed onto the pictures and embedded in a rectangle of background color (RGB: 220, 220, 220). The rectangle’s size was constant for a given picture but varied between pictures (depending on the maximal width of the distractors associated with a given picture). Distractor words appeared in Arial bold font (height: 1.7°, width: 2.9°–7.2°). Four additional pictures, taboo words, and neutral words were selected for creating practice and warm-up trials.

Design

There were two independent variables: distractor condition (taboo vs. neutral word) and task (picture naming vs. phoneme decision vs. size decision). Distractor condition was tested within participants and items. Task was tested within items but (partially) between participants. Every participant conducted two of the three tasks. Tasks were blocked; task sequence was counterbalanced across participants. Overall, each task was administered first to one group of 24 participants (and second to another group of 24 participants). Each task comprised two experimental blocks. In each of these experimental blocks, one of the two sets of target–distractor pairings was used (see Materials). The sequence of target–distractor pairings across the two experimental blocks was counterbalanced across participants. The sequence of conditions per item was counterbalanced in two parallel lists (i.e., lists with the same sequence of target pictures, but different sequences of distractor words). Overall, 12 lists (two parallel lists × six pseudo-randomizations) were constructed, and each list was used six times (once with each task sequence). Trials were pseudo-randomized with the following restrictions: (a) picture repetitions were separated by eight trials min., (b) specific sequences of any two pictures were not repeated within a task, (c) semantically related pictures did not appear on consecutive trials, (d) semantically or phonologically related distractor words did not appear on consecutive trials, and (e) target onset, target size, and distractor condition were repeated on four consecutive trials max.

Procedure

First, participants received a booklet showing all pictures with their names. Then, participants were familiarized with their first task. For picture naming, participants were instructed to name each depicted object as fast as possible. For phoneme decision, they were instructed to decide as quickly as possible whether the object name starts with a b (left button press) or a k (right button press). For size decision, they were instructed to decide as quickly as possible whether the object is typically smaller than the standard box (left button press) or larger than it (right button press). Printed signs indicating the button assignment remained visible during the decision tasks (in size decision, the standard box was also visible). The experiment started with two practice blocks. First, the task was introduced, and each picture (experimental and practice items) was responded to once (36 trials). Second, distractor words were introduced by presenting each practice picture twice with a (practice) distractor word (eight trials). Participants were instructed to pay no attention to the words and were informed that some of the words would be unpleasant and socially inappropriate. Afterward, the two experimental blocks were presented (two warm-up trials followed by 64 experimental trials per block, blocks separated by a short pause). After completion of the first task, the second task was administered in exactly the same manner (including new practice blocks). After completion of the second task, participants performed the rating.

Stimuli were presented on a TFT-monitor. Naming latencies were registered with a microphone connected to a hardware voice key. Decision latencies were registered with a push-button box. Stimulus presentation and data collection were controlled by NESU (Nijmegen Experimental Setup). Experimental trials were structured similar to Dhooge and Hartsuiker (2011). First, a fixation cross was presented (500 ms), followed by a blank screen (500 ms), followed by the picture-word stimulus (350 ms). Responses were registered until 3 s after picture onset, and 500 ms later the next trial started.

Results and discussion

Originally, we had planned to test 48 participants (12 for each of these task sequences: picture naming–phoneme decision, phoneme decision–picture naming, picture naming–size decision, size decision–picture naming). This would have allowed us to contrast picture naming with each decision task within participants. However, initial analyses showed that the baseline taboo interference effect in picture naming decreased substantially with distractor repetitions and thus also across the two parts of the experiment (see below). Therefore, collapsing the data for a specific task from the first and second part might underestimate distractor effects and would therefore have been problematic for assessing the presence versus absence of taboo interference in the decision tasks. However, separating the data by part of experiment would have left only 12 participants per part for each decision task (i.e., half of the planned sample size). We therefore tested another 24 participants (12 for each of these task sequences: phoneme decision–size decision, size decision–phoneme decision), resulting in 24 participants per part of the experiment for each task. Here, we report the data of the first task of each participant only. The data from the second task, which could only be interpreted with reservation in light of the apparent attenuation of taboo interference over the course of the experiment and potential additional task-specific carryover effects, are reported in the Supplemental Material (all data and scripts are available at https://osf.io/dm6eu/).

For the latency analyses, we discarded trials with (a) no or an erroneous response, (b) a disfluent naming response or a nonspeech sound produced by the participant triggering the voice key, or (c) a technical problem (e.g., voice key malfunction). Furthermore, we discarded trials with a response latency below 150 ms. Overall, 3.8% of all trials were removed from latency analyses. Trials of type (a) and (b) were entered as errors into error analyses. Analyses were conducted in R (R Core Team, 2017). Averaged response latencies and error rates of each task were submitted to ANOVAs (distractor repetition × distractor condition) and subsequent one-sided t tests for the distractor condition effect. Additionally, we report corresponding Bayes factors (BF10) to quantify the evidence in favor of a taboo interference effect over a null effect. Bayes factors were computed with the BayesFactor package (Morey & Rouder, 2015) using the default prior (r = .707). We oriented our evaluation of the Bayes factors on the descriptive classification scheme presented in Wagenmakers et al. (2017). Table 2 presents mean response latencies and error rates per task, distractor condition, and distractor repetition.

Table 2. Mean response latencies (RT in ms) and error rates (ERR in %) separated by task and distractor condition for Experiment 1

Picture naming

In the latency analysis, the main effect of distractor condition, reflecting taboo interference, F1(1, 23) = 85.76, p < .001, ηp2 = .789, F2(1, 31) = 125.07, p < .001, ηp2 = .801, and the main effect of distractor repetition, reflecting faster responses with distractor repetition, F1(1, 23) = 38.82, p < .001, ηp2 = .628, F2(1, 31) = 118.90, p < .001, ηp2 = .793, were significant. The interaction between these factors was also significant, reflecting reduced taboo interference with distractor repetition, F1(1, 23) = 28.53, p < .001, ηp2 = .554, F2(1, 31) = 36.37, p < .001, ηp2 = .540. The taboo interference effect was significant for the first distractor presentation t1(23) = 9.11, p < .001, dz = 1.86, t2(31) = 11.61, p < .001, dz = 2.05, and also with distractor repetition, t1(23) = 5.92, p < .001, dz = 1.21, t2(31) = 6.12, p < .001, dz = 1.08. Bayesian t tests suggest extreme evidence for a taboo interference effect for the first and the second distractor presentation, in each case for participants and items BF10 > 1,000. In the error analyses only the main effect of distractor repetition, reflecting decreasing error rates with repetition, was significant, F1(1, 23) = 6.90, p = .015, ηp2 = .231, F2(1, 31) = 9.30, p = .005, ηp2 = .231.

Phoneme decision

In the latency analysis, the main effect of distractor condition, reflecting taboo interference, F1(1, 23) = 15.71, p = .001, ηp2 = .406, F2(1, 31) = 25.98, p < .001, ηp2 = .456, and the main effect of distractor repetition, reflecting faster responses with distractor repetition, F1(1, 23) = 64.80, p < .001, ηp2 = .738, F2(1, 31) = 107.55, p < .001, ηp2 = .776, were significant. The interaction between these factors was also significant, reflecting a reduced taboo interference effect with distractor repetition, F1(1, 23) = 16.31, p = .001, ηp2 = .415, F2(1, 31) = 9.84, p = .004, ηp2 = .241. The taboo interference was significant for the first distractor presentation, t1(23) = 4.96, p < .001, dz = 1.01, t2(31) = 5.16, p < .001, dz = 0.91. With distractor repetition it was not significant in the participant analysis, t1(23) = 1.51, p = .072, dz = 0.31, t2(31) = 1.87, p = .035, dz = 0.33. Bayesian t tests suggest very strong evidence for a taboo interference effect for the first distractor presentation, for participants and items BF10 > 900, and anecdotal evidence for the effect with distractor repetition, for participants BF10 = 1.1, for items BF10 = 1.7. There were no significant effects in the error analyses.

Size decision

In the latency analysis, the main effect of distractor condition was not significant, F1 < 1, F2(1, 31) = 2.13, p = .154, ηp2 = .064, but there was a significant main effect of distractor repetition, reflecting faster responses with distractor repetition, F1(1, 23) = 8.25, p = .009, ηp2 = .264, F2(1, 31) = 47.47, p < .001, ηp2 = .605. The interaction between these factors was not significant in the participant analysis, F1(1, 23) = 2.90, p = .102, ηp2 = .112, F2(1, 31) = 4.62, p = .040, ηp2 = .130. Bayesian t tests suggest anecdotal evidence against a taboo interference effect (across distractor presentations), for participants BF10 = 0.4, for items BF10 = 0.9. There were no significant effects in the error analyses.

In sum, we replicated the taboo interference effect in picture naming from previous studies, demonstrating the sensitivity of our materials. Importantly, taboo interference was also observed with manual responses in the phoneme decision task. However, the taboo interference effect was larger in picture naming than in phoneme decision (ps < .002 for the interaction of task and condition both at the first and the second distractor presentation). In the size decision task, the results were inconclusive (i.e., null hypothesis was only very slightly favored in Bayesian analyses), suggesting that there either is no taboo interference in this task or that the effect was too small to be detected. One critical factor in this regard might have been the substantially shorter response latencies in size decision (see Table 2). While the overall latencies—fastest in size decision and slowest in picture naming—correspond to our hypothesized involvement of the different processing stages in these three tasks, it might be that the size decision task was just not sensitive enough for taboo interference to emerge. One possibility is that because of the fast size decision responses, the distractor words were not processed long enough to influence these responses. Another possibility is that attentional capture by taboo words did not become effective because target processing was too easy (so that detraction of processing resources was not detrimental). We tested these two possibilities in Experiment 2.

Experiment 2

Experiment 2 tested two new variants of the size decision task. In the first version, distractor words were presented earlier than the target pictures with a stimulus-onset asynchrony (SOA) of −200 ms. In the second version, distractor words started simultaneously with the target pictures (as in Experiment 1), but task difficulty was increased by visually degrading the target pictures. If the absence of taboo interference in size decision in Experiment 1 was caused by insufficient time to process the distractor words before completing target processing, then both manipulations (head start of distractor processing with an SOA of −200 ms vs. prolonged target processing duration with degraded targets) should promote taboo interference. In contrast, if it was caused by insufficient task difficulty, then taboo interference should be promoted only by the degradation manipulation.

Method

A new sample of 48 native speakers of German (24 for each task version) participated (36 female, mean age: 20.5 years, range: 18–33 years). One additional participant of the size decision task with a negative SOA was replaced due to exceedingly slow responses (M = 1,031 ms). The material, design, and procedure were identical to Experiment 1, with the following exceptions: All participants performed only the size decision task; there was no second task. For the size decision task with a negative SOA, distractor words were presented 200 ms before picture onset (for a duration of 350 ms, as in Experiment 1). For the size decision task with degraded pictures, picture identification was made more difficult. To this end, parts of the picture outlines were removed by superimposing a masking pattern in background color onto each picture and rotating each picture around its center (see Fig. 1 for an example). The orientation of the masking pattern and the rotation angle were chosen individually for each picture so that identification was still possible (which was verified by three independent participants).

Fig. 1.
figure 1

Example of a stimulus pair in the neutral condition (target picture: bed, distractor word: Stapel [stack]). The left side depicts the nondegraded version of the picture (as used in Experiment 1 and in size decision with negative SOA in Experiment 2). The right side depicts the visually degraded version of the picture (as used in size decision with degraded target pictures in Experiment 2)

Results and discussion

Analyses were conducted as described for Experiment 1 (for descriptive results, see Table 3). Overall, 4.9% of all trials were removed from latency analyses. Size decision latencies and error rates with an SOA of −200 ms were similar to size decision latencies with an SOA of zero ms in Experiment 1 (for the neutral condition 523 ms and 2.7% errors vs. 508 ms and 2.6% errors), suggesting comparable task difficulty regardless of SOA. In contrast, size decision latencies and error rates increased substantially with degraded pictures (680 ms and 6.6% errors), suggesting higher task difficulty with visually degraded pictures.

Table 3. Mean response latencies (RT in ms) and error rates (ERR in %) separated by task and distractor condition for Experiment 2

Size decision with negative SOA

The main effect of distractor condition was not significant in the item analysis, F1(1, 23) = 4.53, p = .044, ηp2 = .165, F2(1, 31) = 3.68, p = .064, ηp2 = .106. The main effect of distractor repetition, reflecting faster responses with distractor repetition, was significant, F1(1, 23) = 8.41, p = .008, ηp2 = .268, F2(1, 31) = 138.6, p < .001, ηp2 = .817. The interaction between these factors was not significant, Fs < 1. Bayesian t tests suggest anecdotal evidence for a taboo interference effect (across distractor presentations), for participants BF10 = 2.8, for items BF10 = 1.8. There were no significant effects in the error analyses.

Size decision with degraded target pictures

The main effect of distractor condition, reflecting taboo interference, F1(1, 23) = 8.86, p = .007, ηp2 = .278, F2(1, 31) = 11.45, p = .002, ηp2 = .270, and the main effect of distractor repetition, reflecting faster responses with distractor repetition, F1(1, 23) = 113.30, p < .001, ηp2 = .831, F2(1, 31) = 148.69, p < .001, ηp2 = .827, were significant. The interaction between these factors was not significant in the item analysis, F1(1, 23) = 6.60, p = .017, ηp2 = .223, F2(1, 31) = 2.72, p = .109, ηp2 = .081. Bayesian t tests suggest strong evidence in favor of a taboo interference effect (across distractor presentations), for participants BF10 = 13.4, for items BF10 = 36.11. In error analyses, only the main effect of distractor repetition, reflecting decreasing error rates with repetition, was significant, F1(1,23) = 14.63, p = .001, ηp2 = .389, F2(1,31) = 7.24, p = .011, ηp2 = .189.

In sum, there was again no reliable taboo interference effect in size decision when distractor words were presented earlier than the target pictures, whereas taboo interference emerged when target pictures were visually degraded. This suggests that task difficulty, rather than distractor processing time, is the critical factor driving taboo interference in the nonlexical size decision task.

General discussion

We investigated the cause and locus of interference from taboo distractor words on picture processing by contrasting three tasks that systematically differed with respect to the processing stages involved: picture naming (requiring conceptual processing, lexical processing, and articulation), phoneme decision (requiring conceptual and lexical processing), and natural size decision (requiring conceptual processing only). We replicated the taboo interference effect in picture naming from previous studies (Dhooge & Hartsuiker, 2011; Hansen et al., 2017; White et al., 2017; White et al., 2016), demonstrating the sensitivity of our materials. Importantly, taboo interference was also observed in phoneme decision and in size decision in the latter task only, however, when task difficulty was increased by visually degrading the target pictures.

The presence of taboo interference in phoneme decision and size decision (with degraded targets) demonstrates that neither articulatory processing nor lexical processing are necessary for the effect to arise. This is in line with attentional capture of taboo distractors affecting target processing at the conceptual and the lexical processing stage (Hansen et al., 2017; MacKay et al., 2004). In size decision, reliable taboo interference was only found when task difficulty was high (with degraded target pictures). This can be readily explained by the attentional account, because detrimental effects of taboo words drawing attentional resources away from target processing would only be expected to occur in a situation in which the remaining resources are insufficient to swiftly complete target processing. Furthermore, taboo interference was attenuated with stimulus repetition over the course of the experiment. This can also be explained by the attentional capture account, as previous research has shown that attention to highly arousing words diminishes substantially with repetition (Harris & Pashler, 2004).

The presence of a taboo interference effect with manual responses in the manual decision tasks is problematic for a verbal self-monitoring account that views overt articulation as a necessary precondition for taboo interference to arise. Under this account, the effect is driven by a monitoring mechanism particularly fine-tuned to prevent potentially embarrassing or offensive naming responses (Dhooge & Hartsuiker, 2011). This idea is not easily applied to manual responses, because an erroneous push-button response would not be potentially embarrassing. Therefore, one would need to assume that the timing of executing the manual response is linked to the evaluation of a hypothetical articulatory response by the verbal self-monitoring mechanism. Although one might argue that the phoneme decision task triggers subvocal articulatory responses and therefore engages verbal self-monitoring as does picture naming, there is evidence that internal speech is not fully articulatory specified (Oppenheim & Dell, 2008) and that phoneme monitoring tasks similar to our task do not involve articulatory preparation (Wheeldon & Levelt, 1995). In addition, an articulatory involvement seems implausible for the size decision task (with degraded targets).

Overall, taboo interference in the manual decision tasks can be well explained by attentional capture but not by verbal self-monitoring. This does not rule out that verbal self-monitoring, at the postlexical stage, might additionally contribute to taboo interference in picture naming. In fact, such an additional contribution might explain why the effect was larger in picture naming than in phoneme decision. This means that there may not be only a single cause or locus of taboo interference in picture naming, but multiple mechanisms operating at different levels contributing to the effect (see also White et al., 2017). However, the present data demonstrate that verbal self-monitoring alone cannot account for the full pattern of taboo interference we observed. Instead, taboo interference appears to arise already prior to articulatory preparation, during lexical processing and—at least with sufficiently high processing demand—also during prelexical processing stages.

Author note

We thank Ismail Ayoub and Tobias Struck for their help in data collection. Our research was supported by DFG-Grant MA 6633/1-1