In this study, we examined the effects of suggestion on Stroop interference (Stroop, 1935). More specifically, we sought to shed some additional light on the idea, emphasized in past research, that the suggestion to construe words as meaningless symbols de-automatizes word reading in highly suggestible individuals.

The Stroop interference effect is the finding that reaction times (RTs)Footnote 1 to naming the color of the ink in which a word is printed are longer when that word names a color different from its ink color (e.g., when the word blue is displayed in green) than when it is a color-neutral word (e.g., when ship is displayed in green). A surprising finding is that when highly suggestible individuals are instructed to construe words as meaningless symbols, such as characters of an unknown foreign language, the normal-sized Stroop effect that they show when no suggestion has been made is dramatically reduced, and sometimes even eliminated (Raz & Campbell, 2011; Raz, Fan, & Posner, 2005; Raz, Kirsch, Pollard, & Nitkin-Kaner, 2006; Raz, Moreno-Iniguez, Martin, & Zhu, 2007; see also MacLeod & Sheehan, 2003; Raz et al., 2003; Raz, Shapiro, Fan, & Posner, 2002).Footnote 2

Various explanations have been offered for this result. Raz et al. (2005) argued that “suggestion may instigate lowered visual system activation by reducing attention either to specific visual stimuli (e.g., words) or to the actual input stream (e.g., dampening down all visual stimuli)” (p. 9982). Later, Raz et al. (2006) concluded that the reduction of the Stroop effect via suggestion points to the fact that “cognitive processes (such as reading) that have been automatized through practice can be deautomatized and brought under control” (p. 94). In 2007, Raz et al. argued that “the combined imaging findings propose that rather than selective abrogation of orthographic processing, the entire visual input stream was dampened down” (p. 336). Finally, their most recent interpretation favored the idea that “suggestion likely operates through a top-down effect that modulates the processing of input words” (Raz & Campbell, 2011, p. 319).

In sum, the processes underlying those findings still remain rather unclear. If low-level processing is the driving mechanism (i.e., if suggestion influences the visual input of highly suggestible individuals), the Stroop task seems rather unnecessarily sophisticated, and is perhaps not the paradigm best-suited for demonstrating such influence. In contrast, the use of the Stroop task remains highly relevant if suggestion influences the processes implicated in word reading per se (i.e., visual word recognition from visual features to meaning).

Following on from the latter idea, it is also equally plausible that the suggestion simply reduces nonsemantic task-relevant response competition taking place in this task. Indeed, as long ago as 2001, Neely and Kahan stressed that the standard Stroop interference effect is, to a great extent, the result of such mechanisms (see also Dalrymple-Alford, 1972; Klein, 1964). Consistent with such a claim, the narrowing of attention through coloring and spatially cuing a single letter (vs. all letters) in a word, for instance, reduces the standard Stroop effect (see, e.g., Augustinova & Ferrand, 2007; Besner & Stolz, 1999; T. L. Brown, Joneleit, Robinson, & Brown, 2002; Manwell, Roberts, & Besner, 2004) but does not eliminate, or even reduce, the semantic contribution to the Stroop effect (Augustinova & Ferrand, 2007; Augustinova, Flaudias, & Ferrand, 2010), which is thought to be automaticFootnote 3 (see also, e.g., T. L. Brown, Gore, & Carr, 2002; Neely & Kahan, 2001; Tse & Neely, 2007).

Given that all of the previous findings on the effects of suggestion have been restricted to the standard Stroop task, in the present study we examined the direct influence of suggestion on the semantic contribution to Stroop interference. To this end, one of the semantic factors examined by Neely and Kahan (2001) was manipulated. More specifically, standard incongruent trials (e.g., the word blue displayed in green) were supplemented by the presentation of words that were simply associated with an incongruent color (e.g., sky displayed in green). In this case, the presence of a significant semantically based Stroop effectFootnote 4 (i.e., a positive difference in mean response latencies between color-associated trials and neutral trials) could consequently be interpreted as prima facie evidence that word reading cannot be de-automatized. Indeed, it would be difficult to argue that semantic activation occurs without reading. More importantly, such a finding would support the idea that suggestion simply reduces nonsemantic task-relevant response competition.

To assess such a possibility, highly suggestible individuals completed both standard and semantically based Stroop tasks with or without it being suggested to them that the words should be perceived as meaningless symbols.

Experiments 1 and 2

Method

Participants and design

A group of 43 native French speakers with normal (or corrected-to-normal) vision participated in the study (28 in Exp. 1 and 15 in Exp. 2). In Experiment 1, the participants were randomly assigned to either a suggestion or a no-suggestion condition (14 participants in each condition), since the experiment made use of a 3 (type of stimulus: standard incongruent vs. color-associated incongruent vs. neutral) × 2 (suggestion: with vs. without) mixed design with Type of Stimulus as a within-subjects factor and Suggestion as a between-subjects factor. In Experiment 2, suggestion was manipulated at a within-subjects level, and the order of the conditions was counterbalanced.Footnote 5 Thus, as in Raz et al. (2006), in Experiment 2 we made use of a 4 (type of stimulus: standard incongruent vs. color-associated incongruent vs. congruent vs. neutral) × 2 (suggestion: with vs. without) × 2 (order of suggestion: first vs. second) mixed design with Type of Stimulus and Suggestion as within-subjects factors.

Procedure

Approximately 750 individuals (all undergraduates at Blaise Pascal University, Clermont-Ferrand, France) were tested for levels of suggestibility. All of the participants scoring 10–12 out of a possible 12 on the French version of the Harvard Group Scale of Hypnotic Susceptibility, Form A (Shor & Orne, 1962), and 9–11 out of a possible 11 on the French version of the Stanford Hypnotic Susceptibility Scale, Form C (Weitzenhoffer & Hilgard, 1962), were invited to participate in one of the two experiments in exchange for financial compensation (€20). Less than 6% of the 750 total participants achieved the criterial level of suggestibility needed to participate in the experiment. On their arrival at the laboratory, all of the participants were informed that the purpose of the study was to investigate the effects of suggestion on cognitive performance, and they were told that a suggestion might be administered at certain points during the experiment (see Raz et al., 2006, for details).

After they had agreed to participate, the participants were presented with the Stroop task and completed 24 practice trials whose composition mirrored the experimental trials. It should be remembered that, in Experiment 1, the participants were randomly assigned to either the suggestion or the no-suggestion condition. Consequently, those in the suggestion condition of this experiment heard the following tape-recorded suggestion (translated and adapted from Raz et al., 2006):

Very soon you will play a computer game similar to the one in the task you just completed. When I clap my hands once, meaningless symbols will appear in the middle of the screen instead of words. They will feel like characters in a foreign language that you do not know, and you will not attempt to attribute any meaning to them. This gibberish will be printed in one of six colors: red, blue, green, brown, orange, or yellow. Although you will only be able to concentrate on the color in which the symbols are displayed, you will look straight at the scrambled signs and see all of them clearly. Your job is to name the display color quickly and accurately. You will find that you can play this game easily and effortlessly. When I clap my hands twice, you will regain your normal reading abilities.

All of the participants in Experiment 1 then completed 90 experimental trials. As explained in the above text, the experimental trials in the suggestion condition started with a handclap and ended with a double handclap.

In Experiment 2, we used the same procedure used by Raz et al. (2006). Each participant performed the Stroop task twice: once after activation of the suggestion (by means of a handclap) and once without activation of the suggestion. There was a 15-min break between these two sessionsFootnote 6 (each of which consisted of 120 experimental trials), and the session order was counterbalanced. For participants in the suggestion-first condition, the experimental trials were preceded by the handclap. At the end of the first set of trials, participants in the suggestion-first condition heard a double handclap, which was the signal for canceling the suggestion. For participants in the suggestion-second condition, a single handclap preceded the second set of trials, and a double handclap followed at the end. The participants were then thanked and fully debriefed.

Apparatus and stimuli

The participants were seated approximately 60 cm in front of a 17-in. Dell color monitor. Stimulus presentation and data collection were controlled by DMDX software (Forster & Forster, 2003) running on a PC. The participants’ responses were recorded via a Koss 70-dB microphone headset and stored on the computer’s hard disk. Latencies were measured to the nearest millisecond. The stimuli were presented individually in lowercase letters. Each word subtended an average visual angle of 0.9º in height × 3.0º in width. At the beginning of each trial, a fixation point (“+”) appeared in the center of the screen. The participants were instructed to concentrate on the fixation point, which was presented for 500 ms and then replaced by a word printed in color. The stimulus remained on the screen until the participant responded or for a maximum of 2 s. After this response, a new word appeared on the screen, again replacing the fixation point and beginning the next trial. The intertrial interval was 3 s.

In Experiment 1, stimuli identical to those employed by Augustinova et al. (2010) were used. These consisted of six neutral words (balcon “balcony,” robe “dress,” pont “bridge,” chien “dog,” train “train,” and studio “studio”), six color-associated words (tomate “tomato,” maïs “corn,” ciel “sky,” salade “salad,” chocolat “chocolate,” and carotte “carrot”), and six incongruent color words (rouge “red,” jaune “yellow,” bleu “blue,” vert “green,” marron “brown,” and orange “orange”). In Experiment 2, these were supplemented by the same six congruent color words (rouge “red,” jaune “yellow,” bleu “blue,” vert “green,” marron “brown,” and orange “orange”). In each condition, all of the stimuli were similar in length (5, 5.8, and 5 letters on average for the color-associated, standard incongruent, and neutral conditions, respectively) and frequency (53, 60, and 65 occurrences per million for the color-associated, standard incongruent, and neutral conditions, respectively) according to Lexique (New, Pallier, Brysbaert, & Ferrand, 2004). In Experiment 1, the color-associated and color words were always presented in incongruent colors (i.e., carotte “carrot” appeared only in red, yellow, green, brown, or blue).

Results

Latencies more than 3 SDs above or below each participant’s mean for each condition (accounting for less than 1.8% of the total data in Exp. 1, and 1.2% in Exp. 2) were excluded from the analyses.

Consistent with our reasoning, all standard and semantically based Stroop effects were significant in both experiments (see Table 1 for magnitudes and the corresponding Cohen’s, 1988, d values). To make comparisons between experiments possible, the computed magnitudes of the Stroop effects and the differences in the percentages of errors (see Table 1 for all descriptive statistics) in both experiments were subsequently analyzed in a 2 (type of Stroop effect: standard vs. semantically based)Footnote 7 × 2 (suggestion: with vs. without) repeated measures ANOVA.

Table 1 Mean correct response times (in milliseconds), percentages of errors, and standard deviations (in parentheses) as a function of type of type of stimulus and suggestion

In Experiment 1, these analyses revealed a significant main effect of type of Stroop effect, F(1, 26) = 25.51, p < .001, η 2p = .50. This effect also contributed to a significant Type of Stroop × Suggestion interaction, F(1, 26) = 7.92, p < .01, η 2p = .23. Similarly, in Experiment 2, the analyses revealed a significant main effect of type of Stroop effect, F(1, 14) = 186.57, p < .0001, η 2p = .93, and a marginally significant main effect of suggestion, F(1, 14) = 3.49, p = .083, η 2p = .20, which in turn contributed to a significant Type of Stroop × Suggestion interaction, F(1, 14) = 7.24, p = .018, η 2p = .34.

Decompositions of these interactions showed that suggestion significantly reduced the magnitude of the standard Stroop effect in both Experiment 1, F(1, 26) = 6.19, p < .05, η 2p = .19, and Experiment 2, F(1, 14) = 8.46, p = .01, η 2p = .38. Yet suggestion had no effect on the semantically based Stroop effect in either Experiment 1, F(1, 26) = 0.08, p = .93, n.s., or Experiment 2, F(1, 14) = 0.36, p = .59, n.s.

The analysis of differences in percentages of errors (see Table 1) revealed a significant main effect of type of Stroop effect in Experiment 2, F(1, 14) = 12.25, p < .01, η 2p = .47, but this effect was only marginally significant in Experiment 1, F(1, 26) = 3.93, p = .058, η 2p = .13.

Discussion

The present experiments closely replicated the work of Raz et al. (2006), showing that the suggestion to construe words as meaningless symbols such as characters of an unknown foreign language substantially reduces the standard Stroop effect in highly suggestible individuals. There are at least two reasons why this replication is important. First, the experiments conducted by Raz and colleagues employed “nonstandard” manual responses rather than the more conventional color-naming response (e.g., Manwell et al., 2004; but see also M. S. Brown & Besner, 2001). It is therefore possible to extend the previously reported effects of suggestion to the latter version of the Stroop task. Second, a replication conducted in a different laboratory is particularly welcome, given the fact that several researchers have reported difficulties in replicating these effects (see, e.g., Raz & Campbell, 2011, and Raz et al., 2007, for a discussion of these difficulties;Footnote 8 see also Casiglia et al., 2010, for a recent corroboration of the effect).

Most importantly, the results for the critical condition (i.e., the semantically based Stroop task) provide reliable evidence that word reading cannot be de-automatized. Indeed semantic activation reliably occurred in all conditions. Since it would be difficult to argue that semantic activation in the Stroop task occurs without reading, we are inclined to conclude that the suggestion does not eliminate or prevent word reading in highly suggestible individuals. Also interestingly, and in agreement with past research (Augustinova & Ferrand, 2007; Augustinova et al., 2010), the reported results suggest that the magnitude of the semantically based Stroop effect remained about the same size in all conditions. Such an observation is compatible with our initial claim that the effects of suggestion are likely to operate at the level of competition between responses.

Relatedly, it should be noted at this point that the inclusion of congruent trials induced participants to pay more attention to the verbal responses associated with the word (because on some trials—i.e., the congruent trials—doing so would facilitate RTs; see, e.g., Besner, Stolz, & Boutilier, 1997; MacLeod & McDonald, 1995). Consistent with such idea, in the no-suggestion condition, the standard Stroop interference effect (relative to the neutral condition) increased twofold when the congruent trials were included (i.e., in Exp. 2) relative to Experiment 1 (the difference of the 146 ms vs. 70 ms was significant at p < .01). Yet the fact that the semantic Stroop effect was not increased by the inclusion of congruent trials shows that, unlike the standard Stroop effect (i.e., an effect caused by competition between two task-relevant verbal responses), the semantic Stroop effect is instead produced by “semantic competition,” because competition between the color required for the verbal response and the task-irrelevant verbal responses of the words “tomato” and “balcony” would be equated. This analysis reinforces our conclusion that the suggestion reduces response competition. In sum, while suggestion undoubtedly disrupts processing in highly suggestible individuals, this type of top-down modulation seems to influence nonsemantic task-relevant response competition.

Conclusion

In addition to shedding additional light on the effect of suggestion, these results improve our understanding of the automaticity of semantic activation, as they add to the growing body of evidence suggesting that semantic activation in the Stroop task is indeed automatic and ballistic, in the sense that it occurs without intent and cannot be prevented (e.g., T. L. Brown, Gore, & Carr, 2002; T. L. Brown, Joneleit, et al., 2002; Heil, Rolke, & Pecchinenda, 2004; Küper & Heil, 2008; Tse & Neely, 2007). It should be remembered that findings from different fields (e.g., psycholinguistics and social cognition) have independently challenged such a view. For instance, Huguet, Galvaing, Monteil, and Dumas (1999; see also Sharma, Booth, Brown, & Huguet, 2010) reported that the presence of a passive, nonevaluative observer reduces standard Stroop interference (as compared to an “alone” condition). More recently, Goldfarb, Aisenberg, and Henik (2011) showed that the standard Stroop effect failed to reach significance in participants primed with the concept of “dyslexia” (as compared to a no-priming condition). However, as impressive as these effects might be, they all suffer from the lack of a clear conceptual definition of de-automatization and from the use of an inappropriate means to measure it (Neely & Kahan, 2001). Thus, given these limitations, such results are clearly inconclusive with regard to the automaticity of semantic activation (hence, the automaticity of word reading). Indeed, it is very likely that the methods reported above generally reduce nonsemantic task-relevant response competition (see, e.g., Augustinova & Ferrand, 2007; Augustinova et al., 2010). Future research should examine this possibility more directly. In the meantime, it seems premature to consider that the automaticity of semantic activation is a myth, and it is clearly risky to do so when only the standard Stroop task is being used.