Introduction

Kandinsky (1912, 1947) first proposed that colors and shapes are inherently associated (i.e., circle–blue, triangle–yellow, and square–red), and his color–shape correspondence theory has garnered much interest from and stoked debate among fellow artists and researchers. Other color–shape associations (CSAs) have been since proposed (Albertazzi et al., 2013; Chen et al., 2015; Dreksler & Spence, 2019; Dumitrescu, 2011; Hanada, 2019; Jacobsen, 2002; Jacobsen & Wolsdorff, 2007; Kharkhurin, 2012; Lupton & Miller, 1993; Walter, 2018). For example, Jacobsen (2002) revisited Kandinsky’s CSAs in German participants, finding red–triangle, blue–square, and yellow–circle associations, while Kandinsky’s assignment was the least preferred. He suggested that the world-knowledge associations might account for these observed CSAs (e.g., the red–triangle may resemble a traffic signal in Germany and the yellow–circle may resemble the sun). Albertazzi et al. (2013) showed that Italian participants systematically associated shapes with colors using an explicit color-matching task (e.g., circle/square–red and triangle–yellow). Chen et al. (2015) replicated Albertazzi et al.’s experiment with Japanese participants, finding a consistent pattern of CSAs with those of Italians (e.g., circle–red, triangle–yellow, and square–blue). They suggested that CSAs could be explained by semantic sensory correspondence, whereby colors and shapes convey some common semantic information (e.g., “warm/cold”). Dreksler and Spence (2019) reported significant but not consistent CSAs across three online color–shape-matching experiments among cross-cultural participants, suggesting the emotional mediation hypothesis could explain those CSAs (e.g., the liking and arousal appraisals; Palmer et al., 2013). Hanada (2019) observed that colors were systematically matched with shapes in Japanese participants (e.g., round shapes with red/orange and sharp shapes with yellow), suggesting world-knowledge (co-occurrence of colors and shapes) contributed considerably to those CSAs.

While CSAs have been reported in several studies, there is still limited research on their cognitive and behavioral consequences. Studies using implicit methods revisited Kandinsky’s CSAs, but failed to show clear evidence, except for a weak triangle–yellow link (Holmes & Zanker, 2013; Kharkhurin, 2012; Markin & Wuerger, 2013). Chen, Tanaka, and Watanabe (2015) provided the first empirical evidence that CSAs could be measured using the implicit association tasks (IATs). They observed that response times for discrimination of colors and shapes were significantly faster when the congruent color–shape combinations were mapped onto the same response key rather than different key combinations. Thus far, most studies have focused on explicit associations, with relatively fewer studies examining the influence of those associations on visual processing.

Evidence from neuroanatomy and neurophysiology indicated that visual processing for color and shape are largely separated by independent, functionally specialized cortical areas (Rentzeperis et al., 2014; Seymour et al., 2010; Zeki & Shipp, 1988). This presents a binding problem: How does the brain combine these distributed features to generate a unified percept? One of the most influential theories of feature integration was the Feature Integration Theory (FIT) posed by Treisman and Gelade (1980). It posits that visual features are initially processed in parallel in separate feature maps, and later bound through spatial attention and/or top-down processes to their common location (Treisman, 1988; Treisman, 1998; Treisman & Gelade, 1980; Treisman & Gormican, 1988). When spatial attention is unavailable, the binding breaks down, leading to a mis-binding of features called illusory conjunction (IC; Treisman & Schmidt, 1982). For example, when two colored letters a red T and a yellow X are displayed briefly, people might mis-combine the features, reporting a red X or a yellow T. ICs often occur when spatial attention is diverted, stimuli are briefly presented, presented close together, or presented in the peripheral visual field where spatial resolution is decreased (Cohen & Ivry, 1989; Ivry & Prinzmetal, 1991; Prinzmetal et al., 1995; Robertson, 2004; Treisman & Schmidt, 1982).

ICs could occur at various levels of stimulus representation. For example, high-level semantic meaning and stored knowledge of feature conjunctions could constrain ICs (Becker, Neel, & Anderson, 2010; Esterman, Prinzmetal, & Robertson, 2004; Goldfarb & Treisman, 2010; Prinzmetal, 1981; Prinzmetal, Hoffman, & Vest, 1991; Prinzmetal & Millis-Wright, 1984; Rappaport, Humphreys, & Riddoch, 2013; Treisman, 1988; Virzi & Egeth, 1984). Goldfarb and Treisman (2010) reported a binding congruency effect: participants made more ICs when binding semantically incongruent features, and replaced the incongruent features with congruent features, no matter whether those features were digits and physical sizes or color words and colored fonts. Furthermore, studies suggested that frequent exposure to a conjunction stimulus can produce strong binding that might occur at an early stage of visual processing, requiring less attention (Anderson & Humphreys, 2015; Li, VanRullen, Koch, & Perona, 2002; Rappaport, Humphreys, & Riddoch, 2013; Robertson, 2003; VanRullen, 2009). Given that CSAs are learned associations, which could be strong enough to test by both direct and indirect experimental methods (Chen et al., 2015; Chen et al., 2015; Chen et al., 2016), it might be possible that the congruency effect likely influences binding processes.

More specifically, in the phenomenon of synesthesia (i.e., one attribute of a stimulus may inevitably lead to the conscious experience of an additional attribute; Ward, 2013), people often experience abnormal stronger (mis-)binding although one feature is not presented in the stimulus (Cytowic, 1997; Robertson & Sagiv, 2004). Some researchers suggested that the synesthetic binding of color and shape occurs pre-attentively, while some argued that it also requires attention (Arend & Henik, 2017; Mattingley et al., 2001; Palmeri et al., 2002; Ramachandran & Hubbard, 2001; Sagiv et al., 2006). Arend and Henik (2017) observed that synesthetic binding follows normal binding processes, that ICs occurred more frequently when the color of a number was incongruent with its synesthetic color. While non-synesthetic people also appear to establish some synesthetic-like CSAs, whether CSAs could influence the binding processes has not been known.

Here, we therefore examined whether binding is influenced by the congruency of known CSAs (i.e., red–circle, yellow–triangle, blue–square) in Japanese, by comparing whether ICs occurred more frequently in incongruent colored-shape features than congruent features. Participants were presented with brief displays of paired incongruent colored-shapes, or congruent colored-shapes. They were instructed to report the color of one of the two shapes under diverted attention. If participants reported more ICs in an incongruent condition than in a congruent condition, it might suggest CSAs could be strong enough to penetrate binding processes, that CSAs may emerge at a relatively early stage and/or influence binding via top-down feedback; if not, it might suggest that CSAs are not available to binding processes and emerge later than binding.

Method

Participants

Eighteen Japanese undergraduates (nine males, M age = 20.2 years, SD = 1.8) from Waseda University participated. Sample size was determined using prior experiments investigating similar effects (Arend et al., 2013; Arend & Henik, 2017). All participants had normal or corrected-to-normal visual acuity and normal color vision, and were naïve to the experimental purpose. This study was approved by the institutional review board (IRB) of Waseda University and conducted in accordance with the Declaration of Helsinki. Written informed consent was obtained from all participants in advance.

Apparatus and stimuli

We adopted the experimental paradigm used by Arend et al. (2013) with modifications. The experiment was programmed in E-Prime 2.0 (Psychology Software Tools, Inc). The stimuli were displayed on a 24-in. LCD monitor (EIZO FG2421, EIZO Corp., Hakusan, Japan) with a 1,920 × 1,080 pixel resolution and a 100-Hz refresh rate. Participants viewed the monitor binocularly from a distance of approximately 60 cm.

Nine colored-shapes and two black letters were used as stimuli. The shapes were combinations of three colors (i.e., red, yellow, blue) and shapes (i.e., circle, triangle, square). The three colors were measured using a PR-655 color meter (Photo Research, Chatsworth, CA, USA) ten times consecutively and averaged. The color information was then set as follows to ensure similarity in luminance: Yellow: L* = 77.6, a* = -18.5, b* = 58.3; Red: L* = 75.6, a* = 71.4, b* = -7.3; Blue: L* = 76.8, a* = -35.5, b* = -135.8. All shape stimuli were black (8 cd/m2) line drawings presented in an upward orientation. The circle diameter was 2.1° (visual angle), the square 1.8° (height) × 1.8° (width), and the triangle 2.4° (height) × 2.1° (width). The background was white (90 cd/m2) throughout the experiment. The two colored-shapes in each display were horizontally aligned and separated by 2.6°. The shape near the center of the screen maintained a distance of 10° (visual angle) from the left or right side of the screen.

The two black letters (H and F) served as fixation stimuli, appearing at the center of the screen and subtending ~0.7° × 0.7° from the center. Two hash signals served as visual masks for the colored-shapes (see Fig. 1). These masks were printed in black and were presented about 1.8° × 1.8° from the center.

Fig. 1
figure 1

Example of the trial sequence. Following a variable interval, two letters appeared in the middle of the screen together with two colored-shapes presented to either the left or the right side of the letters. At the end of the trial, participants were required to give three responses by pressing keys (same or different letter, the color of one particular shape, (e.g., what is the color of the circle?), and confidence with color choice).

Procedure

The experiment was carried out in a dimmed laboratory. The experimental procedure was based on Arend et al. (2013). Following a variable interval (600–1,200 ms), a stimulus display comprising two letters (at the center of the screen) and two colored-shapes (to either the left or the right of the letters) was presented for 160 ms (Fig. 1). The letters were vertically presented and could be either the same (i.e., H/H, F/F) or different (i.e., H/F, F/H). Following the offset of the display, the two masks were presented for 250 ms in the same locations as the colored shapes. The masks were followed by a screen with a fixation for 500 ms. At this point, participants had to give three responses. First, they reported whether the two letters were the “same” or “different.” Second, they reported the color of one of the two colored-shapes (e.g., what color is the circle? Fig. 1). Finally, they indicated their level of confidence in their choice of color on a scale of 1–6 (1 = least confident, and 6 = most confident). The order of these responses was kept constant throughout the experiment. The response options were indicated by labeled keys on the computer keyboard: letters U and I were labeled with “same” and “different,” respectively, and letters J, K, and L were labeled “red,” “yellow,” and “blue,” respectively. The confidence response was given by pressing the numbers 1–6 on the keyboard. These three questions were often used in previous studies examining illusory conjunctions with some modifications (Treisman & Schmidt, 1982). This design allowed the investigation of binding under conditions in which focused attention to particular colored shape was prevented. Perceptual binding errors (i.e., ICs—mis-combinations of colors and shapes) were observed in the second question.

The two shape stimuli presented were always different and were randomly chosen from the three shapes (i.e., circle, triangle, square), leading to six combinations. We used one within-subjects factor, color–shape congruency, which concerned the CSAs known to be present in Japanese (i.e., circle–red, triangle–yellow, and square–blue). Each pair of shapes was thus either congruent or incongruent with these CSAs, leading to 12 possible combinations. Combined with the order in which the two shapes appeared, there were 24 combinations of shapes in total. Taking the circle and triangle pair as an example, the colored-shape pairs were either congruent (circle–red/triangle–yellow pairs) or incongruent (circle–yellow/triangle–red pairs; the opposite of congruent pairs). The other colored-circle or colored-triangle stimuli (e.g., blue circle or blue triangle) were not presented in the experiment. The colored-shape stimuli were presented on either the left or the right of the screen center. For the letters, we generated three conditions (i.e., (H/H, F/F), H/F, F/H). Thus, each participant took part in 288 trials (24 (colored-shape combination) × 2 (color choice for one shape) × 2 (stimuli location) × 3 (letter condition)) broken into six blocks of 48 trials. At the end of each block, participants took a self-determined break. The formal experiment was preceded by 30 practice trials. The entire experiment took approximately 40 min to complete.

We recorded three possible outcomes for the shape’s color response: hit, IC, and feature error (FE). Taking the stimuli displayed in Fig. 1 as an example, when the question asks for the color of the circle, a hit is correctly reporting the target color (i.e., “blue”), an IC is reporting the color of the distractor (i.e., “red”), and an FE is reporting a color not presented in the display (i.e., “yellow”).

Results

The mean accuracy of the letter task was 93.75%. We excluded the trials in which the letter judgment was incorrect. The mean proportion of hits, ICs, and FEs for each participant with color–shape congruency as a factor (congruent color shape associations are circle–red, triangle–yellow, and square–blue) was calculated.

Mean proportion of ICs, hits, and FEs

Figure 2 shows the congruency effect of CSAs on ICs. A paired-sample t-test revealed a significant difference in ICs between the congruent and incongruent conditions, t(17) = 4.37, Bonferroni-corrected p < .01, Cohen’s d = 1.03, with participants making more ICs in incongruent conditions (18.6%) than in congruent conditions (11.1%). For hits, a paired-sample t-test also revealed a significant difference, t(17) = 4.37, Bonferroni-corrected p < .01, Cohen’s d = 1.03, with participants making significantly more hits on congruent (36.59%) than on incongruent trials (28.75%). FEs did not differ between the congruent (2.62%) and incongruent conditions (2.37%), t(17) = 0.87, Bonferroni-corrected p > .1, d = 0.21. These results indicated that participants tended to make more ICs for incongruent colored-shape pairs than for congruent ones.

Fig. 2
figure 2

Mean proportion of hits, ICs, and FEs in congruent and incongruent conditions. Error bars represent the standard errors of the mean. IC illusory conjunction, FE feature error

ICs in each of color–shape pair

Next, we further explored the ICs in the congruent and incongruent conditions by examining each pair of colored-shape combinations separately (i.e., for the circle/triangle pair, congruent: circle–red/triangle–yellow; incongruent: circle–yellow/triangle–red; for the circle/square pair, congruent: circle–red/square–blue; incongruent: circle–blue/square–red; for the triangle/square pair; congruent: triangle–yellow/square–blue; incongruent: triangle–blue/square–yellow; Fig. 3). We conducted a two-way analysis of variance (ANOVA) with colored-shape combination (3) and congruency condition (2; congruent/incongruent) as factors and the mean proportion of ICs as the dependent variable. Only a main effect of congruency condition was significant, F(1, 17) = 27.67, p < .01, ηp2 = .62, showing that ICs in the incongruent condition were significantly different from those in the congruent condition. Paired-sample t-tests showed that for each colored-shape pair, the ICs in the incongruent condition were significantly larger than were those in the congruent condition (circle/triangle pair, 19.5% vs. 10.7%, t(17) = 3.51, Bonferroni-corrected p < .01, d = 0.83; circle/square pair, 22.6% vs. 12.8%, t(17) = 3.65, Bonferroni-corrected p < .01, d = 0.86; triangle/square pair, 21.0% vs. 13.5%, t(17) = 3.22, Bonferroni-corrected p < .05, d = 0.76). Therefore, ICs occurred more frequently in the incongruent conditions than in the congruent conditions.

Fig. 3
figure 3

Mean proportion of ICs in congruent and incongruent conditions for each colored-shape combination. Error bars represent the standard errors of the mean. IC illusory conjunction

Hits and ICs as a function of confidence ratings

We next examined the confidence ratings for the reported colors. Previous studies have shown that high confidence is associated with ICs (e.g., Prinzmetal et al., 2002). We followed Arend et al.’s (2013) approach to examine binding as a function of confidence ratings. First, we classified confidence ratings as low (answering 1 or 2), medium (answering 3 or 4), and high (answering 5 or 6). As there were far fewer FEs (4.5%) compared to other response types, we excluded them from this analysis.

We conducted a two-way ANOVA of confidence level (3) × congruency conditions (2) on the mean proportion of hits. We found a significant main effect of congruency condition, F(1, 17) = 14.94, p < .01, ηp2 = .47, a marginal effect of confidence level, F(2, 17) = 2.74, p = .08, ηp2 = .24, and no significant interaction, F(2, 34) = 2.19, p = .13, ηp2 = .11. That is, participants were more confident when giving a correct answer in congruent than incongruent conditions in high- (27.0% vs. 20.0%, t(17) = 3.85, Bonferroni-corrected p < .01, d = 0.91) and medium-level confidence ratings (16.9% vs. 13.0%, t(17) = 2.76, Bonferroni-corrected p = .04, d = 0.65). No difference was observed for low confidence level (12.5% vs. 10.5%, t(17) = 0.95, Bonferroni-corrected p = 1.10, d = 0.22). A similar two-way ANOVA for ICs revealed a significant main effect of congruency condition, F(1, 17) = 27.67, p < .01, ηp2 = .62, but not of confidence level, F(2, 17) = 0.19, p = .84, ηp2 = .02, and no significant interaction, F(2, 34) = 1.18, p = .32, ηp2 = .06. Thus, participants tended to be less confident when reporting errors in congruent than incongruent conditions in high- (9.6% vs. 19.8%, t(17) = 3.35, Bonferroni-corrected p = .01, d = 0.79) and medium-level confidence ratings (12.0% vs. 23.5%, t(17) = 3.40, Bonferroni-corrected p = .01, d = 0.80). No difference was observed in low-level confidence ratings (15.4% vs. 20.0%, t(17) = 1.28, Bonferroni-corrected p = .65, d = 0.30).

Location effect on ICs

Previous studies have suggested that stimulus location can influence binding errors (Treisman, 1998). Thus, we examined whether ICs differed between when stimuli were presented on the left or right of the display. A two-way ANOVA with location (2) and congruency condition (2) as factors revealed significant main effects of location, F(1, 17) = 6.08, p < .05, ηp2 = 0.36, and congruency condition, F(1, 17) = 27.67, p < .01, ηp2 = 0.62, but no significant interaction F(1, 17) = 1.78, p = .20, ηp2 = 0.01. Paired-sample t-tests revealed that ICs tended to occur more frequently on the left than on the right in incongruent conditions (35.3% vs. 27.7%, t(17) = 2.66, Bonferroni-corrected p = .03, d = 0.63; Fig. 4), and there was no difference in the congruent conditions (20.7% vs. 16.3%, t(17) = 1.73, Bonferroni-corrected p = .20, d = 0.41).

Fig. 4
figure 4

Location effect on the proportion of ICs in congruent and incongruent conditions. Error bars represent the standard errors of the mean. IC illusory conjunction

Discussion

We observed that IC occurred more frequently with incongruent colored-shape displays than congruent ones. This study is the first to show that CSAs affect putative visual processing for feature binding.

According to FIT, color and shape features are automatically processed in parallel, and later bound together through spatial attention and/or top-down influences (Treisman & Gelade, 1980). When viewing an incongruent trial (e.g., a blue circle and a red square in Fig. 1), it might co-activate the stored color–shape mapping (i.e., circle–red and square–blue), modulating the later stage with binding color and shape features together. The higher occurrence of ICs suggests that CSAs may arise at an early processing stage prior to binding, and/or exert a top-down effect on binding processes. This finding is consistent with previous studies showing that stored knowledge constrains ICs (Becker, Neel, & Anderson, 2010; Goldfarb & Treisman, 2010; Hommel & Colzato, 2009; Treisman, 1988; Vanrullen, 2009). For example, Arend et al. (2013) observed that when asked to report the color of the larger number of two colored-numbers, participants made more ICs when the numerical distance between the numbers was 2, as compared with when the numerical distance was 5. Thus, the long-term representation between numbers and space interferes with feature-binding processes. Goldfarb and Treisman (2010) also observed semantically congruent effects on feature bindings, such that incongruent features tended to induce more ICs, and replaced congruent features.

CSAs are considered as a form of intramodal correspondence, grounded in the field of cross-modal correspondence research. Cross-modal correspondence refers to the tendency that the brain matches distinct features or dimensions of experience across different sensory modalities (Spence, 2011). There are at least four types of cross-modal correspondence: statistical (via perceptual learning; Parise, Knorre, & Ernst, 2014), structural (via neural coding; Walsh, 2003), semantic (via language development; Melara & Marks, 1990; Parise & Spence, 2013), and emotional mediation (via emotion associations; Palmer et al., 2013). Our previous study suggested that CSAs could be explained by semantic sensory correspondence (e.g., through “warm/cold” dimensions; Albertazzi et al., 2013; Chen et al., 2015). Some researchers argued that CSAs are emotion mediated, i.e., the preference, arousal appraisals, and emotions that individuals hold for colors and shapes construct CSAs (Dreksler & Spence, 2019; Malfatti, 2014). The present results indicated that the high-level semantic/affective CSAs could mediate low-level feature-binding processes, in line with previous findings (Becker, Neel, & Anderson, 2010; Esterman, Prinzmetal, & Robertson, 2004; Neel et al., 2012; Prinzmetal, Hoffman, & Vest, 1991; Treisman & Souther, 1986; Virzi & Egeth, 1984). For example, Virzi and Egeth (1984) showed that ICs may occur with high-level codes as well as with perceptual features. Some researchers reported that ICs follow stereotypic expectations, such as gender-emotion associations, for example that male faces tend to “grab” anger from neighboring faces, and female faces tend to “grab” happiness (Becker et al., 2010; Jelonek et al. 2015; Neel et al., 2012).

Importantly, the top-down effect on binding driven by the CSAs could also have a structural or statistical basis. For example, perceptual learning, due to repeated exposure to the co-occurrences in the environment, may account for CSAs – the red–circle association may derive from the Japanese flag, or the triangle–yellow from the image of a pyramid (Jacobsen, 2002; Jacobsen & Wolsdorff, 2007; Hanada, 2019). Previous studies have similarly observed those associations in various countries (e.g., circle–red and triangle–yellow in Italians, triangle-yellow in British; Albertazzi et al., 2013; Holmes & Zanker, 2013; Makin & Wuerger, 2013). Furthermore, studies suggested that past learning experience of color-shape conjunctions could modulate binding, which may occur at an early stage of processing, requiring modest attention (e.g., yellow for a banana shape; Hommel & Colzato, 2009; Olivers, 2011; Rappaport, 2013; Rappaport et al., 2016; Rappaport, Humphreys, & Riddoch, 2013; Wildegger, Riddoch, & Humphreys, 2015). Olivers (2011) observed that when searching for a gray-scale target, participants were faster when the color of the only colored-distractor was unrelated to the target than when it was related, suggesting that long-term associations between color and shape automatically affect attentional orienting. Recent studies suggested a dual-system for binding in the brain (Hommel & Colzato, 2009; Vanrullen, 2009). One is automatic and pre-attentive, based on a “hardwired” binding for familiar features, that through repeated exposure to specific combinations of features, the brain may establish “hard-wired” conjunction representation, speeding up object recognition. The other type is “on-demand” binding, randomly conjoined for unfamiliar, arbitrary, or meaningless features, mediated by attention. Consistent with this, the present findings suggested that the learned CSAs might be strong enough to exert a top-down influence, and/or emerge at an early stage in visual processing, consequently modulating binding processes. Future studies may help to shed light on this statistical hypothesis by directly investigating cultural differences on the strength of the CSAs on visual processing.

Alternatively, the CSAs may derive from structural correspondence shaped by the neurological coding system. Evidence from neuropsychology and brain imaging showed that learned color-shape conjunctions could be efficiently conjunction-coded in early visual areas of the brain (Lu et al., 2010; Price & Humphreys, 1989; Rappaport, 2013; Seymour et al., 2010; Walsh, Ashbridge, & Cowey, 1998; Zeki & Marini, 1998). For example, Rappaport (2013) showed that early visual regions are sensitive to the congruency of color-shape binding, processing congruent color-shape binding is computationally less costly and more efficient than processing inappropriate pairing. It might be possible that some neurons may carry a unified coding of color–shape combinations earlier, leading to a bias in feature binding (Ramachandran & Hubbard, 2001; Seymour et al., 2009, 2010; Sumner et al., 2008). Future studies may explore the neural mechanism of CSAs using neuroimaging techniques.

After participants chose the target’s color, we asked them to rate their confidence in that choice. The results showed that the proportion of ICs did not vary with the confidence rating, thus implying that the participants were unaware of their perceptual errors. These results were consistent with previous findings showing that feature integration errors might be subjectively impenetrable; for example, a greater number of binding errors might be related to high confidence levels (Ashby et al., 1996; Simione et al., 2017; Treisman & Schmidt 1982).

Interestingly, we also found that ICs occurred more frequently in the left than in the right periphery in incongruent conditions. This result might indicate that language plays a role in mediating the occurrence of ICs. Previous studies suggested that language affects perception more in the right visual field than in the left visual field. When the colored-shape pairs were presented in the right visual field, they were processed more by the left hemisphere, where participants could be better using language to code the color-shape mapping on each trial (e.g., verbally rehearsing “red-circle/yellow-triangle”), which consequently prevented ICs (Damasio et al., 1996; Gilbert, Regier, Kay, & Ivry, 2006). Those results provided further evidence for the semantic-mediation hypotheses on CSAs (which are based on language development; Chen et al., 2016). It might also indicate the right brain hemisphere’s poor spatial perception performance. Previous studies suggested that conjunction errors to local targets were greater when the stimulus was projected to the right hemisphere (Flevaris, Bentin, & Robertson, 2010; Hübner & Volberg, 2005).

In conclusion, we reported that ICs occurred more frequently in incongruent colored-shape visual displays. Thus, CSAs could be strong enough to affect feature binding, perhaps via top-down feedback, and/or CSAs may occur at an earlier stage than binding. The present study provided further evidence for the stored-knowledge effect on feature binding, that the learned high-level semantic associations could exert an influence on low-level feature binding processes. These findings give rise to new questions of how cross-modal correspondence affects feature binding processes, and may shed light on the mechanism of cross-modal correspondence and better understanding of binding problems.