Introduction

The visual world around us contains a rich array of information. To make sense of this complexity, visual features processed in distinct regions of cortex must be integrated into coherent object representations. Kahneman, Treisman, and Gibbs (1992) were among the first to propose a mechanism for dealing with this issue. They suggested that perceptual information is integrated into “object files,” or short-term episodic representations that temporarily link together codes of the relevant features of perceptual objects. This binding proposal has since been expanded to include contextual information and associated actions (Hommel, 1998; Hommel & Colzato, 2004).

Kahneman et al. (1992) highlighted a behavioral consequence of forming an object file using a simple preview letter-naming task. Participants were required to respond to a target letter that matched both the location and identity of a preceding preview letter, or that matched the identity of one preview letter and the location of a different preview letter. Naming responses were faster for the condition in which the target matched both location and identity of a preview letter, implicating some form of memory for the binding between these two features. Kahneman et al. named this effect the object-specific preview benefit. Hommel (1998) expanded on this idea by demonstrating that response representations can also be bound together with visual features in these object representations, called “event files.” Both sets of findings suggest that preview (or repetition) effects do not depend simply on repetition of object features – they also depend on the repetition of feature bindings. When features repeat from preview to target display, but the bindings involving those features change, additional time is needed to update those bindings.

In the present study, we examined whether event file binding can occur for representations that support visual imagery rather than vision itself. The theory of event encoding proposes that event file binding is produced by transient links involving perception and action codes (Hommel, Musseler, Aschersleben, & Prinz, 2001). If vision and visual imagery have common underlying representations (Ishai et al., 1999; Kosslyn, Thompson, & Ganis, 2006; O’Craven & Kanwisher, 2000), then it seems possible that event file binding could occur for representations that support visual imagery. In line with this idea, several recent studies have shown that visual imagery can influence visual search (Cochrane, Nwabuike, Thomson, & Milliken, 2018a; Cochrane, Zhu, & Milliken, 2018b; Reinhart, McClenahan & Woodman, 2015), binocular rivalry (Chang, Lewis, & Pearson, 2013), and visual identification (Wantz, Borst, Mast, & Lobmaier, 2015; though see Cochrane, Siddhpuria, & Milliken, 2018c). Together, the theory of event encoding and the empirical work cited above constitute a solid basis for examining whether representations that support visual imagery can contribute to event file binding. The experiments reported here constitute a first attempt at exploring this issue.

We examined this issue by asking whether visual imagery at one point in time can produce partial repetition costs for a visual object presented at a following point in time. We used an event file procedure derived from the work of Hommel (1998). The first stimulus (S1) on a trial required participants to imagine a color and then make an arbitrary left/right keypress response. The second stimulus (S2) on a trial was a colored square that required a two-alternative forced-choice left/right keypress response. If imagined objects involve feature bindings that are similar to those of perceptual objects, then the pattern of event file binding effects ought to be similar for participants who perceive S1 and participants who imagine S1.

Experiment 1

In Experiment 1, repetition effects were compared across two groups, one in which S1 and S2 were both perceptual colors, and another in which S2 was a perceptual color but S1 was an imagined color.

Method

Participants

Thirty-two undergraduates at McMaster University (26 female, Mage = 18.2 years) took part in exchange for course credit. Sixteen of the 32 participants were assigned randomly to each of the perception and imagery groups. A power analysis revealed that a sample size of eight participants per group would be sufficient to detect an event file binding effect of the size typically reported in the literature (Cohen’s f = .80) with power = .80.

Apparatus and stimuli

Stimuli were presented using Psychopy v1.82 on a BenQ 24-in LED monitor that was connected to a Dell 300 computer. All visual displays were set on a black background. S1 and S2 were centrally located squares that subtended vertical and horizontal visual angles of 3°. For the perception group, S1 was presented in either red or green. For the imagery group, S1 was a white outline square with a white ‘R’ or ‘G’ inside the square. The cue presented prior to S1 was three white carets (‘<<<’) facing either left or right. S2 for both the perception and imagery groups was a red or green square.

Procedure

Participants were seated approximately 60 cm from the computer screen. Each trial began with text displayed on-screen inviting participants to press the spacebar to begin the trial. Following a spacebar press, a central fixation cross was displayed for 500 ms. Three carets facing either left or right were then presented centrally for 1,500 ms, followed by the fixation cross for 500 ms, and then onset of S1. In the perception group, S1 was a centrally located red or green square. In the imagery group, S1 was a centrally located white outline square containing either an ‘R’ or ‘G’ (see Fig. 1). Participants in the imagery group were instructed to imagine the square was solid green if it contained a ‘G’ and to imagine it was solid red if it contained an ‘R.’ Participants in both groups responded to S1 with a keypress indicating the caret direction. Participants in the perception group made this response immediately upon onset of S1, whereas participants in the imagery group made this response when they had completed the imagery task as requested. Participants pressed ‘z’ if the carets were facing left and ‘m’ if they were facing right. Following the response to S1, a centrally located fixation cross was displayed for 1,500 ms, followed by S2, a red or green square. Half of the participants responded red by pressing the ‘z’ key and green by pressing the ‘m’ key, while the other half of participants had the opposite response mapping. Following response to S2, participants were prompted on-screen to indicate the vividness of their visual imagery on a 4-point scale: 1 = no imagery, 2 = weak imagery, 3 = moderate imagery, and 4 = strong imagery, almost like perception.

Fig. 1
figure 1

An example of a partial match trial for the imagery group, together with a depiction of the four trial types tested

The experiment began with 15 practice trials. The first five practice trials required only color identification of S2. The next five practice trials involved the full trial sequence with the exception that responses to the caret direction task were made to a white outline box. For the final five practice trials, participants performed a full trial sequence corresponding to their assigned group. Participants then performed 200 experimental trials.

Results

Mean correct response time (RT) to S2 was the primary dependent variable in all experiments. RTs excluded from these mean RTs included: (1) trials in which an incorrect response was made to S1/S2; (2) correct RTs greater than 2,000 ms or less than 200 ms (2.9% of observations); and then (3) correct RTs identified as outliers (3.0% of observations) by the non-recursive moving criterion procedure of Van Selst and Jolicoeur (1994). The resulting mean RTs and corresponding error rates were submitted to separate mixed factor ANOVAs that treated group (imagery/perception) as a between-subjects variable and color (repeat/alternate) and response (repeat/alternate) as within-subject variables. Mean RTs are displayed in Fig. 2, and error rates are displayed in Table 1.

Fig. 2
figure 2

Mean response times for the perception and imagery groups of Experiment 1 and the no-imagery and imagery groups of Experiment 2. Error bars represent the standard error of the mean corrected to remove between-subjects variability (Morey, 2008)

Table 1 Percentage of errors in Experiments 1 and 2

The analysis of RTs revealed a main effect of group that approached significance (p = .07), with slower RTs for the imagery group. The interaction between color and response was also significant, F(1,30) = 43.5, p < .001, η2p = .59, implying that event file binding effects did indeed occur in our study. This interaction did not differ statistically for the perception and imagery groups (p = .31, for the three-way interaction). As our primary interest was whether event file binding would occur for each of the perception and imagery groups, separate two-way repeated-measures ANOVAs were then conducted for each group.

Perception group

In the analysis of RTs, there was a significant interaction between color and response, F(1,15) = 23.9, p < .001, η2p = .61. For the color alternate condition, responses were faster for the response alternate than response repeat condition, F(1,15) = 21.2, p < .001, d = .40. In contrast, for the color repeat condition, responses were faster for the response repeat than response alternate condition, F(1,15) = 19.2, p < .001, d = .32.

In the analysis of error rates, there was also a significant interaction between color and response, F(1,15) = 16.7, p < .001, η2p = .61. For the color alternate condition, the effect of response repetition was not significant (p = .18). For the color repeat condition, fewer errors were made for the response repeat than response alternate condition, F(1,15) = 12.6, p = .001, d = .67.

Imagery group

In the analysis of RTs, there was a significant main effect of color, with faster responses for the color repeat than color alternate condition, F(1,15) = 11.3, p = .004, η2p = .34. More importantly, there was a significant interaction between color and response, F(1,15) = 21.5, p < .001, η2p = .59. For the color alternate condition, responses were faster for the response alternate than response repeat condition, F(1,15) = 24.8, p < .001, d = .54. For the color repeat condition, responses were faster for the response repeat than response alternate condition, F(1,15) = 8.62, p = .01, d = .60.Footnote 1

In the analysis of error rates, the interaction between color and response was significant, F(1,15) = 8.46, p = .01, η2p = .36. For the color alternate condition, there were fewer errors for the response alternate than response repeat condition, F(1,15) = 11.5, p = .004, d = .75. For the color repeat condition, there were fewer errors for the response repeat than response alternate condition, F(1,15) = 12.6, p = .001, d = .67.

Discussion

The results in the perception group replicated those reported in many prior studies (e.g., Hommel, 1998, 2004), with the effect of color repetition strongly dependent on the influence of response repetition. Most important, the results in the imagery group revealed a qualitatively similar interaction, with the effect of color repetition again strongly dependent on the influence of response repetition. This interaction is consistent with the proposal that visual imagery and visual perception involve similar event file binding processes. In addition, and of less importance, the trend toward slower responses in the imagery group than the perception group perhaps points to a task switching cost across S1/S2 tasks.

Experiment 2

As a control for Experiment 1, we examined whether the pattern of results for the imagery group hinged on the imagery instruction, rather than the mere S1 presentation – the ‘R’ and ‘G.’ Two groups were tested in Experiment 2: a replication of the imagery group from Experiment 1, and a control group with identical S1 and S2 stimuli but no imagery instruction.

Method

Participants

Thirty-two undergraduates at McMaster University (27 female, Mage = 18.0 years) took part in exchange for course credit. Sixteen of 32 participants were assigned randomly to each of the imagery and no-imagery groups.

Apparatus and stimuli

The apparatus and stimuli were identical to those in the imagery group of Experiment 1.

Procedure

The procedure for the imagery group was identical to Experiment 1. The procedure for the no-imagery group was similar, with the exception that participants were not instructed to imagine a color in response to S1.

Results

Correct RTs greater than 2,000 ms or less than 200 ms (1.4% of observations) and correct RTs identified by the Van Selst and Jolicoeur (1994) outlier method (2.8% of observations) were excluded from analyses. Mean RTs and error rates for the imagery and no-imagery groups were submitted to mixed factor ANOVAs as in Experiment 1. Mean RTs are displayed in Fig. 2, and error rates are displayed in Table 1.

The analysis of RTs revealed a main effect of group that approached significance (p = .051), with higher RTs for the imagery group than the no-imagery group. Importantly, there was a significant three-way interaction between color, response, and group, F(1,30) = 12.38, p = .001, η2p = .29. Separate repeated-measures ANOVAs were then conducted for each group.

Imagery group

The analysis of RTs revealed significant main effects of both color and response, with faster responses to the color repeat than the color alternate condition, F(1,15) = 16.0, p = .001, η2p = .52, and faster responses to the response repeat than the response alternate condition, F(1,15) = 12.1, p = .003, η2p = .45. Critically, there was a significant interaction between color and response, F(1,15) = 19.0, p < .001, η2p = .55. For the color alternate condition, responses were faster for the response alternate than for the response repeat condition, F(1,15) = 5.8, p = .03, d = .27. For the color repeat condition, responses were faster for the response repeat than for the response alternate condition, F(1,15) = 31.6, p < .001, d = .66.

The analysis of error rates also revealed a significant interaction between color and response, F(1,15) = 13.4, p = .002, η2p = .47. For the color alternate condition, there were fewer errors for the response alternate than response repeat condition, F(1,15) = 11.1, p = .004, d = .83. For the color repeat condition, there were fewer errors for the response repeat than for the response alternate condition, F(1,15) = 12.3, p = .001, d = .57.

No-imagery group

There were no significant effects in the analysis of either RTs or error rates (all F < 1).

Discussion

This experiment replicated the results of the imagery group from Experiment 1 and demonstrated a dependence of this pattern of results on the imagery instructions – when no instructions were given to participants for S1, no event file binding effects occurred. At the same time, we must acknowledge that the no-imagery instructions may have failed to produce event binding effects because participants attended only to the onset of S1; that is, a response was made upon onset of S1 without a requirement for any additional processing of S1.

Experiments 3 and 4

In the final two experiments, we examined more closely the types of S1 processing that produce event file binding effects, and in particular whether verbal rather than visual representations could be responsible for event file binding effects observed with visual imagery instructions. In Experiment 3 we evaluated whether event file binding effects would emerge with instructions to verbalize rather than to imagine the S1 color, and in Experiment 4 we evaluated whether visual imagery instructions would produce event file binding effects while verbal coding was occupied by an articulatory suppression task.

Method

Participants

Thirty-two undergraduates at McMaster University (26 female, Mage = 18.1 years) were assigned randomly to either the imagery or verbal group in Experiment 3. Sixteen McMaster University undergraduates (14 female, Mage = 18.4 years) participated in Experiment 4.

Apparatus and stimuli

The apparatus and stimuli were identical to previous experiments.

Procedure

The procedure for the imagery group of Experiment 3 was the same as in previous experiments. The procedure for the verbal group of Experiment 3 required participants to say aloud “red” in response to the letter ‘R’ and “green” in response to the letter ‘G’ prior to responding to the caret direction task.Footnote 2 In Experiment 4, the procedure was identical to the imagery group of previous experiments with the exception that participants were also required to repeat the phrase “ba, ba, ba” from onset of presentation of the carets to onset of S2.

Results

Experiment 3

Correct RTs greater than 2,000 ms or less than 200 ms (4.7% of observations) and RTs identified by the Van Selst and Jolicoeur (1994) outlier method (2.3% of observations) were excluded from analyses. Mean RTs and error rates for the imagery and verbal groups were submitted to mixed factor ANOVAs as in previous experiments. Mean RTs are displayed in Fig. 3, and error rates are displayed in Table 2.

Fig. 3
figure 3

Mean response times for the verbal and imagery groups of Experiment 3 and the articulatory suppression group of Experiment 4. Error bars represent the standard error of the mean corrected to remove between-subjects variability (Morey, 2008)

Table 2 Percentage of errors in Experiments 3 and 4

The analysis of RTs revealed a non-significant three-way interaction between color, response, and group (p = .70). Nonetheless, a priori hypotheses led us to conduct separate two-way repeated-measures ANOVAs for each group.

Imagery group

The analysis of RTs revealed a significant main effect of color, with faster responses for the color repeat than color alternate condition, F(1,15) = 9.89, p = .007, η2p = .40. Critically, there was also a significant interaction between color and response, F(1,15) = 13.8, p = .002, η2p = .48. For the color alternate condition, responses were faster for the response alternate than the response repeat condition, F(1,15) = 5.6, p = .03, d = .24. For the color repeat condition, responses were faster for the response repeat than the response alternate condition, F(1,15) = 16.1, p = .001, d = .38.

The analysis of error rates also revealed a significant interaction between color and response, F(1,15) = 14.2, p = .002, η2p = .49. For the color alternate condition, there were fewer errors in the response alternate than the response repeat condition, F(1,15) = 7.65, p = .01, d = .52. For the color repeat condition, there were fewer errors in the response repeat than the response alternate condition, F(1,15) = 10.7, p = .005, d = .63.

Verbal group

In the analysis of RTs, there was a significant interaction between color and response, F(1,15) = 16.2, p = .001, η2p = .52. For the color alternate condition, responses were faster for the response alternate than the response repeat condition, F(1,15) = 5.4, p = .03, d = .29. For the color repeat condition, responses were faster for the response repeat than the response alternate condition, F(1,15) = 11.9, p = .004, d = .32.

In the analysis of error rates, there was a significant interaction between color and response, F(1,15) = 20.6, p < .001, η2p = .58. For the color alternate condition, there were fewer errors in the response alternate than the response repeat condition, F(1,15) = 12.6, p = .003, d = .83. For the color repeat condition, there were fewer errors in the response repeat than the response alternate condition, F(1,15) = 22.7, p < .001, d = 1.10.

Experiment 4

Correct RTs greater than 2,000 ms or less than 200 ms (3.3% of observations) and correct RTs identified by the Van Selst and Jolicoeur (1994) outlier method (2.9% of observations) were excluded from analyses. Mean RTs and error rates were submitted to repeated-measures ANOVAs that treated color (repeat/alternate) and response (repeat/alternate) as factors. Mean RTs are displayed in Fig. 3, and error rates are displayed in Table 2.

The analysis of RTs revealed a significant main effect of color, with faster responses for the color repeat than color alternate condition, F(1,15) = 15.3, p = .001, η2p = .50. There was also a significant interaction between color and response, F(1,15) = 37.4, p < .001, η2p = .71. For the color alternate condition, responses were faster for the response alternate than response repeat condition, F(1,15) = 4.7, p = .05, d = .21. For the color repeat condition, responses were faster for the response repeat than the response alternate condition, F(1,15) = 15.4, p = .001, d = .42.

The analysis of error rates also revealed a significant interaction between color and response, F(1,15) = 5.9, p = .03, η2p = .28. For the color alternate condition, there were fewer errors in the response alternate than the response repeat condition, F(1,15) = 4.98, p = .04, d = .61. For the color repeat condition, there were fewer errors in the response repeat than the response alternate condition, F(1,15) = 6.11, p = .03, d = .60.

Discussion

Event file binding effects were observed for both the imagery and verbal groups in Experiment 3, demonstrating the novel finding that verbal representations can also produce event file binding effects. These results suggest that verbal representations could conceivably underlie event file binding effects observed with visual imagery instructions. However, the results of Experiment 4, in which participants engaged in concurrent visual imagery and articulatory suppression, demonstrate that event file binding effects observed with visual imagery instructions do not depend on verbal coding of S1.

General discussion

The results of the present study demonstrate the novel finding that bindings can be formed between imagined colors and actions, just as they are formed between perceptual colors and action. Moreover, these bindings contribute to repetition effects between consecutive S1 and S2 items. S2 trials in which the bindings repeat from S1, or S2 trials in which entirely new bindings are formed, are responded to more efficiently than trials in which S1 bindings partially match with those required for S2. These partial match-binding effects were present when S1 colors were imagined, verbalized, and imagined during articulatory suppression, but not present when participants were given no instructions for S1 other than to perform an already prepared response. Overall, the results suggest that color imagery produces event file bindings that are sufficiently perception-like to influence performance with perceptual objects.

A finding of particular interest was that event file binding effects were observed with both imagery and verbal instructions in Experiment 3. In some ways, this finding fits with the literature, and in other ways not. Prior studies have shown that verbal representations sometimes have only a modest influence on visual search performance (Cochrane, Nwabuike, Thomson, & Milliken, 2018a; Theeuwes, Reimann, & Mortier, 2006). On the other hand, there are also theoretical accounts that posit semantic information is represented in the same brain networks that represent perceptual information (Amsel, Urbach, & Kutas, 2014; Barsalou, 2008; Tomasello et al., 2017). These theories would predict that saying a color aloud should automatically induce activation of a color representation that is visual in nature. We suspect that this was indeed the case in our study – half of the participants in the verbal group of Experiment 3 reported imagining the color when they named it. Ultimately, whether event file binding effects for verbal and visual imagery instructions involve separate or overlapping representations awaits further study.