When you open a door, the force with which you push (or pull) the door will decide how fast the door will move. In situations like this, we generally think that we can consciously predict the subsequent stimulus that is determined by our actions. However, if the subsequent stimulus occurs coincidentally after our own action, could this contingency between action and the stimulus be consciously learned? Moreover, could the contingency be learned even if the subsequent stimulus is “consciously unpredictable”? In nature, living organisms often confront many coincidences that are caused by their own actions. Even though these coincidences are not consciously discernible in many cases, living organisms seem to implicitly learn this contingency. For instance, if the early bird catches the worm, and the bird wakes up early enough times, then the bird should anticipate catching a worm in the early morning. This implies that when the coincidental dynamic changes in our environment are predictable by our own response (e.g., moving quickly vs. slowly), the contingency between our response and the following consequences could be learned even if this relationship is not consciously discernible. However, only a few studies have shown that the contingency between a response and the subsequent stimulus can be learned unconsciously. Thus, the main aim of this study was to examine whether a contingency between our own response and a following stimulus can be implicitly learned.

In the situation of the bird catching the worm, associative learning of the contingency between the response and the subsequent stimulus appears to be used unconsciously and could represent one example of procedural memory (Squire, 1987). Procedural memory is a consequence of learning the association between the sequences of stimulus-response pairs. Moreover, procedural memory can be implicitly acquired (Cleeremans, Destrebecqz, & Boyer, 1998; Squire, 1992). Associative learning between a response and the subsequent stimulus is similar to operant conditioning, a type of learning in which an individual’s behavior is modified by its antecedents and consequences (Skinner, 1938). However, procedural memory or operant conditioning is, in most cases, accompanied by immediate consequences after each response. Thus, in the majority of response-stimulus associative learning scenarios, the consequence of the response could be consciously discernible.

Chun and Jiang (1998) demonstrated that a contextual cueing stimulus and a target location can be associatively learned in implicit manner, and the contextual information around the target can guide spatial attention to the target. This contextual cueing effect suggests that even when a contextual representation cannot be consciously identified, a more advanced visual search could be possible, based on the implicit contextual cues. Similarly, implicitly represented contextual information can guide spatial attention in a visual search task (VST; Chun & Jiang, 1998, 1999). Furthermore, Kim, Kim, and Chun (2010) demonstrated that spatial working memory can be used as a cue to predict the target location in a VST. Although the spatial working memory task preceded the VST, the effect was similar to the contextual cueing effect. Thus, the authors inferred that spatial working memory could be linked to a long-term representation, which can be associated with the target location. However, it was not possible to conclude that the participant’s overt response in the VST can be implicitly associated with the target location. Thus, we observed whether participants’ overt response could be used as an implicit cue to predict the target feature and whether the cue could guide top-down attention to the target.

In everyday life, we exhibit various responses and experience various contingent stimuli. However, during response-stimulus associative learning experiments, displaying various stimuli and inducing various responses from the participants is difficult. Previous response-stimulus associative learning studies (Elsner & Hommel, 2004; Hommel, 1996; Stöcker, Sebald, & Hoffmann, 2003; Ziessler, 1994) have only focused on investigating the association between a few fixed responses and the following results. Moreover, in these studies, it is difficult to manipulate the response-stimulus contingency without the participant being aware of the manipulation. Ziessler (1994) used a serial search-and-reaction task (SSRT) to argue that the participant’s response and the subsequent stimulus can be associated. However, in the Ziessler’s experiment, each individual stimulus corresponded to a distinctive response key. And the responses respectively induced a consequential stimulus. This means that there is a possibility of having both the response-stimulus association and the stimulus-stimulus association. Therefore, we felt the need for further research to conclude that response-stimulus associative learning actually occurs.

In our study’s experimental group, we designed two tasks in which the participant’s response in the first task affected the target feature in the second task in an implicit manner. We measured the response time in the first task. Based on the relative speed of the participants’ response (e.g., very fast, fast, slow, or very slow) in the first task, the target location in the second task was determined. From a participant’s perspective, the first and the second tasks appeared to be independent. Therefore, it would have been difficult for them to consciously notice the contingency. The main purposes of this study were to determine whether the response speed–contingent target feature can be implicitly learned and whether the contingency can be used as a predictive cue for an efficient search in the second task. We hypothesized that if the participant’s response speed could be used as a predictive cue of the upcoming target feature, then the response time in the second task would become faster as the block progressed. We called this search facilitation effect the “cueing-by-response” (CBR) effect. Additionally, we expected CBR to guide top-down attention to the response speed–contingent target feature as a contextual cueing effect.

A random control experiment was conducted to trace the practice effect in regards to the location of target in the second task. Also, a yoked control experiment was performed to exclude a possibility of regularities in sequence of the target locations that might have been caused by the experimental design in the experimental group. In the yoked control group, the sequence of the target locations in the second task was identical to that of the experimental group.

Method

Participants

Forty-nine students (experimental group: 17, random control group: 16, yoked control group: 16) at Yonsei University participated for course credits or monetary rewards after providing informed consent. All participants reported normal or corrected-to-normal visual acuity and normal color perception. None of the participants were aware of the purpose of this experiment. One participant from the experimental group was excluded from the analysis because the participant’s sensitivity, d’ = Z(hit rate) – Z(false alarm rate) (Macmillan & Creelman, 2005) was 3 standard deviations below that of the other participants.

Stimuli and Apparatus

The experiment was executed on an IBM computer with a 24-in. 1920 × 1080-pixel LED monitor using the Psychophysics Toolbox in MATLAB (Brainard, 1997; Pelli, 1997). On a gray background, a black fixation point was presented in the center of the screen. In the simple detection task (SDT), the stimulus had a visual angle of 1.9° in width and length and was presented in the center of the screen. In the VST, the stimulus had a visual angle of 1° in width and length and was 12° distant from the center. In the SDT, the target stimulus was O, and the catch trial stimulus was X. In the VST, the target stimulus was T, and the distractor stimulus was L. The target T was rotated \( \pm \) 90°. The distractor L was randomly rotated between 0° and 270° in 90° increments.

Procedure

Participants performed two tasks consecutively within a single trial. The first task was an SDT. Participants were instructed to respond quickly when the target O appeared and to withhold response when the catch trial stimulus X appeared. After the fixation point was presented (600 ms), the stimulus in the SDT appeared at the center of the monitor (see Fig. 1). Participants were instructed to press the f key quickly and accurately when the target O appeared. The ratio between the target (O) and catch trial stimulus was 10:1. The VST followed the SDT only when the target O was presented in the SDT. The stimulus in the SDT was presented for 1,500 ms or until response. When the participant responded within 1,500 ms, the VST was immediately presented. If the participant did not respond within 1,500 ms, or if the participant responded to the catch trial stimulus, an auditory feedback was provided. After the SDT, the fixation point was presented (600 ms), and the VST was subsequently presented. The stimuli in the VST were presented until the participant responded. The participant responded with the j key when the target T was rotated -90°, and with the k key when rotated +90°. If the participant was incorrect, a negative feedback alarm was provided. The SDT and VST were bound together as one trial, for a total of 1,320 trials. There were 24 blocks in total, and each block contained 55 trials. Each trial was self-paced using the space bar.

Fig. 1
figure 1

Example of the experimental procedure. In the simple detection task, the participants responded to target O. In the visual search task, the participants responded to the orientation of target T. The visual search task was not administered when the catch trial stimulus X was presented in the simple detection task

Forty practice trials were administered prior to the main experiment, and the target and distractor locations were randomized. The locations were also randomized in the first block of the main experiment. From the second block on, however, the contingency between the response in the SDT and the presenting location of target T was applied. After the first block, the distribution of the response time to target O in the SDT was divided into four equivalent response speed ranges. If the response time to target O exceeded ±3 standard deviations, it was excluded from the data analysis. The four response speed ranges were classified as very fast, fast, slow, or very slow. In the second block, the presenting location of target T in the VST was determined by the response time to target O in the SDT in a trial. Thus, each of the four equivalent response speed ranges from the first block corresponded one-to-one with the four possible locations where target T could appear. For example, if a response time to target O belonged to the very fast range, which was defined in the first block, then target T was presented in the first quadrant. The one-to-one correspondence between the four response speed ranges from the previous block and the four quadrants that target T could be presented was counterbalanced for each participant. After the second block, the four response speed ranges were obtained from the response time to target O in the previous block, and this calculation was repeatedly renewed prior to the start of the next block. To demonstrate that the contingency exhibited an experimental effect, target T was presented among three possible locations in a random manner in the 24th block, with the exception of the quadrant that corresponded one-to-one with the four response speed ranges based on the contingency. For example, if the participant responded to target O within a very fast response speed range, then target T appeared in the first quadrant based on the contingency. However, in the 24th block, target T was presented in a random location with the exception of the first quadrant. Through this experimental design, we determined whether the response time to target T could be increased in the 24th block as a result of the contingency disappearing.

In the random control experiment, target T was presented in a random location regardless of the response speed to target O in all blocks. In the experimental group, the participants experienced a significantly higher number of trials where the target was presented in the same location as the target in the previous trial than the random control group. Thus, the yoked control experiment was performed to demonstrate that the main experiment result was not caused by target T appearing more frequently in the same location. Based on the sixteen participants’ results in the experimental group, the sequence of target (T) locations in the yoked control group was identical to the sequence of the experimental group from the second block. In the yoked control experiment, there was no contingency in any of the blocks. The presenting location of target T was random for the first block. For all participants in the experimental and control groups, we asked whether they noticed the contingency after the experiment.

Results

For analysis, we combined two consecutive blocks as one epoch, with the exception of the first and the last blocks, where no contingency existed between the participant’s response and the location of target T. To exclude response time outliers, we used the median of the response time to target T as a representative value for each epoch. The results in the experimental, random control, and yoked control groups are shown in Fig. 2.

Fig. 2
figure 2

The response times in the experimental, random control, and yoked control groups. The x-axis shows the blocks. We combined two blocks as one epoch from the 22nd–23rd block. The y-axis shows the response time for target T in the visual search task. Error bars indicate the within-participant standard error (Cousineau, 2005)

Based on the response times in the 22nd–23rd blocks and the 24th block from each group, we performed a group (experimental group, random control group, yoked control group) × epoch (22nd–23rd blocks, 24th block) factor analysis. As a result of the 3 × 2 mixed analysis of variance (ANOVA), the main effect of the group was not significant. However, the main effect of the epoch was significant, F(1, 45) = 13.67, p < .05, \( {\upeta}_{\mathrm{p}}^2 \) = .23. The 22nd–23rd blocks (M = 918.47, SD = 165.30) exhibited a faster response time to target T than the 24th block (M = 980.31, SD = 165.06). The 3 × 2 two-way interaction between the group and the epoch was also significant, F(2, 45) = 5.98, p < .05, \( {\upeta}_{\mathrm{p}}^2 \) = .21. To decompose this two-way interaction effect, we compared the three possible 2 × 2 two-way interactions. First, the two-way interaction between group (experimental group, random control group) and epoch (22nd–23rd blocks, 24th block) was significant, F(1, 30) = 10.92, p < .05, \( {\upeta}_{\mathrm{p}}^2 \) = .27. The two-way interaction between group (experimental group, yoked control group) and epoch (22nd–23rd blocks, 24th block) was significant, F(1, 30) = 4.28, p < .05, \( {\upeta}_{\mathrm{p}}^2 \) = .13. Finally, the two-way interaction between group (random control group, yoked control group) and epoch (22nd–23rd blocks, 24th block) was not significant. In the experimental group, there was a significant difference between the 22nd–23rd blocks and the 24th block, F(1, 15) = 18.87, p < .05, \( {\upeta}_{\mathrm{p}}^2 \) = .56. The 22nd–23rd blocks (M = 822.42, SD = 177.05) exhibited a faster response time to target T than the 24th block (M = 959.60, SD = 180.00). In the random control group, there was no significant difference between the 22nd–23rd blocks and the 24th block. In the yoked control group, there was a marginal difference between the 22nd–23rd blocks and the 24th block, F(1, 15) = 3.77, p = .07, \( {\upeta}_{\mathrm{p}}^2 \) = .20. The 22nd–23rd blocks (M = 964.36, SD = 177.36) exhibited a faster response time to target T than the 24th block (M = 1,016.09, SD = 175.82).

The response time to target O was not significantly different among the groups. Through a postexperiment interview, we have confirmed that participants did not notice the contingency between the participants’ response time for target O and the location of target T. None of the participants recognized the relationship between SDT and VST, and also the CBR.

Discussion

Here, we investigated whether the response speed can be used as a cue to predict a target feature in the VST. Moreover, we scrutinized whether CBR can be implicitly used and can guide spatial attention to the target feature.

Compared to the other control groups, the experimental group’s response time to target T shortened as the blocks progressed. However, when the contingency disappeared from the experimental group’s last block, the difference between the groups disappeared. In the experimental group, the response time to target T was significantly slower in the 24th block with no contingency than in the 22nd–23rd blocks with contingency. However, these findings may be the result of the high frequency with which target T was presented sequentially at the same location in the experimental group rather than due to implicit learning of the CBR. The results of the yoked control group, however, verified that the results in the experimental group were not caused by the high presenting frequency of target T in an identical location. In the yoked control group, the response time to target T was not significantly different between the 22nd–23rd blocks and the 24th block, even though the presented locations of target T were identical to those of the experimental group. These results imply that individuals can exploit CBR to yield a better performance in the VST. The participants implicitly used CBR even though they were not consciously aware of it. We inferred that the participants guided spatial top-down attention to the presented location of target T based on their own relative response speed. This inference is consistent with the findings that the association between a target and a context can guide spatial top-down attention to the target location (Chun & Jiang, 1998).

In the yoked control group, the response time to target T exhibited no difference between the 22nd–23rd blocks and the 24th block, even though the 22nd–23rd blocks had substantially higher frequencies of target T repeatedly being presented in the same location. However, because there was a tendency toward statistical significance, this result is partially consistent with the studies in which a high presenting frequency in the same location guides spatial attention to the location (Jiang, Swallow, & Rosenbaum, 2013; Jiang, Swallow, Rosenbaum, & Herzig, 2013).

In contrast to the previous studies (Elsner & Hommel, 2004; Hommel, 1996; Ziessler, 1994), we exploited the participants’ “response speed” to establish the contingency between a response and the subsequent stimulus. In the previous studies, each stimulus corresponded to a distinctive response key, and the responses respectively induced a consequential stimulus. In the present study, however, we constrained the target as a single stimulus (O) and measured the response speed to the target. Therefore, there could not be a stimulus-dependent response. Moreover, by using the response speed, we allowed the participants to react relatively freely to the target stimulus.

In the present study, the response speed–contingent target feature was irrelevant to the tasks that the participants overtly performed. For example, the contingency that the location of target T was determined by the response speed to target O was irrelevant to the VST, in which the participants chose the orientation of target T. Thus, we reasoned that CBR could be implicitly learned even if it is irrelevant to the participants’ tasks. This implication is consistent with previous studies (Elsner & Hommel, 2004; Hommel, 1996).

The results in the present study could be considered an example of operant conditioning (Skinner, 1938). In operant conditioning, however, a reinforcer or a punisher is provided as an output of an individual’s reaction to a certain stimulus. In contrast, in the present study, the response-contingent target was a neutral stimulus (not a reinforcer or a punisher). Additionally, the participants could not be reinforced to a certain response speed. Thus, the participants’ responses were not conditioned to a particular speed. Therefore, it was possible to learn the association between a response and the subsequent stimulus even though the response-contingent stimulus did not condition a certain response speed range.

One possible limitation in our design was that, compared to the random control group, the experimental group showed higher number of trials where the target T was presented at the same location as the target in the previous trial. The experimental group exhibited significantly higher number of trials than the random control group in the 2nd–23rd blocks, Fs(1, 30) > 8.81, ps < .05, \( {\upeta}_{\mathrm{p}}^2 \) s > .23. This repetition might have been caused by the method used in the experimental group. For the experimental group, the response speed ranges from the SDT were split into four quartiles, each summarizing 25 % of the trials. Thus, the two middle quartiles might be comprised of a much smaller response time range than the outer quartiles. Thus, it is possible that if the participants responded in the later block differently from the previous block, one can assume that the participants’ responses could have been more likely to belong to the very fast or the very slow quartile calculated from the previous block. For example, due to a possibility of the practice effect, the participants’ responses in the SDT might have been skewed towards the very fast quartile as the blocks progressed. Another possibility is that participants’ own rhythmical paces for the responses in sequential motor response cause the repetition for the location of target T. However, the repetition is not a major concern because the yoked group had the same repetitions, but the RT advantage was not observed.

Ziessler (1998) asserted that response-stimulus associative learning is a key factor in implicit serial learning. Ziessler designed a target to determine the presenting location of a subsequent target. Each stimulus corresponded to the participants’ different responses, and three different experimental groups were systematically varied based on the relationship between the participants’ responses and the presenting location of a subsequent target. As a result, Ziessler argued that the response-stimulus associative learning could be more significant than stimulus-stimulus associative learning in the implicit serial learning paradigm. In this study, however, the consequence of the participants’ responses became a new target in the next trial. Thus, the explicit relevance between two adjacent trials was very obvious. Moreover, the participants were preinformed about the four possible target locations in the subsequent trial based on the target location in the present trial. These factors represent important differences compared with the present study, which used two tasks to conceal the contingency between the SDT and VST. Additionally, in Ziessler’s study, the participants consciously noticed their different responses. The participants were instructed to discretely respond to each different target. In the present study, however, it was difficult for participants to be explicitly aware of their response speed because it was measured within a very short time range (~1,500 ms). Thus, the differences in their own response speeds were thought to be implicitly perceived. Consequentially, the present study excluded the possible alternative explanations in Ziessler’s study.

We used participant response speeds to confirm that CBR can be implicitly learned, even though two successive tasks are explicitly irrelevant. Moreover, we demonstrated that the participants could implicitly use CBR to guide their top-down attention to the predicted target feature.