Introduction

We are often faced with the task of scanning our visual environment for any one of multiple objects, such as when looking for any one of our friends in a crowd. To accomplish this, we must use attention to explore our visual environment, while also searching long-term memory. This combination of visual and long-term memory searches is termed hybrid search, and recent work examining this ability has demonstrated that searching through memory adds little cost to the efficiency of visual search (Schneider & Shiffrin, 1977; Wolfe, 2012). In a series of experiments, Wolfe had participants commit pictures of objects to memory and then perform a visual search task wherein any one of those studied objects could be presented as the target among novel distractors. The number of distractors in the visual search arrays varied from 1 to 16, and the number of objects committed to memory, referred to as memory set size, ranged from 1 to 100. As has commonly been found, visual search response times for recognizing a studied item increased linearly with greater numbers of visual distractors. Surprisingly, however, response times increased logarithmically with the number of targets stored in long-term memory. Put differently, it did not take much longer to visually search for 1 of 16 objects stored in long-term memory than it did to search for 1 of 8.

It remains unclear what type of memory recognition processes support such a rapid and efficient search through memory. Previous research on long-term memory suggests that two neurally distinct processes underlie recognition judgments: familiarity and recollection (Jacoby, 1991; Mandler, 1980). From a phenomenological perspective, familiarity is our awareness that we have encountered an item before, in the absence of any corresponding contextual details of the encoding experience (Yonelinas, 2002). Familiarity is also driven by processing fluency, whereby ease of processing is used as a cue to infer that something has been encountered before (Jacoby & Brooks, 1984; Whittlesea, Jacoby, & Girard, 1990). From a cognitive neuroscience perspective, familiarity judgments are thought to be made through retrieval of unitized, inflexible memory representations (Atkinson & Juola, 1973; Diana, Yonelinas, & Ranganath, 2008; Henke, 2010; Mandler, 1980). Consequently, the constituent elements of the target and their relations are fused, and if a portion of the unitized item is viewed at retrieval, the whole representation becomes reactivated (Graf & Schacter, 1989). This familiarity-based retrieval process is supported by the perirhinal cortex specialized for item-specific memory representations (Diana et al., 2008; Henke, 2010; but see Wais, Wixted, Hopkins, & Squire, 2006).

Recognition can also be supported by a recollection-based process. In contrast to familiarity, recollection traditionally requires cognitive control but affords retrieval of the context in which a target was first encountered (Jacoby, 1991; Yonelinas, 2002). From a cognitive neuroscience perspective, recollection is supported by the hippocampus (Eichenbaum, Otto, & Cohen, 1994), a structure specialized for retrieving memories that have been encoded flexibly. That is, individual elements of the target can be activated and in different ways, such as through different modalities. Comparing the two memory processes, familiarity is traditionally thought to be effortless but limited to environments identical to past experience, whereas recollection is effortful but less limited by context.

A familiarity-based recognition process seems like a probable candidate for supporting previous hybrid search results. In the investigation by Wolfe (2012), memorized targets were both perceptually identical to the visual search targets and repeated frequently throughout the search task. Repeated presentation of the target item is likely to increase processing fluency, and the more frequently an item is presented, the greater the familiarity signal and the less neural processing required (Brown & Xiang, 1998; Brozinsky, Yonelinas, Kroll, & Ranganath, 2005). Moreover, because the target item was identical to the item stored in memory, the participant could invoke a retrieval process that relies on inflexible, unitized representations. Thus, it is possible that hybrid search may be efficient only when it relies on a one-to-one match between the item presented and the stored memory representation. In the present study, we investigate whether hybrid search is possible and remains efficient even after familiarity is minimized and flexible retrieval of the target information is encouraged.

Thinking back to the example of spotting one of your many friends in a crowd, it is unlikely that your friend will be a perfect one-to-one match with your stored memory representation(s) of that person; his or her hairstyle, clothes, and so forth can all vary. It would be more adaptive if hybrid search could rely on recollection of flexible representations. Evidence is emerging that recollection may be faster and more efficient than traditionally thought (Moscovitch, 2008). Moreover, searching for a member of a category produces slower but still logarithmic hybrid search slopes (Cunningham & Wolfe, 2012), again pointing to the possible involvement of flexible recollection. Here, we explored the roles of familiarity and recollection in hybrid visual memory search. In particular, we added two levels of control to the task used by Wolfe (2012). In Experiment 1, we sought to reduce the contribution of familiarity to hybrid search by using predominately target-absent trials. We reasoned that this would reduce the accrual of familiarity based on processing fluency. In Experiment 2, we further minimized familiarity and examined the flexibility of the memory representations used to guide hybrid search by requiring participants to search for previously unseen information. That is, participants had to search for target information (pictures) that was perceptually distinct from the information previously studied (words). In both experiments, we continued to observe logarithmic search slopes suggesting that efficient hybrid search likely is not due to the strength of familiarity but can be accomplished using a rapid form of recollection.

Experiment 1

The aim of Experiment 1 was to replicate the logarithmic hybrid search slope using a paradigm consisting of predominately target-absent trials, intended to reduce the opportunities for processing targets. Similar to Wolfe (2012), we examined five memory set sizes (1, 2, 4, 8, and 16) in separate blocks, but with visual set size held constant (12 objects). Each block consisted of two phases (Fig. 1). In the memory training phase, participants committed visual objects to long-term memory. These objects then served as search targets in the visual search phase.

Fig. 1
figure 1

Schematic representation of procedure for both experiments. Per block, participants studied a set of 1, 2, 4, 8, or 16 target images (Experiment 1) or words (Experiment 2). Once recognition memory performance met criterion, participants completed the visual search phase, searching for any one of the set of targets

Method

Participants

Eighteen undergraduate students from the University of Guelph with an average age of 18.28 years (range, 17–22; 6 males) participated in the experiment for partial course credit. All participants reported having normal or corrected-to-normal vision. All procedures were approved by the University of Guelph Research Ethics Board.

Apparatus and stimuli

The experiment was conducted on a Macintosh computer with a 1,680 × 1,050 resolution LCD monitor. Stimuli consisted of visual objects selected from Brady, Konkle, Alvarez, and Oliva (2008). A subset of 62 images was selected from the 200-item “state pair” database, all unique objects, for use as target memoranda and recognition test lures. Specifically, 31 of these objects were randomly assigned as targets across the five memory set sizes. The other 31 objects were randomly assigned as lures to test recognition accuracy during the memory training phase. Visual search distractors were sampled with replacement from a pool of 2,400 unique-object images (Brady et al., 2008). Any objects belonging to the same category as the 62 memory-training objects were excluded, leaving 2,133.

Procedure

Experiment 1 consisted of five blocks, with memory set size randomized across blocks, each consisting of a memory training phase and a visual search phase. During memory training, participants studied a set of targets by viewing each visual object for 3 s, one at a time. Visual objects were 5.1º × 5.1º of visual angle in size and were presented in the center of the screen at fixation. After viewing all targets, participants immediately completed a forced choice recognition memory test wherein a target object was presented alongside a new lure object. One object was presented to the left of fixation, and the other to the right, determined randomly. Participants made a keyboard press to report whether the studied target was on the left or right. Participants had 4 s to make a correct response, and if not, an error sound was presented as feedback and the trial was considered incorrect. As in Wolfe (2012), the memory training phase repeated until the participant completed the forced choice recognition task with 80 % accuracy twice consecutively.

Once criterion was met for the memory training phase, the visual search task began. Displays, each containing 12 visual objects, were presented one at a time, and participants were instructed to press the space bar if 1 of the objects was a target from the memory set. To prevent the accrual of familiarity from target repetitions, the visual search task terminated after the first successful target detection, at which point the participant moved to the next block of trials. To ensure an adequate number of trials for each set size, the search task was composed of 118 target-absent trials and only 2 target-present trials (one appearing randomly between trials 9 and 60 and, if the first target was missed, the other between trials 61 and 120). Search items were 2.2º × 2.2º in size, presented at random locations on a 12 × 6 item grid, with positions spaced 3º apart.

Results and discussion

Long-term memory performance

While criterion for completion of the memory training phase was set at 80 % accuracy twice consecutively, participants’ recognition performance was considerably higher. Across set sizes, accuracy during all recognition trials ranged from a high of 100 % at set size 1 to a low of 99.1 % (SD = 2.6) at set size 8. For all set sizes, the modal number of blocks required to meet criterion was the minimum (i.e., two blocks).

Visual search accuracy

We investigated participants’ engagement in the search task by comparing the hit and false alarm rates. The average hit rate (M = .38, SD = .29) was significantly greater than the false alarm rate (M = .01, SD = .01), t(17) = 5.50, p < .001, suggesting that, despite the high frequency of target-absent trials, participants actively searched for the target rather than always reporting “absent”. The average hit rate was low and may reflect a low prevalence effect—a phenomenon describing how participants more frequently miss, or fail to detect, rarely presented targets (Wolfe, Horowitz, & Kenner, 2005).

In addition, to assess whether a speed–accuracy trade-off could explain the observed logarithmic changes in response time, we examined hit rate, false alarm rate, and the number of trials required to detect a target (average block, M = 56.57 trials, SD = 7.27) across set size using separate one-way ANOVAs. None varied systematically (all three Fs < 1), suggesting that increased search efficiency at high set sizes was not a result of more liberal responses.

Memory search times

Figure 2 depicts average correct response times for target-absent trials plotted by memory set size. Our primary comparison of interest was to examine whether the obtained response time function better fits a linear model or a log-linear model. Accordingly, we correlated each participant’s average response times with memory set size (1, 2, 4, 8, and 16) and the log of memory set size (1, 2, 3, 4, and 5). For all statistical tests, correlation coefficients were first variance-stabilized using the Fisher’s Z transform; reported correlation coefficients have been reconverted using the inverse transform. Examining the average correlation across participants, the logarithmic scale yielded a significantly better fit, r = .68, as compared with the average correlation with the linear scale, r = .59, t(17) = 3.18, p = .005. Experiment 1 demonstrates that even when using predominately target-absent trials that minimize recognition judgments based on processing fluency, hybrid visual memory search times remain log-linear.

Fig. 2
figure 2

Mean search response times from both experiments. To complement the regression analyses reported in the main text, set size 16 response times were predicted using a linear extrapolation of set sizes 1–8 (gray markers) and a log-linear extrapolation (white markers). For both experiments, the log-linear predictions were significantly closer to the observed values than were the linear predictions (both ps < .001). Error bars are within-subjects 95% CIs (Cousineau, 2005) and should be compared within, but not across, experiments

Experiment 2

Building on the results of Experiment 1, Experiment 2 assessed hybrid search performance when participants could not rely on perceptual familiarity and, instead, had to flexibly retrieve information from the study phase. To this end, we had participants memorize written object names and then visually search for images of the memorized words. By asking participants to search for previously unseen information, together with using predominately target-absent trials, we not only reduce the potential sources of target familiarity, but also encourage participants to call upon flexible memory representations to make recognition judgments.

Method

Participants

A new sample of 22 undergraduate students from the University of Guelph participated in Experiment 2. Data from 2 participants were excluded for failing to follow instructions, leaving a sample of 20 participants with an average age of 18.00 years (range, 17–19 years; 6 males). As in Experiment 1, students received partial course credit for participation. All participants reported having normal or corrected-to-normal vision. All procedures were approved by the University of Guelph Research Ethics Board.

Apparatus, stimuli, and procedure

The visual search phases of Experiments 1 and 2 were identical, and the memory-training phases differed in only one respect. In Experiment 1, participants studied images of target objects, whereas in Experiment 2, they studied one- or two-word labels of those objects (e.g., polar bear; see Fig. 1). Words were approximately 2.8º × 0.5º in size, and presented in Myriad Pro font in the center of the screen. The experiments were otherwise identical, including presentation times and stimulus sizes.

Results and discussion

Long-term memory performance

Participants’ average recognition accuracy ranged from a high of 100 % at set size 1 to a low of 99.3 % (SD = 1.3) at set size 16. For all set sizes, the modal number of blocks required to meet criterion was the minimum (i.e., two blocks).

Visual search accuracy

As in Experiment 1, the average hit rate (M = .37, SD = .28) was significantly greater than the average false alarm rate (M = .02, SD = .01), t(19) = 5.71, p < .001, suggesting that participants were engaged in the task. Neither hit rate nor the number of trials required to detect a target (average block, M = 54.48 trials, SD = 4.97) varied with set size, both Fs < 1. There was a numerical increase in false alarm rate from 0.7 % at set size 1 to 3.2 % at set size 16, although the effect of set size did not reach significance, F(4, 76) = 2.71, MSE = 1.70 × 10−3, p = .061.

Memory search times

The average response times for target-absent trials at each memory set size are presented in Fig. 2. Relative to Experiment 1, we see an overall increase in response times, F(1, 36) = 15.86, MSE = 4.14 × 107, p < .001. This may reflect the additional process of flexibly retrieving information from long-term memory, such as comparing novel perceptual information against conceptual-level memory representations. This increase in response times is consistent with the results of Cunningham and Wolfe's (2012) category-defined searches.

As in Experiment 1, our primary comparison of interest was to examine whether the obtained response time function better fits a linear model or a log-linear model. Examining the average correlation between search times and memory set size, or the log of memory set size, the logarithmic scale once again yielded a significantly better fit, r = .77, as compared with the linear scale r = .72, t(19) = 2.39, p = .028. Moreover, when we compared across experiments using a 2 (experiment: 1, 2) × 2 (model type: linear, log-linear) ANOVA, we found a significant main effect of model, F(1, 36) = 15.18, MSE = .08, p < .001, and no interaction, F < 1, suggesting that the increase in model fit from linear to log-linear was equivalent for both experiments. Thus, even when participants were encouraged to rely on flexible retrieval to complete the search task, response times remained efficient and better fit by a log-linear model.

General discussion

In two experiments, we examined the potential roles of familiarity and recollection in hybrid search. To review, while recollection traditionally involves the retrieval of the contextual details of a prior learning episode, familiarity is the awareness that one has previously encountered an item without knowledge of the learning context. Recent models of memory systems further this definition by suggesting that episodic recollection may be recruited for retrieval of flexible, relational representations, while familiarity often supports recognition of unitized items (Diana et al., 2008; Eichenbaum et al., 1994; Henke, 2010).

Experiment 1 demonstrated that response times continued to increase logarithmically with memory set size despite using predominately target-absent trials, which likely reduced the accruing feeling of familiarity (Brown & Xiang, 1998) and judgments based on processing fluency (Jacoby & Brooks, 1984; Whittlesea et al., 1990). In Experiment 2, when participants were encouraged to rely on a more flexible retrieval process to search for previously unseen information, performance remained highly efficient and log-linear. One possible explanation of this result is that participants completed the hybrid search on the basis of conceptual familiarity. That is, reading words during the study phase may have activated conceptual, or even perceptual, representations of the word (Schreuder, d’ Arcais, & Glazenborg, 1984), which could later have driven feelings of familiarity during hybrid search (Wang & Yonelinas, 2012). While this is one possibility, there are reasons to suspect that, instead, a recollection-based process was engaged in Experiment 2. First, given that all of our stimuli overlapped with other stimuli both conceptually and perceptually, it is not clear that a familiarity strategy would be adequate for differentiating search targets from distractors in Experiment 2. For example, the target word apple would conceptually prime other fruits used as search distractors (e.g., orange, grapefruit, and mango) and perceptually prime other round objects used as search distractors (e.g., globe, coin, and tire). Second, recent evidence has emerged demonstrating a relation between rapid behavioral responses (reaction times, eye movements) and detail-rich or relational retrieval (Hannula & Ranganath, 2009; Hannula, Ryan, Tranel, & Cohen, 2007; Sheldon & Moscovitch, 2010). Both detail-rich and relational retrieval are thought to rely on hippocampally mediated recollection processes (Eichenbaum et al., 1994; Yonelinas, 2002). Therefore, we suggest that participants remained efficient at hybrid search in Experiment 2 by relying on a rapid, flexible recollection process.

A limitation of the present study is that, due to the structure of the hybrid visual memory search task, we were unable to obtain direct estimates of familiarity or recollection. Instead, we used theory-driven manipulations of these processes and examined their impact on hybrid search performance. An important step for future studies will be to use measures such as neuroimaging, which can estimate the contribution of each memory system.

One way of understanding how we can recollect episodic memory representations so quickly and efficiently is to consider recollection as a two-stage process. According to a recent model (Moscovitch, 2008), recollection may be composed of an initial rapid, unconscious process that is followed by a slower, conscious, effortful process. The initial, rapid recollection may underlie such highly efficient hybrid search.

This notion of rapid recollection aligns with emerging theories about the role of the hippocampus in perception as well as episodic memory formation and retrieval. Episodic recollection is believed to emerge from the flexible encoding of associations into long-term memory, a process requiring a network of brain regions, including the hippocampus (Henke, 2010). Importantly, the hippocampus is also thought to be situated to receive the products of perceptual processing and is intimately involved in the comparison of these current inputs with stored memory representations (Olsen, Moses, Riggs, & Ryan, 2012). Neuroimaging evidence has emerged demonstrating that hippocampal responses have a much faster time course than originally believed, comparable to time windows observed for perceptual processing (Riggs et al., 2009).

The results of the present study, demonstrating interactions among attention, perception, and episodic memory, help bring together these recent theoretical frameworks. Our data suggest that a hippocampal network may support hybrid search by (1) initially encoding flexible episodic memories (Henke, 2010) and then (2) continuously comparing incoming perceptual information with internally stored representations (Olsen et al., 2012), until (3) a match is detected and rapid recollection is engaged (Moscovitch, 2008).