The human face is a special stimulus of critical importance in survival and social interaction. One notable feature of face perception is that although the human face is a complex visual stimulus, containing a substantial amount of information, the processing of facial information seems to be rapid and automatic (Farah, 1996; Lavie, Ro, & Russell, 2003; Young et al. 1986). This automatic nature of face perception is particularly evident when people are faced with familiar faces; the identification of familiar faces can proceed in parallel with other capacity-limited processes, such as response selection and working memory consolidation (Jackson & Raymond, 2006; Jung, Ruthruff, & Gaspelin, 2013).

However, even though it is well established that the processing of familiar faces can be done simultaneously with other tasks that consume capacity at the central stage of information processing, it remains unclear whether the perceptual processing of a familiar face at the early stage is subject to any capacity limitation. Indeed, a previous study showed that detecting a familiar face among nonfamiliar faces might depend on a capacity-limited process (Tong & Nakayama, 1999); when participants searched for their own face among others’ nonfamiliar faces, the search reaction time increased as the number of the distractors (nonfamiliar faces) increased.

Although this set-size effect strongly suggests that detecting one’s own face is subject to a capacity limit, an alternative explanation is possible; the observed set-size effect might be due to statistical decision error even when the search process is capacity-unlimited (Huang & Pashler, 2005; Palmer, 1994). Specifically, assuming that the sensory signals of search stimuli are noisy, the probability of confusing one of the nontarget items with the target at least once should increase as the number of items in the display increases. In the presence of this statistical decision noise, search performance would suffer, especially when many items are presented in the display, even though all of those stimuli are processed with unlimited capacity. Given this, it remains unclear whether detecting familiar faces depends on a capacity-limited or -unlimited process.

In the present study, we tested whether the process of detecting an extremely familiar face—one’s own face—among others’ nonfamiliar faces is subject to a capacity limit. To test this without the confound of statistical decision noise, we adopted a well-established paradigm to reveal the capacity-limited nature of perception: the simultaneous–sequential paradigm (Eriksen & Spencer, 1969; Huang & Pashler, 2005; Scharff, Palmer, & Moore, 2011; Shiffrin & Gardner, 1972). In this paradigm, participants search for a specific target among distractors. Importantly, the target and the distractors are presented in two different ways. In the simultaneous-presentation condition, all of the search stimuli are presented briefly at the same time. In the sequential-presentation condition, only half of the stimuli are presented briefly first, and then, after a certain interval, the other half are presented. If the target search is a capacity-limited process, the search performance should be better in the sequential-presentation condition than in the simultaneous-presentation condition, because the limited capacity that can be allocated to each item at a given time would be twice as great with sequential as with simultaneous presentation. Importantly, the two conditions include the same numbers of search items, equating the amount of statistical decision noise. Hence, any performance difference between the two conditions should be due to the capacity limit.

In the present study, we had a group of participants search for their own face among others’ faces. To examine whether this self-face processing is subject to a capacity limit, we compared the search performance when all of the search stimuli were presented simultaneously and when two different subsets of search stimuli were presented sequentially. If the process of detecting a self-face suffers from a capacity limit, the search performance should be better in the sequential- presentation than in the simultaneous-presentation condition. By contrast, if one detects her or his own face in a capacity-unlimited manner, we should observe no performance difference across the different presentation conditions (Huang & Pashler, 2005; Scharff et al. 2011). We also had another group of participants search for someone else’s unfamiliar face. This condition was included to compare the extents to which the processes of detecting familiar and unfamiliar faces suffer from a capacity limit.

Method

Participants

Two groups of 15 adults (eight males and seven females in each group, 18–25 years old) with normal or corrected-to-normal vision participated for course credit. On the basis of the previous studies using a similar approach (Huang & Pashler, 2005; Scharff et al. 2011), we reasoned that the sample size of 15 should be sufficient to detect significant effects. The Institutional Review Board of Chungbuk National University approved the experimental protocol, and informed consent was obtained from each participant.

Stimuli and apparatus

The experiment was programmed and run using PsychoPy (Peirce, 2007). The search stimuli, face images (2 × 2 of visual angle), were presented on a 21-in. LCD monitor with a gray background. The participants were required to fixate on a small white (0.3 × 0.3 of visual angle) dot presented at the center of the screen throughout the experiment. The task stimuli consisted of a set of eight face images (four male), which served as the distractors, and another set of 15 face images (eight males), each of which served as the target for each individual participant. For a group of 15 participants, the target was the participant’s own face, which was photographed immediately prior to the experimental session. Importantly, to mimic natural viewing conditions, the facial images were minimally altered; no hair cropping was done. Instead, we included participants without distinctive visual features, such as a unique hairstyle, facial color, or facial hair, which would differentiate their facial images from the other faces (see also Tong & Nakayama, 1999, for a similar control).

This same set of 15 images also served as the target faces for the other 15 participants. Hence, half of the participants searched for their own face, whereas the other half searched for someone else’s. In either case, a male participant searched for a male target face, and a female participant searched for a female target face.

The face stimuli were positioned on an imaginary circle (6 from the center of the screen). Specifically, each stimulus was centered at one of eight locations, divided into four quadrants (top right, top left, bottom left, and bottom right; see Fig. 1). Within a quadrant, the distance between the two stimulus locations was approximately 1.6°, whereas across quadrants, the distance between the two nearby locations was 6°. This was done to control for stimulus density and potential crowding effects between the stimuli across different presentation conditions (see the following section).

Fig. 1
figure 1

Trial design of the experiment. (a) Set-size 4 condition. (b) Set-size 8 condition. (c) Set-size 8–sequential condition. The search display duration was adjusted for each indiviual participant, to yield about 70%–80% accuracy in the set-size 8 condition

Design and procedure

One group of participants performed the task of searching for their own face among others’ unfamiliar faces (self-face group). Prior to the experimental session, each individual participant’s facial photo was taken. This self-face image was the search target for that participant. Another group of participants (the other-face group) searched for one of the same set of face images that served as the targets in the self-face group. Half of the trials contained the target face, and no target was presented on the other half of the trials. Participants indicated the presence/absence of the target via buttonpresses. They were instructed to respond as accurately as possible without time pressure.

A trial started with a 500-ms fixation presentation, followed by the search display presentation. The duration of the search display was adjusted individually prior to the main experimental session, to yield about 70%–80% target accuracy when eight faces were presented simultaneously (set-size 8 condition; see below). To prevent eye movements, the longest search duration was set to 200 ms (Huang & Pashler, 2005). The resulting stimulus durations ranged from 20 to 200 ms. After the offset of the entire search display, participants were prompted to indicate whether the target was present or absent.

There were three presentation conditions: the set-size 4, set-size 8, and set-size 8–sequential conditions. In the set-size 4 and 8 conditions, four and eight faces, respectively, were presented simultaneously. In the set-size 8–sequential condition, two different sets of stimuli, each of which included four faces, were presented sequentially across two frames with a 500-ms interframe interval (Huang & Pashler, 2005). Notably, in the set-size 4 condition, two pairs of stimuli were always presented in two opposing quadrants on the screen (top right and bottom left or top left and bottom right; see Fig. 1a). Similarly, in the set-size 8–sequential condition, the stimuli were always distributed in two opposing quadrants for a single frame. This was done to prevent the manipulations of set size and presentation type from affecting the stimulus density and any potential crowding effects between the stimuli.

Taken together, the experimental design consisted of a 2 × 3 mixed design with Target Face (self vs. other) as a between-subjects factor and Presentation Type (set size 4 vs. 8 vs. 8–sequential) as a within-subjects factor.

Results

The results are shown in Fig. 2. To analyze the data, we calculated the perceptual sensitivity (d prime) to the target face. Note that we found the same pattern of results when the data were analyzed using response accuracy. A two-way mixed analysis of variance (ANOVA) with Target Face (self vs. other) as a between-subjects factor and Presentation Type (set size 4 vs. 8 vs. 8–sequential) as a within-subjects factor applied to the perceptual sensitivity data revealed significant main effects of target face, F(1, 28) = 4.64, < .05, and presentation type, F(2, 56) = 20.93, p < .001, with no interaction, F < 1.

Fig. 2
figure 2

Results of the experiment. (a) Target detection performance, measured by d prime. (b) Hit rate data. (c) False alarm rate data. Error bars represent the standard errors of the means

For the self-face group, the search performance in the set-size 4 condition was significantly better than that in the set-size 8 condition, t(14) = 3.77, p < .005, consistent with a previous study also showing a significant set-size effect of self-face search (Tong & Nakayama, 1999). Although this result indicates that the current target face was not a pop-out stimulus, it does not inform us as to whether processing the target face was subject to a capacity limit. Importantly, we found a significant difference in perceptual sensitivity between the set-size 8 and set-size 8–sequential conditions, t(14) = 3.62, p < .01. This result shows that detecting a familiar face, such as one’s own, depends on a capacity-limited process. Notably, the search performance in the set-size 8–sequential condition was equivalent to that in the set-size 4 condition, p > .53. This result indicates that the observed set-size effect was primarily due to a capacity limitation of self-face perception, rather than to decisional noise. In line with this main finding, the hit rate was significantly lower in the set-size 8 condition than in the set-size 4 and set-size 8–sequential conditions, whereas the false alarm rate was significantly higher in the former condition than in the latter two, ps < .05.

For the other-face group, similar patterns of results were found. Specifically, a significant difference in perceptual sensitivity emerged between the set-size 8 and set-size 8–sequential conditions, t(14) = 3.99, p < .005. The difference between the set-size 4 and set-size 8 conditions was also significant, t(14) = 4.22, p < .005. The false alarm rate and hit rate data also showed patterns similar to those of the self-face group, ps < .05.

It is important to emphasize that since the ANOVA revealed no significant interaction between target face and presentation type, the magnitude of the performance benefit from sequential presentation did not differ depending on whether the target face was a self- or an other-face. That is, the processes of identifying self- and other-faces suffered from capacity limits to similar extents. However, the main effect of target face was significant, p < .05 (see above), indicating that the overall search performance was better for the self-face group than for the other-face group. This difference in sensitivity was derived from significantly higher hit rates for the self-face in all presentation conditions, ps < .05. We found no difference in false alarm rates across the target face conditions, ps > .21. Given that the mean search display durations for the self- and other-face groups were not significantly different, p > .15, this result indicates that familiar faces are perceived better than unfamiliar faces for a given duration of stimulus presentation. This finding points to the role of stimulus familiarity in face search performance. Presumably, one’s own face is very familiar, leaving a stronger memory trace than others’ unfamiliar faces. Given this, one would have a clear representation of one’s own face, whereas someone else’s face would be easily confused with others’. This finding is also in line with a previous study showing that the perceptual representation of a familiar face is more robust than that of an unfamiliar face (Tong & Nakayama, 1999).

Finally, we considered the possibility suggested by an anonymous reviewer that the observed capacity limit of self-face perception occurred because one’s own face is not so familiar to oneself as we expected; in practice, one is called to recognize others’ faces more often than one’s own. To address this issue, we ran a control experiment (N = 15) in which participants were required to detect a friend’s face among unfamiliar faces. Specifically, we asked participants to bring their classmates, whose photos were taken. Each friend’s image served as the target image for each participant. The result of this experiment was similar to that of the main experiment: The perceptual sensitivity to the target in the set-size 8–sequential condition was significantly better than the sensitivity in the set-size 8 condition, t(14) = 4.11, p < .005. The difference between the set-size 4 and set-size 8 conditions was also significant, t(14) = 3.85, p < .005. The magnitude of the benefit of sequential presentation was similar to the benefit calculated from the self- and other-face groups’ data, ps > .68. This finding further supports the claim that familiar-face processing is dependent on a capacity-limited process similar to the one used for processing unfamiliar faces.

Discussion

In the present study, we found that detecting one’s own face among nonfamiliar faces depends on a capacity-limited process: Search performance was significantly better when the search stimuli were presented sequentially across two frames than when they were presented simultaneously. A performance benefit of magnitude similar to the one from sequential presentation was found when the search target was an unfamiliar face. This finding suggests that the processing of familiar faces suffers from a capacity limit similar in extent to the limit for unfamiliar face processing.

Note that there has been evidence that identifying familiar faces can proceed simultaneously with other capacity-limited processes, such as response selection and working memory consolidation (Jackson & Raymond, 2006; Jung et al. 2013). Given that those processes take place at the central stage of information processing (Jolicœur, 1998; Pashler, 1991, 1994), the present finding suggests that a process that does not require central capacity can still be subject to a capacity limit of perceptual resources. This finding provides further evidence that perceptual attention and central attention are separable.

Another important point is that even though both familiar and unfamiliar faces suffer from a capacity limit, the overall sensitivity was higher for the familiar than for the unfamiliar faces. This result is consistent with the account that the perceptual representations of familiar faces are more robust than those of unfamiliar faces, as Tong and Nakayama pointed out (Tong & Nakayama, 1999). In the study by Tong and Nakayama, the visual search slope was steeper when participants searched for unfamiliar faces than when they searched for their own face. This processing efficiency for self-face search was explained by the robust representation of familiar faces. The perceptual robustness of familiar faces might also allow one to process them with relatively little cost, even when one’s attention is diverted to other tasks.

We would also note that the present finding provides further evidence that attention is needed for conscious perception. In a contentious debate regarding the relationship between attention and conscious perception, the claim that conscious perception can be separated from attention is mainly based on findings that some classes of stimuli, such as images of natural scenes and faces, can be brought into conscious perception without the aid of attention (Cohen, Cavanagh, Chun, & Nakayama, 2012; Koch & Tsuchiya, 2012). However, a recent study has reported that attention is necessary for the conscious perception of natural scenes (Cohen, Alvarez, & Nakayama, 2011). In the present study, we found that the conscious perception of faces—even of an extremely familiar face, such as one’s own—depends on a capacity-limited process, such that perceiving multiple faces without impairment is impossible. This finding implies that selective attention, through which a subset of sensory inputs are prioritized or processed more deeply at the expense of others, should play an important role in face processing in order to optimally allocate limited capacity. Given this situation, the present findings broadly support the critical role of selective attention in conscious perception.

Finally, the present results fit well with a recently proposed theoretical framework on the capacity limit of visual object recognition (Scharff, Palmer, & Moore, 2013). According to this framework, the nature of the underlying process, rather than the task difficulty, determines whether or not a given visual task will suffer from a capacity limit. Specifically, the processes of simple sensory analysis and feature segmentation for contrast discrimination, size discrimination, and detecting specific conjunctions of features are capacity-unlimited. By contrast, object and semantic processes are subject to capacity limits. Considering that the face image is a complex object with abundant visual features and information, detecting or identifying a particular face should depend on a capacity-limited process.

To conclude, the present findings suggest that the processing of familiar faces, even though their perceptual representations are more robust than those of unfamiliar faces, still depends on a capacity-limited process. Although the perceptual robustness of familiar faces enables one to process those faces with little cost in parallel with the performance of other capacity-limited tasks, the processing of familiar faces does not take place in a capacity-unlimited manner. Attention is certainly needed for the perception of faces, even for an extremely familiar one, such as one’s own face.