The importance of awareness in face processing: A critical review of interocular suppression studies

Human faces convey essential information for understanding others ’ mental states and intentions. The impor- tance of faces in social interaction has prompted suggestions that some relevant facial features such as configural information, emotional expression, and gaze direction may promote preferential access to awareness. This evi- dence has predominantly come from interocular suppression studies, with the most common method being the Breaking Continuous Flash Suppression (bCFS) procedure, which measures the time it takes different stimuli to overcome interocular suppression. However, the procedures employed in such studies suffer from multiple methodological limitations. For example, they are unable to disentangle detection from identification processes, their results may be confounded by participants ’ response bias and decision criteria, they typically use small stimulus sets, and some of their results attributed to detecting high-level facial features (e.g., emotional expression) may be confounded by differences in low-level visual features (e.g., contrast, spatial frequency). In this article, we review the evidence from the bCFS procedure on whether relevant facial features promote access to awareness, discuss the main limitations of this very popular method, and propose strategies to address these issues.


Introduction
Human faces convey a wealth of information, which we use to evaluate others' mental states and intentions and guide our social behaviour [1,2]. It has been claimed that faces are remarkably effective at capturing attention [3,4], especially when expressing emotional states. For instance, several studies have reported that fearful and angry expressions are detected faster than neutral and happy expressions [5,6] (but see [7]). Another facial feature that has been claimed to be effective at capturing attention is eye-gaze. For instance, faces making eye contact typically draw attention toward the face whereas averted gaze draws attention to the gaze's direction [8][9][10]. Various aspects of face processing are altered in some psychiatric and neurological conditions; for example, it has been claimed that depression enhances salience of sad expressions [11][12][13][14] (but see [15]), anxiety enhances salience of fearful and angry expressions [16][17][18] (but see [19]), and autism affects face processing by inducing avoidance of eye contact [20]. Understanding how facial information is processed can therefore shed light on the cognitive mechanisms of perception and emotion, both in healthy and psychopathological conditions. More recently, a growing body of research has indicated that some facial features like emotional expression and gaze direction may even promote access to awareness. However, the evidence for these claims is inconsistent [21], which raises the question of how reliable the methods used in those studies are, as well as suggesting that other factors might explain these studies' findings.
In this article, we survey recent findings from this new line of research about how facial information gains access to awareness. We place particular emphasis on reviewing studies that used the most popular method used to produce these findings: breaking Continuous Flash Suppression (bCFS); we discuss this method's limitations and propose strategies that future studies could implement.

Dissociating visual processing and conscious awareness
To study whether certain facial features promote access to awareness, we can ask people to provide subjective reports of their visual experience in its simplest forms, e.g., whether they saw a face's emotional expression or gaze direction after being presented very briefly and measure their visual discrimination together with their neural correlates. But since simply impoverishing visual stimuli to render them faint or invisible (i.e., subjectively undetectable) will very likely eliminate any possible visual processing as a consequence, different strategies have been developed with the objective of testing what types of visual information are prioritised when entering awareness. The aim of these strategies is dissociating sensory processing and awareness; this is based on the assumption that by suppressing a stimulus from awareness while keeping the sensory stimulation intact, one can learn about the aspects of visual processing that do not require awareness to unfold (the dissociation paradigm [22,23]), or may be prioritised for entering awareness [24,25].

Brief exposure durations
Probably the simplest method for studying how faces enter awareness is presenting face images for predefined exposure durations and measure if certain visual features can be detected or recognised with shorter exposure durations than others. If observers are able to report one type of stimulus presented for a briefer exposure duration than another type of stimulus using an objective task (e.g., detection, location discrimination, expression discrimination; Fig. 1A), this performance may be interpreted as evidence of faster access to awareness by the former type of stimulus (for a more comprehensive description, see [26]). However, due to hardware limitations (specifically, computer monitors' refresh rates), it is currently difficult to present visual stimuli for briefer durations than 10-16 ms. Crucially, these presentation durations are sufficiently long for detection, categorisation, and even physiological markers of visual and emotion processing to arise [27][28][29][30][31], thus preventing researchers from testing hypotheses about conscious awareness.

Masking visual stimuli
Masks interrupt visual processing. The most commonly used masking technique, backward masking, allows stimuli to be presented on screen without observers being aware of them, thus allowing researchers to measure distinctions in how stimulus categories are processed, at a manipulable temporal grain. Backward masking consists of briefly presenting a target stimulus that is quickly followed by another imagethe Fig. 1. Methods to study unconscious processing and conscious access of visual stimulus features. (A) Brief exposure durations. A stimulus of interest is presented on the screen for a predefined exposure duration, typically followed by a detection, location discrimination, or expression discrimination task. (B) Backward masking. Backward masking is one of the most popular masking techniques. The stimulus of interest is followed by a mask stimulus that interrupts visual processing, including perceptual awareness. (C) Continuous Flash Suppression. One eye is flashed with changing Mondrian-like patterns typically updated at a rate of 10 Hz creating a strong interocular suppression effect on the contents shown to the other eye. mask (Fig. 1B). For example, Greene and Oliva [32] found that different exposure durations were required for observers to consciously perform different categorisation tasks on the same masked image; they found, for instance, that the detection of scenes' global properties requires shorter exposure durations than the detection of scenes' basic properties. While these findings do not directly test whether certain visual features of scenes may gain access to awareness faster than others, they do suggest that processing of global properties occurs before processing of scenes' basic properties.
In a similar vein, Codispoti et al. [33] investigated the effect of stimulus valence by presenting observers with pleasant, neutral, and unpleasant pictures. The study used both masked and unmasked images, across exposure durations ranging from 25 to 6000 ms, and assessed participants' emotional reactivity using various measures, including electromyography, EEG, skin conductance response (SCR), and both pleasure and arousal ratings. Crucially, when masked pictures were employed, evidence of emotional processing only arose at exposure durations greater than 80 ms; but when unmasked pictures were employed, emotion processing was apparent at all exposure durations. This study exemplifies the usefulness of masking techniques at interrupting visual processing and awareness.
However, an important limitation of masking techniques is that they are only effective at suppressing awareness when the target stimulus is presented very briefly [34,35], typically for less than 100 ms. This is an important limitation for studying complex stimuli such as faces since one would like to allow periods of unconscious (i.e., masked or suppressed) processing of the stimulus and the integration of its different features for as long as possible [36]. Furthermore, masking techniques are also very vulnerable to unstable fixation and eye movements [37].

Continuous flash suppression
When one eye is presented with a high-contrast, dynamic visual pattern and the other eye is presented with a less salient image at the corresponding retinal location, the weaker image is suppressed from awareness, typically for several seconds. This phenomenon is known as Continuous Flash Suppression (CFS [38]; Fig. 1C), and has been used extensively to examine whether there is unconscious processing of the suppressed image. Typically, the dominant CFS mask consists of so-called Mondrian-like patterns (randomly arranged geometric shapes with different brightness/colour levels), which are flashed to one eye (most commonlybut not alwaysat a rate of 10 Hz) while a lower-contrast, often stationary target image is shown to the other eye. Therefore, CFS enables suppression of images for longer durations, circumventing the limitations of backward masking and other similar techniques [39].
Several paradigms have employed CFS to study how facial information promotes access to awareness [21,34], but the most widely used one is Breaking Continuous Flash Suppression (bCFS [25]; Fig. 2). This paradigm is built on the assumption that stimulus categories that overcome suppression faster (thus gaining access to awareness) enjoy prioritised processing outside of awareness compared with stimuli that take longer time to overcome suppression. For instance, Jiang et al. [40] presented Chinese speakers and Hebrew speakers with CFS-suppressed words that could match their native language or not (i.e., Chinese or Hebrew words), and asked them to report the location of the word on the screen (left or right) as soon as they were able to see it. Participants' response times (RTs) were faster for words in the language they spoke than for words in the language that they did not speak, suggesting that words that are familiar and recognisable reach awareness faster than unrecognisable words, possibly due to unconscious semantic processing.
The bCFS paradigm presents a crucial advantage over other masking techniques: it measures breakthrough times, which are an estimate of how long a stimulus took to gain access to a participant's awareness. Crucially, breakthrough times can be informative about two different aspects of awareness: when a stimulus gains access to awareness and whether it gains access faster than another stimulus of interest (or a control stimulus) does. This makes bCFS an attractive choice when the researcher's goal is to test whether certain stimulus categories enjoy prioritised access to awareness. The bCFS procedure has other advantages, including a straightforward implementation and results that are easy to interpret, making it a popular experimental paradigm [41]. However, as argued below in Section 4, bCFS suffers from a number of important limitations, some of which may be detrimental for studying complex stimuli such as facial features.

bCFS studies on face processing and access consciousness
Myriad bCFS studies have claimed that different facial features may promote access to awareness. Below, we review the main facial features that have been studied in this context.

Facial configuration and the face-inversion effect
In order to perceive faces, we need to see beyond their isolated parts, integrating them into coherent wholes, a process known as configural or holistic processing [42]. The integration of facial features conveys a large amount of high-level semantic/conceptual information, including emotional states, intentions, identity, gender, age, race, ethnicity, health, attractiveness, and personality traits [43][44][45][46][47][48].
An elegant approach to investigating the holistic or configural processing of faces is turning them upside down [49] as this disrupts their configural information while keeping their physical properties intact [50]. Many studies have found that upright faces enjoy better detection and recognition than inverted faces (the face-inversion effect or FIE; [51,52]); likewise, the former activate the fusiform face area (FFA; a key neural structure for face perception) more than the latter [53,54]. The fact that presenting faces upside down disrupts their holistic processing has been used to determine whether configural face processing makes faces gain access to awareness faster, by testing whether upright faces break through CFS masking faster than inverted faces [55]. Jiang et al. [40] were the first to address this question using bCFS, finding that upright faces were, on average, about 400 ms faster than inverted faces to overcome suppression. This result was interpreted as indicating that the holistic processing of upright faces gives them prioritised access to awareness.
Many studies have employed CFS to replicate the FIE [24,[56][57][58][59][60][61][62][63]], which appears to be stronger for faces than for other objects [63]. Zhou The Breaking CFS paradigm. In this variant of the CFS procedure, a stimulus of interest is presented to one of the participants' eyes and ramped up gradually in contrast while their other eye is flashed with CFS masks. Participants are asked to perform a task as soon as they become aware of the stimulus; this is typically a detection, discrimination, or identification task. In this example, the task requires participants to respond the location of the face (left or right side of the screen). et al. [60], for example, replicated the FIE in a bCFS task but did not find an inversion effect for houses, suggesting that other stimulus categories do not contain the same type of high-level information and therefore may not be processed holistically (promoting prioritised access to awareness), or at least not to the same extent as faces. More recently, Kobylka et al. [58] showed that the FIE can be found both when using localisation (left/right) and stimulus categorisation (face/house) bCFS tasks. Similarly, Lanfranco et al. [61,62] and Stein & Peelen [63] replicated the FIE by showing participants CFS-suppressed faces for predefined exposure durations; in these studies, the FIE was replicated by measuring both sensitivity to face location (left/right) and stimulus categorisation (face inversion, emotion, or gaze direction; however, note that it has been suggested that the FIE may be noisy or weak when using CFS [62,64]). They found that at specific exposure durations sensitivity to suppressed upright faces was significantly higher than to suppressed inverted faces, thus indicating that the FIE arises from differences in perceptual sensitivity. These studies suggest that faces' holistic properties confer them a processing advantage for entering awareness and that this effect may be attributed to an increase in perceptual sensitivity.
However, whether these results truly reflect holistic processing has been controversial since other studies have attributed the FIE to lowlevel properties of the images such as convexity [59,65], suggesting that high-level characteristics such as inversion effects may correlate with low-level characteristics. Furthermore, other complex stimuli such as human bodies have also been reported to exhibit a significant but smaller inversion effect [66], which may suggest that the inversion effects typically found in bCFS tasks, such as the FIE, may be driven by the complexity and specificity of their configurations.
In summary, some (but not all) of the evidence suggests that holistic face processing can facilitate faces' access to awareness. However, while the main body of research indicates that the FIE in bCFS tasks is due to high-level processing (e.g., semantic/conceptual information), suggesting that holistic facial information promotes access to awareness, a few other studies suggest that this may not be entirely the caselow-level information may also contribute to it.

Emotional expressions
Emotional expressions are very informative for understanding people's mental states. Several studies have shown that threatening expressions such as fearful and angry faces draw attention faster than other expressions [3,5,67], perhaps serving an evolutionary purpose. But are threatening expressions prioritised for entering awareness?
Many bCFS studies have tested whether emotional expressions enjoy prioritised access to awareness. In the first study to investigate this issue, Yang et al. [80] reported shorter suppression times for fearful expressions than for happy and neutral expressions, suggesting an advantage for fearful expressions in gaining access to awareness. However, this difference was also found when the faces had been inverted, which calls into question the original interpretation. Since turning faces upside down disrupts high-level information processing (and should thus impair emotion perception), this could be evidence that the effect was in fact caused by lower-level features of the stimuli, such as differences in contrast, luminance, or spatial frequency. Stein and Sterzer [81] addressed this problem by presenting participants with schematic faces expressing angry, neutral, happy, and sad expressions. Unexpectedly, they found shorter suppression times associated with happy expressions. Follow-up experiments demonstrated that this effect was driven by the curvature of the faces' mouths, suggesting that the advantage of emotional expressions could indeed be driven by low-level features.
Indeed, many bCFS studies to date have suggested that emotionrelated advantages reported in the past may be driven by low-level features like contrast, luminance, and spatial frequency. For instance, Gray et al. [82] manipulated face orientation and luminance polarity (i. e., normal or colour-inverted), and presented fearful, happy, angry, and neutral expressions in a bCFS paradigm. The advantage for fearful expressions over happy and neutral ones was still present even when faces were shown upside down, or colour-inverted, or both, suggesting that low-level features may account for the difference in breakthrough latencies. Other studies have found that other low-level features, such as differences in spatial frequency [83], luminance, and contrast [84,85], can contribute to the advantage of emotion too. Therefore, low-level information may fully account for differences in suppression times between emotional and non-emotional expressions, meaning that low-level information either promotes visual processing of emotional expressions or confounds suppression times in favour of emotional expressions. Importantly, emotion studies employing the bCFS procedure have typically used a very restrictive set of stimuli, which raises the question of whether processing differences between emotional and non-emotional expressions are driven by the particular face stimuli chosen [62].
The role of spatial frequency was studied by Stein et al. [86], who used it to investigate claims about the processing pathways of emotional facial information. To test this, Stein et al. [86] created face images that either lacked low spatial information, or high spatial frequency information. These images were CFS-suppressed and presented to participants in one out of four possible screen locations. They were asked to localise the faces as quickly and accurately as possible as soon as the face or any part of it became visible. Stein et al. [86] reasoned that visually responsive neurons in the subcortical visual route (specifically those in the superior colliculus) receive afferences mainly from magnocellular retinal ganglion cells, which are more sensitive to low-spatial-frequency information; therefore, face images with low spatial frequency should be predominantly processed by these neurons. Conversely, visually responsive neurons in cortical areas predominantly receive afferences from parvocellular ganglion cells, which are more sensitive to high-spatial-frequency information; therefore, face images with high spatial frequency should mainly be processed by these neurons [87,88]. Stein et al. [86] analysed differences in suppression times and found a consistent fear advantage associated with high spatial frequencies, suggesting that the specific suppression advantage for fearful expressions was due to high-spatial-frequency information. Interestingly, these results contradict past findings, which had suggested that unconscious emotion processing (which arguably drives the emotion-related bCFS advantage) might be mediated by a specialised subcortical pathway involving the amygdala and preceding visual cortex processing [78,89] (for reviews, see [90,91]). In conclusion, the prioritised access to awareness that fearful expressions may enjoy, as reported in many bCFS studies, would primarily rely on visual cortical areas rather than subcortical structures, although, as argued by Stein et al. [86], both pathways may contribute. Further studies are needed to determine whether their finding is specific for CFS-suppressed expressions or generalisable to other masked or unmasked expressions.
An important limitation of the studies testing for an advantage of emotional faces at entering awareness, however, is that they involved masks to suppress face images from awareness, typically by using backward masking or CFS. While these techniques are widely used in the field, it is still unclear what specific visual mechanisms they interrupt. A simpler and more intuitive approach would be to simply present (unmasked) face images for brief exposure durations to determine their detection thresholds, but as argued in Section 2.1, even the shortest exposure durations that can be presented with computer screens may be sufficient for observers to detect faces, identify their expression, and elicit the corresponding physiological activation of these processes. However, very recently, Lanfranco et al. [92] used a newly developed LCD tachistoscope that enables visual presentations with sub-millisecond precision. They presented participants with intact and scrambled human faces with fearful or neutral expressions for predefined exposure durations ranging from 0.8 to 6.2 ms and measured participants' sensitivity to the location of the face and to their expression, while also measuring their EEG data, to determine minimal required exposures of these processes. In a series of experiments, they found that emotion processing does not unfold before conscious awareness does, suggesting that emotion processing of faces may require awareness. These findings may suggest that prior reports of unconscious emotion processing of faces indeed may have been due to low-level visual information (or to its interaction with masks) rather than to the high-level emotional content.
In summary, multiple studies have shown that emotional expressions, particularly fearful expressions, enjoy prioritised access to awareness. However, some studies have claimed that such an advantage could be explained by differences in low-level features between face images, such as luminance, contrast, and spatial frequency, especially in the case of bCFS studies.

Gaze direction
Eye-gaze is very informative for understanding other people's intentions [44,93]. For example, many studies have claimed that direct-gaze faces draw attention towards themselves whereas faces with an averted gaze draw attention to the place they look at [8][9][10]94]. Several reports have suggested that gaze is a relevant cue for the recognition of gender, age, and emotional expression in a face [44,95,96], especially when gaze makes eye contact, as it provides a processing advantage that makes face detection faster [10,97]; this advantage is known as the eye-contact effect.
The first studies on the relationship between eye-gaze processing and awareness explored whether gaze direction can be processed in absence of awareness. For instance, Sato et al. [98] presented observers with faces looking either at the left or right side of the screen. These backward-masked faces were employed as cues for a target subsequently presented either at the left or right side of the screen. Observers were instructed to give a detection response as soon as they became aware of the target. Crucially, RTs were consistently shorter for valid (gaze looking at the same screen side of the target) than invalid gaze cues in both conditions, when the face was masked and unmasked, thus suggesting that gaze direction may be processed in absence of awareness.
Most studies on this topic, however, have employed CFS to render faces invisible in order to test whether eye contact also promotes preferential access to awareness. For example, Stein et al. [99] reported shorter bCFS suppression times for direct-gaze faces than averted-gaze faces. They concluded that these results may be due to an enhanced unconscious representation for direct gaze, perhaps preparing individuals for social interaction. This finding was recently replicated by Lanfranco et al. [61], and extended by using a novel approach to CFS based on the method of constant stimuli: suppressing the faces for a series of brief, prespecified times, and having participants report features of the stimuli (e.g., where on the screen the face was, what its gaze direction was). This method has the advantage that it is less easily confounded by decisional factors, and it revealed that even with this high level of control, participants were more accurate at reporting where the suppressed face was on screen, if it was gazing directly at them. Thus, there is good evidence that direct-gaze faces break through CFS faster than averted gaze faces, indicating that the former enter awareness faster than the latter.
However, why direct-gaze faces break through CFS faster is still unclear. Although Stein and colleagues argued that faces provide a social benefit, this advantage could also be driven by low-level information. Both Stein et al. [99] and Lanfranco et al. [61] found that inverting a face did not disrupt the advantage of direct-gaze faces over averted-gaze faces, which may suggest that this effect relies on low-level information. Other findings are also consistent with this low-level interpretation: First, Caruana et al. [100] found that gaze direction does not modulate the advantage of fearful expressions over neutral ones, i.e., the former having shorter suppression times than the latter. Prior studies have found that gaze direction interacts with emotional expression such that unsuppressed fearful faces enjoy better detection during attentional blink tasks when their gaze is looking away compared to when it makes eye contact [101,102]; this has been interpreted as evidence of a contextual value of gaze direction for emotional face detection. Importantly, this effect was not present when the face images were CFS-suppressed, suggesting that eye contact promotes access to awareness through low-level processing mechanisms.
Second, using schematic faces, Chen and Yeh [103] found shorter suppression times for direct-gaze not only when stimuli were upright faces or inverted faces, but also when stimuli were simply pairs of eyes, removed from any face. Thus, this suggests that high-level information processing may not be driving the effect. Interestingly, a similar advantage has been found, using bCFS, for faces turned towards the viewer in comparison to faces turned away regardless of their gaze direction, indicating that a similar effect can be found with head angle alone [104].
In summary, the evidence suggests that direct eye-contact can promote access to awareness, but that this effect likely relies on processing of low-level visual features.

Familiarity traits
There is also evidence that more familiar faces enjoy prioritised access to awareness. Geng et al. [105], using bCFS, found shorter suppression times for self-faces (observers' own faces) than celebrities' faces, suggesting that facial information containing self-features gains faster access to awareness. Next, they measured EEG-ERPs during subliminal (CFS) and supraliminal (non-CFS) conditions. They found enhanced N170 amplitude to self-faces in the supraliminal condition and a decreased vertex positive potential (VPP) amplitude to self-faces in the subliminal condition, suggesting a distinct neural modulation associated with familiar faces. Gobbini et al. [48] also used bCFS and expanded on these results by finding an advantage for faces of family members compared to faces of unknown people (and this effect may be specific for self-faces and not for self-adjectives; see [106]). Taken together, these studies suggest that familiarity in the form of own-faces and relatives' faces may enjoy a prioritised access to awareness.

Social information
People extract information from faces to evaluate social qualities such as friendliness and trustworthiness, and then use it to adjust their own social behaviour accordingly. Several studies have suggested that two major axes, trustworthiness and dominance, predominantly characterise this evaluation process [107,108]. Does this evaluation affect how faces access awareness? Stewart et al. [109] generated multiple facial expressions that covered a large range of possible combinations of trustworthiness and dominance traits. They used CFS to render these face images invisible and to test for RTs in a bCFS task. In a series of experiments, they found that faces that were either highly dominant or highly trustworthy elicited significantly longer suppression times than less dominant or trustworthy faces. Interestingly, participants who scored lower in dominance and untrustworthiness took longer to report awareness of the dominant or untrustworthy faces; these findings were replicated and expanded by Getov et al. [110]. The researchers interpreted these results as evidence of slowed visual perception resulting from a possible passive fear response. However, another study conducted by Stein et al. [111] found that the results obtained by Stewart et al. [109] could be explained by differences in low-level visual features: Stein et al. [109] successfully replicated the dominance-and untrustworthiness-related longer suppression times, though they found the same effect when turning faces upside down and when presenting only the eye region of faces to participants, suggesting that the effect of friendliness and trustworthiness in bCFS studies may be due to differences in low-level visual features in the eye region.
A more recent study by Abir et al. [112] used bCFS and found evidence of social information modulation over faces' access to awareness. They replicated the FIE by presenting participants with upright faces and inverted faces and employed a reverse correlation to model the low-level (such as contrast and spectral content) and high-level (perceived power/dominance) facial properties that predicted breakthrough times. They found a dimension that explained a large part of the variance, which in addition correlated with power/dominance, suggesting that these social traits play an important role in how fast faces break through CFS. Crucially, though, the social dimension found could still predict breakthrough times when low-level features were controlled for in the model, based on data obtained from scrambled and inverted faces. These findings suggest that social dimensions (e.g., power/dominance) in faces may promote a face's access to awareness.
But can more stable traits such as race and gender modulate a face's access to awareness? Are faces of our own race, gender or even age group prioritised for awareness? Using bCFS, Stein et al. [113] found shorter suppression times for faces matching the observer's own race or age group, in addition to larger FIEs for own-race and own-age faces compared to other-race and other-age faces, suggesting that experience-based facial information can promote access to awareness. Yuan et al. [114,115] explored this question further by using own-race and other-race faces as CFS-suppressed primes for an affective priming task. They found that suppressed other-race faces facilitate identification (as indexed by shorter response times) of subsequent unsuppressed negative words whereas suppressed own-race faces facilitate identification of unsuppressed positive words, suggesting that semantic information contained in suppressed racial may be processed in absence of awareness.
Taken together, these reports suggest that social information such as power and dominance, and race and gender modulate the time it takes for a face to break through suppression and enters awareness, and that visual features associated with race and gender may be processed unconsciously. The role of low-level features in this effect is controversial, nonetheless.

Attractiveness and aesthetic traits
A classic study that presented unmasked stimuli for predefined exposure durations of either 150 or 1000 ms found that at both durations participants' ratings are always highly reliable regarding attractiveness of faces [116]. However, do attractive faces gain access to awareness faster than less attractive ones? A few studies have claimed that this is the case. Hung et al. [47] were the first to address this. In a series of bCFS experiments, they found that more attractive faces break through CFS faster than less attractive faces. Additionally, by using a staircase procedure, they found that more attractive faces had lower visibility thresholds than less attractive ones under CFS. Finally, they presented two CFS-suppressed faces one to each screen side (a more attractive one and a less attractive one) followed by a brief flash of a Gabor patch on the left or right screen side; they found lower accuracy detecting the Gabor patch when it was shown on the same screen side as the attractive face. These results suggest that more attractive faces reach awareness faster than less attractive faces and that when suppressed they draw spatial attention more effectively than less attractive faces.
However, this advantage of attractive faces might, like the effects noted in previous sections, be due to low-level features rather than holistic face properties. Nakamura and Kawabata [117] used bCFS and successfully replicated the effect of shorter suppression times for more attractive faces, but in two additional experiments, they showed that this effect could also be found when turning attractive faces upside down, i.e., when disrupting faces' holistic information. Conversely, the effect was absent when they compared intact attractive faces to scrambled attractive faces. One explanation is that the advantage effect of attractive faces is driven by low-level features such as differences in contrast and spatial frequency, given that the effect was also found with inverted faces. However, the fact that the effect disappeared when using scrambled faces (i.e., when destroying all facial information contained in face images) may suggest that it relies on a minimal amount of facial information to occur, perhaps just sufficient to convey that the images are face stimuli.
These studies indicate convergent evidence of an advantage of attractive faces when gaining access to awareness compared to less attractive faces. However, this advantage could be due to low-level features.

Conclusions
The studies described above suggest that an array of different properties can influence how faces gain access to awareness. These properties include configural relations, emotional expressions, gaze direction, familiarity features, perceived trustworthiness, and attractiveness. They have been studied mainly using CFS procedures (predominantly bCFS), with the occasional measurement of neural and physiological data. In summary, it has been claimed that faces gain access to awareness faster due to their configural properties, by comparing upright to inverted faces. Fearful expressions have been claimed to enjoy a processing advantage in comparison to other expressions. Similarly, other studies have suggested that facial features such as gaze direction, familiarity, and attractiveness may also promote faces' access to awareness. However, most studies have failed to convincingly attribute their effects to processing of high-level information in faces. This is an important matter because if the advantages attributed to facial features can be explained by low-level features, then said advantages may not be of facial nature. Furthermore, some findings are contradictory or have not replicated, raising the question of how reliable these methods and their paradigms are. To clarify whether different facial features may promote access to awareness, it is necessary to conduct further research and, as detailed below, such research must not suffer from the methodological limitations inherent in the bCFS procedure.

Methodological issues and solutions
The study of how faces and their features gain access to awareness suffers from limitations that stem from the methods and paradigms employed in the field, with the most common being the bCFS procedure. Since these limitations cast doubt on the conclusions that can be drawn, future studies need to address them. Next, we list these limitations and propose strategies that future studies can implement.

The problem of disentangling detection from identification
Most bCFS studies ask participants to give a report as soon as the target stimulus breaks through suppression, by performing a detection or localisation task. The assumption is that RTs reflect breakthrough timesa faster response for detection or localisation should reflect a prioritised access to awareness [21,25]. For this rationale to be valid, neither reporting the presence of a stimulus nor reporting its location on the screen should involve the stimulus category's identification or classification, let alone any more complex recognition processes. For example, if the task is to say whether a face is shown, or whether it was shown on the left or right, it should not need to involve recognising its expression or gender. It is therefore assumed that participants do not waste time identifying and categorising the face's emotional expression or gender before respondingthey simply press the relevant key as soon as they see any part of it. But is this assumption justified? Participants have control over the amount of visual information they receive since trials are self-terminated, which makes it impossible to determine whether participants' RTs are the result of pure detection processes or of a combination of detection and identification processes. If identification processes influence RTs, not only could they confound the results when identification performance differs between experimental conditions, but they could also bring additional identification-related post-perceptual confounderse.g., decision criterion differencesinto the equation. Thus, disentangling detection from identification is necessary to avoid potential confounds (e.g., [61,62]), especially since less information is required to detect a stimulus than to identify its nature [58], but we cannot assume that participants are able to suppress their identification processes just because identification is task-irrelevant.
Disentangling detection from identification (or discrimination) can also be useful to test whether a difference in detection bCFS RTs is driven by unconscious or conscious processes. For instance, recently Stein & Peelen [63] combined tasks of detection and discrimination of faces shown for predefined exposure durations in different orientations to test whether observers are able to detect the location of a face (and exhibit a FIE) while being unable to identify its orientation. By teasing apart these two processes, this method allows measuring detection and discrimination sensitivity, separately, and their corresponding decision criteria (also see [61,62,92]).
The potentially confounding nature of identification processes in bCFS studies raises the question of how one should interpret bCFS findings. For example, what is the nature of the FIE measured with bCFS RTs? Classic face perception studies have shown that turning a face upside down disrupts its recognition [51,52,118], which has been attributed to a disruption of its holistic configuration. If so, then why does it affect bCFS RTs when the task is to just detect the presence or location of the face? As stated above, many bCFS studies have found shorter RTs to suppressed upright faces than inverted faces [24,40,57,59,61,66,80,99]. On the one hand, access to awareness that enables detection could be partly driven by holistic features, as suggested in the literature. On the other hand, the FIE could be due to task-irrelevant identification processes that some participants are not able to suppress, a feasible possibility given the high between-subject variability found for this effect [57]. Future studies should come up with more stringent paradigms that allow disentangling detection from identification.

The problem of post-perceptual factors
Participants' bCFS RTs may be contaminated by their decision criteria for reporting a stimulus breaking through suppression. For example, using traditional visual detection paradigms, it has been shown that anxious participants are more willing (i.e., have more lenient decision criteria) to report ambiguous faces as angry [119] or fearful faces as more fearful [120] compared to healthy individuals. Post-perceptual factors such as response bias (a preference to give a particular response) and decision criterion (the willingness to report a signal) are separate from perceptual sensitivity (the ability to discriminate a signal from noise). Importantly, they may confound participants' subjective reports, especially in bCFS tasks where the main dependent variable is RTs, and the amount of information collected before making a decision is controlled by the participants themselves. In the case of unconscious face processing, for example, participants could exhibit a more liberal decision criterion for reporting fearful expressions than happy expressions, thus needing less information to report the former than the latter, leading to shorter RTs.
Three approaches have been proposed to deal with potential criterion confounders. One approach is to use a conscious control condition that emulates all aspects of the experimental condition except the interocular suppression manipulation. Some control conditions try to achieve this by presenting the stimulus binocularly or monocularly on top of the CFS masks, thus not suppressing it from awareness. The assumption is that such a task should replicate all detection-related postperceptual factors; if the difference in RTs to different stimulus categories is larger in the experimental condition than in the control condition, or if there is a difference in the experimental condition whereas no difference is found in the control condition, such effects could be attributed to differences in unconscious processing [24]. For example, this is the case of the study by Jiang et al. [40], who found significantly shorter RTs to upright faces than inverted faces when the stimuli were suppressed (experimental condition). Such an advantage, however, was not found when they presented the target stimuli on top of the CFS masks (control condition).
There are, however, concerns about this approach. Perceptual uncertainty is higher when stimuli are suppressed from awareness than when they are not, leading to wider response-time distributions and longer tails, and making stimuli easier to predict in control conditions [24]. Therefore, as both conditions differ substantially, participants could adopt different decision criteria per condition, thereby making the control condition useless. Furthermore, because these control conditions are meant to measure access to awareness in the absence of interocular suppression, they may also be too stringent: If the control condition fails to replicate the main condition's findings, it does not necessarily mean that the results of the latter are due to unconscious processes; at the same time, if the control condition replicates the main experiment's findings, the possibility that they are due to unconscious processing cannot be ruled out either.
Another approach is to ask participants to perform a task that is orthogonal to the experimental manipulation. The assumption is that the experimental manipulation's outcome should be unaffected by differences in post-perceptual factors if these factors are unrelated to the task. For example, Gayet et al. [121] asked participants to identify the orientation of suppressed Gabor patches. However, they were interested in the effect of the colour of the annulus surrounding the patches (for which associations had been created earlier by conditioning), making the task irrelevant to the experimental manipulation. Similarly, Salomon et al. [122] asked participants to identify the orientation of suppressed Gabor patches presented inside a hand image, when in fact they were interested in the effect of congruency between the position of the hand image and the participants' hand. This approach has also been adopted in studies about unconscious face processing. For instance, Yang and Yeh [65] presented participants with different facial expressions both in upright and inverted orientations. Participants were asked to press a key as soon as any part of the face broke through suppression. Next, they were asked to report the location of the face on the screen and to rate its emotional valence. The detection and localisation tasks were probably not as orthogonal to the experimental manipulation (emotional expression) as in the previous two examples, given that participants could have guessed the purpose of the study and thereby adjusted their decision criterion to it. In fact, we cannot be certain about what specific aspects of the stimuli are relevant for each participant's decision criterion, even in the first two examples. As long as participants have control over the amount of information received, we cannot know whether their RTs are confounded by criterion differences or not, let alone whether they reflect differences in perceptual sensitivity [61,62].
A newer approach that is gaining traction is using non-speeded tasks where experimenters control how long stimuli are presented and assess participants by combining detection and identification tasks. This paradigm enables control and quantification of both decision criterion (in presence/absence detection tasks) and response bias (in forcedchoice tasks) under a signal detection theoretic framework [61][62][63]92]. This approach is promising because it allows ruling out or at least accounting for post-perceptual factors.

The problem of low-level features, failed replications, and small stimulus sets
CFS paradigms (particularly bCFS) have been used to explore whether high-level facial features promote access to awareness. For instance, emotional expressions, identity, and attractiveness are commonly seen as high-level features since their processing allegedly involves integration of visual features and even memory recall. However, for this logic to be valid, suppressed face images should not differ in any other aspect between each other, otherwise low-level factors such as differences in luminance and contrast [123], spatial frequency [124], and retinal size [125] may confound suppression times, thereby increasing false positives. The problem of low-level features has confounded some bCFS studies. For example, it was originally claimed that semantic relations between an object and its surrounding context could be extracted unconsciously by showing that suppressed scenes with an incongruent object (e.g., a basketball player tossing a watermelon instead of a basketball) broke through suppression faster than congruent scenes [126]. However, studies with more stringent low-level control exploring the same effect failed to replicate it [127,128]. A similar case was made for a study about unconscious visual cueing, in which experimenters showed that separate fragments organised to elicit a Kanizsa triangle illusion broke through suppression faster than when presented in a disorganised way [129]. A subsequent study showed this bCFS effect to be due to a low-level confounderthe presence of collinear edges [130]. Similarly, another study showed that images of snakes have shorter breakthrough times than images of birds, which the researchers interpreted as evidence of prioritised access to awareness [131]. However, a follow-up study found that this effect was due to differences in spatial frequency [132].
Face images are particularly vulnerable to low-level confounding factors. They naturally differ from each other (e.g., when comparing different identities and features) and, crucially, some of their high-level features may depend on low-level feature differences. For instance, emotional expressions differ in spatial frequency [133][134][135] and importantly, certain expressions such as fearful expressions may enjoy prioritised subcortical processing thanks to their spatial frequency [86,89,134]. As described above, multiple studies have shown that fearful expressions break through suppression faster than other expressions [80], but subsequent studies have indicated that differences in low-level features such as contrast could explain differences in breakthrough times [82,84,85]. While it may be the case that the visual system could have evolved this way, with a preference for high-contrast facial features, even involving distinct eye-movement patterns when presented with different expressions [43], it is essential to account for potential low-level confounders to determine what factors drive these effects. This may be a tricky task when studying facese.g., Stein and Sterzer [81] tried to replicate the advantage effect of emotional expressions over non-emotional ones using schematic faces to avoid low-level confounders. But unexpectedly, they found an advantage of happy expressions instead, which, as they showed, was due to another low-level confound: a visual relation between mouth curvature and face contour.
A different but related issue is the fact that many bCFS studies of face processing have relied on a small and circumscribed set of stimuli (e.g., [80,136]), which raises worries about generalisability. This makes bCFS studies even more vulnerable to low-level confounders since using fewer stimuli (e.g., face identities) will make the results more likely to be confounded by the particular visual features of those stimuli rather than the high-level categories that they intend to test.
Another issue is the fact that bCFS studies often suppress stimuli with different degrees of depth. CFS Mondrian-like patterns may differ in temporal frequency [124,[137][138][139][140][141], spatial frequency [124,137], colour [123,142], motion [143,144], internal structure [24], and spatial density across studies [145] (for a review, see Pournaghdali & Schwartz [21]); these visual attributes may interact with the visual attributes of the stimuli, thus introducing potential confounders that in turn may contribute to the number of failed replications. This issue is not specific to face processing studies but relevant to all bCFS studies.
As shown, many failed replications occurred when researchers have controlled for confounding factors, which casts doubts on the validity of the conclusions reached in bCFS studies that did not control for these potential confounds. All these failed replications cast doubts on the validity of bCFS findings and thus call for more stringent procedures.

Conclusions
CFS and especially the bCFS paradigm have received several methodological critiques, which revolve around stimuli-related and participant-related potential confounding factors. The former involves the need to control for low-level visual features that could confound participants' breakthrough times, and the latter involve task-irrelevant identification processes and post-perceptual factors that could confound participants' RTs. See Table 1 for a summary.
How to control for low-level visual effects? While some differences in low-level features may be controlled for by equating them between stimuli (e.g., luminance), other aspects may be inherent to the stimuli themselves (e.g., differences in spatial features between emotional expressions) and therefore may be difficult or impossible to control in a direct manner. Future studies should employ more numerous stimulus sets such that low-level differences between stimuli are more likely to be reduced or cancelled out.
Several of the problems described above stem from the fact that participants in bCFS tasks get to decide how much information they will receive before committing to a response. This makes their results particularly vulnerable to their own decision criteria. Future studies should develop paradigms that do not allow participants to control the amount of information they will receive, thus reducing the reach of participants' response biases and decision criteria. Very recent developments, such as the approaches used by Stein & Peelen [63] and Lanfranco et al. [61,62], which dissociate detection and discrimination using non-speeded tasks and calculate bias-independent SDT measures, are particularly promising.

Table 1
Main methodological issues in bCFS studies of face processing.

Issue
Possible implications Proposed solutions The need to disentangle detection and identification • Breakthrough RTs may not exclusively reflect detection processes. • Recognition and identification processes may confound RTs.
• Use non-speeded tasks where participants do not control the amount of visual information they receive. • Use detection and identification tasks simultaneously, to quantify the influences of both processes.

• Use Signal Detection
Theoretic (SDT) measures to tease apart sensitivity and criterion/bias .
Post-perceptual factors may confound detection performance • Differences in identification-related processing may confound RTs. • Response bias may also confound RTs.
Differences in lowlevel features between stimuli Differences in breakthrough times may be due to differences in lowlevel features.
Select stimuli or equate them for luminance, contrast, size, and spatial frequency.
Many studies have used small stimulus sets • Performance is more likely to be confounded by differences in lowlevel visual features. • Effects due to idiosyncrasies of specific stimuli may contribute to failed replications • Use larger stimulus sets.
• Use validation studies to select stimuli for each category (e.g., emotional expression).
CFS Mondrianpattern visual attributes differ across studies • The depth of suppression may vary between studies that employ different visual attributes in their CFS maskers. • Differential suppression may contribute to failed replications.
• Report the CFS visual attributes used in each study, to facilitate replication. • Base the choice of the CFS masks' visual attributes on the visual stimuli that will be suppressed.

Concluding remarks
Faces contain a wealth of information essential for social interaction. Myriad studies have explored whether different facial features promote access to awareness, claiming that holistic configuration, emotional expression, gaze direction, familiarity, social information, and aesthetic traits in faces may be processed unconsciously and promote faces' access to awareness. However, the methods that many of these studies have used suffer from important limitations that undermine their claims. Future studies should address these matters by developing more stringent methods that can more reliably distinguish between detection and identification of faces, control for post-perceptual factors that may affect participants' detection decisions, reduce the number of low-level visual confounders in their stimuli, and use larger stimulus sets.

Data Availability
No data were used for the research described in the article.