Awareness is needed for contextual effects in ambiguous object recognition

Despite its centrality to human experience, the functional role of conscious awareness is not yet known. One hypothesis suggests that consciousness is necessary for allowing high-level information to reﬁne low-level processing in a “ top-down ” manner. To test this hypothesis, in this work we examined whether consciousness is needed for integrating contextual information with sensory information during visual object recognition, a case of top-down processing that is automatic and ubiquitous to our daily visual experience. In three experiments, 137 participants were asked to determine the identity of an ambiguous object presented to them. Crucially, a scene biasing the interpretation of the object towards one option over another (e


Introduction
Imagine seeing a yellowish circle placed on a table from a distance; at first glance, it could be perceived as a lemon.Give it a second look, and it might seem like a tennis ball.Now, imagine that this circle was hanging from a tree; then, you would probably immediately perceive it as a lemon, because lemons grow on trees and tennis balls do not.Such contextual influences on object recognition are typically referred to as top-down processes (Bar, 2004;Palmer, 1975).These processes are driven by our expectations, drawn from innate knowledge or previous experience with the world.They differ from bottom-up processes that are data-driven, and shaped by the physical properties of the stimulus (Balcetis, 2016;Bar, 2004;Hubel & Wiesel, 1962;Riesenhuber & Poggio, 1999).Many studies have demonstrated top-down effects on object identification (Biederman et al., 1982;Henderson & Hollingworth, 1999;Truman & Mudrik, 2018).For example, perception of an ambiguous object can differ based on the scene in which it is embedded, because of its associated context (Bar, 2004).
Do top-down processes require conscious awareness?The answer to this question is controversial.Some top-down, contextual effects were found to occur without awareness, yet these were mostly confined to lower-level perceptual cases.For example, the brightness illusion, in which two equally bright circles appearing on different backgrounds seem to have different luminosity, can occur without awareness of the background (Harris et al., 2011).Similarly, the tilt illusion, in which a surrounding tilted grating causes a vertical central grating to be perceived as tilted at the opposite direction, also persisted despite unawareness of the surrounding grating (Clifford & Harris, 2005; see also Mareschal & Clifford, 2012, for a replication and further evidence).On the other hand, the Kanizsa illusion (Kanizsa, 1979), in which participants perceive an illusory shape based on amodal completion, was sometimes reported to be found without awareness (Fahrenfort et al., 2017;Wang et al., 2012; but see Moors et al., 2015), and sometimes not (Banica & Schwarzkopf, 2016;Harris et al., 2011;Sobel & Blake, 2003).Finally, a more recent study (Biderman et al., 2020) has focused on symbolic contextual effects.Relying on the classical B-13 manipulation, first employed by Bruner and Minturn (1955), participants were presented with an ambiguous object that could be perceived either as the letter B (when appearing with the letters A and C) or as the number 13 (when appearing with the numbers 12 and 14).Though the contextual inducers were invisible to the participants, they still elicited a reliable (albeit weak) effect on their classification of the ambiguous stimulus.Interestingly though, this effect was not found for lexical contexts (words), aimed at biasing perception of an ambiguous letter.Thus, current evidence suggests that lower-level integration, in the form of simple visual illusions or categorical symbolic contextual effects, can occur even when the context is perceived unconsciously.However, the picture is not clear with respect to higher-level contextual effects.
Critically, no study has examined unconscious contextual influences on visual object recognition.Given our vast experience with objects in scenes in daily life, such processing might have become so automated and trained that it might be independent of conscious processing (for a somewhat similar claim with respect to driving a car, see Dehaene & Naccache, 2001).While integrating scenes and objects is currently considered to depend on conscious processing (Biderman & Mudrik, 2018;Moors et al., 2016;Mudrik et al., 2014), despite earlier findings implying otherwise (Mudrik et al., 2011;Mudrik & Koch, 2013), suppressed scenes might still be potent enough to facilitate the processing of a visible object.Indeed, gist processing occurs very rapidly (Joubert et al., 2008;Sun et al., 2011), even when the scene is presented for 8 msec (Furtak et al., 2022).Importantly, gist extraction was reported also in the near absence of attention (Li et al., 2002), suggesting that it may indeed take place also without awareness.
Here, we accordingly asked whether top-down contextual effects on object recognition could be evoked by invisible scenes.We describe three experiments, all preregistered, in which participants were presented with visible objects that were rendered ambiguous by blurring (e.g., a tent and a pyramid; when blurred, the two can be easily confused due to their similar outline).Each object in each ambiguous pair was matched with a scene in which it is likely to appear (e.g., a camping site for the tent, a desert for the pyramid).Blurred objects were presented while embedded either in their own scene or in the other object's scene.Participants were asked to report the identity of the objects.Critically, scene visibility was manipulated such that they were either consciously perceived or not.We hypothesized that scenes will bias identity judgments when they are consciously perceived, in line with previous findings (Biederman et al., 1982;Brandman & Peelen, 2017;Henderson & Hollingworth, 1999;Truman & Mudrik, 2018), and that participants will be faster when making judgments congruent with the scene.Our major question was whether similar effects will be found when scenes will be presented in an unconscious manner.The codes and materials of all experiments, along with the data obtained in them and the scripts used to analyze the data, are available at https://osf.io/afzre/.

Methods
Below, we report how we determined our sample size, all data exclusions, all inclusion/exclusion criteria, whether inclusion/exclusion criteria were established prior to data analysis, all manipulations, and all measures in the study.

Participants
The experiment was carried out in two phases.First, the conscious condition was run with 12 participants (11 females, 11 right-handed, mean age 24.1, SD ¼ 2.5), to validate the design.Then, 34 naive participants were included in the unconscious condition (23 females, 32 right-handed, mean age 24.7, SD ¼ 3.1).All analysis plans and exclusion criteria were preregistered on the OSF platform (https://osf.io/ayqcz/). 1 Sample size was determined to reach statistical power of 90% based on the effect size observed in Biderman et al. (2020).All participants were students from Tel Aviv University, receiving either course credit or payment for their participation.The study was approved by the Tel Aviv University ethics committee.All participants signed a consent form and were explained that they could leave the experiment at any stage if they wish to do so.
1 Notably, when the first experiment was preregistered, we had a different analysis plan, yet while working on a similar project (with a somewhat different research question; Biderman et al., 2020), we developed a better, Bayesian analysis scheme that allowed us to evaluate null results, and to better fit RT data.Importantly, the analysis as it was registered for the first experiment does not change the results and adds one additional effect of congruency of RT in the conscious condition.This analysis can be found in the supplementary material.
Six additional participants were excluded from the unconscious condition based on our predefined exclusion criteria: four had less than thirty usable trials in at least one of the four experimental conditions, and two reached the maximum number of post-test trials without sufficient invisible trials and so were also removed from analysis (see Trial exclusion).Two additional criteria for participant exclusion were defined but were not met in the current experiment: an average RT of 3 SDs or more away from the group average, and a discrimination between real and scrambled scenes under 1-visibility trials in post-test at a sensitivity index (d 0 ) of 1.5 or more.Our final sample therefore included 46 participants: 12 in the conscious condition and 34 in the unconscious condition.

Stimuli and apparatus
Stimuli were presented on an LCD monitor (23 00 ASUS Sync-Master) with 1920 Â 1080 resolution and 60 Hz refresh rate using Matlab and Psychtoolbox 3 (Brainard, 1997).Participants were seated in a dimly lit room with their heads stabilized by a chin rest located 60 cm from the monitor.
Stimuli included 23 pairs of objects and corresponding 23 pairs of scenes.Paired objects shared similar overall shape, so that they resembled each other when blurred (e.g., a fish and a leaf; Fig. 1A; henceforth, ambiguous objects).Blurring was done using Adobe Photoshop.Objects were of .629e6.472 width and 1.047 e4.263 height (visual angle).Twenty pairs of objects were used in the conscious condition and nineteen in the unconscious condition (one pair was excluded, see explanation in the Trial exclusion section below).The remaining pairs were used for the practice trials.
Scene images were selected to maximally disambiguate the object pairs.Thus, they matched one object and not the other (e.g., a chess board image and an envelope on a table were the two scenes selected for the chess pawnestamp pair; Fig. 1B).Scenes were 9.44 in width and 6.88 in height.Note that in all cases, the scenes were not the original ones from which the objects were taken, but different exemplars of these scenes.Combining the ambiguous objects with their corresponding scenes created a set of four stimuli: object A on scene A, object A on scene B, object B on scene A and object B on scene B. In each set, special care was taken to equate the size, position and orientation of the objects as much as possible.
Forty masks were created by scrambling other scene images taken from the IAPS picture bank (Lang et al., 1997).Scrambling was done by dividing the scenes into 9 Â 6 rectangle segments and shuffling their order.Fixation cross was of 1.24 Â 1.24 size.All stimuli were presented against a gray background (RGB ¼ 128, 128, 128).

Procedure
Each of the four combinations produced by every pair of objects and its corresponding pair of scenes was presented four times during the study, amounting to 16 presentations per pair (and, overall, 320 trials in the conscious condition and 304 in the unconscious condition corresponding to 20 and 19 pairs used, respectively).The experiment was divided into 16 blocks, each containing one presentation of each pair.Presentation order was pseudo-random, with the constraint that all four combinations will appear before a new cycle of their repetitions begins.To familiarize participants with the task, an additional practice block (20 trials) was administered before the main experiment began, using different sets of objects and scenes.
The experimental sequence was identical in the conscious and unconscious conditions, apart from the order of masks and blanks.Every trial began with a fixation cross for a random duration between 900 msec and 1100 msec, located at the center of the to-be-presented ambiguous object.In the conscious condition, this was followed by a 50 msec mask, which in turn was followed by a 100 msec blank display.In the unconscious condition, the blank was presented before the mask.Then, the scene image with the ambiguous object on top was presented for 33 msec.In the conscious condition, this was followed by a 100 msec blank and then a 50 msec mask, with the ambiguous object presented on top of both.In the unconscious condition, their order was reversed.In both cases, therefore, the ambiguous object was presented on screen for a total of 183 msec, while the scene was presented for 33 msec only (Fig. 1B left panel).
In each trial, before this experimental sequence, the names of the two possible objects in that trial were shown to the left and to the right of center (location counterbalanced between participants, but kept constant within a participant to avoid confusion).Names remained on screen until participants pressed the space bar, initiating the presentation sequence.When the sequence ended, the object names appeared again, and participants were asked to decide which object was presented by pressing the left or the right arrow keys.After making a choice, participants were asked to rate the extent to which they saw the scene on a Perceptual Awareness Scale (PAS; Ramsøy & Overgaard, 2004), in which 1 indicated "I saw nothing", 2 indicated "I had a vague perception of something but don't know what it was", 3 indicated "I saw a clear part of the scene", and 4 indicated "I saw the entire image clearly".PAS response was given using the number keys.

Post-test
At the end of the experiment in the unconscious condition, participants performed an additional post-test phase to further assess the visibility of the masked stimuli (adopted from Jiang et al., 2009).In the post-test task, participants were presented with two consecutive sequences that were similar to the sequences presented in the main task (BlankeMaskeSceneeMaskeBlank, with an object overlaid on the latter three) one after the other (Supplementary Fig. S1).One of the sequences contained a real scene, as in the main task, and the other contained a scrambled mask instead (both with an ambiguous object on top of them, as in the main task).Per each such pair of sequences, participants were asked to determine whether the first or the second sequence contained the real scene.Then, they were asked to rate scene visibility using the PAS, like they have done in the main experiment.
The number of trials in the post-test varied between participants based on their performance.In order to have sufficient trials to assess objective performance under each participant's subjective report of no visibility (PAS of 1), we c o r t e x 1 7 3 ( 2 0 2 4 ) 4 9 e6 0 aimed to terminate the post-test when the goal of forty 1visibility ratings was reached both in scene-first and in scrambled-first trials, or when a pre-determined maximum of 240 trials was reached.However, due to a bug in our code, post-test termination was impercise and the maximum was set to 228 trials.Participants that were eventually included in analysis ended up having M ¼ 102.5, SD ¼ 19.3 post-test trials (range: 80e160), out of which M ¼ 41.6, SD ¼ 5.9 trials were given a PAS rating of 1 in each condition (range: 30e55; see Participant exclusion).

Trial exclusion
From among the participants included in the analysis, trials with Reaction Times (RTs) shorter than 250 msec or longer than 4 sec (4% of trials) were excluded.Also, trials in which the RT deviated by 3 standard deviations (SDs) or more from the Fig. 1 e A) Pairs of objects with similar overall shape (e.g., a fish and a leaf) were used as stimuli.Objects were blurred (referred to as ambiguous objects), and presented either on top of their own congruent scene (e.g., a coral reef in the case of the fish), or on top of the other object's scene (e.g., a tree in the case of the fish).B) Designs of the three experiments.Each trial began with a presentation of the prospective options for the object identity and a fixation screen.Then, the trial sequence with the scene and the object was shown.Following each sequence, participants were required to select the identity of the object they have just seen given the two identity options, and then rate how well they saw the scene on an awareness scale.Within each trial sequence, scenes were presented shortly between a forward and a backward mask.In all experiments, the conscious and unconscious conditions differed in the order of the masks and the blanks which surround the scenes.Both conditions are depicted for Experiment 1, whereas only the unconscious conditions are depicted for Experiment 2 and 3 (there, the round blue arrows indicate screens whose order is swapped for creating the conscious conditions).Ambiguous objects were presented on top of different sequence screens for a duration allowing their conscious perception.In Experiment 1, the onset of the object coincided with the presentation of the scene.In Experiment 2, the object appeared after the scene had disappeared.In Experiment 3, the object appeared 130 msec before the scene had appeared.
participant's mean RT in each experimental condition (1.7% of trials) were excluded.Lastly, trials with PAS rating higher than 1 (2.8% of trials) were also excluded from analysis.Overall, 8.5% of trials were removed from analysis due to meeting one or more of the trial exclusion criteria.
In addition, one pair of stimuli (baseball glove and muffin) used in the conscious condition was found to elicit extremely high classification accuracy (p < .01 after FDR correction for all pairs), indicating that it was not sufficiently ambiguous for the task.This pair was therefore removed from analysis of the conscious condition (5% of trials), and was not used when running the unconscious condition.Results of both conditions therefore pertain to 19 pairs of stimuli only (304 trials per participant).

Classification analysis
Analysis followed Biderman et al. (2020).Our dependent variable was classification of the ambiguous object, coded as a binary 0 or 1 according to whether the classification matched the original (pre-blurred) identity of the object (i.e., when the ambiguous object was created from the chess pawn, classification as a chess pawn was coded "1" and classification as a stamp was coded "0").The independent variables were object identity and scene identity, also binary.
Classification responses were modeled using a multilevel logistic regression, in which participant-specific and grouplevel coefficients are fit simultaneously.This approach allows estimating group-level effects while still accounting for individual differences (Gelman et al., 2013), and it can handle a different number of trials per participant, as in our data.A random intercept and random slopes for both dependent measures and their interaction were included in the model, allowing the maximal random effects structure supported by the design (Equation 1; Barr et al., 2013).
A posterior distribution over coefficients was obtained using Markov chain Monte Carlo (MCMC) methods using Stan (Carpenter et al., 2017), via the rstanarm (Gabry & Goodrich, 2016) and brms (Bu ¨rkner, 2017) packages in R statistical software (R Core Team, 2021).Six MCMC chains were run, with 8000 iterations (and 3000 warm-up iterations) per chain, and weakly informative priors (Student's t distribution; M ¼ 0, scale ¼ 2.5, DF ¼ 7) were used for the group-level intercept and coefficients.To interpret results, we computed the median and the 95% highest-density interval (HDI) of the posterior distribution of each group-level regression coefficient.A reliable effect was considered one in which the HDI excluded zero (Gelman et al., 2013;Kruschke, 2014).

Reaction time analysis
To model the nonnegative distribution of RTs, we used a similar multilevel regression model with the same independent variables (Equation 1), but with an ex-Gaussian link function instead of a logistic one.All other implementation parameters were the same.

Scene visibility
As expected, scenes were seen in most trials of the conscious condition (3.71%, 8.15%, 25.2% and 62.94% for PAS values 1 to 4, respectively), and unseen in the vast majority of the trials of the unconscious condition (96.54%, 3.14%, .28%and .04%for PAS values 1 to 4, respectively).

Ambiguous object classification
In the conscious condition, ambiguous objects were predominately given the classification that matched the context in which they were embedded (M ¼ 66.7%, SD ¼ 13.5%; logistic regression: intercept ¼ À.Overall, participants were quicker to perform classification in the unconscious group compared to the conscious Fig. 2 e Individual average classification matching the scene and original identity in Experiment 1. Classifications of ambiguous objects were biased towards the interpretation promoted by a disambiguating scene when that scene was presented consciously, but not when it was presented unconsciously.When the scene was presented unconsciously, the ambiguous objects' original pre-blurred identity affected classification.White dots indicate mean value, gray boxes indicate SEM.These conventions will be used in Figs. 3 and 4

Post-test visibility
The distribution of PAS rankings during post-test indicated that the context was indeed unseen in most trials of this phase as well (83.4%,13.8%, 2.1% and .7%,for PAS of 1e4 respectively).However, results suggested that the presented contexts were not completely suppressed.Considering only trials in which a PAS visibility rating of 1 was given, objective performance was slightly above chance level [M ¼ 53.78%, SD ¼ 5.75%; t(33) ¼ 3.84, p < .001],and the average sensitivity index (d 0 ) was above zero [M ¼ .22,SD ¼ .30;t(33) ¼ 4.26, p < .001].

Discussion
When perceived consciously, scenes affected object recognition, biasing classification of the objects towards the meaning congruent with them.This is in line with literature on the role of top-down processes in perception in general (Bar, 2004;Palmer, 1975), and in object recognition in particular (Biederman et al., 1982;Brandman & Peelen, 2017;Henderson & Hollingworth, 1999;Truman & Mudrik, 2018).Notably, the effect was restricted to classification judgments, with no difference in response speed.When scenes were presented unconsciously, however, results were different.Context had no effect on object recognition.Instead, the original identity of the object stimuli was found to influence recognition.Arguably, in the absence of visible scene information, participants may have focused more on the visible object, allowing them to extract more information about its original identity and perform better in the main task.Interestingly, post-test results showed that for trials where participants reported not seeing the stimulus, they were nevertheless slightly above chance in detecting in which of two sequences the scene appeared.Although this result does not affect the main conclusion of this experiment, as in any case we did not find any effect of the invisible scenes, it does relate to an ongoing discussion about the best approaches to measure consciousness: subjective measures that rely on introspective report versus objective ones that are based on performance (Hesselmann et al., 2011).It is not uncommon for these two measures to dissociate; above chance performance found in subjectively unconscious trials is typically referred to as blindsight-like phenomenon (Azzopardi & Cowey, 1997;Cowey, 2004;Meeres & Graves, 1990;Weiskrantz et al., 1974).In the brain, objective awareness has been linked to low-level activity, while subjective awareness to activation of higher-level areas specializing in stimuli processing (Stein et al., 2021).
Taken together, the results imply that consciousness might be needed for top-down contextual effects on object recognition.That is, scenes that strongly bias perception when consciously perceived, exert no such effect when invisible.However, the lack of a context effect might reflect a failure to simultaneously integrate, and perhaps even process, a consciously perceived object with an unconsciously perceived scene, rather than a failure of top-down processes per se.This can hinder the integration between a consciously perceived object and an unconsciously perceived scene and provide an alternative explanation for the current findings.
To examine this option, Experiment 2 decoupled the scene and the object, to allow top-down effects without the need to simultaneously integrate the visible object and the invisible scene.If indeed the null results reflects a failure to integrate, rather than a failure of top-down process to bias perception in the absence of awareness, we should be able to obtain a context effect by presenting the invisible scene prior to the ambiguous visible object.

Methods
The procedure and analysis methods closely followed those of Experiment 1, apart from the differences described below.

Participants
Here, five participants were excluded: two for having less than thirty usable trials in at least one of the four experimental and three who reached the maximum number of post-test trials with an insufficient number of invisible trials.The final sample included 12 participants in the conscious condition (10 females, 9 right-handed, mean age 22.8, SD ¼ 1.72 ), and 33 participants in the unconscious one (24 females, 28 righthanded, mean age 24, SD ¼ 2.9).The experiment was fully preregistered (https://osf.io/9nv6m/).

Procedure
In Experiment 2, the ambiguous object was not presented on top of the masked scene.Therefore, the first time that the object appeared in a trial was during the backward mask, and its total duration on screen (including the backward mask and subsequent blank) was 150 msec (Fig. 1B middle panel).The masked scene was presented for 33 msec as before.Only 10 stimulus pairs were used in this study, since we were not able to remove the object from the other scenes in a way that did not introduce visual distortions onto the scenes.Therefore, each of the four versions of each pair was now repeated eight times during the experiment, with 320 trials per participant.

Post-test
The post-test phase in Experiment 2 was terminated if participants reached a maximum of 240 trials.Participants in the unconscious condition who were eventually included in analysis underwent M ¼ 109.2,SD ¼ 26 post-test trials (range: 80e177), of which M ¼ 41.3, SD ¼ 5 trials were given a PAS rating of 1 in each condition (range: 31e52).

Trial exclusion
With the same trial exclusion criteria as in Experiment 1, 3.9% of trials had extreme RTs below 250 msec or above 4 sec, 1.7% of trials had RT which was 3 SDs or more within an individual, and 15.2% had visibility rating higher than 1.Together, these amounted to 23.6% of trials that were excluded from analysis for at least one of these reasons.

Discussion
Experiment 2 results replicate those found in Experiment 1, demonstrating the importance of top-down processing in objects recognition, and suggesting that these effects depend on conscious awareness of the scene.This latter finding negates the hypothesis that the null result in the unconscious condition in Experiment 1 stemmed from a failure to simultaneously integrate the invisible scene and the visible object.Hence, Experiment 2 strengthens the claim that consciousness may indeed be essential for top-down scene-based contextual effects on object classification.
Yet before making such a strong claim, another possible explanation should be tested, focusing on the temporal dynamics of top-down processes.According to the matching model of object-scene processing (Bar, 2004), early visual areas send initial information based on low spatial frequencies of an observed object to the orbitofrontal cortex (OFC), which generates informed guesses about the possible identity of the object.These predictions are activated and back propagated to inferior temporal areas (IT).There, they are integrated with upcoming information based on high spatial frequencies, so the correct identity of the object is determined (Trapp & Bar, 2015).Critically, this process is held to occur about 130 msec after the stimulus was presented (Tal & Bar, 2014), and the integration in IT is assumed to peak at 180 msec.Based on this model, therefore, top-down effects should be strongest after the guesses about object identity have been activated.It could thus be that the null results in Experiments 1 and 2 resulted from the temporal order of stimuli, which were not presented at the most ideal timing for top-down processes to exert their effect.Fig. 3 e Individual average classification matching the scene and original identity in Experiment 2. When the disambiguating scene was presented before the ambiguous object, it again biased classifications when it was presented consciously, but not when it was presented unconsciously.The original identity of the ambiguous objects had no effect on classification in this study.
Thus, we conducted an additional experiment in which the ambiguous object was presented early enough to generate predictions in OFC about the object's identity, which could then arguably be resolved upon the presentation of the scene.The ambiguous object was accordingly presented 130 msec prior to when the scene was presented.If the null effects in Experiments 1 and 2 stemmed from insufficient time to form object-based hypotheses of identity, a context effect should be found in Experiment 3.

Methods
Again, the methods followed closely those of Experiment 1, and the differences are described below.

Participants
In this experiment, twelve participants were excluded: Eight participants had less than thirty usable trials in at least one of the four experimental conditions and four were excluded for technical difficulties.The final sample included 12 participants in the conscious condition (7 females, 10 right-handed, mean age 27.7, SD ¼ 4.5) and 34 participants in the unconscious one (26 females, 29 right-handed, mean age 26.5, SD ¼ 4.0).Like the other two experiments, Experiment 3 was also preregistered (https://osf.io/bx5z7/).

Procedure
In Experiment 3, presentation of the ambiguous object began before the presentation of the masked scene.The object appeared on top of the blank screen before the forward mask (80 msec), and then remained on screen during the forward mask (50 msec), the scene (30 msec), the backward mask (50 msec) and the last blank (80 msec; Fig. 1B, right panel).It was presented, therefore, for a total of 290 msec, of which 130 msec were before the scene was presented.Notably, in order to achieve 130 msec of object presentation prior to the scene, a screen refresh rate of 100 Hz was used in Experiment 3, and the masked scene was presented for 30 msec, slightly shorter than in Experiments 1 and 2. All 19 stimuli pairs used in Experiment 1 were used in Experiment 3 as well, amounting to 304 trials per participant.

Post-test
The post-test phase in Experiment 3 was terminated if participants reached a maximum of 228 trials.Participants in the unconscious condition that were eventually included in analysis underwent M ¼ 91.5, SD ¼ 14.9 post-test trials (range: 80e143), of which M ¼ 41.1, SD ¼ 4 trials were given a PAS rating of 1 in each condition (range: 30e50).

Trial exclusion
Examining trial exclusion criteria, 6.1% of trials had RT below 250 msec or above 4 sec, 1.7% had RT above 3 SD withincondition within-participant, and 4.4% had a visibility rating higher than 1.Overall, 25.4% of trials met one or more exclusion criteria and were removed from analysis.
As in Experiment 1, classification RT of the unconscious group was quicker than that of the conscious group [t(44) ¼ 3.66, p < .001;M ¼ 926.1, SD ¼ 321.3 msec vs M ¼ 1310.3,SD ¼ 285.4 msec, respectively].Context and object identity did not affect the RTs of the conscious group (logistic regression: Fig. 4 e Individual average classification matching the scene and original identity in Experiment 3. When the ambiguous object was presented before the disambiguating scene, classifications were biased towards the consciously processed scene, but not the unconsciously processed one.Object identity affected classification both in the conscious and in the unconscious condition.

Discussion
As in experiments 1 and 2, a clear contrast between conscious and unconscious processing was observed: while top-down, contextual effects were evoked by consciously processed scenes, these processes failed to affect performance for unconsciously processed scenes.This excludes yet another alternative explanation that could have accounted for the results: that insufficient time has been given for object-based associations to form, according to the model proposed by Tal and Bar (2014).Finding no contextual effect in the unconscious condition supports the hypothesis that consciousness plays an essential role in top-down contextual effects on object recognition.

Aggregated analysis
Finally, to better estimate the evidence for the context effect in the conscious condition and the null results in the unconscious condition, we conducted an exploratory analysis where we pooled data from all our three experiments into a single analysis.This allowed us to examine results with higher statistical power.The same analysis used in each experiment was applied to the aggregated data of the conscious conditions and the aggregated data of the unconscious conditions of the three experiments.In addition, following a suggestion by one of the reviewers, classification in the conscious conditions was examined as a function of subjective visibility, providing a within-subject test that complements the between-subject results obtained in this work.Over all experiments, classification in the conscious condition was biased by the scenes objects were embedded in (logistic regression: intercept ¼ À1.04, 95% HDI ¼ [À1.34,À.75]; context coefficient ¼ 1.67, 95% HDI ¼ [1.11, 2.23]) and also by object identity (identity coefficient ¼ .22,95% HDI ¼ [.02, .41]).No interaction was found between the two factors (interaction coefficient ¼ À.02, 95% HDI ¼ [À.20, .15]).Most importantly, even with the tripled-sized sample, in the unconscious condition, context had no effect on classification (logistic regression: intercept ¼ À.27, 95% HDI ¼ [À.38, À.15]; context coefficient ¼ .04,95% HDI ¼ [À.03, .11]),while object identity did (identity coefficient ¼ .33,95% HDI ¼ [.22, .44]).No interaction was found between the two (interaction coefficient ¼ .01,95% HDI ¼ [À.09, .11]).In addition, neither context, identity nor their interaction were found to affect classification RT in the conscious and in the unconscious condition.
Lastly, subjective visibility reports (PAS) in the conscious condition were found to modulate the effect of scenes on classification, such that higher ratings of visibility yielded greater conformity with the presented scenes [F(2.3,62) ¼ 14.53, p < .001,h 2 p ¼ .35;Fig. S2].Follow-up analysis revealed that the contextual scene effect was abolished in trials rated as invisible (PAS ¼ 1), wherein scene-conformity was at chance [M ¼ 54.36%, SD ¼ 21.08%; t(29) ¼ 1.13, p ¼ .266].A similar modulation by visibility was not found with respect to the original identity of the objects [F(2.5, 66.4) ¼ .75,p ¼ .501,h 2 p ¼ .027].These results further strengthen our claim that conscious processing of scenes is needed for a top-down effect of scenes on object recognition.

General discussion
We report a series of three studies examining the role of consciousness in top-down contextual effects on object recognition.Ambiguous objects were presented such that they were embedded in scenes that disambiguate them, which were either suppressed from conscious awareness or not, under three different temporal conditions: the object and scene appearing concurrently (Experiment 1), the scene appearing before the object (Experiment 2), and the object appearing before the scene (Experiment 3).Under all conditions, scene-based contextual effects were found when the scenes were consciously perceived, but abolished when they were processed unconsciously.Together, our results strongly support a fundamental role for consciousness in top-down scene-based contextual effects.Our findings are in line with previous studies suggesting that consciousness is essential for semantic integration and top-down processes (Mudrik et al., 2014).Most relevant is the work by Biderman et al. (2020), where invisible symbolic contextual inducers affected the classification of an ambiguous object but did not do so with lexical context.This suggests that higher-level top-down effects might indeed require conscious processing.Notably though, this result may simply reflect that scenes cannot be unconsciously processed; this interpretation is in line with previous studies that failed to find effects of congruency processing without awareness (e.g., Biderman & Mudrik, 2018;Faivre et al., 2019).It also accords with the lack of a motion aftereffect for invisible scenes (Faivre & Koch, 2014), and a lack of semantic priming for unseen pictures (Stein et al., 2020).Thus, our results emphasize the limited nature of unconscious processing (Hesselmann & Moors, 2015;Moors et al., 2017Moors et al., , 2019;;Newell & Shanks, 2014;Peters & Lau, 2015), and the possible functions of conscious processing; either in allowing for the processing of complex stimuli like real-life scenes, and, more precisely, gist extraction (but see Furtak et al., 2022), or in evoking top-down processes that facilitate and constrain object recognition.
c o r t e x 1 7 3 ( 2 0 2 4 ) 4 9 e6 0 Importantly, we are not making a claim against unconscious semantic processing in general.There is ample, albeit debated, evidence in the literature for such processing, either manifested in behavioral effects (e.g., semantic priming) or in neural activity (for review, Kouider & Dehaene, 2007;Mudrik & Deouell, 2022).Our experiment examined the potency of unconsciously processed scenes to administer top-down influences on perception.As scenes are richer and more complex than verbal stimuli, they might require deeper processing, which in turn might require conscious processing (Biderman & Mudrik, 2018;Faivre & Koch, 2014).Our results therefore highlight the limits of unconscious processing in exerting top-down, contextual influences on perception.
Making such a claim based on a null result should be done with caution; one has to demonstrate that the null finding is convincing and cannot be explained away by alternative hypotheses.Here, we replicated the null result in three different experiments, adopting a Bayesian approach which allows us to estimate the likelihood of the null model given the data, while excluding two alternative explanations: difficulty to perform an online, simultaneous integration between the visible object and the invisible scene, and insufficient presentation time of the stimulus, rendering the timing less ideal for top-down effects to be found.Disjoining the presentation of the object and the scene during trials allowed us to confront both hypotheses.The claim that top-down scenebased contextual effects do not occur without awarenress is strenthened by the absence of contextual effects under these conditions, their absence in an aggregated analysis of all three experiments pooled together, and the within-subject analysis, based on subjective visibility, which yielded similar results.
Our findings accord with theories of consciousness that tie consciousness with specific functions, such as integration for the 'Global Neuronal Workspace' (GNW; Dehaene & Naccache, 2001;Mashour et al., 2020) and high-level perceptual organization for the recurrent processing theory (Lamme, 2020;Lamme & Roelfsema, 2000).According to GNW, the first stage of processing is expressed in bottom-up processes carried automatically and involving sensory brain areas, mostly in a feedforward manner.The second stage, expressed in a global broadcasting of information, involves conscious processing, and allows feedback from higher areas as well as long-range integration.Thus, the dependency of top-down processes on conscious processing, found in this study, is in line with this theory.
Interestingly, when the scenes were invisible, the properties of the objects (which were visible) were more dominant in determining object recognition, while this was not the case when the scenes were visible.This is opposed to Biderman, et al. (2020) in which an effect of the object itself was found both under conscious and under unconscious context conditions.A critical difference between the two studies is that the properties of the stimuli were more pronounced in that study than in the current one, in which the objects were blurred.Thus, it might be that in this study, the strong difference in stimulus reliability between object and scene in the conscious condition abolished the effect of the object.That is, the blurred object was substantially devalued compared with the intact scene, thereby exerting no effect on performance.Indeed, studies of multisensory integration suggest that sensory channels associated with less noise gain precedence over more noisy channels in determining perception (Ernst & Banks, 2002).For instance, Alais and Burr (2004) have found that whereas vision typically dominates over sound (i.e., the ventriloquist effect), this effect is reversed when vision is blurred and thus degraded.Our findings may result from a similar process occurring between two visual cues.In the conscious condition, clear scenes were granted dominance in perception over blurred objects, hence context e but not identity e biased recognition.In the unconscious condition, on the other hand, the scenes were unseen, and the blurred objects were experienced as appearing on top of scrambled masks.In that situation, they were deemed as more informative than the scrambled masks, and were accordingly granted dominance, leading to identity e but not context e biasing recognition.
Another interesting aspect of our results relates to the dissociation between objective and subjective measures of consciousness, which was found in all three experiments: though participants reported not seeing the stimuli (subjective measure), they showed reliable, albeit weak, above chance performance in discriminating between them (objective measure).This can be taken as further corroboration for the role of consciousness in top-down perception, since even under subjective-but-not-objective invisibility a contextual effect was not found, despite being strong in the conscious condition.Thus, whether above-chance performance stemmed from residual conscious processing (Gelbard-Sagiv et al., 2016;Kouider & Dupoux, 2004), or from blindsight-like processing (Azzopardi & Cowey, 1997;Cowey, 2004;Meeres & Graves, 1990;Weiskrantz et al., 1974), it was not potent enough to generate a contextual effect.
Irrespective of the question of consciousness, the contextual effects we found in the conscious condition strengthen the contextual facilitation model for object recognition (Bar, 2004).According to this model, predictions are rapidly derived from contextual information and are sent back to the IT cortex, activating object representations and facilitating recognition.As in Brandman and Peelen (2017) (see also Truman & Mudrik, 2018), we find that context facilitates recognition of ambiguous objects.
To conclude, we demonstrate that a disambiguating scene biased recognition of an ambiguous object when the scene was consciously processed, but not when it was unconsciously processed.In the absence of conscious awareness of the scene, top-down contextual modulation was not observed, and instead bottom-up modulation by object properties was found.Our findings accordingly suggest that consciousness may play an essential role in top-down contextual effects and in integrating objects with scenes.

Open practices
The study in this article earned Open Data, Open Material and Preregistered badges for transparent practices.The data and materials used in this study are available at https://osf.io/afzre/ and the preregistered studies at: https://osf.io/ayqcz/,https://osf.io/9nv6m/and https://osf.io/bx5z7/.