Is auditory awareness graded or dichotomous: Electrophysiological correlates of consciousness at different depths of stimulus processing

The level-of-processing (LoP) hypothesis postulates that transition from unaware to aware visual stimuli is either graded or dichotomous depending on the depth of stimulus processing. Humans can be progressively aware of the low-level features, such as colors or shapes, while the high-level features, such as semantic category, enter consciousness in an all-or none fashion. Unlike in vision, sounds always unfold in time, which might require mechanisms dissimilar from visual processing. We tested the LoP hypothesis in hearing for the first time by presenting participants with words of different categories, spoken in different pitches near the perceptual threshold. We also assessed whether different electrophysiological correlates of consciousness, the auditory awareness negativity (AAN) and late positivity (LP), were associated with LoP. Our findings indicate that LoP also applies to the auditory modality. AAN is an early correlate of awareness independent of LoP, while LP was modulated by awareness, performance accuracy and the level of processing.


Introduction
Do we perceive a weak sound with gradual clarity or in an all or none manner?Windey and Cleeremans (2015) proposed the level of processing hypothesis (LoP), which postulates that the transition from unaware to aware depends on the complexity of perceptual processing and on the selected set of stimulus features.On the low-level of processing, which handles relatively low level processing features, such as shape or color that are hard to associate with a single label, awareness is graded.On the high-level of processing, such as semantic evaluation and determining object's categories, awareness is dichotomous.The LoP hypothesis was proposed for vision and some of the LoP predictions were confirmed for the visual modality (Windey & Cleeremans, 2013;Derda et al., 2019;Jimenez et al., 2018Jimenez et al., , 2020Jimenez et al., , 2021)).
In contrast with vision or visual stimuli used, which in most experiments on awareness are static and very briefly presented, sounds always unfold in time and this structural difference may imply differences in perceptual mechanisms.Although auditory perception has been described by the various types of dual-level processing frameworks, including separate steps for auditory stream and object processing (Bregman et al., 1990;Mar Marinato inato et al., 2019;Shamma, 2008) or for acoustic cues and phonetic categories in speech (Pisoni, 1973;Toscano et al., 2010), it remains unknown whether mechanisms of auditory consciousness follow a similar duallevel logic.Even if so, levels of processing in hearing might differ from those in vision: some studies on auditory perception show graded categorization in speech stimuli (Toscano et al., 2010), which, if also true for the auditory consciousness, could contradict with LoP which associates categorization with a dichotomous awareness of the categorical information.There are alternative ways how they could be related: awareness of both high-and low-level features could be graded due to its temporal extension, or alternatively, it could be graded for the auditory streams of the low-level features, while dichotomous on the level of physically wholistic auditory objects.In the present study, we investigate these alternatives by testing how LoP is related to auditory awareness for speech stimuli in hearing.
Relationship between the different levels of processing and various neural correlates of consciousness (NCC) in hearing lacks empirical investigation.When it comes to the electrophysiological correlates of consciousness in vision, the concept of "phenomenal consciousness" has been linked with visual awareness negativity (VAN), whereas the concept of "access" or "reflective" consciousness with the late positivity (LP) (Förster et al., 2021;Koivisto & Revonsuo, 2010).It should be, however, noted that the conceptual distinction between "phenomenal" and "access" consciousness (Block, 1995;Revonsuo, 2010) is still subject of a debate (Cohen & Dennett, 2011;Naccache, 2018).As originally framed by Block (1995), phenomenal consciousness refers to subjective experience in the Nagelian what-it-is-likeness sense (Nagel, 1974), while access consciousness begins when the phenomenal experience becomes available for cognition.A fundamental debate surrounds both the phenomenal versus access distinction itself, questioning whether there are truly two types or stages of consciousness (Phillips, 2018;Naccache, 2018;Cohen & Dennet, 2011;Amir et al., 2023), and the degree of the dissociation and overlap between them in case they are reliably distinct.Another dimension of this debate depicts the overflow argument, asserting that phenomenal consciousness is richer than the capacity for access (Block, 2011(Block, , 2014;;Brown, 2012;Cohen, Dennett, & Kanwisher, 2016;Aru & Bachmann, 2017), evidence for which was already proposed by Sperling (1960).Some other arguments for the phenomenal vs access distinction also arise from research on dreaming (Sebastian, 2014;Crespin, 2015).To date, one empirical study by Amir et al. (2023) has reported a dissociation between phenomenal and access consciousness, in which participants were able to retrospectively report changes in a background pink noise, presented in the mixture of target and nontarget sounds, while lacking immediate access to it.
The theoretical debate concerning phenomenal and access consciousness could be influenced by what the empirical relationship between NCC and LoP turns out to be.If PAN represents phenomenal consciousness (P) and is modulated by awareness only in the lowlevel of processing, while LP represents access/reflective consciousness (A) and is modulated by awareness solely in the higher-level of processing, the dissociation would further support the conceptual distinction between "phenomenal" and "access".The former would be tied to sensory qualia, whereas the latter to the conceptual or categorical cognition of the stimulus.Otherwise, if one would still reject the P-A distinction, following the GNWT line of argument, the low-level awareness (such as of color or pitch) would have to be treated as preconscious or non-conscious: if PAN, underlying the lower level of processing, is treated as preconscious in the absence of LP, one has to admit that the entire lower level would be preconscious as well.
Indirect evidence for the double dissociation between NCC under different levels of processing could be found outside consciousness research, particularly in studies on auditory perception.For instance, Toscano et al (2010) reported that the amplitudes of N1 ERP component were influenced by acoustic differences in low-level stimuli properties, while P3 amplitudes were affected by the phonological categories.Given that LP is associated with a P3 time window, one can speculate a similar link between the LP and a higher-level of processing, which should, however, be rigorously investigated.One study on LoP and electrophysiological NCC in vision reported a double dissociation, where VAN was modulated by awareness rating only in the low-level task and LP changed during the high-level task (Jimenez et al., 2021), attributing these components to different levels of processing.Another study by Koivisto et al. (2017) reported that VAN was influenced by the stimulus detection, but not the identification.On the other hand, Wiens et al. (2023) used Bayesian linear mixed effects models and found extreme evidence that both VAN and LP were stronger and therefore more sensitive for detection compared to identification in near-threshold stimuli with separate detection and identification thresholds.A connection between the Koivisto et al. (2017) and Wiens et al. (2023) studies and LoP could be made, considering that detection and identification at least partially correspond to the low-and high-level of processing respectively.It should be, however, noted that various studies (including Wiens et al., 2023;Derda et al., 2019) have not found a double dissociation between ERP markers pertaining to different LoP and that evidence is still controversial (see Jimenez et al., 2020 for a review).
To investigate whether auditory awareness is graded or dichotomous at the different levels of processing and how NCC relate to each level, we implement the LoP paradigm using two animal and two object words spoken in four different pitches.Participants were D. Filimonov et al. asked to evaluate either the word category (high-level task) or the pitch height (low-level task) in different blocks.Previous studies on LoP suggest that the most distant levels of processing, such as of detection vs semantics, can radically vary in processing or stimulus presentation time that will be sufficient for awareness as well as influence the steepness of the psychometric function (Jimenez et al., 2020).To mitigate these effects and avoid radical criterion shifts during awareness ratings between the tasks, which would confound the results, we chose the pitch discrimination rather than stimulus detection as a low-level task.
In alignment with the cumulative evidence from visual modality and studies on auditory NCC, we assume that the lower-level awareness will be more graded compared to the higher-level awareness at the behavioral level.In LoP paradigm, the proportion of intermediate awareness ratings on the awareness scale and/or the number of correct trials at different scale levels are compared between the tasks (Jimenez et al., 2020).A greater number of intermediate ratings or a linear increase of correct trials would indicate graded awareness and is expected for the low-level compared to the high-level task.The LoP effect is usually observable in the intermediate ratings since in clearly aware trials accuracy will be similarly high, while unaware trials will result in similar near chance accuracy.In addition to that, LP will be linked to access/reflective consciousness and will be modulated by awareness rating level, accuracy and levels of processing.This type of consciousness will have cognitively more complex contents during the high level of processing, in the sense that the experience of the stimulus including an awareness of its category has more conceptual complexity than an experience without any awareness of the stimulus category.
As we tend to think that both levels of processing possess phenomenal properties and recent study by Wiens et al. (2023) reported VAN and LP at both levels of processing (LP was more prominent at the higher-level), a double dissociation between NCCs specifying different levels is unlikely.Therefore, we predicted that awareness rating level will modulate the ERPs in AAN time window for both tasks because higher level depends on lower level processing results, but not necessarily the other way around, while the LP time window will be modulated by awareness, level of processing and responding accuracy.The accuracy was preregistered as a separate predictor to further support evidence that the LP is task-specific.As evidence exists that LP can be divided into two parts with distinct functional profiles (Filimonov et al., 2022), we separately analyzed its early and late components and predicted that the early LP will be mostly modulated by awareness rating level and the task accuracy, while the late LP will be affected by awareness and the level of stimuli processing.

Participants
We have preregistered (https://osf.io/rgack)the minimum sample size of twenty-five participants.In total, thirty-seven healthy right-handed participants (age: M = 24.3,SD = 3.95) were recruited from the Turku area.Before the experiment, they gave an informed consent in accordance with the Declaration of Helsinki.The study was accepted by the Ethics Committee for Human Sciences at the University of Turku.All participants reported normal or corrected-to-normal vision and normal hearing.The preregistered exclusion criteria included less than 15 trials in any of the conditions of interest and a noisy EEG data.In addition to the preregistered criteria, failure in calibrating individual auditory thresholds within 35 %-65 % detection rate was a reason for exclusion.The decisions on a noisy data were made by viewing the raw EEG files before ERP computations: in case of excessive noise in one third of the electrodes that persisted in two thirds of the recording period, the data was considered bad.One participant's EEG file was corrupted, two datasets contained excessive noise in most of the electrodes throughout the recording, four participants weren't able to calibrate Fig. 1.Trial structure.A trial started with a random 500-1000 ms blank interval, following by a 200 ms fixation cross after which the masker noise played for 2000 ms and a stimulus was presented for 350 ms in a 420 to 1580 ms interval.The trial finished with a 200 ms blank interval before a perceptual awareness rating and either the high-or low-level task.
D. Filimonov et al. individual thresholds and five had zero trials in some of the conditions of interest.The resulting sample was N = 25.

Stimuli
Four word stimuli (cat, elk, cup, oak) were generated in four different pitches with free voice synthesizer (https://voicegeneratior.io/) using Google US English as a voice template.They were binaurally presented via in-ear earphones (Neuroscan, 10 O ¼ stereo) using PsychoPy (version, 3.0.7)on a Windows 10-based computer.The pitch levels differed by 7.5 semitone steps from each other or by 0.1 step on the voicegenerator.ioscale in the 0.7 to 1.0 range.The length of the words was in a range of 310 to 340 ms.Computer's own volume was adjusted to 25 % for the participants' comfort.The resulting stimuli were chosen from various types of object and animal words of similar lengths in a pilot study and those which participants discriminated best were selected.Words were presented in a random time interval between 420 ms and 1580 ms after the onset of the masker stimulus, which was a 2000 ms white noise burst.Participants' responses were recorded with a Logitech gamepad control (model, F310).

Procedure
The experiment consisted of eight blocks with 100 trials, including 80 target stimuli (each word at every pitch level was presented 5 times per block) and 20 catch (empty) trials.In half of the blocks, participants performed a high-level of processing task, while in other half they had a low-level of processing task and the task order was randomized.In the beginning of each block, either the low-or the high-level tasks prompt "In this block please assess the stimulus category (pitch): animal or object (high or low)" was presented.
The trial structure is shown in Fig. 1.Each trial started with a random blank interval of 500 to 1000 ms, followed by a 200 ms fixation cross, after which a masker noise was played for 2000 ms.A stimulus was presented within a masker noise within a 420 to 1580 ms time range.After the noise, a 200 ms blank interval was presented, after which participants were asked to rate their awareness with a perceptual awareness scale (PAS) (Ramsøy & Overgaard, 2004;Sandberg & Overgaard, 2015), which had four levels corresponding to whether they heard the stimulus clearly (PAS3), almost clearly (PAS2), weakly (PAS1) or did not hear it (PAS0).In the end of each trial, depending on the task block, either a high-level question ("Animal or object?"Aanimal, B − object) or a low-level question ("High or low pitch?"Ahigh, B − low) was presented and the response was given with the corresponding A and B buttons on a gamepad joystick.In the high-level task participants decided whether the stimulus referred to animal or object and in the lowlevel task they decided whether the pitch was high or low.Two lower pitch levels were considered as low and two upper levels as high.
Before the actual experiment, participants underwent a preparation block, where they were familiarized with the stimuli and the tasks by listening to each word in all four pitches (16 trials in total).The words were presented without the masker noise and were clearly audible.Participants were instructed to discriminate the two higher pitches as high and the two lower pitches as low as well as "cat" and "elk" as animals and "cup" and "oak" as objects.After the preparation block, individual awareness threshold was calibrated by adjusting the masker volume with a "one down-two up" staircase procedure to achieve 35 %-65 % awareness level (step sizes in linear units: 5, 5, 5, 4, 4, 4, 3, 3, 3, 3, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 0.5, 0.5, 0.5, 0.5, 0.5).Participants responded with the same PAS scale as in the actual experimental trials, where PAS1, PAS2 and PAS3 were treated as aware and PAS0 as unaware.During the calibration, participants only rated their awareness in order to exclude the influence of task on PAS: they were given oral and written instruction to rate awareness of the word in general.After the calibration block (119 trials), participants performed a validation task, which included 40 low-level and 40 high-level trials and PAS.If the individual threshold failed to calibrate or the accuracy of any task was out of the 15 %-65 % range, a second calibration block of 68 trials was performed and followed by a new validation block.The ranges were specifically chosen in order to acquire aware and unaware trials with correct and incorrect performance for both tasks under every PAS level.After subject-by-subject adjustments of stimulus volume to approximate their individual thresholds, stimuli remained the same throughout the experiment.

EEG recording
EEG was recorded using 32 silver/silver chloride ring electrodes attached to the recording cap (EASYCAP GmbH) and a NeuroOne Tesla amplifier (Bittium, 2022) with a band pass of 0.05-100 Hz and a 500 Hz sampling rate.The standard 10-20 electrode grid layout was extended with addition of two extra electrodes: one was attached to the nose as an online reference electrode, and another was attached to the forehead between the FP1 and FP2 electrodes as an online ground electrode.

Preregistration
The preregistration included two behavioral analyses: one on awareness ratings with task (2: high level, low level) and accuracy (2: correct, incorrect) as factors and another on accuracy with task (2: high level, low level) and awareness (3: PAS0, PAS1, PAS2) as factors.The ERP analysis was awareness (3: PAS0, PAS1, PAS2), task (2: high level, low level) and accuracy (2: correct, incorrect) analysis on the mean amplitudes in AAN and LP time windows.In addition to the main analysis, we have preregistered an exploratory factorial mass univariate analysis (FMUT) on the whole trial time-window and all electrodes for obtaining more nuanced picture of the effects of interest, which could be missed if the scalp topography of the ERP activity shifts away from the preregistered clusters.We have chosen to change the preregistered repeated measures ANOVAs to linear mixed effects models for both behavioral and EEG data due to the low trial amount in some of the experimental conditions.The models are described in the corresponding sections below.
As there were few PAS3 ("heard clearly") trials throughout the experiment, we have combined PAS3 with PAS2 ("heard almost clearly") in a single variable PAS2, so that the PAS2 would denote "clearly".Since we failed to predict the low probability of some experimental contrasts in the preregistration, for example, that incorrect pitch or category estimation in clearly aware trials would be very low, it resulted in less than 15 trials per condition for some participants.Therefore, instead of performing repeated measures ANOVA in both behavioral and EEG analysis, we've chosen to implement linear mixed effects models, which is a current state-of-theart method to analyze ERPs and avoid listwise deletions of the data (Heise et al., 2022;Volpert-Esmond et al., 2021).

Behavior
We used lme4 (Bates et al., 2015) and lmerTest (Kuznetsova et al., 2017) packages in R (R Core Team, 2019) to calculate linear mixed effects model on the number of trials in the behavioral analysis.Awareness (PAS0, PAS1, PAS2), level of processing (high-level task, low-level task), accuracy (correct, incorrect), and their interactions were introduced as fixed effects, while random intercept and words ("cat", "elk", "cup", "oak") were random effects (number of trials ~ Awareness X LoP X Accuracy + (1|Id) + (1|word)).The reference categories were PAS0, low-level task, incorrect trials and "cat" word.We used simple contrast coding to obtain main effects and confidence intervals.This design was implemented to detect all possible interactions.R's anova function with Satterthwaite's method was applied to obtain the ANOVA tables.In addition, we ran a post-hoc t-tests with a Bonferroni correction on the proportion of accurate responses and on the number of trials at each PAS level between the high-and low-level tasks.

EEG
EEG was processed using EEGLAB (Delorme and Makeig, 2004) (version, 2021.1) and Matlab (version, R2021b).It was rereferenced to the linked mastoids (average of electrodes TP9 and TP10).Bad channels were removed by the EEGLAB function "pop_rejchan", with the options kurtosis, joint probability and spectrum with an absolute threshold of 4 SD.After additional visual inspection, a 0.5 Hz high-pass filter was applied (FIR, Hamming windowed; transition bandwidth, 1 Hz; filter order, 1650).High-pass filter parameter was chosen due to the latest recommendations for improving data quality in similar ERP components and time windows (Zhang, Garret & Luck, 2023).The the 50 Hz line noise was filtered out with a "pop_cleanline" function and a 30 Hz low-pass filter (FIR, Hamming windowed; transition bandwidth, 6.675 Hz) was applied to attenuate noise in the higher frequencies.The number of removed electrodes per participant ranged from 0 to 7 electrodes (M = 3.590, SD = 2.04).Bad electrodes were interpolated using a built-in spherical interpolation function, "pop_interp".
As we were also interested in ICA-based source reconstruction for AAN, but will not report it because of the absence of any statistically significant results, we performed independent component analysis on a rank deficient data in order to avoid ghost ICs caused by the interpolation (Kim et al., 2023).Artefactual components were visualized by ICLabel plugin (Pion-Tonachini, Kreutz-Delgado, Makeig, 2019) (version 1.3.)and removed by manual inspection (M = 6.040,SD = 4.650, min = 0, max = 15).The criterion for removal was substantial noise on the IC scalp distribution, power spectrum and a trial-to-trial variability chart.A baseline was corrected to − 400---0 ms EEG activity preceding the onset of the target auditory stimulus.
We implemented an amplitude ~ Awareness X LoP X Accuracy + (Awareness|Id) linear mixed effects models with lme4 (Bates et al., 2015) and lmerTest (Kuznetsova et al., 2017) R packages (R Core Team, 2019) on the mean amplitudes in AAN and LP time windows within the preregistered time windows and electrode clusters.Awareness (PAS0, PAS1, PAS2), level of processing (high-level task, low-level task), accuracy (correct, incorrect), and their interactions were introduced as fixed effects, while random intercept and random slope for Awareness were random effects.The model with random slopes for Awareness was selected because it fitted the data better than others, having a lowest AIC.We used R's anova function with Satterthwaite's method to obtain ANOVA tables for the models.We ran the model on the subset of the data with more than 9 trials in condition and used simple contrast coding to obtain main effects and confidence intervals.The reference categories were PAS0, low-level task and incorrect trials.AAN channels included Fp1, Fz, F7, F3, FC1, FC2, Cz, C3, CP5, T7, P3, P7 and LP channels included Fz, F3, F4, F7, F8, FC5, FC6, FC1, FC2, Cz, C3, C4, CP1, CP2, CP5, CP6, P3, P4, Pz and T7.The channels were chosen having most of the ERP activity of interest in a pilot study.Both components, especially AAN, showed left lateralization in the pilot study, which was not the case for the actual experiment.A scalp distribution of AAN and LP in the present study, with strongest activation over the central electrodes, can be caused by the linked mastoid reference, which was also observed in previous studies using the same reference electrodes (Filimonov et al., 2022).AAN time window was specified between 160 and − 300 ms stimulus onset and LP time window was 450 -800 ms stimulus onset.In addition, we separately analyzed early and late LP parts, between 450---600 ms and 600---800 ms respectively.The time windows and electrode clusters of interest were obtained via an exploratory Factorial Mass Univariate ERP Toolbox, FMUT, (Fields & Kuperberg, 2020) on the pilot study data and preregistered.To conclude our findings, we performed a post-hoc comparisons of the raw ERP amplitudes at PAS2 and PAS1 for each task and between tasks in AAN and LP time windows using t-tests with a Bonferroni correction.In the present study we also applied an exploratory FMUT, which is an extension of Mass Univariate ERP Toolbox, MUT (Groppe et al., 2011) and implements a mass-univariate factorial ANOVA, to capture effects which could be present over the electrodes outside of the preregistered clusters.A non-parametric approach with 1000 repetitions (Maris and Oostenveld, 2007) was selected with permutation-based cluster mass correction for multiple comparisons (Groppe et al., 2011b).The family-wise alpha of the test was set to 0.05.
In addition to the main analyses, we have done ICA-based source reconstruction in order to investigate cortical generators of AAN and LP.The dipolar sources were based on the average MRI with standard channel coordinates, components with a residual variance of more than 15 % were removed additionally to artefactual ICs that were removed in the preprocessing.ICs were clustered by implementing a k-means function in the EEGlab and the clustering was based on a dipole location and orientation.In order to avoid circular D. Filimonov et al. inference in the statistical tests (Kriegeskorte et al., 2010), no time-frequency data was used for clustering.To correct for multiple comparisons, permutation statistics with Holmes correction was applied with a p = 0.05 threshold.Unfortunately, it did not reveal any interesting results and therefore will not be reported.
Post-hoc comparisons revealed a difference in proportion of accurate trials (percent of correct trials from the sum of trials for particular condition) for PAS1 ("heard weakly") between high-and low-level tasks, M = -8.435,95 % CI [-13.850,− 3.020], t = -3.215,p = 0.004.This corroborates the LoP hypothesis: while both tasks have a similarly high accuracy for clearly aware trials and a nearchance accuracy for unaware trials, on the intermediate level only the low-level task performance increase linearly.This implies that the low-level processing is gradual, while the high-level processing is more dichotomous and less gradual.When the tasks were compared, number of trials differed at all PAS ratings.In the high-level task, number of false alarm (empty aware) trials, M = 6.700, 95 % CI [3.620, 9.780], t = 4.372, p < 0.001, and weakly aware trials, M = 8.440, 95 % CI [3.062, 13.819], t = 3.153, p = 0.003, was higher compared to the low-level task.And oppositely, in the low-level task there were more clearly aware trials than in the high-level task, M = 7.760, 95 % CI [1.908, 13.612], t = 2.665, p = 0.010.A lower number of clearly aware trials and less accuracy implies that the high-level task requires higher level of awareness.A higher number of unaware trials in the high-level task makes the task more demanding.

EEG
Fig. 3 shows the grand average ERPs over Cz electrode for high and low task.Fig. 4 demonstrates difference waves between PAS1/ PAS2 and PAS0 for high-and low-level tasks with confidence intervals.Scalp topographies of the aware-unaware differences between PAS1/PAS2 and PAS0 in both tasks are shown in Fig. 5. Table 1 demonstrates descriptive statistics and confidence intervals for AAN and LP at different PAS levels and tasks.
Figs. 3, 4 and 5 indicate that AAN and LP are present for both tasks and at both levels of awareness.AAN begins from approximately 160-200 ms until about 350 ms post stimulus, while LP starts from around 350 ms stimulus onset and propagates to approximately 700 ms or further, which fits with the preregistered time windows for these components.The preregistered AAN and LP channels still cover most of the EEG activity.Fig. 5 also demonstrates a shift in scalp topography for the ERP activation in a late LP time window, leading to a noticeable difference compared to the preregistered cluster of electrodes for the LP.To investigate effects, that could be possibly missed by the main analysis, we have complemented it with the results from FMUT.

LP
In the LP time window, ERP amplitudes were modulated by significant effects of Awareness, F (2, 24) = 6.800, p = 0.004, M = 0.461, 95 % CI [0.152, 0.769] at PAS1 and M = 1.089, 95 % CI [0.509,1.668]The main effect of Accuracy indicates that correct trials elicited a higher amplitude in the corresponding time windows, while the direction of the main effect of LoP suggests that the late LP amplitude decreased in the high-level task compared to the low-level task.

FMUT
Fig. 6 demonstrates the results of a factorial mass univariate analysis.The main effect of Awareness (cluster mass = 62881.930,pvalue < 0.001) starts from around 100-120 ms post stimulus with a cluster that starts over left temporo-parietal areas and propagates towards the central area, later forming a considerable cluster from 200 ms to 300 ms, which covers most of the scalp electrodes and corresponds to AAN time window and scalp topography.At around 350 ms post-stimulus the activity shifts towards the posterior areas, followed by a cluster around 380-400 ms stimulus onset, which also covers most of the scalp channels.Later, from 550 ms it moves towards central-parietal-occipital region.This activation corresponds to the time windows and scalp topography of LP.The main effect of Accuracy forms a cluster (cluster mass = 4001.450,p-value = 0.034) that starts from 450 ms post stimulus over the right central, temporal and parietal areas, later shifting towards all scalp channels at 500 ms and to the frontal, right temporal and right parietal areas from around 550 ms, attributing this activity to the LP time window.Main effect of LoP (cluster mass = 5706.063,p-value = 0.008) starts around 700 ms post stimulus in the late LP time window over all scalp electrodes and moves towards the central areas at around 800 ms stimulus onset.Finally, the Awareness X LoP interaction (cluster mass = 2394.655,p-value = 0.027) is present in a 450 ms -600 ms time range and forms clusters over the central areas, left temporal-parietal, central-temporal-parietal-occipital and finally over the occipital electrodes.
In light of the evidence from FMUT we argue that the Awareness X LoP interaction was not captured by the main analysis for several reasons: firstly, in an entire LP time window analysis, interaction effect was weaker and more topographically scattered relative to the other effects and therefore hasn't been captured by averaging over long temporal windows.Secondly, the late LP time window started when the effect of the interaction was already ending and, finally, it was not captured in the early LP time window due to the fact that some of the activation was over channels, different from the preregistered.We performed a follow-up FMUT analyses of the Awareness x LoP interaction for PAS2 vs PAS0, PAS1 vs PAS0 and for PAS2 vs PAS1, where high-and low-level tasks were contrasted.PAS2 vs PAS1 analysis confirmed the Awareness x LoP interaction which formed a cluster from 214 ms to 368 ms post-stimulus (cluster mass = 6084.789,p-value = 0.022).This early cluster was not observed in the original FMUT performed on all PAS levels.However, there was a later cluster showing the Awareness x LoP interaction, which overlaps with the cluster found in the original FMUT.It started from 396 ms post-stimulus and continued till the end of the trial time (cluster mass = 19296.012,p-value = 0.001).This finding suggests that the original Awareness x LoP interaction (in the 450 ms -600 ms time range) reflected larger difference (i.e., early LP) between PAS2 and PAS1 in the low-level task than in the high-level task.The results from the PAS2 vs PAS1 analysis also showed a significant LoP effect, having a cluster from 138 to 368 ms post-stimulus (cluster mass = 16250.724,p-value = 0.002) and a cluster from 370 ms post-stimulus to the end of the trial time (cluster mass = 25538.249,p-value = 0.001).Thus, these pair-wise comparisons of PAS levels

Table 1
Raw descriptive statistics and confidence intervals for mean amplitudes of AAN and LP clusters as well as different parts of LP in high-and low-level tasks.LP1 and LP2 refer to the early (450---600 ms) and late (600 -800 ms) LP parts.revealed extended temporal windows of the clusters compared to those in the original FMUT.Therefore, one should not press much weight on the additional time windows which did not show effects in the original FMUT on the whole dataset.We report the results from PAS2 vs PAS0 and PAS1 vs PAS0 analyses and figures for all three pair-wise comparisons of the PAS levels in the Supplementary Material.
Taken together, the results suggest that the level of processing hypothesis can be applicable to the auditory modality.LP is significantly affected by the level of processing, indicating complex cognitive operations happening in its time window, whereas AAN increases with the level of awareness, but not with the task complexity or accuracy of performance.However, the direction of the effect is opposite to the expected LoP prediction: the amplitude of LP was stronger in the low-level compared to the high-level of processing, which is also evident from the mean amplitude values and ERP topographical figures.The results further support the findings that AAN is an early correlate of basic auditory sensory awareness or "phenomenal" consciousness of hearing, while LP is modulated by a variety of cognitive processes, such as awareness, task accuracy and level of stimuli processing in the current experiment.

Discussion
We tested the level of processing framework in hearing for the first time and also looked for the relation between electrophysiological correlates of consciousness and depths of stimulus processing.Our findings indicate that awareness of the low-level features from no-awareness to weak awareness up to clear awareness is graded in a near linear fashion.For the high-level categorical processing the awareness accrual along this continuum is also monotonical, but not strictly linear.It may well be that with different categories and/or parameters of stimulation the higher LoP awareness formation function could become really dichotomous.We further conclude that AAN is an early correlate of auditory awareness, while LP is correlated with awareness and accuracy.A factorial mass univariate analysis showed that LP was also modulated by the levels of processing interacting with awareness and a follow-up analysis concluded it for the difference in ERP amplitudes at PAS2 vs PAS1, where the amplitude was bigger in the low-level task.
Behavioral results show that the number of aware trials was modulated by both level of processing and accuracy.Significantly lower number of correct responses was observed for weakly aware trials in the high-level task compared to the low-level task.At the very low level of awareness it seems to be easier to discriminate pitch than phonematic pattern of words.Unaware trials exhibited similar near-chance accuracy in both tasks and in clearly aware trials accuracy was similarly high.This pattern suggests a linear increase in accuracy for the low-level of processing and a more nonlinear trend for the high-level, supporting the idea that the acceleration of the function of transition from unaware to aware perception at its low argument values is faster for the lower LoP than for higher LoP.Another finding is a higher number of weakly aware trials in the high-level task compared to the low-level task.Combined with lower accuracy in weakly aware trials, this suggests that the high-level task requires higher level of awareness, aligning with LoP predictions.Additionally, more unaware trials in the high-level task points out that a higher level of processing is more challenging, consistent with similar conclusion from Kiefer & Kammer (2017), who demonstrated that higher level tasks require more processing time.Although statistically significant, these differences alone cannot definitively determine whether the high level of awareness was dichotomous or just less graded: as also discussed in Jimenez et al. (2021), there is no quantitative threshold that distinguishes between dichotomy and gradedness.
The levels of processing influenced accuracy, but not the distribution of awareness scale ratings.Some studies in the visual modality also reported similar results (Derda et al., 2019;Windey et al., 2014), while others suggested that awareness scale ratings also differ between the levels of processing (Jimenez et al., 2021), see Jimenez et al. (2020) for a comprehensive review.In the present study, auditory modality, target auditory threshold and the experimental design could influence the distribution of the awareness scale.As our aim was to obtain correct and incorrect responses for both clearly and weakly aware trials, the stimuli had to be relatively hard to hear and identify.Otherwise, all "clear" trials would result in perfect response accuracy or, conversely, all the "weak" trials would give accuracy at a chance-level.Auditory modality could also modulate the distribution of PAS ratings: if the steepness of the auditory psychometric function for the same level of processing is generally higher than that in vision, a wider range of evenly distributed ratings would be harder to achieve.Finally, the low-level task chosen for this experiment was at the level of feature extraction, which is not the lowest level of processing possible (Kouider et al., 2010), that would correspond to a mere stimulus detection.The low-level task included a relative pitch judgement, which required cognitive comparison.Nevertheless, as pitch height is not easily associated with a single label, that was originally proposed by LoP as a demarcation line between levels of processing (Windey & Cleeremans, 2015), the pitch discrimination is still a low-level task.Even with uniformly distributed awareness ratings, the linearity/nonlinearity difference at the levels of processing suggests that LoP remains a promising theoretical framework for hearing, as it is for vision.
Although PAS distribution was proportionally similar for both LoP tasks, the number of trials under each awareness rating differed between the tasks.In a high-level task, a higher number of false alarms and weakly aware trials were reported, while in a low-level task there were more clearly aware trials.Combined with lower accuracy for weakly aware trials in a high-level task, this further supports the idea that a higher level of processing is generally more demanding and requires stronger awareness.Similar pattern of results has been previously reported in visual LoP studies (Derda et al., 2019;Windey et al., 2014).
Electrophysiological analysis revealed AAN and LP in the predicted time windows and electrode locations.AAN was a correlate of awareness and remained unaffected by the level of processing or accuracy, while LP was associated with awareness, accuracy and the level of processing, indicating its relation to conscious access or other cognitive operations related to the reflective consciousness.Numerous studies have suggested that PAN, which includes AAN, is an NCC proper, while the LP is associated with later cognitive processes and executive functions, especially the reportability of the stimulus (Railo et al., 2009;Koivisto et al., 2017;Wiens et al., 2023;Schlossmacher et al., 2021;Sergent et al., 2021;Filimonov et al., 2022), although this debate is not yet settled (Dellert et al., 2021;Pitts et al., 2012Pitts et al., , 2014;;Shafto & Pitts, 2015).AAN and LP time windows were typical for the auditory modality.LP appeared right after AAN similarly as it starts after VAN in the visual modality (Koivisto & Grassini, 2016).
Surprisingly, our analysis demonstrated that in the LP time window, amplitudes were lower for a high-level-compared to the lowlevel task.This could be partially explained by the distribution of clearly aware trials (lower amount of clearly aware trials in the high-level task could result in a bigger variance) along with the complexity of the high-level task itself, since accuracy modulates LP.In addition, since the pitch discrimination task was about the relative pitch between the different stimuli (not absolute pitch intrinsic to each sound), framing the low-level task in this relational way can have induced a cognitive comparison process: i.e., each time the heard pitch was judged, it needed to be compared in working memory with the previously learned high-and low pitch samples.This would be a complex and more cognitive resources demanding task affecting the LP range and might explain the higher LP for the lowlevel stimuli.The word discrimination, by contrast, is intrinsic to each stimulus: if one hears it clearly, the intrinsic features of the experience map it to a specific word or label and no relative comparison process between the stimuli is required.In vision, Derda et al. (2019) found that LP amplitudes correlated with PAS ratings in a high-level task, but not in the low-level task.Due to the difference in levels of processing across studies and specifically in our pitch task discussed above, one might expect that with an even lower level than that of feature extraction, LP would demonstrate a similar pattern as in Derda et al. (2019).
In the factorial mass univariate analysis (FMUT), an Awareness x Level of processing interaction was found in the early LP time window.FMUT also concluded presence of the LoP effect in the late LP time window.It is known that mass univariate analysis can decrease both type I and type II errors (Bürki et al., 2018, Groppe et al., 2011a, Groppe et al., 2011b, Luck and Gaspelin, 2017) and highlight effects neglected by the traditional analyses conducted on the peak or mean amplitudes.In our study, FMUT showed the critical interaction between awareness and levels of processing, which was missed by the mean amplitude analysis due to the signal distribution in the late LP time window which didn't match with the selection of electrodes.Other FMUT results were similar to the results of the mixed effects models.The awareness x LoP interaction was supported for a difference between PAS2 and PAS1 ERP amplitudes by a follow-up FMUT.Specifically, LP increases with awareness for the lower level of processing, which points on the neurophysiological difference between the levels.Taken together, results suggest that the LP is not a correlate of awareness per se: rather, it could underline a level-specific and accuracy-specific process associated with awareness.
Unlike reported in vision by some studies (Jimenez et al., 2018;Jimenez et al., 2021), we did not find a double dissociation between AAN/LP modulation and levels of processing in hearing, where AAN would be modulated only by the low level of processing and LP by the high level.In fact, many studies in vision also failed to report it (Jimenez et al., 2020;Wiens et al., 2023).The possibility of the double dissociation between the different classes of NCC and levels of processing across modalities could be a potential research direction for future.Considering the current state-of-the-art findings, which treat PAN as a direct correlate of awareness, lower-level of processing could index phenomenal consciousness, while the higher-level, that is associated with LP, could underline conscious access.This, in turn, may support or challenge some of the theories of consciousness, such as the Recurrent Processing Theory (Lamme, 2000) and the Global Neuronal Workspace Theory (Dehaene et al., 2011; but see Sergent et al., 2021 for a Global Playground extension).Our results are in line with the Recurrent Processing Theory in that phenomenal conscious experience is influenced by early processing in primary sensory areas.All in all, there is a strong converging evidence from multiple sources that VAN and AAN reflect phenomenal features of awareness (Dembski, Koch, & Pitts, 2021;Eklund, Gerdfeldter, & Wiens, 2019, 2020;Eklund & Wiens, 2019;Filimonov, Railo, Revonsuo, & Koivisto, 2022;Förster, Koivisto, & Revonsuo, 2020;Koivisto & Grassini, 2016;Koivisto & Revonsuo, 2010;Koivisto, Salminen-Vaparanta, Grassini, & Revonsuo, 2016;Revonsuo, 2009;Van Gaal & Lamme, 2012;Ye & Lyu, 2019), but which features of awareness (if any) does LP uniquely reflect, remains to be seen.
Studies on visual modality have reported that VAN is associated with stimulus detection, but not the identification, which is associated with LP (Railo et al., 2009;Koivisto et al., 2017;Wiens et al., 2023).Similarly, LP in hearing has been shown to be modulated by the experimental task and not by awareness per se (Schlossmacher et al., 2021;Sergent et al., 2021;Filimonov et al., 2022), but see Eklund, Gerdfeldter, & Wiens, 2019;Eklund & Wiens, 2019.Studies on auditory perception further support this by reporting early ERPs affected by the lower-and late ERPs by the higher-level of perceptual processing (Toscano et al., 2010).Although we have observed AAN and LP in both levels of processing, our results show that only LP is associated with identification, which is represented by accuracy.Our results show that LP is modulated by awareness and accuracy, or in case of the early and the late parts of the LP: by awareness, accuracy and LoP; but never by awareness alone.Both pitch and category tasks required identification to make a correct decision.The pitch task also required cognitive comparison between stimuli, whereas the word category tasks involve no such comparison in working memory.Therefore, the tasks may have been qualitatively different in terms of working memory engagement, which would be expected to influence LP.We argue that although different in depth, our selected levels of processing include detection and identification, having, therefore, a mixture of phenomenal and reflective consciousness.
The observation of two possible parts of LP has been recently reported by Filimonov et al. (2022), where early part corresponded to the modality-specific and late part to the modality-general component in an experiment with visual, auditory and bimodal stimulation.The distinction was based on both scalp topography and function.Our results show that the ERPs in the early LP time window were modulated by awareness, task accuracy and an interaction between awareness and level of processing, while in the late time window they were modulated by accuracy and the level of processing.Our post-hoc comparisons also indicated dissociation between the LP parts at different levels of awareness: only the early part was found in weakly aware trials, while in clearly aware trials both parts were present.Furthermore, there is a clear change in the scalp topography between presumptive parts of LP.As the LP itself could be composed of several ERP components (Dembski et al., 2021), it is also possible that it can have functionally distinctive processes separated either topographically, time-wise or both.Still, such division could turn out somewhat problematic when one compares these results with similar studies that did not report topographical or functional distinctions.As evidence for the two parts of the LP component is still scarce, we interpret it with caution.
Our main finding is that different depths of perceptual processing exist in hearing and can be explained by the LoP framework: the extent of gradedness of awareness and its growth function dynamics depend on the corresponding level of processing.We also further conclude that auditory awareness negativity is a neural correlate of consciousness, while the late positivity is a correlate of conscious access, involving additional cognitive processes.AAN is only influenced by awareness, while the ERPs in LP time window are Fig. 2. Left: percentage of correct responses for each PAS rating.Right: PAS scale distribution.0 denotes unaware, 1 weakly aware and 2 clearly aware trials.

Fig. 3 .
Fig. 3. ERPs and difference waves at each PAS level.Upper panel shows ERP components for high-level category task at each PAS level and for correct and incorrect trials.Middle panel shows same ERPs for low-level pitch task.The ERPs are shown over CZ electrode.

Fig. 4 .
Fig. 4. Difference waves between PAS1/PAS2 and PAS0 for high-and low-level tasks and confidence intervals.The ERPs are shown over CZ electrode.

Fig. 5 .
Fig. 5. Scalp topography of the aware-unaware difference waves for different PAS levels in both tasks.Upper part: Panel A displays difference between PAS1 (weakly aware trials) and PAS0 (unaware trials) in the high-level task.Panel B shows difference between PAS2 (clearly aware trials) and PAS0 in the high-level task.Panel C shows difference wave between PAS1 and PAS0 in the low-level task, while panel D displays difference between PAS2 and PAS0 in the low-level task.Lower part: mean amplitudes at AAN and LP intervals for different tasks and PAS levels.

Fig. 6 .
Fig. 6. Results from factorial mass univariate analysis.Upper panel show clusters of F values in a channel x timepoint array.Lower panel shows these clusters on a scalp topography.