Modality-specific and modality-general electrophysiological correlates of visual and auditory awareness: Evidence from a bimodal ERP experiment

To date, most studies on the event-related potential (ERP) correlates of conscious perception have examined a single perceptual modality. We compared electrophysiological correlates of visual and auditory awareness in the same experiment to test whether there are modality-specific and modality-general correlates of conscious perception. We used near threshold stimulation and analyzed event-related potentials in response to aware and unaware trials in visual, auditory and bimodal conditions. The results showed modality-specific negative amplitude correlates of conscious perception between 200 and 300 ms after stimulus onset. A combination of these auditory and visual awareness negativities was observed in the bimodal condition. A later positive amplitude difference, whose early part was modality-specific, possibly reflecting access to global workspace, and later part shared modality-general features, possibly indicating higher level cognitive processing involving the decision making, was also observed.


Introduction
An observer is aware of the external world when a specific type of activity in her brain "transforms" the upcoming sensory information into subjective conscious experience, and one of the aims of neuroscience is to figure out where and when this takes place. An ongoing debate whether true NCCs are associated with early activity in sensory-specific cortical areas or with late activity in higher order brain regions (i.e., fronto-parietal areas) remains unsettled (Koch et al., 2016;Boly et al., 2017;Odegaard et al., 2017). Research on neural correlates of consciousness (NCC) has been mostly carried out in the visual modality (Koivisto and Revonsuo, 2010;Railo et al., 2011;Förster et al., 2020). Recently, studies on NCCs in other modalities, especially in hearing Eklund et al., 2020;Dykstra et al., 2017;Gutschalk et al., 2008;Sadaghiani et al., 2009), have also started to appear. Preliminary results from studies using electroencephalography (EEG) (Snyder et al., 2015), functional magnetic resonance imaging (fMRI) (Eriksson et al., 2007) and magnetoencephalography (MEG) (Sanchez et al., 2020) revealed similarities between visual and auditory NCCs, suggesting that conscious experience is based on comparable neural mechanisms across senses.
Up to date there are two main NCC candidates for visual awareness: while some research posits late activation in fronto-parietal networks as the true NCC Lau and Rosenthal, 2011), other studies argue that the visual NCC is rather an early recurrent activation in occipitotemporal areas Railo et al., 2015;Lamme, 2010). In electrophysiological measurements these NCC candidates are reflected by two separate event-related potential (ERP) difference waves between aware and unaware sensory stimuli: the Visual Awareness Negativity (VAN) and the Late Positivity (LP) (for a recent review, see Förster et al., 2020). VAN is a negative amplitude difference, peaking around 200 ms after stimulus onset, and it is typically observed over posterior scalp electrode sites, having stronger amplitude in the contralateral hemisphere if the aware stimulus is presented to one side of the visual field (Eklund and Wiens, 2018;. LP is a positive amplitude difference between aware and unaware stimuli in the P3b time window, and it is detected in central electrodes and could be correlated with various cognitive processes (Verleger, 2020), including consciousness. VAN has been found in many studies and is thought to be the earliest ☆ This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. systematically observed electrophysiological correlate of visual awareness (Förster et al., 2020;Eklund and Wiens, 2018;Koivisto et al., , 2017Koivisto et al., 2018;. Some researchers propose that both VAN and LP could be neural correlates of different stages of consciousness, viewed as a process (Revonsuo and Koivisto, 2010;Rutiku et al., 2017). Others state, based on the global workspace theory, that awareness happens solely in the LP time-window, which contains both the modality-specific correlate of sensory experience and the modality-general correlate of access to cognitive functions (Baars, 1997;Dehaene et al., 2006, but see also Sergent et al., 2021 for recent modifications).
Similarly to VAN in visual modality, an Auditory Awareness Negativity (AAN), which is observed in temporal and parietal electrodes and also peaks around 200 ms after stimulus onset, has been proposed as an NCC in hearing . Comparisons of the experimental results between different studies reveal at least partial similarities between auditory and visual modalities, such as the presence of both early negativity and late positivity difference waves (Dykstra et al., 2017). There are only few recent studies on auditory awareness that has been focused on the detection of a simple stimulus without requiring higher-level cognitive processing  and some very early ones (Hillyard et al., 1971) where AAN was called awareness-associated vertex negativity. Some other studies used oddball (Bekinschtein et al., 2009), informational masking paradigms (Gutschalk et al., 2008), or bistable percepts with fMRI (Brancucci et al., 2016). In our opinion, methods of direct comparison between presence and absence of simple stimulus awareness are so far the best choice to reveal the auditory NCCs. In case of other more complex paradigms, such as bistable percepts, change detection or oddball paradigm, memory or other higher cognitive functions, which are required for the experimental tasks, might introduce additional confounding factors that get mixed up with the true NCC. Some of these paradigms (such as Bekinschtein's et al., 2009) rely on attentional modulation as indirect measure of conscious experience, having benefits of no-report paradigms in case of passive mind-wandering.
Although revealing the NCC for each modality separately is extremely valuable, conscious experience is rarely separated in daily life; rather it represents a combination of senses and complex scenes in what has been called the "phenomenal unity of consciousness" (Revonsuo, 1999) or "unified qualitative subjectivity" (Searle, 2004). There is evidence that cross-modal integration is not a mere summation of percepts (Papai and Soto-Faraco, 2017): for instance, incongruent audiovisual stimuli can worsen the detection of visual motion compared to congruent ones (Rosemann et al., 2017), auditory stimuli can improve or deteriorate temporal resolution of visual detection, a single visual flash can be perceived as multiple flashes when accompanied by multiple sounds (Shimojo et al., 2001). Such cross-modal interactions happen unconsciously and stimuli from one sensory modality can enhance awareness of stimuli in other modalities (Faivre et al., 2017). Therefore, even simple stimuli which are unrelated to each other may modulate conscious experience in various ways.
Research on ERP correlates of auditory and visual consciousness have been largely conducted without interaction between research in distinct modalities. To this date there are no experiments which directly compare ERP correlates of visual and auditory awareness across early and late latencies, so we choose to address this issue. In the present study we compared electrophysiological correlates of visual and auditory conscious detection of simple stimuli in the same experiment to test whether there are modality-specific and modality-general correlates of awareness. We hypothesize that auditory awareness correlates with AAN, which should be present over central electrodes. Visual awareness correlates with VAN and should be present over posterior electrodes. Thus, the scalp topographies of AAN and VAN should be dissociated. We also hypothesize that if LP reflects post-perceptual cognitive processes (Verleger, 2020), then it should be similar for visual and auditory awareness, assuming that a post-perceptual access mechanism is similar across modalities, indicating that the LP is a domain-general correlate of conscious processing. On the other hand, global neuronal workspace theory predicts that LP reflects the combination of sensory-specific and domain general-processes, therefore predicting that its topography and timing may be different between modalities. In bimodal condition, a topographically widespread scalp distribution for awareness should be expected to be observed, covering the sites involved both in auditory and visual awareness. As bimodal condition contains both visual and auditory components, its ERPs should show similarities to both visual and auditory ERPs. However, as multimodal perception is not a simple combination of unimodal percepts, the ERP pattern in bimodal condition should not be expected to be equal to a sum of visual and auditory ERPs.

Participants
Twenty-four healthy right-handed participants, 4 males and 20 females, age from 18 to 30 years, were recruited from local faculties and through online advertisement. Before the experiment, participants provided written informed consent. The research was conducted in accordance with Declaration of Helsinki and accepted by the ethics committee of Hospital District of Southwest Finland. Participants had normal or corrected-to-normal vision and did not experience problems with hearing. Exclusion criteria were noisy EEG data and failure to calibrate individual visual and hearing thresholds within 20%-80% detection rate. The resulting sample was 20 participants for behavioral analyses and 21 participants for the EEG analysis. One participant, who was excluded from behavioral analysis, ended the experiment earlier due to personal request, after having performed 7 experimental blocks out of 8.

Stimuli
Auditory stimuli were pure tones of 1000 Hz frequency presented for 58 ms with 5 ms fade-in and fade out period. Tones were presented binaurally through Etymotic ER-3A earphones. Visual stimuli were low contrast sinusoidal Gabor patches (225 pixels, Michelson contrast starting level = 0.0401) presented against a grey background (R = 128, G = 128, B = 128). A VIEWPixx/EEG LCD monitor (resolution: 1920 × 1080, refresh rate: 120 Hz) was used to display visual stimuli. The experiment was run using E-prime 2.

Procedure
Participants received both written and verbal instructions, written instructions were in Finnish and English. The experiment comprised of 8 blocks with 100 trials in each, 800 trials in total. Every block had 25 trials for each condition: visual, auditory, bimodal, and catch trials without any stimuli. Each trial started with 1166 ms blank screen, after which an eye fixation cross (0.5 visual degrees) was presented in the center of the screen for 500 ms, followed by 500-1000 ms blank screen period, a stimulus presented for 58 ms, and with 500 ms blank screen (see Fig. 1). Visual stimuli were presented in the center of the screen and auditory stimuli were presented binaurally. Each trial ended with two questions: "I heard stimulus" and "I saw stimulus" with three choice options ("A = "not at all", "B = weakly", "C = clearly"). The order of questions changed after each block, with the first question in the beginning of the experiment counterbalanced across participants. Participants rated their subjective awareness by pressing one of three buttons: "A", "B", "C" on the gamepad (Xbox Wireless Controller). In the end of each trial participants had time to assess their awareness and rest, and after answering second question the next trial started. The trial structure is illustrated in Fig. 1.
Prior to the actual experiment, each participant ran a calibration procedure in order to obtain a required stimuli intensity. Calibration consisted of blocks with 60 trials, having 15 stimuli for each condition and 15 catch trials without any stimuli. Visual, auditory and bimodal (visual + auditory) stimuli were presented in each block in random order like in the actual experiment and with the same trial structure. The calibration of visual gabor stimulus began with Michelson contrast level 0.0401. The second and third contrast values of the next visual stimulus were decreased or increased by one unit (Michelson contrast step = 0.0125), depending on whether participant responded seeing stimulus either "weakly" or "clearly" or "not at all". Auditory stimulation started with the maximum computer volume attenuated by 95 dB volume level. The next auditory stimulus volume was increased or decreased with 1 dB step, depending on whether the stimulus was heard "not at all" or either "weakly" or "clearly". If both visual and hearing thresholds were in the range between 20% and 80%, calibration was considered completed. Otherwise, a next calibration block was performed with stimuli parameters (volume and Michelson contrast) taken from the previous block. In the end, a validation block with calibrated stimuli was performed to ensure that the thresholds were calculated correctly, having 15 trials for each condition and 10 catch trials.

EEG data acquisition
EEG was recorded using active 64 Ag/AgCl sintered ring electrodes attached to recording cap (EASYCAP GmbH, Germany) by NeurOne system (Mega Electronics ltd) and amplified using a band pass of 0.05-100 Hz, with 500 Hz sampling rate. EEG was processed using EEGLAB (Delorme and Makeig, 2004) (version 2019.1) and Matlab (version R2018a).

Analysis
Behavioral data was processed with R (R Core Team, 2016) and lme4 package (Bates et al., 2015). Behavioral results were analyzed using linear mixed effects model which included participant-wise random slope for modality as random effect. Mixed effect model was chosen because of within-subjects design, taking into account that data comes from different participants. Visual unaware condition was coded as the baseline. Both "clearly" and "weakly" heard/seen trials were counted as aware and combined because participants rarely rated their awareness as clear (Eklund & Wiens, 2018. In the bimodal condition a trial was counted as aware if the participant reported awareness of both visual and auditory stimulus (and vice versa in case of bimodal unaware condition, if both stimuli was reported as unaware). In the unimodal conditions only trials without false alarms in the other modality were counted as aware. In accordance with signal detection theory (Stanislaw and Todorow, 1999), sensitivity (d') and response bias (c) were calculated for each condition. In order to illustrate how the sensitivity and the response bias differed between modalities, and whether this difference was statistically significant, linear regression analyses were implemented.
EEG was referenced to linked mastoids (average of electrodes TP9 and TP10). Baseline was corrected to the activity in the 200 to 0 ms preceding the onset of the visual or auditory stimulus. Bad channels were removed using EEGLAB package function "pop_rejchan" with probability, spectrum and kurtosis options with absolute threshold = 4sd, after which channels were also inspected visually. A 1 Hz high-pass filter was applied (FIR, Hamming windowed; transition bandwidth, 1 Hz; filter order, 1650). Independent component analysis was performed, and artefactual ICA components were removed using ICLabel plugin (Pion-Tonachini, Kreutz-Delgado, Makeig, 2019) (version 1.3.). Remaining bad trials were rejected via EEGLAB package function using joint probability on the recorded electrodes (local activity probability limit: 4sd, global limit: 2sd).
ERPs were analyzed using linear mixed effects models which included modality, awareness and their interaction as fixed effects and intercept, modality, awareness and their interaction as by-participant random effects (amplitude ~ modality*awareness + (1 + modality*awareness |ID)). We used mass-univariate approach, analyzing ERPs of all channels and time points in a − 200 to 600 ms window. For this analysis we used all the trials from all the participants, although some of them had a very few trials per condition (in the range from 4 to 160 trials). To test for statistical significance, we used a non-parametric permutation approach with 1000 repetitions (Maris and Oostenveld, 2007) and performed threshold-free cluster enhancement (TFCE) to take into account clustering of effects (Smith and Nichols, 2009). The permutations were calculated in the Neuroscience Gateway Portal Each trial started with blank grey screen, after which a black fixation cross was displayed in the center for 500 ms following by a random interval from 500 ms to 1000 ms and a visual, auditory or bimodal stimulus for 58 ms. Trial ended with two questions: "I heard stimulus" or "I saw stimulus" and 3 choice options ("not at all", "weakly", "clearly"). environment (Sivagnanam et al., 2013). TFCE was performed on the real dataset and the permuted data where the condition labels were randomly shuffled. To control for multiple comparisons, null distribution was obtained by selecting the maximum TFCE value from each permutation iteration (Maris and Oostenveld, 2007). TFCE values whose p value < .001 when compared to the null distribution were considered statistically significant. The TFCE was performed using a function from Limo EEG package with default parameters (E = 0.5, H = 2, dh = 0.1; Pernet et al., 2011).

Awareness in different modalities
Percentage of aware trials in visual, auditory, and bimodal conditions are shown in Fig. 2. Linear mixed effects model (nr of trials ~ modality + (1 + modality|ID)) for N = 20 was calculated to assess average proportion of aware reports in different modalities and visual aware condition was taken as a baseline. The effect of auditory modality (B = − 18.725, SE = 6.218, t = − 3.012, p < .003) indicates that there were less aware trials in auditory modality than in visual modality. The effect of bimodal condition (B = − 48.250, SE = 6.218, t = − 7.760, p < .0001) indicates that there were less aware trials in bimodal condition than in visual modality. Thus, participants reported most often awareness of visual stimuli and least often of bimodal stimuli (i.e., both auditory and visual stimuli).
The percentage of visual false alarm responses in auditory (M = 47.9, SD = 22.6) and catch (M = 44.4, SD = 24) trials compared to auditory false positives in visual (M = 15.7, SD = 13.8) and catch (M = 15.15, SD = 17.0) trials also differed. Overall, there were more visual than auditory (B = -61.450, SE = 9.876, t = − 6.222, p < .000) and bimodal (B = -89.900, SE = 9.876, t = -9.103, p < .000) false alarms, where a bimodal false alarm represented aware response in catch trials both for visual and auditory modality. Also, there were more visual false alarms in auditory trials (B = 32.150, SE = 5.927, t = 5.424, p < .000) than false auditory alarms in visual. The signal detection theoretical approach was applied to compute the sensitivity (d') and the response bias (c) to assess the levels of sensitivity for each modality (Stanislaw and Todorov, 1999). For the visual modality, average d' = 0.5404, c = 0.5512, for auditory modality d' = 1.6934, c = 0.8043, and for bimodal condition d' = 1.3248, c = 1.7255. Linear regression analyses (d' ~ modality) shows higher sensitivity for auditory (B = 1.1529, SE = 0.2207, t = 5.225, p < .000) and bimodal condition (B = 0.7844, SE = 0.2207, t = 3.555, p < .000) compared to visual modality. Second linear regression analysis (c ~ modality) indicates that the response bias in auditory (B = 0.25306, SE = 0.09963, t = 2.540, p = .014) and bimodal condition (B = 1.17427, SE = 0.09963, t = 11.786, p < .000) was more shifted towards "no" response compared to the visual modality. As is displayed in Fig. 2, participants reported seeing visual stimuli more often than hearing sound stimuli or simultaneously seeing and hearing a bimodal stimuli, but at the same time the false alarm rate was the highest, meaning that participants were less sensitive to and were making more errors in the visual modality. Bimodal condition was the hardest, having highest response bias and lowest number of aware trials. Performance in the auditory condition was most accurate, having moderate response bias and highest sensitivity. Taken together, results of these analyses suggest that participants were more sensitive to detect auditory and bimodal stimuli, and more conservative in reporting auditory and bimodal awareness compared to visual awareness in this experiment.

EEG
Event-related potentials (ERPs) were calculated for 21 participants. Grand averages in several electrode sites are shown in Fig. 3. Scalp topographies for aware-unaware differences in each condition are shown in Fig. 4. In all conditions, negative peaks around 200 ms post stimulus and positive peaks around or after 300 ms were observed.
Results of mass univariate regression analyses (modality × awareness) comparing visual, auditory and bimodal aware and unaware conditions are represented in Fig. 5. The upper panels show statistically significant contributions of each predictor on the ERP amplitudes at each time point and channel; color denotes t value. Leftmost panel shows the intercept (i.e., the reference category) which corresponds to visual unaware condition. The intercept panel in the figure is empty because the stimulus was too weak to elicit a statistically significant ERP in the unaware condition. The main effect of awareness shows how ERPs differed between unaware and aware visual trials (note that these effects also apply to auditory/bimodal awareness, assuming these regressors do not show further effects). Negative t values in N2 and positive t values in P3 time window show that amplitudes of these waves were strengthened in aware trials. Additionally, two clusters of statistically significant effects are present: a cluster of positive activation around 0-10 ms stimulus onset, localized in left temporal, frontal and central electrodes and a cluster of negative activation around 100 ms stimulus onset, localized in right temporal and parietal electrodes. The main effect of auditory modality shows how ERPs in auditory unaware condition differed from the visual unaware ones. A cluster of positive activation around 20 ms stimulus onset, localized in right electrodes and a cluster of negative activation around 400-440 ms stimulus onset, localized in occipital electrodes are present. A small cluster of negative activation around 550 ms, localized in left frontal and parietal electrodes is also present in the main effect of auditory modality. The main effect of bimodal condition shows how ERPs in bimodal unaware condition differed from the ERPs in visual unaware condition. A cluster of positive activation around 0-20 ms stimulus onset, localized in frontal, central and occipital electrodes, and a cluster of negative activation around 500-600 ms stimulus onset, localized in frontal and occipital electrodes, are present.
The auditory modality × awareness interaction shows how the aware-unaware ERP difference in the auditory modality differed from that in the visual modality. Negative t values in N2 and positive t values in P3 time window indicate that auditory awareness modulated ERPs more strongly (than visual awareness) in the similar time-windows as visual awareness, and both negative and positive activations started earlier. Additionally, three clusters of statistically significant effects are present: a cluster of positive activation in temporal electrodes around 170 ms before stimulus onset, a cluster of negative activation in right frontal, temporal and central electrodes 0-40 ms after stimulus onset, and a cluster of negative activation in right frontal, central and occipital electrodes around 500-550 ms after stimulus onset.
Finally, the main effect of bimodal × awareness interaction shows how the aware-unaware ERP difference in bimodal condition differed from that in the visual modality. Negative t values in N2 and positive t values in P3 time window are detected in the analysis of bimodal condition. The negative activation started earlier than in the visual modality. Additionally, two clusters of statistically significant effects are shown: a cluster of positive activation in right fronto-temporal electrodes at 40 ms after stimulus onset, a left-lateralized negative cluster at 520 ms, a right-lateralized negative cluster at 550 ms and a posterior positive cluster at 590 ms stimulus onset. The lower panel in Fig. 5 shows the corresponding statistically significant effects on scalp topographic maps. The Supplementary Material Fig. 1 shows all clusters in the range from 0 ms to 600 ms after stimulus onset for main effects of awareness, auditory modality and bimodal condition, as well as for modality × awareness and bimodal × awareness interactions on a more precise time scale. In addition, Fig. 6 shows clusters of t values in LP time window for the main effect of awareness, auditory × awareness interaction, and bimodal × awareness interaction. As Figs. 3-5 show, VAN emerges in posterior areas in occipital electrode sites, around 220-270 ms and lasts until 300 ms, while AAN emerges earlier in temporal-parietal and central areas. The lack of auditory modality × awareness interaction in occipital sites (i.e., lack of red areas in the occipital electrodes in Fig. 5) from around 170 ms-280 ms suggests that the AAN was so strong that it could not be statistically dissociated from VAN and thus it was detected also in occipital electrodes. The interactions around 350-440 ms and 460-560 ms in temporal, parietal, central, and frontal electrodes shows that the auditory LP differed in topography from the visual LP: the amplitude of the early part of LP was increased and the late part was decreased in the auditory modality. The visual LP started in occipital electrodes around 250-300 ms and spread as a function of time to parietal-central electrodes being present at least until 600 ms. In the very late latencies (>570 ms) no statistically significant difference between the auditory and visual LP was detected (as is evident from the lack of modality × awareness interaction).
In the bimodal condition VAN and AAN emerge around the same time as in the unimodal conditions, VAN in occipital and AAN in The main effect of modality shows how the ERPs in auditory unaware condition differ from the ERPs in visual unaware ones, C) The main effect of bimodal conditions show how the ERPs in bimodal unaware condition differ from the ERPs in visual unaware condition, D) The modality × awareness interaction indicates how the ERP difference between aware and unaware auditory trials differ from the corresponding difference in visual trials, E) The bimodal × awareness interaction show how the ERP difference between aware and unaware bimodal trials differ from the corresponding difference in visual trials. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.) temporal, parietal and central areas. They start bit earlier than AAN in the auditory modality and similar later VAN-effect is present as in the aware visual modality. In both early and late latencies, bimodal LP partially resembles patterns of both visual and auditory LPs.

Discussion
Research on electrophysiological correlates of auditory and visual awareness have progressed separately, largely without interaction between the research lines within each modality. Here we aimed to bring the research lines together by comparing electrophysiological correlates of detecting visual and auditory stimuli in the same experiment to test whether there are modality-specific and modality-general correlates of awareness that are shared by vision and hearing. In summary, the results of present study indicate that VAN and AAN are modality-specific early correlates of visual and auditory awareness, while late LP is a late correlate and shares both modality-specific and modality-general features. Fig. 7 shows difference waves in different electrodes in visual, auditory and bimodal conditions and Fig. 8 illustrates their scalp distributions in LP time window. Early parts of the LP are dissociated The main effect of awareness shows how the ERPs in aware trials differ from those in unaware ones, B) The modality × awareness interaction indicates how the ERP difference between aware and unaware auditory trials differ from the corresponding difference in visual trials, C) The bimodal × awareness interaction shows how the ERP difference between aware and unaware bimodal trials differ from the corresponding difference in visual trials. between modalities both in latency and scalp distribution: auditory LP peaks earlier than visual. In the bimodal condition, both early and late parts of the LP partially resemble LPs from visual and auditory modalities.
Several studies suggest AAN as an early correlate of auditory awareness Eklund et al., 2020;Snyder et al., 2015), similarly to VAN in vision Railo et al., 2015). Recently, Dembski et al. (2021) proposed General Perceptual Awareness Negativity (PAN) as an umbrella-term for the Awareness Negativity family for at least visual, auditory and somatosensory modalities. Our findings of modality-specific AAN, VAN and LP provide evidence for the view that neural correlates of consciousness in each sensory system are organized according to similar principles, so that they first involve sensory cortical areas and then common fronto-parietal areas (Eriksson et al., 2007;Snyder et al., 2015;Sanchez et al., 2020). Consistent with this view, awareness in auditory and visual modalities correlated around 200 ms with negative potentials (AAN, VAN) which had modality-specific topographies, while the later positive correlate (LP) showed a topography that was partially overlapping for both vision and hearing. In bimodal condition the LP showed a partially distinct pattern, which might be interpreted as evidence that the scalp topography of the LP also to some extent reflects the sensory modalities that are currently accessing the modality-general workspace. If the LP would have been purely modality-specific, then we would expect to see different topographies and peaks as well as onset latencies across the time window in all conditions. If, on the other hand, LP would have been purely modality-general, similar peaks, latencies and topographies would be present in all conditions. As shown in Figs. 3 and 7, different peak and onset latencies of the LP indicate modality-specificity of its early part. The later part of the LP contains modality-general features, having same scalp topography, but in different time window across conditions, as is evident from Figs. 4 and 8. This could indicate that a same global system becomes activated in different time windows. The differences in scalp topographies between the auditory and visual modality cannot be explained by different response latencies between the modalities, because the order of giving the auditory and visual awareness reports was counterbalanced.
The neural sources of VAN are thought to be localized in the lateraloccipital complex in the ventral stream according to source reconstructions such as LORETA (Koivisto and Revonsuo, 2010) or LAURA (Pitts et al., 2011). A recent study by  localized the source of AAN in bilateral auditory cortices and LP in ventral temporal areas using dynamic statistical parametric mapping. It is not surprising that in the present study the early ERP correlates in bimodal condition represented a combination of those in auditory and visual modalities, indicating that bimodal condition shares activation patterns of both visual and auditory modalities.
The late positive difference in P3 time window between electrophysiological responses to aware and unaware responses (LP) has been previously documented for different modalities. Noel et al. (2018) examined EEG complexity, trial variability and presence of late P300 for visual, auditory and bimodal stimuli and also found signs of LP in all conditions. They didn't report VAN or AAN, but they didn't statistically test for VAN/AAN time-windows either. Studies using non-report paradigms show absence of the LP (Schlossmacher et al., 2020;Sergent et al., 2021) in experiments where participants do not need to report their awareness or the stimulation is not related to a task. In studies, where stimulation is task-relevant, LP is present. A recent MEG study (Sanchez et al., 2020) compared visual, auditory and tactile stimulation separately and found late activation of task-unrelated primary sensory regions (Sanchez et al., 2020), for instance, after detecting auditory stimulus at late processing stage the visual cortex was also activated. Their results suggest that common late activation could be a neural correlate of access consciousness for different sensory modalities. A novel finding in the present study is that the late correlate of awareness, LP, is comprised of modality-specific and modality-general features and that its scalp topography depends on contents of consciousness. Recent ERP studies on visual awareness suggest that LP is related to higher-level cognitive processing such as identification, semantics (Derda et al., 2019;Jimenez et al., 2018;Koivisto et al., 2017) or decision-making and response selection . Thus, the later part of the LP is likely to correlate with such higher-level processes which are relatively independent of the modality. The earlier part of the LP must be related to the processes that occur between the processes in sensory cortical areas and those in the latest phases. Prominent theories of consciousness, such as the global neuronal workspace theory  or the recurrent processing theory (Lamme, 2010), emphasize that conscious representation which can be reported requires recurrent processing between fronto-parietal network and earlier cortical areas. According to the global neuronal workspace theory, sensory information needs to be broadcast into a global distributed brain network in order to enter consciousness and be reported ; from the other angle, the recurrent processing theory relates conscious awareness with dynamic local recurrent processes between higher and lower sensory areas, starting as soon the feedforward sweep has reached a higher level (Lamme, 2010). According to the recurrent processing theory, higher level cognitive processing and reports are possible later when fronto-parietal areas are engaged in global recurrent processing with lower areas. Thus, we suggest that the early part of the LP may correlate with recurrent processing between the higher fronto-parietal areas and lower modality specific areas, which might represent the active mechanisms creating access to the global workspace or global recurrent processing. Such processes, mediating between the modality-specific and modality-general stages of conscious processing, might be related  D. Filimonov et al. to the first stages of binding of the unimodal phenomenology to a common unity of consciousness, which is also referred in the literature as the "binding problem" (Revonsuo, 1999).
We decided to keep the stimulus intensity constant and not to change it during the experiment in order to be sure that differences in neural activity always corresponded to the same state of environment. The drawback was that some participants tended to alter their thresholds even after the validation of the threshold, beginning to see or to hear almost all or almost none of the stimuli. We kept bimodal and empty conditions in the threshold calibration and validation phases to ensure that the stimulation, or in other words the environment, would be identical to that in the actual experimental setting. In spite of these procedures, there were more aware visual trials than auditory ones and more visual false alarm responses than auditory ones. In spite of the higher number of visual aware trials, the signal detection analysis (d') suggest that aware sensitivity to detect the stimuli in visual condition was lower than in the auditory condition. That is so because the criterion to report awareness was more liberal in the visual than auditory condition. We speculate that audio-visual interference in the experiment and in bimodal condition can be the reason for visual and auditory thresholds shifts. Alternatively, the participants' attention may have shifted during the experiment more toward auditory stimuli. Nevertheless, even with the mentioned limitations, the mass univariate analyses showed statistically significantly detectable AAN, VAN, and LP for all three conditions. Although the behavioral results suggest that visual awareness ratings must have been lower than those of auditory and VAN was relatively weaker then AAN, we observed some dissociations between VAN and AAN, as there is no AAN in visual condition. For example the visual difference ERPs in the visual modality were clearly present in the time window from 220 ms to 270 ms in the most posterior electrode locations. On the other hand, the auditory predominance causes problems in interpretation of the onset latencies of the components as the more strongly experienced auditory awareness may also onset earlier than the weaker visual awareness. Therefore, one should stress the topographical ERP differences between modalities more than the exact onset latencies as the main findings of this study.
Finally, a more general limitation in the studies on neural correlates of consciousness is that the interpretation of the results are highly dependent on the definition of consciousness itself (Revonsuo, 2006), which is based on certain philosophical claims and specific theories of consciousness. If, for instance, one's favorite theory of perception does not include any concept of subjective phenomenal consciousness as such, then the early components AAN and VAN of course cannot be interpreted as the correlates of (this type of) "consciousness". Rather, they could only be interpreted as the correlates of "preconscious" processes, or as the (necessary) preconditions for the conscious access that is the only type of consciousness. In that case, the LP would be interpreted as the only true correlate of consciousness. The Global Neuronal Workspace Theory suggests exactly this kind of interpretation ). Yet even then some minimal report is already available in VAN/AAN time-window (Railo et al., 2015). If we use a theory that distinguishes two types of consciousness and allows subjective phenomenal experience to occur independently of and earlier than the global ignition and conscious access take place, then the early components AAN and VAN are interpreted as the earliest true correlates of subjective, modality-specific experience (as in e.g. the RPT, Lamme, 2010), and late correlates (LP) could either be linked to some properties of reflective/access consciousness or to higher-level cognitive processes. For example, the late correlate in P3 time window (LP) would not necessarily correlate with consciousness per se, but it may reflect later task-relevant conscious processing (Koivisto et al., 2005;Koivisto and Revonsuo, 2008;Pitts et al., 2014;Sergent et al., 2021). It is clear, however, that full perceptual awareness emerges as a function of time from mere phenomenal experience of the object's presence to a richer representation that can be reported (Campana and Tallon-Baudry, 2013;Bachmann, 2000), and we suggest that AAN/VAN and the early and late LP reflect different phases in this process.

Conclusions
Our experiment showed both early and late electrophysiological correlates of conscious awareness in all three conditions: visual, auditory and bimodal. We conclude that early components AAN and VAN are modality specific neural correlates of phenomenal consciousness and in bimodal condition at least partially similar neural activations take place. LP is a correlate of conscious access, where early part is modalityspecific and late part contains modality-general features and it can denote cognitive post-perceptual processing of task-related stimuli or access consciousness.