The Timing of Gaze Direction Perception: ERP Decoding and Task Modulation

Distinguishing the direction of another person ’ s eye gaze is extremely important in everyday social interaction, as it provides critical information about people ’ s attention and, therefore, intentions. The temporal dynamics of gaze processing have been investigated using event-related potentials (ERPs) recorded with electroencephalography (EEG). However, the moment at which our brain distinguishes the gaze direction (GD), irrespectively of other facial cues, remains unclear. To solve this question, the present study aimed to investigate the time course of gaze direction processing, using an ERP decoding approach, based on the combination of a support vector machine and error-correcting output codes. We recorded EEG in young healthy subjects, 32 of them performing GD detection and 34 conducting face orientation tasks. Both tasks presented 3D realistic faces with five different head and gaze orientations each: 30 ◦ , 15 ◦ to the left or right, and 0 ◦ . While the classical ERP analyses did not show clear GD effects, ERP decoding analyses revealed that discrimination of GD, irrespective of head orientation, started at 140 ms in the GD task and at 120 ms in the face orientation task. GD decoding accuracy was higher in the GD task than in the face orientation task and was the highest for the direct gaze in both tasks. These findings suggest that the decoding of brain patterns is modified by task relevance, which changes the latency and the accuracy of GD decoding.


Introduction
The human face is arguably the most frequent visual stimulus that we see from the second we are born.Face perception requires specific complex visual processing, involving extraction of information about face familiarity, a person's identity, emotional expression, and the direction of the eyes.The direct gaze (DG), has been shown to induce attentional capture (Bockler et al., 2014;Senju and Hasegawa, 2005), to be perceived and processed faster than the averted gaze (AG) (Conty et al., 2007), and to improve face encoding and subsequent face recognition (Conty and Grezes, 2012;Hood et al., 2003;Mason et al., 2004;Vuilleumier et al., 2005).Importantly, these eye contact effects appear to emerge very early in life (Farroni et al., 2007), continue to develop throughout early childhood (Smith et al., 2006), and are disrupted in atypically developing children (Lauttia et al., 2019;Senju and Johnson, 2009), suggesting that perception of gaze direction hints into core function of cognitive development, indispensable for effective everyday interaction and providing cues of how to adapt our behaviour according to specific social context (Lewis and Edmonds, 2003).
Although eye region alone provides important cues about other people's attentional focus and intentions, this information can also be extracted from other visual signals, such as the angle of the other person's body posture, if it is available, and importantly the orientation of the head.As in everyday life, we normally see the eyes in the face and not in isolation, the accurate detection of gaze direction not only involves information about the eye region, but also integrates information about head orientation.For instance, the same eyes embedded in two different faces will be perceived either as looking towards or away from the observer depending on the head orientation of the facethe socalled Wollaston effect (Wollaston, 1824), clearly demonstrating that we process faces holistically rather than by extracting separate features.Since this demonstration, a great body of research has shown that perceived gaze direction can indeed be influenced by head orientation (e.g.Anstis et al., 1969;Hecht et al., 2020;Langton et al., 2004;Otsuka and Clifford, 2018;Otsuka et al., 2014;Ricciardelli and Driver, 2008).
In one respect, the rotation of the head provides a coarse spatial cue for gaze direction and thus facilitates the judgement: detection of gaze direction is faster when head orientation and gaze direction are congruent (Florey et al., 2015;Langton, 2000;Ricciardelli and Driver, 2008).In another respect, reverse congruency effects have been also reported (Anstis et al., 1969;Langton, 2000, Exp. 1;Ricciardelli and Driver, 2008), as to that perceived gaze direction can be shifted away from the head orientation in conditions where the head is deviated in the opposite direction as the eye gaze, leading to biased perception of gaze direction (Anstis et al., 1969;Gamer and Hecht, 2007;Otsuka et al., 2014), but a better performance in gaze detection.
The magnitude of such so-called repulsive gaze direction effect was shown to be greater in conditions where only the eye region was present and less pronounced when the whole head was present, whereby gaze direction seems to be pulled towards the orientation of the head (Otsuka et al., 2014).It seems, thus, that head orientation alone can act as a direct cue for gaze direction perception.In the framework of the dual-route model, gaze perception consists of two distinct routes: the detection of deviation of the pupil in the visible part of the sclera, which provides information about local changes in the eye region, and the orientation of the head, which provides global spatial cues (Otsuka and Clifford, 2018;Otsuka et al., 2014).Jointly, eye region and head orientation information ensure the accurate judgement of gaze direction.
Interestingly, Ricciardelli and Driver (2008) revealed the factor of given instructions on task performance, whereby faster gaze detection is made for congruent gaze and head conditions if the task instruction required a speedy response, and for incongruent conditions where no time limit was given.These results suggest that task instructions can modulate the processing of gaze and head cues in detecting gaze direction, and, in turn, affect the performance.The task demand is thus one of the top-down influences from the higher brain centres via feedback connections to visual circuits, which can indeed alter the response tuning of the neurons conveying information into visual areas (Gilbert and Li, 2013).Indeed, several studies have shown that behavioural performance (Latinus et al., 2015;Ricciardelli and Driver, 2008) and neuroimaging effects (Burra and Kerzel, 2019;Tautvydait ė et al., 2022) in face perception vary depending on task demand.Nevertheless, to date, it is unclear how the task demand may alter the temporal processing of gaze direction and face orientation, and how the neural representations of these features interact.
To provide a holistic and unified picture of the face we are observing, the information extracted from the eye region and head orientation has to be integrated somewhere.On the neural basis, studies with macaque monkeys showed that the large population of neurons in the macaque superior temporal sulcus (STS), coding for head orientation, were also coding for gaze direction (Perrett et al., 1992(Perrett et al., , 1985)).De Souza et al. (2005), on the other hand, reported a more heterogeneous coding within macaques' STS, selective for certain gaze directions and head orientations.Neuroimaging data in humans too, reported a fine-grained coding of gaze direction in the anterior STS and precuneus, which was invariant of different head orientations (Carlin et al., 2011), whereas the posterior STS seems to be responsive to both varying gaze directions and head orientations (Carlin et al., 2011;Nummenmaa and Calder, 2009).It seems that the processing of social attention cues reflects both distinctive and shared neural coding systems for gaze direction and head orientation (Battaglia et al., 2022).However, mutual neural coding might actually reflect the integration process of eye and head cues, while each of them might be processed independently before being integrated.While processing of gaze direction and head orientation, and the effect of their interactions on perceived gaze direction has been addressed in some studies (Burra et al., 2017;Carlin et al., 2011;George et al., 2001), the neural mechanisms of these processes and independency of their neural representations, however, remain unclear.Event-related potential (ERP) studies, measured with electroencephalography (EEG), are particularly suitable on this point, as they provide measures of brain activity alterations in response to a specific event (for example, seeing and distinguishing specific head orientations with a specific gaze direction).In the present study, we elucidate the processing of gaze direction and head orientation, as well as their interaction by applying the decoding of brain activity patterns from the EEG signal.Decoding of ERPs not only reveals temporal dynamics of neural processing with the precision of milliseconds but, when combined with machine learning techniques, enables to disentangle neural representations of visual stimuli (Bae, 2021;Bae and Luck, 2019a;Mares et al., 2023;Nemrodov et al., 2016), allowing for a more detailed understanding of the underlying neural mechanisms in a specific event.
The ERP decoding approach aims to quantify the data within the neural signal and enables to detect how information content, relevant to the experimental manipulation, varies over time.Traditional ERP analyses assess whether the amplitudes or latency of a specific component vary across trial types.Studies using such analyses could evidence early markers of face processing, in particular P1 and N170 components, to be sensitive to face perception and gaze direction processing (for a review see Tautvydait ė et al., 2022).Yet, the decoding approach offers advantages over classical ERP analyses as it does not pre-determine the time periods and electrode locations of effects of interest, needed for waveform analyses, but rather explores combined information across all electrode sites and time intervals.For example, by conducting decoding on ERP signal, it is possible to determine how processing stages of face identity and emotion differ in explicit vs implicit task contexts (Smith and Smith, 2019), or to identify the temporal evolution of face representations during stimulus perception and maintenance in working memory (Bae, 2021).ERP decoding thus presents an increased sensitivity in detecting subtle differences in EEG signal, not seen by traditional ERP analyses.
Building upon previous research, our study aimed to contribute to a more profound understanding of the neural mechanisms underlying gaze direction and head orientation processing.We particularly aimed to explore how the dynamics of brain representations, conveying information about gaze direction and face orientation, evolve over time and how these processes are modulated by task demands.We presented realistic faces with varying gaze directions and head orientations while recording brain activity with EEG within two different task demand settingsdetection of gaze direction and detection of head orientation.The ERP decoding approach (Bae, 2021) was used to isolate and analyse information specifically associated with the processing of gaze direction and head orientation.

Materials and method
The study consisted of two experiments conducted in the same settings but at different time points and participants.We first designed Experiment 1, to distinguish gaze direction and face orientation EEG decoding.In this experiment participants had to identify the gaze direction (Gaze task, see Fig. 1).We then wanted to discern how the task instructions might affect the gaze direction and head orientation decoding and thus conducted Experiment 2 with the same experimental manipulations as in Experiment 1, only now the participants had to report the orientation of the head (Head task, see Fig. 1).
The behavioural data, averaged EEG datasets, and the code underpinning decoding analysis can be found at https://osf.io/tn78q/.

Participants
Study participants were young (18-35 years) healthy students recruited from the University of Geneva, Psychology faculty, and obtained the course credits for participation.They all had normal or corrected to normal vision.Forty-four healthy participants participated in Experiment 1. Data from 12 participants were discarded from the analyses because of poor EEG signal (5) and performance in the task (7), which made the final sample of 32 participants (3 male, mean (M) age, ± standard error (SE): 22.1 ± 3.3 years).Experiment 2 included 35 participants (8 male, M = 21 ± 2 years) from whom 1 male was excluded due to technical issues during EEG recordings.
As no power analyses were available before data collection, the sample size was determined based on previous EEG decoding studies (Bae andLuck, 2018, 2019b;Nemrodov et al., 2016), i.e. around 20 participants.To ensure that the number of participants used in our study was sufficient to detect robust effect sizes, we estimated the power of our study by performing sensitivity analyses.This analysis confirmed that our study has a high enough power to detect robust effect sizes with the sample size we used (please refer to Supplementary Material for detailed description and results of Sensitivity analyses).All participants gave written informed consent to participate in the study, which was approved by the local ethics committee (Faculty of Psychology and Educational Sciences, Geneva University).The study was conducted according to the Declaration of Helsinki.
The experimental tasks, behavioural data, averaged EEG datasets, and the code underpinning decoding analysis can be found at htt ps://osf.io/tn78q/.

Stimuli
Stimulus displays consisted of 3D realistic faces (1 male and 1 female) created using the software Poser 5.The luminance and spatial frequencies of each face were matched with the SHINE package (Willenbockel et al., 2010).Faces were presented with combinations of 5 varying gaze directions and face orientations: − 30 • , − 15 • , 0 • , +15 • , and +30 • , comprising in total 17 different variations of male and female stimulus (see Fig. 1).The choice of one male and one female face, along with 17 different gaze/head combinations, was made to balance experimental complexity of stimuli.

Procedure
The experiment took place in a soundproof room with dim lighting.Participants sat 75 cm from a computer screen, of 53×29 cm size, with a refresh rate of 60 Hz.The presentation of the stimuli and the collection of the responses were controlled by a computer running Psychtoolbox v.3 (Kleiner et al., 2007).Each trial began with the presentation of the black fixation cross (0.5 • x 0.5 • ) in the middle of the grey background screen for 500 ms.Then, the face stimulus appeared in the centre of the screen for 200 ms, followed by the grey screen with a fixation cross until participants pressed the keyboard to answer.The inter-trial interval duration varied randomly for each trial between 700 and 1500 ms (See Fig. 1).
Participants were asked to indicate as accurately and as rapidly as possible the 1) direction of the gaze (left, direct, right, Gaze task) and 2) the head orientation (left, direct, right, Head task) by pressing one of the three keys on a regular keyboard with three fingers of their right hand.Each block consisted of 34 trials, each including 17 gaze direction and head orientation combinations with a female and male face.The whole experiment included 20 blocks (680 trials), meaning that each stimulus type was randomly presented on 20 trials for female and male faces.After every block, participants had a 1 min pause to relax their eyes.

EEG recording and pre-processing
EEG was recorded continuously during the experimental tasks using the 64-channel Biosemi Active-Two system (BioSemi Active-Two, V.O.F., Amsterdam, The Netherlands) at a sample rate of 1024 Hz.Common mode sense (CMS) and driven right leg (DRL) were used as reference channels.Electrodes were placed based on the International 10-10 system (Jurcak et al., 2007).Electrodes placed at the outer right and left canthi measured the horizontal electrooculogram (HEOG) and electrodes above and below the right eye measured the vertical electrooculogram (VEOG).Matlab (Mathworks Inc.) toolboxes EEGlab (Delorme and Makeig, 2004) and ERPlab (Lopez-Calderon and Luck, 2014) were used for offline analysis.The data were band-pass filtered using a zero phase-shift Butterworth filter with half-amplitude cutoffs at 0.1 and 30 Hz.The filter was set to 0.1 and 10 Hz for HOEG and VEOG.Then, the data was recalculated against the average reference across all scalp electrodes and down-sampled to 256 Hz.Independent component analyses (ICA) were applied to the continuous EEG data to remove ocular and cardiac artefacts.
ICA-corrected EEG data were segmented from 200 ms before stimulus onset to 1200 ms post-stimulus onset for each subject and stimulus type, and baseline corrected using a 200 ms pre-stimulus period.Noisy electrodes (Gaze task, M = 1, Head task M = 2.1) were interpolated using 3D spline interpolation (Perrin et al., 1987).Epochs with eye blinks (difference in VEOG > 60 µv during a period of 150 ms), saccadic eye movements to the left or right (difference in HEOG > 30 µv during a period of 150 ms), and bad epochs (any electrodes > 80 µv) were removed and inspected visually for any remaining noise.Further, trials with incorrect responses and with a response shorter than 200 ms or longer than 2000 ms were excluded from the analyses.Five participants were excluded from further analyses due to noisy EEG signals: two participants due to excessive alpha waves, and 3 participants due to too few clean epochs (< 50, see Boudewyn et al. (2018)) needed for EEG analyses.

ERP analysis
To align our study with existing literature (for a review see Tautvydait ė et al., 2022), we performed an analysis of event-related potentials (ERPs) on two clusters of parieto-occipital electrodes located on the left and right hemispheres (P7, P07, P9, and P8, P08, P10, respectively), to measure the P1 and N170 components.We then extracted the mean amplitudes in the 120-160 ms and 170-210 ms time intervals for P1 and N170 respectively.These mean values were analysed using rmANOVA with Side (2: left and right), Head orientation (3: 30 • , 15 • and 0 • ), and Gaze direction (3: 30 • , 15 • , and 0 • ), and Electrode side (2: left and right hemispheres) as within-subject factors separately for P1 and N170, and Gaze and Head task.

Decoding analyses of gaze direction and head orientation 2.6.1. Data organization for decoding analyses
Critically, we aimed to decode gaze direction and head orientation from the phase-locked ERP voltage signal.All decoding analyses were performed using Matlab (Mathworks Inc.).As the decoding procedure is known to be affected by alpha-band activity (8-12 Hz) (see, f.ex., Bae andLuck, 2019a, 2019b), we low-pass filtered our data to 6 Hz using the EEGLAB eegfilt () routine.To enhance the speed of analyses, data was then resampled to 50 Hz (20 ms for 1 sample, 71 time points in one epoch).The decoding analyses was performed for gaze direction and head orientation in Gaze task and Head task separately, and then decoding results were compared across tasks (see Bae et al., 2020).To decode gaze direction, the ERP epoched data was organized according to variations of gaze direction, i.e., trials for each of the 5 gaze directions were collapsed across the 5 head orientations (resulting in 120 trials for Gaze-30 • , Gaze-15 • , Gaze +15 • , Gaze+30 • , and 200 trials for Gaze0 • , as it was presented in 5 different head orientations: Head +30 • , Head +15 • , Head 0 • , Head − 15 • , and Head − 30 • ).The same organization rule was applied for Head orientation decoding, except that here the trials were collapsed across 5 gaze directions (resulting in 120 trials for Head-30 • , Head-15 • , Head +15 • , Head+30 • , and 200 trials for Head 0 • , as it was presented with 5 different gaze directions: Head +30 • , Head +15 • , Head 0 • , Head − 15 • , and Head − 30 • ).We had thus two conditions, Gaze direction and Head orientation, with 5 levels ("bins") each: Gaze-30 • , Gaze-15 • , Gaze 0 • , Gaze +15 • , Gaze+30 • , and Head-30 • , Head-15 • , Head 0 • , Head +15 • , Head+30 • ).Note that if after EEG pre-processing there were different numbers of clean trials per condition bins, we randomly sub-sampled the number of trials according to the smallest number per condition bin.The number of trials was thus equalized across conditions bins so as not to bias the classifier.This organization process resulted in a matrix composed of 4 dimensions: time (51 time points; from 200 ms pre-stimulus onset to 800 ms after stimulus onset, one time point every 20 ms), condition (Gaze direction and Head orientation, 5 direction bins each), trial (120 trials per condition), and electrodes (64).Such a 4-dimensional matrix was produced for each participant, and each task (Head task or Gaze task).

Decoding data classification
To classify ERP signals of Gaze direction and Head orientation we used the Support Vector Machine (SVM) with error-correcting output codes (ECOC, Dietterich and Bakiri, 1994) implemented through the Matlab function fitcecoc().Gaze direction and Head orientation were decoded based on the ERP signal over the 64 electrodes for each of the 51 time points.
A 3-fold cross-validation procedure was used to train and test the decoder and estimate the decoding accuracy (Grootswagers et al., 2017).Decoding was performed on averaged ERP trials per condition rather than on single-trial data, where the signal-to-noise ratio is typically reduced.The averaged ERP trials for a given condition were randomly divided into 3 subsets, where 2/3 of the dataset was used to train the classifier to decode stimuli from the ERP signal, and the remaining 1/3 of the dataset was used as a testing set.In the training set, the SVM was taught to distinguish how the 5 gaze directions and 5 head orientations differ at each time point for a given electrode.Precisely, the training set, containing the labels of gaze directions and head orientations, was submitted to the ECOC model to train 5 SVMs.The binary classification method was applied with a one-vs-rest coding design, whereby the SVM learns to separate one gaze direction or head orientation from 4 other gaze directions or head orientations.Once the classifier was trained, the SVMs of each condition bin (5 gaze directions and 5 head orientations) were used to predict the unlabelled conditions in the test set.Finally, comparison between the true labels of gaze direction and head orientation of the test set and predicted labels for the test set allowed us to compute the decoding accuracy.The performance of classifiers was considered above chance level if the obtained decoding accuracy was higher than 0.2 (1/5).
The whole procedure was repeated 3 times so that each of the subsets could be assigned to the training or testing groups.It was then iterated 10 times with different randomly selected trials assigned for training and testing sets, for each time point and each subject in the Gaze task and Head task separately.The number of iterations was determined according to previous ERP decoding studies (Bae, 2021;Bae andLuck, 2019a, 2019b).Decoding accuracies were then averaged by collapsing the percentage of 150 decoding outputs (5 directions x cross-validations x 10 iterations) for each time point of the epoch.Finally, to decrease noise, averaged decoding accuracies were smoothed using a 5-point (time period of ± 40 ms) moving window.
The same procedure was applied to decode the Gaze direction and Head orientation in the Gaze task and Head task.

Statistical analyses of decoding accuracy
If the ERP signal recorded in Gaze and Head tasks contains information about Gaze direction or Head orientation, then their obtained decoding accuracy should be higher than chance level.We applied the non-parametric cluster-based permutation approach (which is similar to the cluster-based mass univariate analyses usually reported in EEG literature (Groppe et al., 2011;Maris and Oostenveld, 2007)) to verify if the decoding accuracy was significantly different from chance.
We firstly collapsed decoding accuracies across participants and computed one-sample t-tests separately for Gaze direction and Head orientation for each time point after the stimulus onset to verify if the decoding accuracy was significantly greater than chance (1/5 = 0.2).We then found the cluster points where significant t values (p < 0.05) extended over several contiguous time points.The t values within these time points were summed up to obtain a cluster-level t mass, which was then compared to a null distribution for the cluster-level t mass.
The null distribution was constructed using permutation analysis.Specifically, permutation analysis was performed by shuffling the true labels of Gaze direction and Head orientation from the test data before determining whether the decoding output corresponded to true labels.This allowed us to reconstruct the null distribution: the decoding output that would have been obtained if the decoder had no information about Gaze direction and Head orientation.Crucially, the same shuffled target labels were used for each time point in a given epoch to reflect the temporal auto-correlation of EEG data, which is a common pitfall in EEG statistical analyses.The whole permutation process was repeated times.

ERPs
To compare the effects of gaze direction and head orientation on ERP components P1 and N170 we conducted a repeated-measures analysis of variance (rmANOVA) with factors of Side (with two levels: left and right), Head orientation (with three levels: 30, 15 and 0 • ), and Gaze direction (with three levels: 30, 15, and 0 • ).ERP waveforms are displayed in Fig. 2.

Decoding analysis
Decoding performance was computed for ERP epochs from − 200 ms to 800 ms relative to stimulus onset to reveal how well Gaze direction and Head orientation can be explained from multi-channel ERP data, under Gaze direction and Head orientation detection tasks.

Decoding in Gaze task
Gaze direction.Decoding of Gaze direction in the Gaze task (i.e., when participants had to indicate whether the gaze of the presented face was orientated towards observer, averted to the left or the right) started to be significant at 140 ms (with 21 % decoding accuracy) and remained significant across the entire epoch (Fig. 3A).The Gaze direction decoding accuracy firstly peaked at 500 ms and then at 660 ms, with an average decoding accuracy of 40.9 % and 41.6 % respectively.The confusion matrices (Fig. 3B, 1 panel) depict the probabilities of correct classifications averaged for early (0-400 ms) and later (400-800 ms) time periods, reflecting a correct classification during both time periods.In the early time period, all gaze directions were classified just above chance level, whereas in the later time period, gaze directions were better separated, with direct gaze having the best classification accuracy (over 55 %), and gaze aversions to the left or rightlower classification accuracies (around 30 %).These results suggest that different gaze directions produced distinct neural patterns across the time of stimulus perception, whereby different gaze directions were much better decodable later (>400 ms) in the stimulus processing time course.
Head orientation.Reliable decoding of Head orientation in the Gaze task began shortly after stimulus onset, at 20 ms.It peaked at 380 ms with 51.7 % decoding accuracy, and then decreased gradually until the end of the epoch, still remaining higher than chance level (Fig. 3A, middle panel).The relevant confusion matrices in early (0-400 ms) and later (400-800 ms) time windows are depicted in Fig. 3B, second panel, and show that all head orientations could be well distinguished (all probabilities > 20 %, p < 0.05) in both time periods, and thus showing that decoding was not driven by a distinctive head orientation.Thirty-Fig.2. ERP waveform analyses on parieto-occipital cluster of Head orientations.The ERP waveforms for 30 • (blue line) 15 • (red line) and 0 • (black line) head orientations with 30 • , 15 • and 0 • gaze directions combined, in Gaze and Head tasks.The vertical axes indicate the amplitude (in µV), and the horizontal axes show time (in ms).The vertical grey bars depict the time intervals over which analyses were performed: 120-160 ms for P1 and 170-210 ms for N170.We measured the main effect of head orientation on P1 and N170 components, but no effects of Gaze direction or Gaze direction x Head orientation or Side x Gaze direction interaction in either experiment.degree head orientations obtained the best discrimination (over 50 %) in early and late time periods, reflecting that neural patterns underlying distinct head orientations remain stable in the time course of stimulus processing.
Comparison.Gaze direction and Head orientation were both decoded separately, by collapsing the 5 gaze directions and 5 head orientations respectively (see Methods).The decoding accuracy for both dimensions was above chance level (Fig. 3A), suggesting that neural representations produced by gaze direction and head orientation are at least partially independent.The time course of decodability confirms this independency: while Head orientation started to be decodable at already 20 ms after stimulus onset, the Gaze direction could be reliably decoded much later, at 140 ms, pointing to the precedence of neural patterns generated by head orientation as compared to gaze direction processing.This early representativeness is also evident from cluster-based permutation analyses (see Methods), which compared Gaze direction and Head orientation decoding accuracies across time, and showed significantly greater decoding (p < 0.001, two-tailed) of Head orientation than Gaze direction starting at 20 ms and remaining greater until 500 ms after stimulus onset (Fig. 3A, right panel).Around 520 ms decoding accuracy of Gaze direction rose higher than Head orientation, but this increase did not reach significance.

Decoding in head task
Gaze direction.In the Head task, when participants were instructed to detect the orientation of the head of the presented face, decoding of Gaze direction was significant at 120 ms with, just higher than chance, 21,7 % decoding accuracy.Decoding accuracy then increased progressively, reaching its peak at 340 ms with 28,9 % average accuracy.The confusion matrices, divided into two time periods (0-400 ms and 400-800 ms, Fig. 3B, middle panel) revealed that all 5 gaze directions could be correctly discriminated (all probabilities > 20 %, p < 0.05) for the entire time epoch.
Head orientation.Reliable decoding of Head orientation also started at 120 ms, with 21,5 % average accuracy, raised sharply reaching 44,6 % peak accuracy at 380 m, and then decreased progressively, although remaining significantly higher above chance level until the end of the epoch.The relevant confusion matrices depicted in Fig. 3B, middle panel, show that all head orientations could be well discriminated (all probabilities > 20 %, p < 0.05) in early (0-400 ms) and late (400-800 ms) time periods.Heads deviated 30 • to the right yielded the greatest (over 30 %) classification in early, but not in later time periods, reflecting that this head orientation could be read out the best early after stimulus onset.
Comparison.The same decoding accuracy comparison analysis as in the Gaze task was performed for Gaze direction and Head orientation decoding in the Head task.As is evident from Fig. 3B, decoding accuracies for both dimensions were significantly higher than chance, showing that both dimensions could be read out independently of one another.Although the onset of reliable decodability was the same (120 ms) for both the Gaze direction and Head orientation, the slope of Head orientation accuracy had a much sharper rise and the peak was higher, compared to the flatter Gaze direction slope.Such increased decodability for Head orientation was confirmed by cluster-based permutation analyses, highlighting significantly greater (p < 0.001, two-tailed) decoding for Head orientation as compared to the Gaze direction starting at 160 ms after stimulus onset and remaining until 780 ms.

Impact of task on decoding accuracy
The effect of task instructions on decoding of Gaze direction and Head orientation is depicted in Fig. 4 (A & B respectively).A clusterbased permutation analysis between tasks showed that decoding of Gaze direction was significantly higher in the Gaze task than in the Head task starting at 300 ms after stimulus onset and remaining significant until the end of the epoch (Fig. 4A).This analysis reflects, as expected, that Gaze direction information is better represented in the Gaze task when participants are explicitly asked to detect the gaze direction.Interestingly, the decoding accuracy of Head orientation was significantly weaker in the Head as compared to the Gaze task, from stimulus onset until 800 ms (Fig. 4B).Importantly, this result demonstrates that decodability of Head orientation is greater even under conditions when gaze direction, and not head orientation, is required to be detected.
To complement this analysis, we extracted and averaged accuracy values for each participant measured in the time window between 320 and 520 ms in the Gaze task, and from 280 to 480 ms in the Head task (i.e., the time window of 200 ms, surrounding the rise and/or the peak of decoding slope obtained in the group level per task, see Fig. 4A & B).As is evident from the strip chart (see Rousselet et al., 2017) in Fig. 4C, decoding accuracy values for gaze direction and head orientation vary depending on the task instructions.Comparisons of individual peak decoding values in selected time windows revealed a significant decrease in decoding accuracy for both Gaze direction (t(64) = − 6.14, p < 0.001, two-tailed; d = − 1.51; Gaze task = 38 %, Head task = 26 %) and Head orientation (t(64) = − 2.25, p < 0.001, two-tailed; d = − 2.25; Gaze task = 49 %, Head task = 38 %) as a function of task.

Discussion
In the present study, we employed a multi-channel EEG decoding approach to elucidate the neurodynamic patterns of gaze direction and head orientation processing.We aimed to understand how task instructions shape neural representations related to these processes.Firstly, behavioural data showed that when presenting faces with varying gaze directions and head orientations, the detection of head orientation was in general an easier process than the detection of gaze direction.Importantly, both in the Gaze and Head task we found congruency effects in task performance: detection of both gaze direction and head orientation was more precise and faster when gaze direction and head rotation were pointing to the same side (See Supplementary material for more details).P1 and N170 components failed to reveal a strong sensitivity to gaze direction but showed the main effect of head orientation.However, from an EEG decoding perspective, our findings indicate that gaze direction and head orientation could be decoded irrespectively to one another, in both task sets.Our study also yielded strong evidence for temporally distinct decoding of gaze direction and head orientations.These results suggest that ERP signals containing information about gaze direction and head orientation are partially dissociable.One of the objectives of the present study was to identify if the neural mechanisms involved in gaze direction and head orientation processing are distinguishable.Most of the studies aiming to disentangle these processes use rather unrealistic stimuli presenting only the eye region in isolation, or excluding the eye region from the whole face, rarely seen in everyday life (for example, Langton, 2000;Otsuka and Clifford, 2018;Otsuka et al., 2014;Ricciardelli and Driver, 2008).We specifically strived for an EEG decoding technique, which allowed us to unravel the processing of different information dimensions by keeping the whole realistic facial stimulus with no omissions of the eye region.In two different task settings, both gaze direction and head orientation could be reliably decoded independently of one another, suggesting that neural representations, containing information about gaze direction and head orientation are at least partially distinguishable.Such independence was also supported by distinct temporal decodability dynamics of each dimension (although only in the Gaze task).Referring to gaze and head processing disparity, our findings are consistent with some functional neuroimaging studies also reporting functional specificity of gaze and head representations (Carlin et al., 2011;Materna et al., 2008).In the Carlin et al. (2011) study, the anterior part of STS was shown to be preferentially involved in the perception of gaze direction, independently of head rotation and physical image features.Whereas the posterior part of STS was sensitive to both different gaze directions and head views (Carlin et al., 2011).It seems thus that representational content of gaze direction and head orientation is shared in posterior STS (Nummenmaa and Calder, 2009), while representations of gaze direction, and invariant of head orientation are mostly prevalent in anterior STS.Importantly, in Carlin et al. (2011) experiment, the temporal information of gaze processing was not available, while the present study could establish different decoding onsets for gaze direction and head orientation, which, additionally, vary depending on task context.
Our study adds important value to the distinction of gaze and head perception systems not only by showing that information containing these dimensions can be reliably decoded from EEG signals independently of one another in different task settings, but also have distinct temporal dynamics depending on task instructions.In the Gaze task participants needed to detect gaze direction.However, head orientation started to be decodable barely at 20 ms after stimulus onset, preceding the onset of gaze direction decodability by 120 ms.This timing difference indicates that head orientation cues were prioritized to gaze direction cues in order to efficiently detect gaze direction.Jointly, our findings on different decoding onsets, the independency of gaze direction and head orientation decoding, together with previous neuroimaging studies (Carlin et al., 2011;Nummenmaa and Calder, 2009;Pageler et al., 2003) raise an important point: the perception of gaze might be organized in hierarchical order, whereby information conveying more global visual cues (i.e., head orientation) is extracted and perceived in the first place in anterior STS (Carlin et al., 2011), and then filtered out to concentrate on the processing of more local, gaze direction-specific visual cues of the eye region in posterior STS (Carlin et al., 2011(Carlin et al., , 2012)).Such a trajectory would also be in line with studies with macaque monkeys, showing neuronal response in their temporal cortex to the integral information of the face in the first place, followed by processing of more refined facial characteristics in the later processing stage (Cauchoix et al., 2012;Sugase et al., 1999).Such a stratified time trajectory does not rule out the possibility that an integration process might arise in later stages of face processing, once the global and more detailed facial features are extracted.
Noteworthy, such decoding timing difference was not observed in the Head task.The decoding onset for both Gaze direction and Head orientation started at the same time point, 120 ms after stimulus onset.This finding suggests that while in the Gaze task information about head orientation becomes available previously to information about gaze direction, in the Head task both representations seem to become accessible at the same time.

Task relevance
Our decoding results vary depending on task context and decoding dimension indicating that decoding of brain patterns is affected by task relevance.In the Gaze task, head orientation had higher decoding accuracy, meaning that the task-irrelevant dimension was decoded better than the task-relevant dimension.This suggests that neural patterns containing information about head orientation are better represented than the ones of gaze direction in the Gaze task.Such higher decoding accuracy for task irrelevant dimension was not observed in the Head task, however.This means that head orientation serves as an important cue to detect the gaze direction, while the gaze direction does not seem to play an essential role in face orientation extraction.
One intriguing finding in our study was that head orientation was decoded better in the Gaze task than in the Head task, namely head orientation decoding was better in the irrelevant task context ("detect gaze direction") as compared to the relevant task context ("detect head orientation").Gaze direction decoding, however, was higher in taskrelevant context, i.e., Gaze task.These results suggest that information about gaze direction is not needed as much in the Head task as head orientation was needed in the Gaze task.Note that in general, detecting the head orientation was much easier for participants as compared to detecting gaze directionanother point supporting the idea that for gaze detection head orientation serves as an important and, probably primary cue, while for head orientation it is not needed as much.Taken together, these are very valuable findings reflecting the enhanced importance of head orientation in extracting gaze direction.
Our initial interpretation suggested an intentional prioritization of head orientation cues over gaze direction cues.However, upon further consideration, it is more accurate to state that head orientation cues might be more readily available or processed earlier than gaze direction cues.Early decoding differences might result from activity that is specific to some sort of global form-related activation, needed to further extract the eye region information to successfully complete the task.This does not necessarily imply a conscious prioritization but rather reflects the hierarchical nature of visual processing.In this sense, early visual processing stages may focus on global visual features such as basic spatial features, followed by more detailed analysis such as gaze direction, as supported by the work of Grill-Spector and Kanwisher (2005) (see Burra et al., 2017 for a more detailed discussion on the matter).

Limitations
The results of this study further deepen our understanding of temporal dynamics in gaze direction, but there are also several limitations.For example, to control for all possible visual characteristics, the face images used in this study were realistic 3D avatar faces, and not real human faces, which weakens their ecological validity compared with human faces.Furthermore, we used a between, and not a within-subject design, which might induce subjects' related variability to reported effects.
Recent concerns have been raised regarding the early decoding of head orientation within 20 ms post-stimulus.This rapid discrimination might not necessarily reflect the cognitive processing of the stimulus as a face or head.Instead, it could be indicative of pre-attentive processing of basic visual features such as low-frequency contrast or shape differences.This perspective aligns with the findings of Thorpe et al. (1996), who demonstrated that early visual processing primarily involves the extraction of basic features before the stimulus is fully resolved as a specific object (Thorpe et al., 1996).To further elucidate this phenomenon, future research could involve control experiments using non-facial stimuli with similar symmetry and contrast properties to determine if the early decoding is specific to facial stimuli or a more general feature D. Tautvydaitė and N. Burra of visual processing.
In response to concerns about potential differences in attentional focus during the Gaze and Head tasks, we ensured consistent visual stimulation across tasks.Participants were instructed to maintain their gaze on a fixation cross, and horizontal electrooculogram recording was employed to monitor eye movements.This approach aimed to minimize variations in visual attention that could potentially influence ERP responses.However, it is acknowledged that differential patterns of attention, such as a focus on the eyes for the Gaze task versus a more peripheral focus for the Head task, could lead to a deferential pattern of attentional deployment, which, in turn, could affect the ERP signal.Future studies incorporating eye-tracking technology (as suggested in Henderson, 2003) could provide more precise control and understanding of attentional focus during these tasks

Conclusions
The current study used the EEG decoding approach to identify the temporal dynamics and relations of two facial features conveying the direction of social attention: gaze direction and head orientation.Firstly, we showed that congruency between gaze direction and head orientation leads to better performance in Gaze and Head tasks.We then revealed that gaze direction and head orientation representations become decodable at different times and that their temporal dynamics depend on task context.

Fig. 1 .
Fig. 1.Experimental paradigm and stimuli.(A) Task procedure showing the time course of the experimental trial.The task design and stimuli were identical, only the task instructions changed.In the Gaze task participants had to indicate whether the gaze of the avatar was directed to the left, right, or towards them.In the Head task participants were asked to indicate the head orientation (left, right, or frontal).(B) Variations of experimental stimuli with 5 different gaze directions and head orientations are shown on the female avatar.Nb: stimuli were grayscaled and controlled for low-level differences in the actual experiment.

Fig. 3 .
Fig. 3. Decoding Gaze direction and Head orientation in Gaze (A, B) and Head (C-D) tasks.A-B-C-D Top panels: Time course of mean decoding accuracy for gaze direction (A) and head orientation (B) decoding in the Gaze task, and in the Head task (C: gaze direction, D: head orientation).The dashed line indicates abovechance level accuracy.The grey areas indicate the time periods when the decoding accuracy was significantly different from chance.The light shadings indicate SEM.A, B, C, D bottom panels: the corresponding confusion matrices showing classification probabilities in early (0-400 ms) and late (400-800 ms) time windows.

Fig. 4 .
Fig. 4. Effect of task on Gaze direction and Head orientation decoding.Decoding accuracy for Gaze direction (A) and Head orientation (B) in Gaze (yellow) and Head (blue) tasks.The dashed line shows the above-chance level decoding accuracy (0.2 for both Gaze direction and Head orientation).The grey areas represent the time windows where decoding accuracy differed significantly (with p < 0.05).C) Strip-chart depicting mean decoding accuracy values between 320 -520 ms in the Gaze task, and 280 -480 ms in the Head task for each participant.