Data-driven analysis of gaze patterns in face perception: Methodological and clinical contributions

Gaze patterns during face perception have been shown to relate to psychiatric symptoms. Standard analysis of gaze behavior includes calculating fixations within arbitrarily predetermined areas of interest. In contrast to this approach, we present an objective, data-driven method for the analysis of gaze patterns and their relation to diagnostic test scores. This method was applied to data acquired in an adult sample (N = 111) of psychiatry outpatients while they freely looked at images of human faces. Dimensional symptom scores of autism, attention deficit, and depression were collected. A linear regression model based on Principal Component Analysis coefficients computed for each participant was used to model symptom scores. We found that specific components of gaze patterns predicted autistic traits as well as depression symptoms. Gaze patterns shifted away from the eyes with increasing autism traits, a well-known effect. Additionally, the model revealed a lateralization component, with a reduction of the left visual field bias increasing with both autistic traits and depression symptoms independently. Taken together, our model provides a data-driven alternative for gaze data analysis, which can be applied to dimensionally-, rather than categorically-defined clinical subgroups within a variety of contexts. Methodological and clinical contribution of this approach are discussed.


Introduction
The tracking of eye gaze in response to social stimuli has been studied in experimental research for more than half a century (Yarbus, 1967). From these pioneering studies, several key findings were identified, including the prioritized attention to faces, while modern eye tracking further identified prioritized attention within internal facial structures, such as the eyes, nose and the mouth (Groner, Walder, & Groner, 1984;Henderson, Williams, & Falk, 2005;Johnson, 2019). With the increasing accessibility and convenience of eye tracking, research on face perception has extended into psychiatric populations with a focus on patient groups with difficulties in the social domain, including autism (Pelphrey et al., 2002), schizophrenia (Phillips & David, 1997) and social phobia (Horley, Williams, Gonsalvez, & Gordon, 2004), as well as patients diagnosed with mood disorders such as depression (Duque & V azquez, 2015) and attention deficit disorders including ADD/ADHD (Muszkat et al., 2015). As recently highlighted by Falck-Ytter and colleagues, "eye tracking has a huge translational potential [including] risk assessment [and] monitoring treatment outcome in clinical trials" (Falck-Ytter et al., 2020).
Among these conditions, autism spectrum disorder (ASD) has been examined in particular, due to a its association with deficits in social communication, and its characteristical atypical eye contact (American Psychiatric Association, 2013). In terms of gaze behavior, some studies have provided evidence for increased attention to the external rather than the internal facial features in ASD (for example, forehead, hair, chin) (Chawarska & Shic, 2009;Asberg Johnels, Gillberg, Falck-Ytter, & Miniscalco, 2014), while others have also shown differences in the way ASD individuals scan the internal facial structures themselves, such as tendency to prioritize the mouth over the eyes (Klin, Jones, Schultz, Volkmar, & Cohen, 2002). In the current study, we chose to contrast the autismface perception profile with two additional clinical traits: depression and ADHD, for three reasons: First, though the literature on face scanning in depression is considerably smaller than that in ASD, research suggests that individuals with depressive symptomatology tend to focus on the lower (mouth and nose) followed by the middle portion (eyes and cheekbones) of the presented face (Hunter, Roland, & Ferozpuri, 2020). Second, a tendency for atypical face scanning patterns has also been shown in children diagnosed with ADHD (Muszkat et al., 2015) Finally, and importantly, since (clinical or subclinical) signs and symptoms of autism, depression and ADHD often overlap in the same individual (Astle, Holmes, Kievit, & Gathercole, 2021;Coghill & Sonuga-Barke, 2012;Gillberg, 2010) the need for a multivariate approach is needed to unravel the specificity of these associations.
In addition to atypical attention to facial features, another gaze pattern atypicality that has been observed in autistic individuals is the reduction of the left visual field bias (LVF bias) (Dundas, Best, Minshew, & Strauss, 2012;Guillon, Hadjikhani, Baduel, & Rog e, 2014). In typically developing individuals, the LVF bias reflects the more extensive exploration of the left side of the face (from the viewer's perspective) than the right side. LVF bias has been found to be mediated by the right hemispheric dominance in face processing which includes face-selective brain regions, most notably the fusiform face area (Kanwisher, McDermott, & Chun, 1997;Luh, Rueckert, & Levy, 1991;Rhodes, 1985;Sergent & Bindra, 1981). Correlational evaluations further show that stronger LVF bias is reflective of the magnitude of asymmetry of the face-selective area in the FFA, with stronger activation in the right FFA (Yovel, Tambini, & Brandman, 2008). While people with ASD traits might lack the LVF bias, evidence suggests this might not be the case for ADHD traits since individuals with ADHD have been shown to display a typical leftward bias when presented with facial stimuli in at least one study (Klimkeit, Mattingley, Sheppard, Lee, & Bradshaw, 2003). As far as we are aware, the association between depression and LVF bias has not been previously examined during spontaneous face gazing, but hypotheses of lateralization alterations in depression have been proposed since several decades (Jaeger, Borod, & Peselow, 1987).
Unfortunately, results from eye tracking studies investigating visual behavior in response to social stimuli have been rather mixed, and findings that initially gained a lot of attention in the clinical literature have at times been difficult to replicate. For instance, in a comprehensive review of published research on face perception in autism, Guillon and colleagues report that findings diverged regarding alternations and directions in eye-versus-mouth gazing in autism (Guillon et al., 2014).
There might be several underlying causes for this phenomenon: one possible reason for decreased reproducibility is the sole reliance on Areas Of Interest (AOIs) as a basis for gaze data analysis. In this technique, commonly used in eye tracking studies on social processing, the researcher manually (and arbitrarily) defines areas in the stimulus space that are relevant to the research question in order to identify which among these (e.g., the eye region of a face) attract and hold the attention of the participants. Once AOIs have been determined, gaze is classified as a dichotomous variable, in that the gaze that falls within AOI borders is accounted for, while gaze registration falling outside is excluded from the gaze metrics. By its very nature, the AOI-based analysis allows for a great degree of variability, simply because it is up to each researcher to determine the number, shape, size and the exact location of a particular AOI, resulting in differences even between studies that use similar stimuli (Caldara & Miellet, 2011;Hessels, Kemner, van den Boomen, & Hooge, 2016). Furthermore, the parameters of the criteria or filters used to identify the dependent variable also vary from one study to another. This type of scenario results in large degrees of freedom due to reliance on the individual researcher's choices and assumptions, which have been suggested to contribute to the decreased replicability more generally in cognitive research (Munaf o et al., 2017). Various methods have been proposed to make the AOI definition process less arbitrary and more robust to noise, such as the use of large AOIs with Voronoi tessellation method (Hessels et al., 2016) or automatic Haar cascaded filtering which relies on a strategy used by artists that define horizontal lines for eyes, nose and mouth (Duchowski, Gehrer, Sch€ onenberg, & Krejtz, 2019; Hessels et al., 2016). However, in most cases even those methods rely on input from an individual researcher and are therefore not able to achieve the objectivity they strive to achieve (Rim, Choe, Scrivner, & Berman, 2021).
As alluded to above, another challenge in linking gaze behavior to a particular clinical profile during face perception has to do with the fact that many psychiatric patients do not fall under a singular diagnostic category, though some prior research has attempted (with more or less success) to carve out singular presentations of these disorders. In reality, clinical patients often represent a combination of symptoms with varying degrees of severity. This underscores the importance and prevalence of "comorbidity" or the co-existence of disorders, probably with shared symptoms (Astle et al., 2021;Casey et al., 2013;Gillberg, 2010) with some disorders co-occurring at higher rates than others (Antshel, Zhang-James, & Faraone, 2013;Gillberg & Coleman, 2000;Joshi et al., 2017;Pastor & Reuben, 2008). In addition, it is more and more widely acknowledged that psychiatric and neurodevelopmental "disorders" actually represent extreme cases on a continuum with no clear-cut boundaries between diagnostic "cases" and "normality" (Gillberg, 2010;Plomin, Haworth, & Davis, 2009).
Given this level of variability, eye-tracking studies comparing a singular group of patients with a singular group of non-clinical controls assume not only that the clinical group is relatively uniform, but also that the two groups only differ on the variable that constitutes the basis of the group's assignment. This type of assumption often ignores the potential presence of other factors that can be influential above and beyond those that were chosen to define the groups. Against this background, recent work (Dalgleish, Black, Johnston, & Bevan, 2020;Newby, McKinnon, Kuyken, Gilbody, & Dalgleish, 2015;Reininghaus et al., 2019;Titov et al., 2011) proposes a transdiagnostic approach which steps away from relying on diagnosis based on discrete categories, toward characterizing disorders in terms of overlapping dimensions. One particularly promising avenue for this approach is the dimensional systems neuroscience perspective developed by the NIMH Research Domains Criteria (RDoC) (Cuthbert & Insel, 2013;Insel et al., 2010) which combines the dimensional measures of clinical traits with objective biomarkers. The resulting RDoC is a research framework for investigating mental disorders that is used, not as a diagnostic tool, but as a way to conceptualize mental disorders in terms of varying degrees of dysfunction in general psychological and biological system dimensions. The current work was influenced by such recent developments, as we inclusively recruited a sample of adult psychiatric patients with a range of different clinical profiles.
In order to address several of the above-mentioned issues, here we propose a data analysis method for eye-gaze data during social processing, with minimal assumptions about both gaze data and psychiatric diagnosis. By using Principal Component Analysis (PCA) method, we automatically identify components in the gaze data that determine areas of the stimuli showing relevant gaze patterns for the recorded data. Even though the components are defined in a purely datadriven fashion, without input on the type of stimuli or the participant, results identify those that highlight relevant internal and external features as well as lateralization (preference for the left or right side) of the face stimuli. The identified components comprise the basis for the definition of linear regression models where clinical scores related to not only autism (Autism Quotient (Baron-Cohen, Wheelwright, Skinner, Martin, & Clubley, 2001)), but also depression (Beck's Depression Inventory II (Beck, Steer, & Brown, 1996)) and attention deficit and hyperactivity disorder (ADHD; Wender Utah Rating Scale-25 (Ward, Wender, & Reimherr, 1993)) are the dependent variables. A feature selection step allows us to identify which PCA components are the best predictors for each clinical symptom score. The aim of the present study is thus to determine the usefulness of a datadriven gaze analysis method in a heterogeneous sample of psychiatric patients.

2.
Materials and methods

Participant information
In the following, we report how we determined our sample size, all data exclusions, all inclusion/exclusion criteria, whether the inclusion/exclusion criteria were established prior to data analysis, all manipulations, and all measures in the study. All procedures were approved by the Regional Ethics Review Board in Gothenburg (Dnr 047-14). Participants were recruited from an adult psychiatric outpatient clinic in Stockholm, Sweden. All 18e25-year-old participants, patients at that clinic, who had been referred either by other professionals or through self-referral (regardless of reason for referral), and who had been clinically assessed by an adult psychiatrist and a psychologist as part of routine care (diagnostic process and treatment planning), were invited to participate in a further in-depth clinical study focused on diagnostic considerations and comorbidity. It was made clear to the patients that their participation was completely voluntary and would not influence their access to care and/or treatment. Written consent was obtained from all participants. In all, 128 patients participated in the eye tracking part of the neuropsychological evaluation. Of the 128, 17 patients were excluded because of poor data quality, following the procedure explained in the Data pre-processing section. Thus, the final sample included in the present analyses consisted of 111 individuals (38 male, 73 female) with a mean age of 21.1 (SD ¼ 2.4) years. See Table 1 for demographic and clinical characteristics of the participant sample.
The study utilized consecutive sampling of individuals with mental health/neurodevelopmental concerns, rather than a convenience sampling of a group of individuals with or without a specific diagnosis. All participants were thoroughly assessed for psychiatric diagnosis, and the final sample consisted of a heterogeneous group in terms of DSM/ICD diagnoses, often with complex comorbidities.
In the final sample, 21 met criteria for ASD, and 58 for ADHD, several of whom met criteria for both (ASD and ADHD) and also for other psychiatric disorders, including a depressive disorder (a total of 80 patients met criteria for this diagnosis). The detailed findings from the clinical evaluation will be presented in a separate paper (Eberhard et al., in prep). In the current paper, we took a dimensional approach and c o r t e x 1 4 7 ( 2 0 2 2 ) 9 e2 3 focused our selection on clinical dimensional traits (autistic, depressive and ADHD traits) irrespective of the formal diagnoses of the individuals (Astle et al., 2021;Gillberg, 2010).

Standard clinical assessments
In the current study, we capitalize on three dimensional, psychometrically-sound measures of clinical symptoms of established or potential relevance for face perception: autistic, depressive and ADHD traits. Autistic traits were evaluated in all participants using the Autism Spectrum Quotient (AQ), a self-report questionnaire (Baron-Cohen et al., 2001) consisting of 50 items assessing five domains known to be affected in ASD, namely, social skills, communication skills, imagination, attention to detail and attention switching/tolerance of change. AQ scores are continuously distributed in the general population and range between 0 and 50 points, with internal consistency (a between .63 and .84) ( Baron-Cohen et al., 2001;Broadbent, Galic, & Stokes, 2013). All participants also completed the Beck's Depression Inventory (BDI-II) (Beck, Steer, & Carbin, 1988;Beck & Steer, 1987), which is a self-report 21-item forced-choice 4-point scale (0e3), used to assess the extent of depressive features and changes in depressive states. Sub-components include the assessments of symptoms of guilt, pessimism, selfesteem, social withdrawal and suicidal thoughts. Scores on BDI range from 0 to 63, with 0e13 are being considerate minimal, 20e28 moderate and 29e63 as severe. The BDI also has a good internal consistency estimate (a ¼ .9) (Storch, Roberti, & Roth, 2004).
Finally, all participants completed the Wender Utah Rating Scale (WURS-25), a self-screening instrument for the assessment of ADHD. It is a 25-item 5-point scale (from 0 ¼ 'not at all' to 4 ¼ 'very much', giving a possible score 0e100 with scores >44 indicating a probability of ADHD) for adult patients recalling their childhood behavior with high internal consistency (a ¼ .95) (Rossini & O'Connor, 1995). In one study, a cutoff score of 36 or higher adequately identified 96% of the patients with ADHD compared to controls (Ward et al., 1993).

Apparatus
Gaze measures were collected using Tobii X2-30 (Tobii Technology Inc., Stockholm, Sweden), which records near infrared reflections of both eyes at 30 Hz as the subject watches an integrated 17-in (33.7 Â 27 cm) monitor at approx. 60 cm distance. A 9-point calibration procedure was used before the experiment, in which a small ball is shown at nine locations on the screen, repeating until data were collected for all points to ensure proper calibration. The iMotions (iMotions A/S, Copenhagen, Denmark) software was used for recording of eye gaze.

Stimuli
The stimuli were 24 black-and-white photographs depicting angry, happy and neutral faces (8 photographs each), created from the NimStim set of facial expressions (Tottenham et al., 2009). Half of the images depicted female and half male faces. Images were shown in random order with each stimulus shown once. The photographs were presented in a 1920 Â 1080-pixel array of which the faces (from the top of the head to the chin) were approximately 400 Â 580 pixels (6.72 horizontal Â 13.78 vertical visual degrees) presented against light gray circular background, embedded in a dark gray square against black background. According to the NimStim dataset requirements, only pictures of model #1 are shown in the present article.

Procedure
Eye tracking acquisition was conducted in a silent testing room without natural light, where participants sat in front of a computer monitor. Following the standard 9-point calibration, participants were instructed to look freely at the images shown on the screen. Each of the 24 stimuli images was presented once in fixed random order for 6 s, followed by a fixation cross which was presented for 15 s. The reason for the long intermediate period was that event-related galvanic skin responses were also collected; these data are not analyzed here. Data files for the eye tracking were exported from iMotions and analyzed with scripts developed for the purpose of this project using the R statistical software, as described in the remainder of the Methods section. The code used in our analysis is available on Github at the URL https://github.com/ pmasulli/EyeTrackingPCA. c o r t e x 1 4 7 ( 2 0 2 2 ) 9 e2 3 2.6.

Data pre-processing
To check the quality of the recorded eye tracking and exclude participants with invalid data, we defined a measure based on the gaze recorded during the inter-stimulus fixation cross. For each participant, we considered the location of the gaze fixations (determined using the I-VT filter) on the fixation cross stimuli and excluded the participants who had a standard deviation of the fixation location above a threshold of 3 visual angle. The criterion and threshold level were determined heuristically with the purpose of excluding participants whose eye tracking data were characterized by low accuracy or precision. This pre-processing led to the exclusion of 17 participants (13.3%). All the collected eye-tracking data from the remaining 111 participants were included in the analysis.

Heatmaps via kernel convolution
Heatmaps were generated for each participant, according to the following procedure. The stimulus images, taken from NimStim database, show faces photographed frontally, are normalized in size, and have eyes at corresponding locations. Therefore, all gaze points for each participant were pooled together across all the stimuli when generating heatmaps, in order to have a sufficient amount of gaze data for the analysis. Each gaze point is represented as a pair of (x,y) coordinates expressed in pixels and belonging to rectangular surface of the screen, which had a resolution of 1920 Â 1080. A participant's gaze points were smoothed via convolution with a Gaussian kernel of standard deviation .18 . This allowed us to represent a participant's gaze as a matrix of size 1920 Â 1080, where a larger value at a certain coordinate point corresponds to a large amount of gaze occurring near the point. To account for the limits in the data resolution and computational feasibility, the resulting heatmaps were cropped in correspondence with the stimulus face. Gaze points occurring just on the boundary rectangle of the stimulus face can encode relevant information, which is spread to a slightly larger area by kernel convolution. For this reason, and in order to avoid potential boundary effects caused by the sharp cropping of the heatmap data at the boundary of the stimulus face, we considered a slightly larger area around the stimuli, measuring 601 Â 676 pixels. The resulting matrix was then represented as a vector with dimension 406,276. This vector, the participant's heatmap, constituted the representation of a participant for the remainder of the data analysis. An example of the obtained heatmaps is displayed in Fig. 1.

Dimensionality reduction and feature selection
Principal Component Analysis (PCA) (Jolliffe, 2002) is a standard statistical technique that allows to reduce the dimensionality, based on matrix Singular Value Decomposition. PCA was applied to the heatmap data for all participants, which was previously centered by removing the mean along each dimension. This produced 111 principal components or factors, named PC1 to PC111, sorted in decreasing amount of variance that they express. In keeping with widely applied practices (Jolliffe, 2002), we limited the number of components to those that collectively express 90% of the variance in the initial set, which amounted to 18 Principal Component heatmaps (PC1ePC18). The gaze data can then be reduced in dimension by projecting them to the space given by these 18 components, where each participant's gaze data are then represented as an 18-dimensional vector of coefficients, corresponding to the principal components (the participant's representation in the PCA space).
For each of the three clinical scores (for autism, depression, and attention deficit), a procedure of step-wise feature selection with 2-layer Cross Validation (CV) was applied, in order to determine a subset of features that were most predictive for the clinical score. The inner CV layer was a 10-fold cross validation used to choose a subset of features. The outer layer was a Leave-One-Out (LOO) CV used to test which model gives the best prediction. The procedure has the goal to determine the combination of components that best predicts the clinical score, with an iterative process where each component is added to the model and the change in the error is considered. In order to avoid over-fitting, the procedure has a two-layered cross-validation step, which splits the dataset into a number of folds and uses them alternately as training and test set. The models are evaluated using the normalized Root Mean Square Error (RMSE) metric, which is calculated on the test folds during the CV procedure, and normalized dividing by the range or the model's dependent variable. The procedure is illustrated in Fig. 2 and a detailed description of the algorithm is included in Appendix A.
The face stimuli used in this study presented neutral and emotional faces. This was motivated by wanting to provide variation in the presented stimuli, which would serve for building a more robust model of how participants gaze at faces. Even so, we did consider whether the different emotions, considered individually, could lead to different results. The results of this analysis do not indicate any clear evidence for the influence of emotion type in the regression models that were obtained. One exception, however, is for the autism score computed on the neutral face stimuli. Since in all other cases we did not observe an effect of the emotion on the obtained models, we pooled together gaze data from all stimuli in the main part of the results. Further details on the models obtained for different emotions can be found in the Supplementary Materials.

Data availability
The collected dataset (Masulli et al., 2021) is archived, in anonymized format, in the Harvard Dataverse repository at the URL https://dataverse.harvard.edu/dataset.xhtml? persistentId¼doi:10.7910/DVN/ZHWLUO. Legal copyright restrictions prevent public archiving of the various assessment tests and instruments described in this manuscript, which can be obtained from the copyright holders in the cited references.

Principal components
Of the calculated Principal Components, the first eight encoded 75.7% of the total variance and are represented in Fig. 3. We c o r t e x 1 4 7 ( 2 0 2 2 ) 9 e2 3 observed that the Principal Component heatmaps identify relevant regions of the face stimuli. In particular, several of the components expressing most of the total variance seem clearly related to the AOIs traditionally used in eye tracking data analysis, as one can notice from their representation (Fig. 3). Specifically, the first principal component (PC1), which accounted for 17.1% of the total variance, and corresponded to the distinction between the eye region and the lower part of the face stimulus, including the mouth, with more gaze toward the eye region. This means that the largest direction of variability for all participants is along this direction. The second highest variance (16.4%) is accounted for by PC2, which corresponds to left versus right side of the face. PC3 (13.4% of the total variance) separates gaze in the eye region from other non-central areas of the face such as the lower left side and the forehead. PC4 (9.0% of the total variance) accounts for a distinction between eye and mouth regions and the center of the face, such as nose and forehead. PC5 (7.2% of the total variance) separated a peak nose and mouth from the eyes. Further components (PC6ePC8 in Fig. 3) (respectively, 5.3%, 4.0%, and 3.4% of the total variance) encode smaller regions centered at or around the central face features, and in some cases the gaze patterns they model are less immediately interpretable. The first eight principal components are visually described in Table 2.

Modeling of the clinical scores with forward selection
Based on a cross-validated feature selection procedure, we obtained a linear regression model for each clinical score. Table 3 shows which Principal Components were chosen as predictors in each case.
Gender was added as a covariate when running the feature selection procedure, in order to investigate its possible effects. For all the three clinical scores, gender was not selected in any of the models computed for the folds of the cross-validation. This indicates that gender does not have an effect on our models, and therefore we did not consider it further in our analysis.
For the autism score (AQ), the selected components are PC2 (left-right), PC3 (eyes vs lower face features), and PC7 (eyes and mouth vs areas around the nose and the left eye). Furthermore, the coefficients obtained in the linear regression model indicate that more gaze on the left side of the face and on the central face features are associated with lower autism scores (i.e., less autistic traits).
For the depression score (BDI), the selected components are PC2 (left-right) and PC5 (eyes vs nose and mouth), with lower scores associated to more gaze towards the eye region and the left side of the face.
The procedure for the attention deficit score (WURS) only chose component PC17, which encode only .9% of the total variance. This indicates that the gaze patterns exhibited by the study participants are not a good predictor of the variation of the attention deficit score.
When looking at the goodness of fit for the three regression models, we can see that the normalized test RMSE of the models for the autism and depression scores are lower than the one for the attention deficit score (Table 3).

Role of the lateralization (left-right) component (PC2)
Since the focus on lateralization (here, PC2 component) in examining face stimuli is novel compared to the focus on internal face structures and it is the only component that was selected in the feature selection procedure for both autism and depression scores, we decided to examine its meaning in greater detail. To do so, we analyzed how the participants' scores along PC2 (participant data in the PCA space) correlate with the scores obtained with the clinical assessments. PC2 (left-right component) showed correlations with the autism and depression scores. In particular, significant correlations were found between the autism score and PC2 (Pearson's correlation r(98) ¼ .26, p ¼ .01) and the depression score and PC2 (Pearson's correlation r(98) ¼ .34, p ¼ .0001). The attention deficit score did not significantly correlate with PC2.
Results of the linear regression analysis with the depression and autism scores entered simultaneously showed a significant model (F(2, 98) ¼ 8.68, p < .001, Adj. R 2 ¼ .133). The depression score (standardized b ¼ .127, p < .01) and the autism score (standardized b ¼ .134, p < .05) both accounted for unique variance in the regression model predicting PC2 (left-right component). At the same time, the autism and the depression score did not correlate in the dataset (p ¼ .07).

3.4.
Generating gaze heatmaps from the linear models Using the final linear regression models obtained with the forward selection procedure, it is possible to calculate a value of the independent variables that is mapped by the model to a c o r t e x 1 4 7 ( 2 0 2 2 ) 9 e2 3 given value of the dependent variable. This allowed us to choose values along the range of variation of each clinical score and generate characteristic gaze pattern heatmaps that the model associated to each value. This procedure was applied for each of the three considered clinical scores. Fig. 4 displays, for each clinical score, the gaze patterns predicted by the models for five values chosen along the range of variation of the score. For the autism score, low values of the score are modeled as most of the gaze directed at the eye region, with more gaze towards the left side of the face. As the autism score increases, the bias for the left side vanishes and the gaze is more diffused on the face, in particular on noncentral regions (forehead, chin). For the depressions score, low values are modeled as most gaze towards the eye region, also with a bias for the left side. As the depression score increases, the gaze gets more spread on the face, with more gaze directed at the mouth region. Finally, for the attention deficit score, low values are modeled with gaze toward both eyes. As the attention deficit score increases, the modeled gaze appears more diffused but does not exhibit any immediately interpretable patterns.

Comparison with standard AOI-based method
For comparison purposes, we analyzed correlations of the clinical scores with time-based gaze features using handdrawn AOIs. As previously discussed, this is a common practice in literature. The hand-drawn AOIs are shown in Fig. 5. We focused on those AOIs that are easily interpreted and have been utilized in previous AOI-based eye-tracking studies (Chita-Tegmark, 2016;Reisinger et al., 2020). Specifically, these are limited to the central features of the face most notably eyes, mouth, and the nose. Here, even though it has not been utilized in previous AOI-based studies, we also added the left/ right side of the image, which is mostly encoded by PC2 in our PCA-based method. The metric used in the following comparisons is the average gaze time spent in each AOI. The correlations are reported in Table 4. The results indicate that the autism score correlates with central regions, eye-regions, and left side. These AOIs closely match the PCA components found for the autism score with our methods. As to the depression score, it correlates with the AOIs for left and right side of the face, and therefore also in this case there is a close correspondence with the PCA components found for the depression score. Finally, the attention deficit score does not correlate with any of the tested AOIs, which is compatible with the poor performance of the regression model for this score in the PCA-based method.

Discussion
The present study builds on data from a representative and, thereby, heterogeneous sample of young adult psychiatric outpatients who were assessed on a variety of clinical measures examining autistic traits, depressive features and deficits in attention. The study integrated measures from an eyetracking task in which the participants were free to gaze at pictures of human faces. One of the critical aspects of the present study is the use of a transdiagnostic approach that moved away from categorical classification of psychopathology, and instead focused on a data-driven methodological approach that linked the clinical dimensional scores with gaze behavior. An important contribution of the present study reflects the Research Domain Criteria (RDoC) where a dimensional approach to psychiatric diagnoses is coupled with biomarker development in order to build a better understanding of mental health issues and psychological constructs such as autism (social and communicative), depression (mood) and attention deficits (Gottesman & Gould, 2003;Kendler & Neale, 2010;Levy, 2020) Following this approach, we decided to focus on the clinical traits, rather than categorical diagnoses, i.e., a dimensional and transdiagnostic approach was taken.
A second major contribution of this paper is the methodological approach taken in analyzing data-driven eye-tracking data. Eye tracking provides a precise and objective measure of behavior based on neural function and has been well studied in both humans and animals (Holmqvist et al., 2011). Given the unique contributions of the present study, the results can be examined from the clinical and methodological perspective. Below, we discuss both points individually, followed by a discussion on the limitations and future direction.

Clinical results
From the clinical point of view, our method allowed us to obtain several notable results. First, in line with a number of past research studies, a significant portion of variability in gaze to the eye region of the presented faces (PC3, PC7) contributed to the modeling of the autism score, meaning that those patients with higher impairments and more autistic features look less at the eyes and the central features of the face. These findings are in line with a rather substantial literature demonstrating such abnormalities in visual scanning of individuals with autism (Dalton et al., 2005;Galazka et al., 2019;Klin et al., 2002;Pelphrey et al., 2002). Our findings are also compatible with recent results that showed a correlation between autism scores (AQ) in healthy adults with increased gaze towards the lower half of the face (Wegner-Clemens, Rennig, & Beauchamp, 2020). Second, the results indicate that those patients with higher impairment in the autism domain, as evidenced by higher autism scores, looked less to the left side of the centrally presented face stimuli (Yovel et al., 2008), a finding which was shown using both the PCA (PC2 component) and the AOIbased analysis. A number of behavioral and eye tracking studies have shown that typical observation of neutral (Luh et al., 1991) and emotional faces (Voyer, Voyer, & Tramonte, 2012) involves a more extensive exploration of the left side (from the viewer's perspective) than the right side. This tendency, known as the left visual field bias (LVF) has been attributed to the right hemispheric dominance and the presence of face-selective brain regions, such as the FFA (Kanwisher et al., 1997;Luh et al., 1991;Rhodes, 1985;Sergent & Bindra, 1981). Correlational evaluations further revealed that indeed stronger LVF is reflective of larger right FFA activation (Yovel et al., 2008).
Similar to the findings in the autism domain, we observe that those who have higher values on the depression score exhibited less gazing to the left side of the face stimuli than those with lower depression scores. The multiple regression analysis shows that the depression score has an effect on the left-right gaze component, even after correcting for the autism score. We are not aware of any prior research showing a link between depressive symptoms and a reduction in left visual field bias during face gaze. In light of right-hemisphere mediation of this function, these findings do however align with previous research showing that patients with unipolar depression are less likely to identify facial affect, when displayed on a left side of the visual field (Jaeger et al., 1987;Mikhailova, Vladimirova, Iznak, Tsusulkovskaya, & Sushko, 1996) and are in general slower in reacting to stimuli presented on the same side (Liotti, Sava, Rizzolatti, & Caffarra, 1991). Also, some research suggests right lateralization at the frontocentral brain region and a shorter P3 latency in the right hemisphere of participants who scored above the median on the MMPI depression scale (Biondi et al., 1993), while other studies indicate an increased overall activity of the right amygdala when viewing neutral human faces in individuals diagnosed with depression (Farahbod, Cook, Korb, Hunter, & Leuchter, 2010). Important questions arise as to why there is a reduced asymmetry, including left visual field bias in depression. In particular, it would be important to learn more about the state-versus-trait status of these alterations, and whether proneness to depression can be identified and treated prior to manifestation of the symptoms. The usage of eyetracking techniques, which are increasingly accessible in clinical research and practice, could be instrumental in this regard.
Regarding the measure of ADHD e WURS e gaze data lack statistically significant correlations for either the PCA-based analysis, or the traditional AOI-based one. This is perhaps not surprising, since case-control research has not revealed consistent differences in face processing in this population compared with healthy controls (for a systematic review see Romani et al. (2018)); though emotional face recognition might be an exception the processing of facial features for ADHD individuals (Muszkat et al., 2015).

Methodological results
The main notable methodological contribution of our study was the use of PCA to extract features from gaze data combined with the linear modelling of the clinical scores. The use of the PCA method for gaze data was initially proposed in (Fookes & Sridharan, 2010) with the name of eigengaze. In comparison with the existing literature, the method proposed in the present study is novel, because the PCA components were used to build a regression model that linked the PCA components to psychometric scores relevant for our participant sample. The model explained the scores and used a feature-selection method in order to select which components are most predictive for each score. A cross-validation step was built in the feature selection algorithm in order to avoid over-fitting. This allowed us to determine which among the PCA components varied with each of the clinical symptom scores. We applied our analysis to a clinically diverse sample of psychiatric outpatients, with the goal of generating components corresponding to their clinical profiles. Another difference is that we based our PCA generation on gaze data, as opposed to fixations. In doing so, we have removed subjective decision-making, for example which fixation filter to use, away from the individual researcher and analysis pipeline. Furthermore, we have performed a confirmatory analysis in which we compared our findings with the traditional method of determining AOIs. When comparing the regions that resulted from our data-driven method and the ones generated with a standard AOI-approach, we can see that they are compatible: for instance, our method confirms the relevance of the eye region as relevant for the autism score, and the eye-region AOI correlates with it. The compatibility of findings provides a data-driven foundation for the AOI choices that are commonly performed in literature.
A critical distinction between PCA and hand-drawn AOIs is largely dependent on a priori assumption about gaze in response to particular stimuli. Our new methodology is of value, because it is able to generate "candidate" areas to investigate for any class of visual stimuli, not being limited to faces, and also because it can be used where the importance of specific regions is not obvious or thoroughly researched. This makes is also suitable for exploratory analyses of gaze data. Furthermore, unlike the traditional AOI approach, the PCA uses recorded gaze data to define relevant attentional components of the tested sample before correlating them with the clinical correlates. Because this analysis is driven by the gaze data, it allows for the discovery of novel AOIs that might otherwise go unnoticed, and that are relevant to one clinical sample but not the other. For instance, in the present case, left/right AOIs would not necessarily be considered.
It should be stressed that in our view, the methodological and clinical contributions go hand-in-hand. The current datadriven method has pointed our attention to gaze regions that were then linked to established clinical measures. Given our population sample, the analytical path taken in the present study may prove useful in linking gaze behavior to clinically heterogeneous and co-morbid conditions with no clear-cut boundaries between diagnoses or between illness and typicality.

4.3.
Limitations & future work It is important to consider present results in light of some limitations. First, no "healthy" or "typically developing" controls were included in the study, which can be seen as a limitation but is not necessarily so. In keeping with the current proposals here discussed (RDoC, transdiagnostic approaches), our approach moves beyond a "standard" case-control design in that our sample was comprised of individuals seeking help for mental health and neurodevelopmental issues and whose scores were variable, but on average were high, on different clinical traits (in our case, on autism, depression and attention deficit). In doing this, no more or less arbitrary dichotomizations into subgroups were needed and the very large co-occurrence of different neurodevelopmental/psychiatric traits could be capitalized on in our analyses. Still, we acknowledge that including a group of individuals who score low on clinical traits in general (i.e., "healthy controls") might have provided additional insights. Another limitation is the lack of the visual analysis (e.g., eye movements and foveal fixations) across time. In order to be able to examine saccadic patterns, an eye tracking system with higher sampling rate is needed. Previous research using the AOI-based methodology has considered temporal features of gaze behavior in response to visual stimuli, including faces (Groner et al., 1984;Malcolm, Lanyon, Fugard, & Barton, 2008) and across various clinical groups (Drusch, Bastien, & Paris, 2014;Falck-Ytter, Von Hofsten, Gillberg, & Fernell, 2013;Kennedy et al., 2017); however, fundamental aspects of these analyses continue to rely on the individual researcher decision making. Much like here, we believe future research effort should steer toward more objective, more inclusive datadriven analyses. One potential avenue could be generating and comparing individual scan-paths across individuals (as proposed by (Jarodzka, Holmqvist, & Nystr€ om, 2010)). Future research effort should undertake this challenge especially with psychiatric populations such as those tested in our present sample.
There might be other relevant stimulus-related factors such as gaze direction (Bindemann, Burton, & Langton, 2008) and gender of the viewed face and of the observer (Armann & Bü lthoff, 2009) that could influence gaze patterns, and which   c o r t e x 1 4 7 ( 2 0 2 2 ) 9 e2 3 were not considered in the present study, but future studies with larger groups of participants should consider these factors. It is possible that the emotional expression of the face stimuli could also play a role in gaze behavior (Eisenbarth & Alpers, 2011;Asberg Johnels et al., 2017), although in the current study, we did not find any clear evidence for a specific influence of emotional expression in the types of gaze features captured, except for the autism score computed on the neutral face stimuli. The little amount of data available per emotion made it difficult to determine whether this difference was genuine. However, as noted earlier, the present study set up might well have included too few trials per emotional expression for such an effect to surface (should there be one). Again, replication would be helpful. See the supplementary materials for more details. An additional limitation and potential source of noise in our data is the risk of calibration drift. Our protocol contained only one calibration stimulus at the beginning of the study.  However, although calibration was not repeated in the course of the trials, the data analysis excluded any participants whose gaze deviated from the fixation cross that was shown as the inter-stimulus interval. As explained in the Methods section, data from 17 participants were excluded based on this criterion. Therefore, although not explicitly recalibration per se, the analysis did include a way to account for the potential fixation drift. Finally, in terms of clinical variables, the current work focused on autism, depressive and ADHD traits considered as dimensional characteristics. As mentioned, the inclusion of depressive traits yielded novel and important insights. However, the choice we made of these three is in no way exhaustive. For instance, there are several other relevant clinical and personality-related individual differences that could and should be considered in future (multivariate) work such as the present one, including social anxiety (Gü nther et al., 2021;L€ owenberg et al., 2020) and alexithymia (Cuve et al., 2021). Despite these limitations, the present work has made clear methodological and clinically relevant contributions, while pointing out important avenues for continued study and replication in future research. In fact, the proposed method can be applied to any eye-tracking study based on static stimuli as a pre-processing step, allowing researchers to determine interesting regions in the stimuli based exclusively on the data, and in this way highlighting interesting regions that might otherwise remain unnoticed. In future work, we plan to extend the model to include temporal aspects of the recorded gaze data, and the ability to encode the transitions between various components as they relate to the type of stimuli. This would allow us to capture additional differences in gaze behavior among individuals that cannot be identified by traditional AOIebased analyses. We will also examine how such gaze patterns align with an even broader range of clinical and personality-related individual differences.

Conclusion
The proposed analysis method for gaze data is data-driven, instead of researcher-driven, and as such is suited to capture variability of eye tracking data without a-priori assumptions regarding their content. Our findings for the studied participant sample show that the components determined in a data-driven way are compatible with what one would obtain with a traditional analysis based on manually drawn AOIs, providing a confirmation of the validity of findings based on traditional methods and, at the same time, providing a way to generalize and extend AOI-based methods. Finally, the obtained components shed light on the underlying bases of autistic personality traits and depressive traits in this studied sample of adult psychiatry patients.

Pre-registration
No part of the study procedures or analyses was preregistered prior to the research being conducted. All authors have seen and approved the final version of the manuscript being submitted, and we confirm that this work is original and has not been published elsewhere, nor is it currently under consideration for publication elsewhere.

Open practices
The study in this article earned an Open Data badge for transparent practices. Data for this study can be found at: https://dataverse.harvard.edu/privateurl.xhtml? token¼b9ccb1c4-d5cf-4145-9661-37aef61fb4b0.
(e) Repeat addition of predictors until the average RMSE stops decreasing. (f) Store chosen predictors for the current (outer) fold. 3. For n from 1 to the number of outer folds: (a) Consider the model with the predictors that were chosen in at least n outer folds. (b) For each outer fold: train model on the remaining data and measure RMSE on the chosen fold. (c) Measure average RMSE for the model. 4. Choose n with minimum RMSE. If two models differ by a small difference in RMSE (less than 10% of minimum RMSE), choose the simplest one (with fewest predictors). This gives the final model.
The algorithm yields a set of features (Principal Components) and a linear model for each of the three clinical scores.