Introduction

Social anxiety disorder (SAD) affects between 12% and 13% of children aged between 8 and 13 years (American Psychiatric Association, 2013) and is defined as the fear of social or performance situations involving exposure to unfamiliar people or possible scrutiny by others (Furmark, 2002; Schneier, 2006). This disorder is associated with adverse effects on children’s familial (Costello et al., 2005), academic (Essau et al., 2010), and social (Greco & Morris, 2005) quality of life. Furthermore, children with SAD have an increased risk of developing other psychopathological disorders during adolescence or adulthood (Costello et al., 2005; Egger & Angold, 2006). Given that current models conceptualize SAD as being on a continuum starting from timidity and subclinical social anxiety (Judah et al., 2013; Rapee & Heimberg, 1997) and given the existing comorbidities between SAD and other mental disorders (Fehm et al., 2008), the identification of the cognitive factors associated with subclinical levels of social anxiety is of central importance, because it may contribute to the development of appropriate prevention methods and treatments (Pine et al., 2009; Pine & Fox, 2015).

The influence of anxiety on cognition has been widely studied in adults and main models posit that increased anxiety levels are associated with a biased attentional system in favor of threat-related stimuli, particularly negative facial expressions (Bar-Haim et al., 2007; Mathews & MacLeod, 2002; Waters et al., 2008; Williams et al., 1999). More specifically, anxious individuals may present a facilitated orienting of attention towards threat-related stimuli, followed by attentional disengagement impairments, which are conceptualized as difficulties to shift attention away from threats (Cisler & Koster, 2010). Over the past decades, many studies have examined the cognitive processes sustaining AB for threat in anxiety disorders (Evans et al., 2016). While the initial orientation is supposed to be an automatic response, disengagement difficulties rather involve top-down mechanisms such as attentional control, which corresponds to the ability to regulate the allocation of attentional resources (Cisler & Koster, 2010; Vromen et al., 2015). According to the attentional control theory (ACT) (Eysenck et al. (2007), attention control impairments of adult, anxious individuals lead to difficulties to disengage attention from threat, whereas the hyperactivation of the bottom-up attention system leads to easier detection of emotionally salient stimuli (Bishop, 2009). The ACT also supports the processing efficiency hypothesis, which suggests that adult, anxious individuals would recruit more neurocognitive resources (i.e., processing efficiency) to obtain similar behavioral performances than their nonanxious peers while performing a task (i.e., performance effectiveness) (Eysenck et al., 2007; Eysenck and Calvo, 1992).

AB for threats are typically measured with the visual dot-probe task (MacLeod et al., 1986; for a review, see Dudeney et al., 2015 and Puliafico & Kendall, 2006) in which pairs of threat-neutral stimuli, either words, photographs of facial expressions, or pictures of natural scenes, are presented on a computer and are then replaced by a visual probe (e.g., a dot or a letter) appearing at the location previously occupied by one of the two stimuli. Participants are asked to locate or identify the targets and AB for threats are traditionally inferred using the difference in reaction times (RT) between invalid trials, where the target replaces the neutral stimulus, and valid trials, where the target replaces the threatening stimulus. Faster RT in valid trials compared with invalid trials are thought to reflect an attentional vigilance for threat (Britton et al., 2012).

Although widely used, this paradigm has reached inconsistent results over the past years in pediatric populations. For example, while Roy et al. (2008), Waters et al. (2010), or Vasey et al. (1995) demonstrated a greater AB for angry faces in socially anxious children, Waters et al. (2004) found comparable increased attention for fearful faces in both anxious and nonanxious children. Methodological discrepancies between studies (e.g., various emotions used as threatening, various presentation times) and the variations in the clinical characteristics of the samples (e.g., type and severity of anxiety disorder) could explain those mixed results. However, some authors also pointed out the poor psychometric properties of the dot-probe task, such as poor internal consistency and test-retest reliability (Cristea et al., 2015; Schmukle, 2005). Furthermore, the calculation method of the dot-probe task has been questioned (Evans & Britton, 2018), because it characterizes attention patterns between experimental conditions by indicating the direction in which attention is generally biased (Evans & Britton, 2018). However, recent simulation studies have suggested that this kind of measures would reflect intra-individual variability in reaction times rather than patterns of threat-related attention specifically (Evans & Britton, 2018; Kruijt et al., 2016).

The first goal of this research was, therefore, to further investigate AB towards social threats in pediatric social anxiety by using the new calculation method proposed by Evans and Britton (2018). This method appears to have better psychometric properties, because it has the advantage to capture intra-individual variabilities by allowing the separate investigation of the orientation and disengagement of attention through the comparison of individual trial RT to a mean reference RT. Based on previous literature indexing the presence of AB in socially anxious children (Roy et al., 2008; Vasey et al., 1995; Waters et al., 2010), we predicted that, compared with a control group, children with high levels of social anxiety are characterized by the conjoint presence of a vigilant orientation and a slow disengagement for threat-related stimuli.

The second goal of this research was to take advantage of the event-related potential (ERP) technique which, due to its high temporal resolution, allows the precise study of the temporal sequence of cerebral activations directly related to stages of threat processing. This noninvasive technique is well-suited to conduct experiments with children (D’Hondt et al., 2017), but only a few studies have been performed in children with SAD (Bechor et al., 2019; Pollak & Tolley-Schell, 2003) or children with high levels of social anxiety (Thai et al., 2016). Those few studies demonstrated that AB for threat are potentially associated with specific electrophysiological changes on the ERP components (depicted by significant variations in amplitudes), occurring at different stages of the information processing. At early stages, Bechor et al. (2019) demonstrated larger P1 amplitudes for threatening stimuli than for neutral faces in children and adolescents (ages 8–16 years) suffering from various types of anxiety disorders. They also demonstrated increased N170 amplitudes for all emotional facial expressions in anxious children compared with the control group. Because the P1 reflects cognitive resources allocated to the early visual processing of a stimulus and the N170 is related to the processing of facial structures (Balconi & Lucchiari, 2005; Eimer, 2000), Bechor et al. (2019) interpreted their results as reflecting an increased orientation of children's attention towards threatening faces. At later stages of faces processing, Bechor et al. (2019) found lower P2 and P3b amplitudes in response to all face displays in the anxious group. Given that the P2 is known to reflect the elaborated and sustained perceptual processing of the emotion at stake and that the P3b is associated with the voluntary shift in attention towards target stimuli and response inhibition (Wauthia & Rossignol, 2016), Bechor et al. (2019) hypothesized poorer attentional control abilities in anxious children. Using a similar paradigm, Thai et al. (2016) had previously shown no group differences on the P1, N1, P2, and P3 components when comparing the processing of angry and neutral faces in children with high levels of behavioral inhibition (at increased risk to develop SAD at a later age) and a control group. Nevertheless, these authors found an interaction between children’s social anxiety level and the P2 amplitude so that increased P2 amplitudes were associated to lower social anxiety symptoms in behaviorally inhibited children. This observation led Thai et al. (2016) to conclude that the P2 amplitude could reflect compensatory recruitment of neural resources used by behaviorally inhibited children to moderate the causal link between the preferential processing of threat and the subsequent social anxiety symptoms (Rapee & Heimberg, 1997; Thai et al., 2016). Here again, based on the existing literature and on the assumption that the results between subclinical and clinical populations are consistent (Judah et al., 2013), we postulated early preferential processing of threatening faces, which appear by larger P1 and N170 amplitudes in children with high levels of social anxiety compared with a control group for threatening (disgusted) faces. Furthermore, and from a more speculatively viewpoint, we postulated that difficulties of attentional control in children with high levels of social anxiety are associated with lower P2 and P3 amplitudes in this population compared with the healthy control group for both faces and targets in all emotional conditions. However, when comparing threatening and positive conditions, we expected increased P2 and P3 amplitudes for threat, particularly in high socially anxious children because of their hypothesized AB for threat.

It should be noted that the investigation of ERP components in children may be concerned by two types of difficulties. First, the still-developing brain structures of children may lead to particularities in components’ topography or temporality compared with ERP traditionally measured in adults (Ciesielski et al., 2004; Jonkman et al., 2003; Okazaki et al., 2004). Second, EEG data in children are known to include more noise and artifacts than adults recording (Kujawa et al., 2013). Therefore, there is a need for more refined analyses methods to obtain more consistent ERP data (Kujawa et al., 2013) and to address our hypotheses. Therefore, this study used temporal-spatial principal component analyzes (PCA), which are useful and reliable for separating sources of variability in components and for differentiating latent components from unsystematic sources of noise in adults (Foti et al., 2009; Weinberg & Hajcak, 2011) and children (Arbel & Donchin, 2011; Weinberg & Hajcak, 2011).

Methods

Participants

Two hundred children, aged 8 to 12 years, enrolled in the elementary schools of the region of Mons (Belgium), were invited to participate in this study. We selected this age range because more than 50% of socially anxious children develop their symptoms before age 13 years (Chavira et al., 2002) and because social anxiety symptoms can already be seen in children as young as 8 years (Van Roy et al., 2009). To participate, children were asked to fill in the Social Phobia and Anxiety Inventory for Children (SPAI-C; Beidel et al., 1995), and the State-Trait Anxiety Inventory for Children (STAI-C; Spielberger et al., 1973). Parents had to give their informed consent.

The Social Phobia and Anxiety Inventory for Children (SPAI-C; Beidel et al., 1995) is a 26-item scale used to assess the severity and the range of social fears, experienced by children as young as age 8 years. Each item describes a social situation and the respondent had to endorse how often he or she feels nervous or scared in this situation on a Likert scale ranging from 0 (“Never”) to 2 (“Always”). Total scores range from 0 to 52, with higher scores indicating higher levels of social anxiety. The SPAI-C has excellent validity (Beidel et al., 1995) and test-retest reliability (scores > 0.72) across 2-week and 10-month intervals.

The State-Trait Anxiety Inventory for Children (STAI-C; Spielberger et al., 1973) is a self-report inventory used to assess anxiety in children that consists of 2 scales of 20 items each: a State scale that measures transient anxiety reactions to particular situations and a Trait scale that measures a stable predisposition to react anxiously to any situation. Each item is answered on a 3-point scale. The total scores on each scale range from 20 to 60. The validated French version of the STAI-C showed good internal consistency, i.e., 0.77 for the state scale and 0.82 for the trait scale (Turgeon & Chartrand, 2003).

Of the 200 solicited families, 60 parents allowed their children to participate in the study. To avoid confounding effects due to potential comorbid disorders, we recruited children with subclinical levels of social anxiety. Consequently, 20 children (M = 10.01; SD = 1.01) were assigned to the high socially anxious (HSA) group, because they had a minimum score of 18 to the SPAI-C, as suggested by previous research using this inventory (Beidel et al., 1995) to discriminate between socially anxious children and healthy controls. Twenty-two children paired in age and gender (15 females; M = 10.20 years; SD = 1.30) were assigned to the healthy control (HC) group. The remaining 18 children canceled their participation afterward or were excluded based on our initial inclusion criteria. The final sample characteristics are presented in Table 1. All children were free from learning, neurologic, or other psychiatric disorders as assessed with the Child Behavior Checklist (Achenbach et al., 1983; Fombonne et al., 1988) completed by parents before participation. This study was conducted in accordance with the recommendations of the ethical board of the Faculty of Psychology and Educational Sciences of the University of Mons and with the Declaration of Helsinki. Participants received a 20 € giftcard for their participation.

Table 1 Sociodemographic and psychological data

Task and materials

The visual dot-probe task was administered using E-Prime 2.0 software (Psychology Software Tools, Pittsburgh, PA) on an Asus X756U PC (screen of 17 inches) in the Cognitive Psychology and Neuropsychology research laboratory of the University of Mons. Participants were placed at 70 cm from the computer screen. Each trial began with a fixation cross appearing at the center of the computer screen for 500 milliseconds. Then, two pictures (4.9° X 8.2°) of the same adult with either a neutral, disgusted, or happy expression were presented on each side of the fixation cross for 500 milliseconds. We chose to use faces expressing disgust, because they convey a negative evaluation of rejection and avoidance and are rated even more negatively than angry faces by anxious individuals (Amir et al., 2005; Buckner et al., 2010; Charash & McKay, 2002; Mao et al., 2020; Yuan et al., 2021). We selected ten different adults (5 males and 5 females) from the Radboud Faces Database (Langner et al., 2010). After face offset, a target (a fish or a cat) appeared at the location previously occupied by one of the two faces and remained on the screen until response. Children had to identify the target as fast as possible by using the response pad (Cedrus RB-740). Children were asked to perform the task using both hands. The order, position, and type of targets were counterbalanced throughout the experiment. Before the task, participants received 16 randomly presented practice trials. The entire task included 512 trials. There were 248 disgust-neutral, 248 happy-neutral trials, and 16 neutral-neutral trials. The order of presentation of the faces was counterbalanced across the trials. In 50% of the trials, targets replaced the emotional face (valid condition), and in the remaining 50%, targets replaced the neutral face (invalid condition). The position of the emotional face on the left or the right side of the screen also was counterbalanced across the trials. Finally, the target type was counterbalanced across trials and conditions. A new trial began 500 milliseconds after the target offset or after 2,500 ms when no answer was given. Figure 1 illustrates the presentation sequence of one trial.

Fig. 1
figure 1

One trial-sequence of the visual dot-probe task

Behavioral data

Response accuracies (RA) and response times (RTs) were measured during the dot-probe task. RTs corresponded to the time between the presentation of the target and the button press. RTs longer than 2,000 milliseconds and shorter than 150 milliseconds were excluded from further analyses. RT higher than 2.5 SDs above or lower than −2.5 SDs below the participants’ mean were discarded to reduce the influence of outliers. Individual mean RTs were computed for each experimental condition on the remaining trials (98%). As a ceiling effect was observed in RA (M = 94.13; SD = 5.35], analyses were only conducted on correct response latencies (Rossignol et al., 2013). No group difference appeared on RA [t(41) = −0.671; p = 0.506].

AB calculation

For the dot-probe task, AB indexes were computed for each emotional condition separately through the response-based method proposed by Evans and Britton (2018). This method allows creating separate measures of vigilance and avoidance of the emotion at stake. For each participant and each emotional condition, RT from valid trials are individually indexed against the mean RTs of invalid trials as reference (RTInvalidMean – RTValid[Trial1…Trial2...Trialn]). Positive scores indicate vigilance while negative scores indicate avoidance. To further decompose attention into orientation and disengagement (Koster et al., 2004), we used an identical approach with the mean RTs for neutral-neutral pairs as the reference for valid and invalid trials respectively (Orientation = RTNeutralMean - RTValid[Trial1…Trial2...Trialn]; Disengagement = RTNeutralMean – RTInvalid[Trial1…Trial2...Trialn]). For valid trials, a positive score indicates a vigilant orientation and a negative score indicates an avoidant orientation. For invalid trials, a positive score indicates a fast disengagement and a negative score indicates a slower disengagement. Internal consistency of the task was assesses using split-half reliability coefficients. To assess split-half reliability, trials were randomly separated; then, each half was correlated with the other and corrected for length using the Spearman-Brown formula (Meissel et al., 2022). Using Shrout (1998)’s conventions, analyses showed overall a substantial for the response-based indices linked to disengagement, avoidance and vigilance (split-half coefficients > 0.90), and a slight reliability for the orientation indice (split-half coefficient = 0.320).

EEG acquisition, preprocessing, and analysis

EEG data were recorded at a sampling rate of 500 Hz (V-Amp, Brain Products, GmbH, Munich, Germany: 0–500 Hz bandwidth, 24-bit A/D conversion) from 64 electrodes sites (AF3, AF4, AF7, AF8, AFz, C1, C2, C3, C4, C5, C6, CP1, CP2, CP3, CP4, CP5, CP6, CPz, Cz, F1, F2, F3, F4, F5, F6, F7, F8, FC1, FC2, FC3, FC4, FC5, FC6, FCz, Fp1, Fp2, FT10, FT7, FT8, FT9, Fz, O1, O2, Oz, P1, P2, P4, P3, P5, P6, P7, P8, PO3, PO4, PO7, PO8, POz, Pz, T7, T8, TP10, TP7, TP8, and TP9) with an actiCap acquisition system arranged in a standard 10-20 layout (Brain Products, GmbH, Munich, Germany). The reference electrode was located at FCz throughout the recording as per common practice with Brain Vision Recorder (Brain Products, GmbH, Munich, Germany). The ground electrode was placed on the forehead between Fp1 and Fp2. The P4 electrode was excluded from further analyses because of poor functioning (impedance >10 kΩ) for all participants. Electrode impedances were kept below 10 kΩ. Offline analysis was performed using Brain Vision Analyzer 2 software (Brain Products, GmbH, Munich, Germany). First, we applied a high-pass IIR filter with a cutoff of 0.1 Hz and a low-pass IIR filter with a cutoff of 30 Hz. Second, we used a semiautomatic independent component analysis (ICA) to correct ocular artifacts, such as blinks, vertical and horizontal EOG (Makeig et al., 1997). Third, all data were re-referenced to the average of all scalp electrodes. Fourth, we created epochs by segmenting data from 200 milliseconds before the stimuli and 500 milliseconds after. Similar epochs were used for faces and target processing. Following previously published studies using a similar paradigm (Li et al., 2018), epochs were then baseline corrected using the mean voltage calculated from the 200 milliseconds preceding the event. Next, algorithmic artifact rejection of voltage exceeding ±100 μV was followed by visual data inspection of segmented data in which segments with artifacts (e.g., eye blinks, horizontal and vertical eye movements, muscle artifacts) were manually rejected. Stimulus-locked ERPs were averaged separately for faces (M = 91.27; SD = 2.63, min = 49; max = 116) and targets trials (M = 47.19; SD = 1.30; min = 28; max = 58). No group difference appeared on the number of clean trials used for the averaging step [F(1,38) = 2.09; p = 0.156].

ERP data were submitted to temporal-spatial principal component analyses (PCA) using the ERP PCA Toolkit, version 2.7 (Dien, 2010a) according to the published guidelines (Dien, 2010b; Dien et al., 2007). PCA analyses are used to extract linear combinations of data that distinguish patterns of electrocortical activity across all timepoints and recording sites (Kujawa et al., 2013). Separate analyses were conducted for faces and targets processing. First, a temporal PCA was performed on the data to capture the variance across timepoints and to maximize the initial separation of ERP components (Dien & Frishkoff, 2005). All timepoints were used as variables, and we considered all subjects, conditions, and recording sites as observations. We used Promax rotation based on previous simulation studies, suggesting that this rotation is the most effective for temporal analyses, because it does not force orthogonality amongst the components (Dien, 2010a; Dien & Frischkoff, 2005). Following this first rotation, we conducted a parallel test (Cattell, 1966), which compared the Scree plot of our dataset to a Scree plot derived from a fully random dataset. The number of factors retained was based on the largest number of factors that account for a greater proportion of the fully random dataset (see Dien, 2010a for more information). Based on this criterion, 18 temporal factors were extracted for faces processing (total variance explained = 97.1%), and 18 temporal factors were extracted for targets processing (total variance explained = 96.7%). Each temporal factor may be considered as a virtual epoch and can be described by both its factor loading (which describes the time course of that factor) and its factor score (which gives that factor’s value for each combination of subject, picture type, and recording site). Spatial information is preserved by temporal PCA; scalp topography can be reconstructed for any timepoint, subject, and condition by multiplying the corresponding electrode scores by factor loading and standard deviation (Dien, 1998).

To synthesize the spatial dimensions of the dataset, spatial PCAs were then performed for faces and targets processing separately. We used an Infomax rotation as suggested by simulation studies showing that Infomax rotations are the most effective for spatial analyses (Falkenstein et al., 1999; Polich, 2003) Recording sites were used as variables, and all subjects and conditions were used as observations. A separate spatial PCA was performed for each temporal factor retained at the previous step, although the resulting Scree plots were averaged across all temporal factors so that the same number of spatial factors was extracted in each case. Five spatial factors were extracted for faces (total variance explained = 78.5%), resulting in 102 unique factors combinations, and six for targets processing (total variance explained = 79.1%), resulting in 108 combinations for targets processing. As suggested by Dien et al. (2005), the covariance matrix and Kaiser normalization were used for each PCA. To assess the timing and spatial voltage distributions directly, we translated all factors back into voltages (μV) by multiplying factor scores by their corresponding loadings and standard deviations. In this way, both the time course and scalp topography of the electrocortical activity captured by that temporospatial factor combination can be directly assessed (Foti et al., 2009).

Only temporal-spatial factors that accounted for at least 0.5% of the variance were retained for further analyses (Dien, 2010a; Dien et al., 2006), representing 38 factors for faces processing and 53 for targets processing. These PCA factors were identified by using the AutoPCA function of the ERP PCA Toolkit, which runs through all factors and chooses the peak channel and the peak-time point for each factor.

Statistical analyses

Analyses of reaction times were completed using the software IBM SPSS Statistics for Windows (Version 21, Armonk, NY, IBM Corp.). The significance level was set at p < 0.05 (two-tailed) throughout the analyses. Between-groups comparisons were performed on demographic (age and gender) and anxiety (social anxiety, trait, and state anxiety) characteristics using: (i) for quantitative variables, Wilcoxon rank-sum tests, or Welch’s t-test for independent samples when the distributions of these variables were significantly different from normal or not, respectively; (ii) Pearson chi-square tests for qualitative variables. The normality of the distribution of quantitative parameters was assessed with Shapiro–Wilk tests.

For AB response-based indexes (vigilance, avoidance, orientation, and disengagement) measured from RTs, General Linear Model (GLM) analyses were conducted with 2-Emotion (Disgust and Happiness) as within-subject factors X 2-Group (HSA and HC) as the between-subjects factor. We used a Greenhouse-Geisser correction for eventual violations of sphericity. Post hoc analyses were used to examine significant interaction and main effects.

Regarding ERP data, to avoid the biasing effects of nonnormality and (co)variance heterogeneity between groups (nonorthogonal groups) and to reduce Type I errors (Dien & Santuzzi, 2004), robust analyses of variance (ANOVA) were conducted on the selected temporal-spatial factors using the ERP PCA Toolkit (Dien, 2010a; Keselman et al., 2003). As mentioned earlier, only temporal-spatial factors that accounted for at least 0.5% of the variance were retained for further analyses (Dien, 2010a; Dien et al., 2006), representing 38 factors for faces processing and 53 for targets processing. Robust ANOVAs were conducted on these 38 temporal-spatial factors associated with faces processing and these 53 temporal-spatial factors associated with targets processing separately. The seed for the number generation was set at 1,000, and the number of iterations used for bootstrapping was 50,000 (Dien, 2010a; Dien et al., 2006). Given the potential variability in p values using this approach (84), simulations were run 11 times, with median p values reported. For each series of robust ANOVA tests, the threshold of significance was adjusted using the Bonferroni correction to correct for multiple comparisons. Onlyresults in which the median p-value plus 2 standard deviations remained below 0.05 were considered significant after the correction. Robust ANOVA tests are indicated by "TWJT/C" and the interpretation of this statistic and resulting p-values are identical to a conventional ANOVA. Further information about robust ANOVA tests can be found in Dien (2010a).

A 2-Emotion (Disgust and Happiness) X 2-Visual Field (Left and Right) X 2-Group (HSA and HC) analysis was conducted on temporal-spatial factors associated with faces processing. A 2-Emotion (Disgust and Happiness) X 2-Visual Field (Left and Right) X 2-Validity (Valid and Invalid) X 2-Group (HSA and HC) analysis was conducted on the temporal-spatial factors associated with targets processing.

Significant interactions were decomposed with robust analyses of variance to control for familywise Type I error. For a sake of clarity, only statistically significant results will be presented in the results section. Finally, visual inspection of the waveforms associated with these factors was used to select those which spatial and temporal characteristics corresponded to ERP components relevant to the paradigm used (Dien et al., 2005). Following previous literature in children (for a review, see Wauthia & Rossignol, 2016), we considered as being representative for the P1 a relative positivity maximally recorded at occipital electrodes at ∼100 to 150 ms after stimulus onset. Factors associated with negativity peaking at ∼140 to 190 ms at right posterior temporal regions were associated with the N170 component. A posterior positivity that peaked between 180 and 350 ms following stimulus onset was considered as representative of the P2 component. For the processing of targets, a positivity elicited at 450 to 700 after target onset at parietal sites will be assimilated to the P3b component.

Results

Demographic and anxiety data

Data are presented in Table 1. Groups did not differ regarding age, gender, state, and trait anxiety [all p-values > 0.05]. The HSA group showed higher scores on the SPAI-C [t(40) = −7.56; p < 0.001]. The intercorrelation matrix conducted between inventories revealed a positive and significant correlation between the SPAI-C and the STAI-Trait score (ρ = 0.340; p = 0.028) but not between the SPAI-C and the STAI-State score (ρ = 0.013; p = 0.933).

Behavioral data

Mean RTs for each group and each condition are displayed in Table 2. No significant group difference appeared on RT [F(1,40) = 0.194; p = 0.662].

Table 2 Mean reaction times (RTs) for both groups in each experimental condition of the dot-probe task

Response-based indexes are shown in Table 3. For avoidance, no effect of Emotion [F(1,40) = 0.473; p = 0.496], Group [F(1,40) = 0.135; p = 0.715] and no significant interaction between these two variables [F(1,40) = 0.074; p = 0.788] appeared. Similarly, for vigilance, no effect of Emotion [F(1,40) = 0.034; p = 0.854], Group [F(1,40) = 0.120; p = 0.742], and no significant interaction was indicated [F(1,40) = 0.589; p = 0.447; Ƞ2p = 0.015]. For orientation, we failed to show a significant effect of Emotion [F(1,40) = 0.007; p = 0.933], Group [F(1,40) = 3.345; p = 0.075], and interaction [F(1,40) = 0.002; p = 0.967]. Similar results were found on disengagement, with no effect of Emotion [F(1,40) = 0.099; p = 0.755], Group [F(1,40) = 0.785; p = 0.381] and interaction [F(1,40) = 0.026; p = 0.381].

Table 3 Attentional bias, orientation, and disengagement indices (ms) for HC and HSA children as measured using Evans and Britton’s method

ERP data

ERP data for faces processing

The grand average waveforms for each face type according to the visual field are presented for each group in Fig. 2a.

Fig. 2
figure 2

a Grand average ERPs reconstructed from PCA analyses for the two faces-pairs types (disgust–neutral and happy–neutral) and the two visual fields (RVF = right visual field; LVF = left visual field) obtained at P8 and O1 for each group (HC = healthy controls; HSA = high socially anxious). b Topographic maps for the temporal-spatial factors associated with faces processing

P1

Two temporal-spatial factors have been associated with the P1 component (TF3/SF1; TF3/SF2), but analyses failed to indicate any significant effects on these factors [all p-values > 0.05].

N170

The TF1/SF4 factor combination was associated with the N170 component, peaking at 224 milliseconds at parieto-occipital sites (P8) (Fig. 2b). Robust ANOVA conducted revealed a significant effect of Group [TWJT/C (1.0, 28.1) = 4.55; p = 0.044], highlighting a larger (more negative) N170 amplitude in the HC group [M = −1.50] than in the HSA group [M = 0.17].

P2

The TF8/SF1 factor was associated with the P2 component, peaking at 356 milliseconds at occipital sites (O1) (Fig. 2b). First, analyses revealed a main effect of Emotion [TWJT/C (1.0, 33.6) = 6.98; p = 0.013], highlighting a larger P2 amplitude for disgusted-neutral face pairs [M = 2.40] than for happy-neutral face pairs [M = 2.06]. Second, analyses showed a significant interaction between Group and Emotion [TWJT/C (1.0, 33.6) = 8.88; p = 0.005]. While no significant effect for HC children was found [TWJT/C (1.0, 21.0) = 0.07; p = 0.80], a significant main effect of Emotion was found for the HSA group [TWJT/C (1.0, 16.0) = 13.39; p = 0.001]. For HSA children, the P2 amplitude was higher for disgusted-neutral faces [M = 3.27] than for happy-neutral faces [M = 2.56].

ERP data for targets processing

The original grand average waveforms associated with targets processing are presented in Fig. 3 for each group according to facial expressions, validity, and visual field conditions.

Fig. 3
figure 3

a Grand average ERPs reconstructed from PCA analyses for targets following disgusted-neutral and happiness-neutral pairs of faces in the valid and invalid condition and for the two visual fields (RVF = right visual field; LVF=left visual field) obtained at POz and Oz for each group (HC = healthy controls; HSA = high socially anxious). b Topographic maps for the temporal-spatial factors associated with targets processing

P1 and P3b

The temporal-spatial principal factor analysis failed to identify a significant factor combination on factors that could be associated, according to their topography and latency, with the P1 and P3b components.

P2

For the P2, the TF5/SF4 factor was associated with the P2 component, peaking at 356 milliseconds at occipital sites (O2) (Fig. 3b). For this factor, a significant Emotion x Validity interaction was found [TWJT/C (1.0, 35.7) = 11.4; p = 0.001] and was qualified by a significant interaction between Group, Emotion and Validity [TWJT/C (1.0, 35.7) = 10.7; p = 0.002]. Further analyses revealed a significant interaction between Emotion and Validity for the HSA group only [TWJT/C (1.0,16.0) = 21.03; p < 0.001], with larger P2 amplitude for targets following disgusted-neutral faces in the invalid [M = 1.33] compared with the valid condition [M = 0.79; TWJT/C (1.0,16.0) = 5.67; p = 0.040].

Discussion

This study was designed to investigate the behavioral and electrophysiological correlates of AB for threat in children aged 8 to 12 years who presented high levels of social anxiety. Twenty-two HC and 20 HSA children participated in a dot-probe task while their responses and electroencephalograms were collected. Main results showed significant differences in ERP at the electrophysiological level only: (i) lower N170 amplitude in response to all facial stimuli in the HSA group than in the HC group; (ii) greater P2 amplitude for disgusted faces than for happy faces in the HSA group; and (iii) larger P2 amplitude for invalid targets following disgusted faces compared with valid targets in HSA children.

The absence of group differences on RT-based measures of AB is in line with a growing body of literature using the dot-probe task (see Dudeney et al., 2015 for a review). Our study extends these findings using, for the first time in a subclinical pediatric population, the new calculation method proposed by Evans and Britton (2018), which demonstrates superior psychometric properties compared to standard AB measures. Conversely, EEG analyses revealed three differences according to the level of social anxiety. On the one hand, during the processing of face-pairs, HSA children showed increased P2 amplitudes in the threatening (disgust-neutral) compared with the nonthreatening (happy-neutral) condition. We also found an increased P2 for invalid targets following disgust faces in the HSA group. The P2 component has traditionally been associated with the mobilization of attention resources on salient stimuli (Bar-Haim et al., 2005; Eldar et al., 2010) and with sustained perceptual processing (Schupp et al., 2003; Schupp et al., 2004). Therefore, our results first suggest an impact of a high level of social anxiety on the initial orientation of attention towards threats, and second, they imply that that the presentation of a threatening face provokes an enduring neural response in HSA children. Interestingly, Thai et al. (2016) suggested that an enhancement of the P2 amplitude could reflect the recruitment of compensatory resources needed for attentional flexibility and the control of the behavioral manifestations of social anxiety through AB. This assumption appears to be particularly interesting in our case since the increased P2 amplitude for the HSA group is significant in the invalid condition, requiring participants to flexibly move their attention towards the side of the screen opposite to a threat. Therefore, the P2 enhancement in this condition could reflect the increased effort that HSA children would need for controlled processes such as the ability to disengage their attention from threat and process a subsequent target correctly later on.

Taken with the absence of group effect at the behavioral level, our result may reflect that, as suggested by the processing efficiency hypothesis of the ACT (Eysenck et al., 2007; Eysenck & Derakshan, 2011), high social anxiety levels have no impact on performance effectiveness but might on processing efficiency, showing that anxious individuals engage more neurocognitive resources than nonanxious individuals (Eysenck et al., 2007; Eysenck and Calvo, 1992). However, this assumption should be taken with a certain amount of caution, because although our sampling size equals or is even greater than most studies on the topic (Bechor et al., 2019; Usler and Weber, 2021), we cannot rule out that our null finding on reaction times is due to a lack of statistical power.

During the processing of faces, HSA children also showed a less important N170 amplitude in response to all facial stimuli than the HC group. This result conflicts with the previous literature (Bechor et al., 2019) and with our initial hypothesis of facilitated attention for threatening faces in HSA children. However, it is important to note that the N170 component relies on the lateral temporal cortices (e.g., superior temporal sulcus and fusiform gyrus) (Itier & Taylor, 2004; Sadeh & Yovel, 2010) that follows a tremendous developmental course from childhood to age 14-15 years (Batty & Taylor, 2006). Therefore, the potential immaturity of these areas in our age range may explain this unexpected result. Surprisingly, we failed to show any significant effect on the P1 and P3b components, meaning that these early and later visual components may not be impacted either by the emotional content or by the level of anxiety in our study. This observation agrees with the results previously obtained by Thai et al. (2016) but conflicts with those obtained by Bechor et al. (2019). Because all of these studies required participants to passively look at the faces, methodologies variations could not explain mixed findings for faces processing. However, we hypothesize that these discrepancies may arise from differences in the clinical status between samples. Indeed, while Bechor et al. (2019) recruited children suffering from clinical levels of anxiety, we evaluated subclinical children with high levels of social anxiety as in Thai et al. (2016). Our results indicate thus that subclinical social anxiety levels also are associated with particularities in the electrophysiological processing of threat, which could shed light on the etiological factors of the disorder and its developmental course. Accordingly, studies conducted in subclinical socially anxious adults demonstrated that they continue to differently process negative facial expressions at electrophysiological levels but, contrary to children, their behavioral performances are altered (Rossignol et al., 2007), suggesting that difficulties should increase with age.

As already outlined, this study has some limitations. First, even if our sample size follows those encountered in studies investigating ERPs in psychopathological contexts (Bechor et al., 2019; Usler and Weber, 2021), it remains limited and prevents us to assert that the null behavioral finding is not due to a lack of statistical power. Future studies should use facilitated paradigms (i.e., shorter tasks, fewer inventories, reaction-times-based tasks only, or with a mobile EEG system) to include larger samples. The use of larger samples would allow for example the investigation of the specific effect of some variables, such as the age of the participants, knowing that executive abilities increase with age (Bishop, 2009; Hare et al., 2008; Kindt & Van Den Hout, 2001). We therefore may assume that the attention control abilities of our sample were still developing, leading to potential similar AB for social threat in our anxious and nonanxious pediatric samples. However, the electrophysiological differences observed in our study are in favor of an increased negative effect of social anxiety on the processing of negative emotional expressions in this age group of children and, therefore, on their attentional control abilities. Despite this observation, there is a need to conduct longitudinal studies on larger samples to further investigate these issues. Second, we chose to recruit children with high levels of social anxiety instead of clinical SAD to avoid the potential confounding effect of comorbidities. However, our results allow attesting that particularities in the processing of threat can be observed at the cerebral level in children with nonclinical levels of social anxiety and allows us to question the causal relationship between AB for threat and the level of anxiety. As a reminder, the cognitive model of social phobia developed by Rapee and Heimberg (1997) posited that the behavioral, cognitive, and physical symptoms of social anxiety increased the attentional focus on the potential threat present in the environment that, in turn, promote the development of symptoms through a vicious circle. Therefore, future research with a clinical sample could observe a continuum in which AB for threat and ERPs amplitudes increase when the level of social anxiety increases. Therefore, we may consider that the inherent characteristics of socially anxious youth (e.g. disorder severity, symptoms, attentional control abilities) may explain the conflicting results obtained until now in the literature on that topic. A better understanding of this hypothetical continuum also may benefit future research on the clinical effects of attentional bias modification (ABM) protocols on AB and anxiety levels (Cristea et al., 2015). A way to circumvent this limit would be to used the SPAI-C scores as a continuous variable in further similar studies. Finally, we used adults faces in our paradigm for a facilitated comparison of our results with previous studies conducted in socially anxious populations (for a review, see Wauthia & Rossignol, 2016). Future research should compare AB towards adults’ and children's faces, because social anxiety may have various symptomatic manifestations (e.g., anxiety when confronted to the evaluations of adults vs. anxiety when confronted to the rejection of peers).

Conclusions

This study showed significant differences between a group of children with high levels of social anxiety and a healthy control children only at electrophysiological levels. We notably found that socially anxious children produced increased P2 amplitudes when processing disgust-neutral pairs of faces as well as targets replacing disgusting faces in an invalid condition. These results indicate that AB toward threat associated with subclinical levels of social anxiety in children could be linked to enhanced perceptual processing for these stimuli. These additional resources recruitment could be used by this population to circumvent attentional control deficits, causing AB for threat. Our results confirm the relevance of using the ERP technique when investigating AB for threat in a pediatric population and support the idea that some components could be considered as neuromarkers of biased processing of threatening faces and of the causal relationship between AB and anxiety levels. A thorough investigation of it could help to identify children at risk of developing social anxiety.