Introduction

To navigate in a complex human society, people assign individuals to specific social groups along a given category such as gender, occupation, socioeconomic class, and so on. We attribute these groups specific stereotypes, which are generalized characteristics, such as personal traits (e.g., the engineer is intelligent) and circumstantial attributes (e.g., the celebrity is rich; Amodio, 2014). Nevertheless, our social expectations of particular group members are sometimes violated, such as when we expect lawyers to be dishonest but experience honest lawyers, or when we come across reliable thieves (Cloutier et al., 2011; Macrae et al., 1999; Sherman et al., 1998). Although neuroimaging research has revealed the brain regions supporting social groups and their stereotypes, only a handful of studies have investigated the neural mechanism involved in identifying stereotype-inconsistent behaviors (Van der Cruyssen, Heleven, Ma, Vandekerckhove, & Van Overwalle, 2015).

Neuroscientific studies have revealed that key areas in the mentalizing network, including the medial prefrontal cortex (mPFC) and the temporoparietal junction (TPJ), are preferentially involved in the process of person perception (Schurz et al., 2014; Van Overwalle, 2009). These areas are recruited when inferring the mental states of others (Frith & Frith, 2006), personality traits (Ma, Vandekerckhove, Van Overwalle, Seurinck, & Fias, 2011), and group stereotypes (Delplanque et al., 2019). Although the cerebellum has been traditionally considered as supporting mainly motor and movement coordination, neuroscientists have demonstrated its involvement also in cognitive processes, including social cognition (Van Overwalle et al., 2014), language (Mariën et al., 2014), and emotions (Adamaszek et al., 2017).

Researchers have proposed that the cerebellum constructs internal models to encode, detect, and predict temporally structured sequences of motor and nonmotor (i.e., mental) events (Leggio & Molinari, 2015). With respect to social cognition, Van Overwalle et al. (2019b) hypothesized that the cerebellum builds internal models of social action sequences to predict and control the behaviors of oneself or other persons in social interaction. This sequencing mechanism makes it easier to understand the social motives behind behaviors, to detect violations, and to adjust subsequent reactions accordingly. For instance, allowing another person to enter a room first might be interpreted as polite, whereas the reverse order of entering the room first oneself, might be seen as inconsiderate. Research has revealed that the posterior cerebellum supports learning of action sequences that involve social mentalizing, such as inferring another person’s beliefs (Heleven et al., 2019), traits (Haihambo et al., 2021; Pu et al., 2020), and goals (Li et al., 2021), even when sequencing was manipulated implicitly (Ma et al., 2021a). In clinical research, cerebellar patients performed worse when they were asked to generate the temporal order of social actions requiring the understanding of others’ outdated or “false” beliefs, more so than when actions involved routine physical or social scripts, such as shopping (Van Overwalle et al., 2019a).

More importantly, recent fMRI studies showed that the posterior cerebellum is recruited when memorizing the temporal order of actions that imply a personality trait (Pu et al., 2020), even when there was no inherent logical order in the actions (Li et al., 2021). Critically, when participants were asked to memorize the temporal order of a series of actions implying either a consistent or an inconsistent trait of a person, the posterior cerebellum was even more activated when an action violated the trait implied by prior actions (Pu et al., 2021). Because there was no inherent logical order in the actions, participants could not predict precisely the next action, only that it should match the same trait. These studies therefore suggest the critical role of the cerebellum in generating predictions during action sequence learning and processing prediction errors due to inconsistencies during social understanding (Sokolov, Miall, & Ivry, 2017).

The present study investigates whether the cerebellum might support such inconsistency effect not only for trait inferences that are related to an individual, but also for stereotypes related to a social group. Social groups (e.g., murderers) and their stereotypes (e.g., violent) are fundamental types of social constructs that help people to simplify social perception and decrease cognitive load in social understanding (Macrae et al., 1993). Stereotypes therefore facilitate the prediction of behaviors of other persons and social interaction (Bodenhausen et al., 1976; Mitchell et al., 2009).

What happens in the brain when we perceive stereotype-inconsistent behaviors from group members? Although fMRI studies have explored the neural correlates of social categories and their stereotypes (Lau & Cikara, 2017; Mitchell et al., 2009; Van der Cruyssen et al., 2015), research investigating the violation of stereotypical behaviors is limited. To illustrate, in an fMRI study, the photographs of either Republican or Democrat politicians were paired with either typical Republican or Democrat political views and presented to participants. Results showed that the mentalizing network, including the TPJ and mPFC, was recruited for stereotypically incongruent social targets (Cloutier et al., 2011). No cerebellar activation was reported. In another fMRI study, participants were presented with faces that were either congruent or incongruent with stereotypes of various races and emotions. The mentalizing mPFC and, additionally, the lateral prefrontal cortex (PFC) were more strongly activated for faces that violated rather than confirmed stereotypical expectancies (Hehman et al., 2014). Moreover, the cerebellum also showed stronger activation to stereotype-inconsistent faces. An fMRI study comparing trait-related actions by group members versus individuals, found that inconsistent actions recruited the mentalizing TPJ more strongly than consistent actions (Van der Cruyssen et al., 2015). However, the posterior cerebellum did not show stronger recruitment for inconsistencies, although, interestingly, it showed stronger activation for group stereotypes than for individual traits. In a related study, trait-inconsistent behaviors recruited the mentalizing mPFC and TPJ, as well as the conflict-related lateral PFC and posterior medial frontal cortex (pmFC; also known as dorsal anterior cingulate; Botvinick et al., 2004; Van Overwalle, 2009). Again, no cerebellar involvement was observed.

In summary, inconsistent stereotype behaviors increased activation in cortical mentalizing areas (TPJ and mPFC) and, additionally, in the pmFC and lateral PFC, which is part of the of the conflict monitoring network (Botvinick et al., 2004; Van Overwalle, 2009). However, the contribution of the cerebellum in stereotype inconsistency across previous studies was contradictory. Although this might in part be due to the fact that the cerebellum was not always measured by fMRI protocols in previous studies (Baetens et al., 2020), a more likely explanation in line with our theorizing is that sequencing was not explicitly manipulated, so that the contribution of the cerebellum remained undiscovered. Given these unclear results, the aim of the present study is to investigate the neural correlates of stereotype-inconsistent social behaviors of group members in the context of a sequencing learning paradigm.

Present study

The research question in this study is: does the posterior cerebellum contribute to detecting stereotype-inconsistent behaviors of group members? We expected that the cerebellar function of detecting violations of trait-implying behaviors by individuals (Pu et al., 2021) extends to stereotype-violating behaviors by group members. To investigate this, we adapted a sequencing learning paradigm from Pu et al. (2021). Specifically, participants were asked to memorize the temporal sequence of sentences describing various group members (e.g., celebrities) who all performed stereotype-consistent behaviors (Social Consistent Sequencing condition), or who after initial stereotype-consistent behaviors by some members, engaged in stereotype-inconsistent behaviors (Social Inconsistent Sequencing condition). After memorizing the temporal sequence of these behaviors, participants were tested on their correct recognition of the sequence. As a comparison, we created two control conditions in which participants had to simply read the social consistent sentences without memorizing the order (Social Consistent Nonsequencing Control condition) or in which participants had to memorize the order of nonsocial consistent events (Non-social Consistent Sequencing Control condition).

We expected that the posterior cerebellum would be recruited more when participants memorize the order of social actions, as opposed to the two non-sequencing and non-social control conditions. Of particular interest, we predicted that the posterior cerebellum would contribute to identifying social behaviors that are inconsistent with the stereotype of social groups, by revealing higher activation during stereotype-inconsistent than consistent behavior. In addition, we expected that cortical brain networks, consisting of mentalizing regions (e.g., TPJ and mPFC) would be activated during all social sequencing conditions, while domain-general conflict monitoring regions (e.g., lateral PFC and pmFC) would support the processing of stereotype-inconsistent behaviors (Hehman et al., 2014; N. Ma et al., 2012; Van der Cruyssen et al., 2015).

Methods

Participants

Twenty-eight, healthy, right-handed, native Dutch-speaking volunteers were recruited to participate in this fMRI study. We excluded one participant due to excessive head movements (more than 10% outlier scans; see below). Thus, 27 participants (18 females; age mean ± SD, 23 ± 4 years) were included in the analyses. All participants had normal or corrected-to-normal vision, and reported no neurological or psychiatric disorders. Informed consent was obtained with the approval of the Medical Ethics Committee at the Hospital of the University of Ghent, where the study was conducted. Participants were paid 20 euros in exchange for their participation and transportation costs.

Stimulus materials

Five sentences were shown in each sentence set, which described five different members belonging to a social group (e.g., jury members) who performed a social behavior that implied a stereotype. For example, “The 1st judge has the final say on the sentence” and so on until “The 5th judge sentences a criminal to five years in prison.” There were a great number of groups, including stereotyped categories, such as gender, occupation, social status, etc. These sentence sets were used in four conditions (Table 1). Specifically, in the Social Consistent Sequencing and Social Consistent Nonsequencing condition, each sentence set comprised of social actions that were all consistent with a stereotype. For example, “The 1st athlete breaks the gold medal record in the Olympic games” implies the stereotype “sportive.” In the Social Inconsistent Sequencing condition, each sentence set comprised of social actions with the initial three or four actions implying a consistent stereotype and the remaining two or one actions implying an inconsistent stereotype. For example, “The 1st celebrity lives in a splendid house in the suburbs” implies a consistent stereotype of “rich.” However, “The 4th celebrity cannot afford health insurance” implies an inconsistent stereotype of “poor.” To create a strong expectation of stereotype-consistent behaviors, inconsistent actions were always presented after at least three consistent sentences. Half of the inconsistent sets included one inconsistent action at a random position in the fourth or fifth sentence (randomly determined), and the other half included two inconsistent actions in the fourth and fifth sentences. This manipulation was introduced to vary the occurrence and number of inconsistent sentences, to avoid a growing anticipation of the position of inconsistent sentences that might weaken our inconsistency effect. As a comparison, we included a Nonsocial Consistent Sequencing Control condition; the five sentences of each set described five different objects belonging to a specific nonsocial category that implied a consistent characteristic. For example, “The 1st skyscraper offers a landscape view all over the city” implies “tall,” which is consistent with the feature of skyscrapers.

Table 1 Abbreviated examples of experimental stimuli

All nonsocial sentences and features were selected from a previous study (Pu et al., 2021). Some of the stereotype-consistent sentences and their stereotypes were adapted from an earlier study (Delplanque et al., 2019). Additional consistent sentences, and all inconsistent sentences, as well as their stereotypes, were newly created by the first author. All social sentences and their stereotypes were pilot tested. In the pilot, participants (n = 25) were asked to rate “How applicable is the stereotype for the group?” using a 7-point scale (1 = not applicable at all, 4 = neutral, and 7 = very applicable). The consistent stereotypes were selected when the applicability rating was >5.5, while inconsistent stereotypes were selected when the applicability rating was <2.5. Next, to identify the applicability of stereotypical sentences regarding the social group, participants rated “How stereotypical is the behavior in the sentence?” using a 7-point scale. The sentences describing the consistent stereotype were selected when the applicability rating was >5.5, while sentences describing the inconsistent stereotype were selected when the applicability rating was <2.5. For the Consistent condition, we selected sentence sets that consisted of five consistent sentences, while for the Inconsistent condition, sentence sets consisted of three (or 4) consistent sentences and two (or 1) inconsistent sentences. All selected sentences contained between 9 and 12 words, including the group name (e.g., “The 1st celebrity”), with most sentences containing 10 words in total.

Procedure

This study included three Sequencing conditions (i.e., Social Consistent Sequencing, Social Inconsistent Sequencing, and Nonsocial Consistent Sequencing), preceded by a baseline Nonsequencing condition (i.e., Social Consistent Nonsequencing), or four conditions in total. The Nonsequencing condition was presented at the beginning of the experiment, whereas the Sequencing conditions were presented in the remainder of the experiment in a randomized order. In all conditions, each sentence set was fully randomized for each participant, with the provision that inconsistent sentences appeared only in positions four or five (see below).

In the Sequencing task, participants were instructed to memorize a given temporal order of a set of five sentences involving a group member or an object. Then they had to infer from these sentences a common stereotype/characteristic of that group or that kind of object. There were 11 sentence sets for each Sequencing condition—all presented in a random order across all sequencing conditions. In each set, participants had to memorize the order of sentences in 25 s or 35 s (randomly determined). The two different durations were intended to create different levels of difficulty in which participants performed at neither chance nor ceiling level. This factor was of no further relevance for our hypotheses, and therefore reported in the Supplementary Table S1. Before the task, participants performed two practice sets including Social and Nonsocial sets.

For each set of the Sequencing task, the same procedure was followed (Fig. 1). During the study phase, participants were asked to memorize the correct temporal order of a set of sentences. First, the name of the social groups or nonsocial objects (e.g., “the celebrities”, or “the skyscrapers”) was presented on the top of the screen. Then five sentences were shown on screen one by one with a duration of 3 s each (while prior sentences were not shown), leaving sufficient time to read each sentence carefully. Immediately afterwards, all sentences were presented together for a total duration of 25 s or 35 s, to memorize their order. A red notice appeared on the top of the screen to indicate that 10 s remained before the memorizing phase ended. To optimize the estimation of the event-related fMRI response for inconsistent sentences, a mean 500-ms jitter (randomly ranging between 0 and 1000 ms) was presented between the third and fourth sentence regardless of condition.

Fig. 1
figure 1

Experimental procedure. In the beginning, participants were asked to memorize the order of the behaviors of five group members. Five sentences describing the actions of each group member were first shown on screen one by one with a duration of 3 s each. Immediately afterwards, all sentences were presented together for a total duration of 25 s or 35 s. Importantly, the sentences in the second half (positions 4-5) of the set implied either a consistent or inconsistent stereotype (Social Consistent Sequencing and Social Inconsistent Sequencing conditions). Afterwards, they had to infer a common stereotype/characteristic of the group member and rate how stereotypical the behaviors of the group members are, followed by a factual check question. All these questions had to be answered within 5 s. Next, participants had to retrieve the correct order of the sentences. An example of the Social Inconsistent Sequencing condition is shown here; Table 1 shows examples of the other conditions. For the Nonsequencing Control condition, the procedure was identical to the Sequencing conditions, except for not having a subsequent sequence retrieval phase

Participants were then asked, “Which characteristic best describes these five group members or five objects?” Two options were given in a random order: one option was the correct stereotype/characteristic, and the other was a distractor with the same valence. Participants then rated, “How stereotypical are the behaviors in the sentences?” (i.e., Stereotypicality rating phase) using a 4-point rating scale (1 = not at all, 4 = very much). Next, to verify whether the participants had fully understood the sentences and to avoid that they used a minimal strategy whereby they memorized only some keywords of the sentences, they had to answer a factual check question “Which of the two sentences was shown before?” One sentence came from the original set, and the other was a slightly reworded version (e.g., “The celebrity orders a bag from Hermes,” and “The celebrity orders a bag from Dior”). Note that the factual check question does not demonstrate sequence memorization of the sentences (which is the main dependent behavioral variable), but only memorization of some details in the sentences (as a manipulation check that the sentences were read).

Finally, during the sequence retrieval, participants were instructed to recognize the correct order of the sentences consisting of two trials by answering the question, “Which of the two sentences were shown earlier during the study phase (1 = the first sentence, 2 = the second sentence).” The order (1st, 2nd, ...5th) of the group members or objects was omitted from the sentence options during the factual check and sequence retrieval questions. On each trial, two sentences were shown in a random order.

Before the start of the Sequencing task, a Social Consistent Nonsequencing Control condition was introduced. This condition was presented first to avoid spontaneous memorization of the order of the sentences after going through the Sequencing conditions first (see also Pu et al., 2020, 2021). Participants were required to read sets of social sentences implying a consistent stereotype of a social group (11 sets in total) but without memorizing their order. For this reason, participants were allowed to end the reading earlier (i.e., before 25 or 35 s) once they understood all the sentences. All other aspects of the procedure were identical to the Sequencing task, except for not having a sequence retrieval phase.

All questions and ratings had to be answered within 5 s and were preceded by a blank screen with a fixation cross in the center, which was jittered randomly between 0 ms to 2000 ms (mean = 1000 ms). All responses were given on a response box used with the (nondominant) left hand. Overall, the participants failed to respond within 5 s (missed) in 6.5% (SD = 5.6%) of the retrieval trials across all sequencing conditions. These missed trials were excluded from the behavioral and fMRI analysis.

Imaging procedure and preprocessing

Images were collected with a Siemens Magnetom Prisma fit scanner system (Siemens Medical Systems, Erlangen, Germany) using a 64-channel radiofrequency head coil. Stimuli were projected onto a screen at the end of the magnet bore that participants viewed by way of a mirror mounted on the head coil. Stimulus presentation was controlled by E-Prime 2.0 (www.pstnet.com/eprime; Psychology Software Tools) running under Windows XP. Participants were placed head first and supine in the scanner bore and were instructed not to move their heads to avoid motion artifacts. Foam cushions were placed within the head coil to minimize head movements. First, a high-resolution anatomical images were acquired using a T1-weighted 3D MPRAGE sequence [TR = 2250 ms, TE = 4.18 ms, TI = 900 ms, FOV = 256 mm, flip angle = 9°, voxel size = 1 × 1 × 1 mm]. Second, a fieldmap was calculated to correct for inhomogeneities in the magnetic field (Cusack & Papadakis, 2002). Third, whole-brain functional images were collected in a single run using a T2*-weighted gradient echo sequence, sensitive to BOLD contrast (TR = 1000 ms, TE = 31.0 ms, FOV = 210 mm, flip angle = 52°, slice thickness = 2.5 mm, distance factor = 0%, voxel size = 2.5 × 2.5 × 2.5 mm, 56 axial slices, acceleration factor GRAPPA = 4).

SPM12 (Wellcome Department of Cognitive Neurology, London, UK) was used to process and analyze the fMRI data. To remove sources of noise and artifacts, data were preprocessed. Inhomogeneities in the magnetic field were corrected using the fieldmap (Cusack & Papadakis, 2002). Functional data were corrected for differences in acquisition time between slices for each whole-brain volume, realigned to correct for head movement, and co-registered with each participant’s anatomical data. Then, the functional data were transformed into a standard anatomical space (2-mm isotropic voxels) based on the ICBM152 brain template (Montreal Neurological Institute). Normalized data were then spatially smoothed (6-mm full-width at half-maximum, FWHM) using a Gaussian Kernel. Finally, using the Artifact Detection Tool (ART; http://web.mit.edu/swg/art/art.pdf;http://www.nitrc.org/projects/artifact_detect), the preprocessed data were examined for excessive motion artifacts and correlations between motion and experimental design, and between global mean signal and experimental design. Outliers were identified in the temporal differences series by assessing between-scan differences (Z-threshold: 3.0 mm, scan-to-scan movement threshold: 0.5 mm; rotation threshold: 0.02 rad). These outliers were omitted from the analysis by including a single regressor for each outlier. A default high-pass filter was used of 128 s and serial correlations were accounted for by the default auto-regressive AR(1) model.

Statistical analysis of neuroimaging data

Whole-brain analysis of study phase and retrieval phase

The general linear model of SPM12 (Wellcome Department of Cognitive Neurology, London, UK) was used to conduct the analyses of the fMRI data. For the analysis at the first (single participant) level, the event-related design was modeled with one regressor for each of the four conditions (Social Consistent Non-sequencing; Social Consistent Sequencing; Social Inconsistent Sequencing; Nonsocial Consistent Sequencing). During the study phase, onsets for all conditions were specified at the presentation of all sentences-at-once of the sentence set, that is, when memorizing the order of the sentences most likely began. After the study phase, onsets were specified at the presentation of each question (i.e., stereotype/characteristic judgment, stereotypicality rating, and sequence retrieval) for each of the Sequencing conditions. Each regressor was convolved with a canonical hemodynamic response function of which the duration was set to 0 s for all questions. During the study phase, event duration was determined in the same manner as Pu et al. (2020): For the Nonsequencing Control condition, the duration for reading all sentences was set to 4 s (on average the shortest reading time to understand the sentences). Sentence sets with reading times shorter than 4 s were excluded from the analysis, and the mean rejection rate of sentence sets was 16% (SD = 22%). For the Sequencing conditions, duration was limited to 10 s to capture memorizing the order of all sentence sets and participants.

At the second (group) level, the regressors from the single-subject, first-level analyses were entered into a second-level, random-effects analysis. For the study phase and all questions, we conducted a one-way within-subject ANOVA and defined all possible t-contrasts of interest comparing Sequencing and Nonsequencing conditions during the study phase, and comparing in a similar manner the stereotype/characteristic judgment, stereotypicality rating, and sequence retrieval questions of the Sequencing conditions. A full factorial analysis was not conducted, because there were not enough conditions to combine into a full factorial design that allowed to test our hypotheses, and because SPM does not have a within-participants version of this analysis which controls for individual differences.

Whole-brain analysis of detecting stereotype-inconsistent behaviors

We then examined brain activity associated with stereotype-inconsistent behaviors of social groups. Recall that each sentence set was split up in the first half of consistent sentences (positions 1-3) and the second half of mixed consistent and inconsistent sentences (positions 4-5). At the second (group) level, to analyze inconsistency detection in more detail, for the Social Inconsistent Sequencing sets, we compared the first occurrence of an inconsistent sentence (in position 4 or 5) > the third occurrence of a consistent sentence (in position 3). We did so to have an inconsistent sentence following as close as possible to a consistent sentence in an attempt to identify a sudden inconsistency effect at the relevant moment. Note that because the order of (in) consistent sentence was full randomized for each participant, the exact sentence in these comparisons always differed. In addition, to verify that the cerebellar Crus was preferentially activated when detecting inconsistent actions, as opposed to consistent actions, we next examined brain activation for the parallel (Social and Nonsocial) Consistent Sequencing conditions. Specifically, we compared the fourth occurrence of a consistent sentence (to parallel a similar position of the 1st inconsistent sentence) > the third occurrence of a consistent sentence. This analysis required six regressors of interest at the first level, involving two sentences in the three Sequencing conditions.

For all whole-brain analyses, significant activation maps were defined at a cluster-defining threshold of p < 0.001, uncorrected with a minimum cluster extent of 10 voxels, and we restricted the analysis to clusters with a Family Wise Error (FWE) corrected with a cluster-wise threshold of p < 0.05.

Regions of Interest Analysis

Several a priori Regions of Interest (ROI) were determined by our specific hypotheses (e.g., Cerebellar Crus) and by earlier findings (Ma et al., 2012) indicating that mentalizing and conflict monitoring networks were activated in updating consistent and inconsistent social behaviors. Specifically, the ROIs for cerebellar Crus were taken from an earlier study (MNI Coordinates: Crus 2, ± 24, −76, −40; Crus 1, 40, −70, −40; Van Overwalle et al., 2020). The cortical ROIs were derived from prior meta-analyses on social cognition (Van Overwalle, 2009) involved the following areas and center coordinates: social mentalizing: TPJ, ±50 − 55 25; dmPFC, 0 50 35; mPFC, 0 50 20; precuneus, 0 − 60 40; conflict monitoring: pmFC, 0 20 45; lateral PFC, ± 40 25 20. A sphere of 10-mm radius for cerebellar ROIs (given the smaller volume of the cerebellum) and 15-mm radius for cortical ROIs (see similar analysis by Pu et al., 2021) around the centers was used to perform a small volume correction using the same cluster-defining threshold as the whole-brain analysis, with p < 0.001, uncorrected with a minimum of 10 voxels. Significant ROIs were identified using a threshold of p < 0.05, FWE corrected at the cluster level.

Results

Behavioral results

We used one-way repeated measures analysis of variance (ANOVA) on the accuracy of stereotype/characteristic judgment, the factual check question, the stereotypicality rating, and the accuracy and response time of sequence retrieval, with Sequencing condition (Social Consistent, Social Inconsistent, and Non-social Consistent Sequencing conditions) as within-participant factors.

First, for the accuracy of the stereotype/characteristic judgment, the main effect of Sequencing condition was not significant, F(2,52) = 0.91, p = 0.41, η2p = 0.034. The average accuracy across all conditions was 96% (SD = 7.8%).

Second, for the accuracy of the factual check question, the main effect of Sequencing condition was significant, F(2,52) = 15.36, p < 0.001, η2p = 0.37. The accuracy in the Social Inconsistent Sequencing condition (mean ± SD: 85% ± 13%) was significantly lower than the Nonsocial Consistent Sequencing (mean ± SD: 95% ± 7%), p = 0.024 and Social Consistent Sequencing conditions (mean ± SD: 91% ± 8%), p < 0.001 (Bonferroni correction). The average accuracy of the factual check question across all Sequencing conditions (Social + Nonsocial) was 90% (SD = 9%).

Third, for the stereotypicality rating, the main effect of Sequencing condition was significant, F(2,52) = 36.79, p < 0.001, η2p = 0.59. The stereotypicality rating in the Social Inconsistent Sequencing was significantly lower than the Social Consistent and Non-social Consistent Sequencing conditions, both p < 0.001 (Bonferroni correction). Overall, these results indicated that participants performed well on inferring the stereotype/characteristic of both social groups and non-social objects, and that the manipulation of stereotype (in)consistency was successful.

Finally, for the main dependent variable—accuracy of sequence retrieval—we analyzed the % correct response. The main effect of Sequencing condition was significant, F(2,52) = 5.41, p = 0.007, η2p = 0.17. The retrieval accuracy in the Social Inconsistent Sequencing condition (mean ± SD: 85% ± 11%) was higher than the Nonsocial Consistent Sequencing condition (mean ± SD: 78% ± 12%), p = 0.012 (Bonferroni correction). There was no significant difference for other comparisons (Social Inconsistent vs. Social Consistent Sequencing, Social Consistent vs. Non-social Consistent Sequencing), all p > 0.10.

In addition, for the response time of sequence retrieval, the main effect of Sequencing condition was not significant, F(2,52) = 2.59, p = 0.09, η2p = 0.09.

Neuroimaging results

The analysis of the data was hypothesis-driven to avoid an overload of nonessential results and because of the lack of a full factorial design. Hence, we focused only on the hypothesized contrasts between Sequencing versus Nonsequencing, between Social versus Nonsocial, and between Consistent versus Inconsistent conditions. We also report the reverse contrasts to ensure that the involvement of the posterior cerebellum is observed only in the hypothesized direction of the contrasts.

Given our focus on the cerebellum, we start the description of our results with this brain area, followed by the other (posterior to anterior) areas.

Study phase: memorizing the order of social actions

First, to identify the cerebellar involvement in learning action sequences, we computed a Social Consistent Sequencing > Social Consistent Nonsequencing contrast. As expected, the results from a whole brain analysis revealed significant posterior cerebellar Crus 2 activation. Additional brain activations were found in the mPFC, dmPFC, TPJ, cerebellar VI, postcentral gyrus, and Inferior Frontal Gyrus (IFG; including lateral PFC), and superior frontal gyrus (Table 2; Fig. 2A). For the reverse contrast, we observed brain activations in the cuneus, precuneus, supramarginal gyrus, superior frontal gyrus, and middle frontal gyrus.

Table 2 Whole-brain analysis of the study phase (e.g., memorizing the order of actions)
Fig. 2
figure 2

Memorizing social action sequences. Top: Sagittal and Transverse views of the contrasts at an uncorrected threshold of p < 0.001. Bottom: Cerebellar activations of the same contrasts drawn on a SUIT flat map, together with a flatmap atlas and the functional network flatmap from Buckner et al. (2011; http://www.diedrichsenlab.org/imaging/AtlasViewer/viewer.html). The results show that the posterior cerebellum Crus was significantly activated in the contrast (A) Social Consistent Sequencing > Social Consistent Non-sequencing Control; (B) Social Inconsistent Sequencing > Social Consistent Non-sequencing Control; (C) Social Inconsistent Sequencing > Social Consistent Sequencing; (D) Social Inconsistent Sequencing > Nonsocial Consistent Sequencing. For all contrasts, whole brain activation, p < 0.05, FWE corrected

For the parallel contrast of Social Inconsistent Sequencing > Social Consistent Non-sequencing, the posterior cerebellar Crus 2 was again significantly activated, together with activations in the mPFC, dmPFC, TPJ, cerebellum VI and VII, middle temporal gyrus, inferior parietal lobule, postcentral gyrus, inferior temporal gyrus, middle frontal gyrus, and IFG including lateral PFC (Table 2, Fig. 2B). The brain activations for the reverse contrast were found in the cuneus, supramarginal gyrus, superior frontal gyrus, and middle frontal gyrus.

Next, to explore the effect of inconsistency on cerebellar activation, we directly compared Social Inconsistent Sequencing > Social Consistent Sequencing. We found significant brain activation in the posterior cerebellar Crus 1 and 2 (Table 2, Fig. 2C); no significant brain activation was found in the reverse contrast.

Finally, to demonstrate that cerebellar activation for processing action sequences was preferentially recruited in social contexts, we compared Social Consistent or Inconsistent Sequencing > Nonsocial Consistent Sequencing. Cerebellar Crus activation was found for the contrast Social Inconsistent Sequencing > Non-social Consistent Sequencing (Table 2, Fig. 2D), but nothing for Social Consistent Sequencing > Nonsocial Consistent Sequencing. No significant brain activation was found for the reverse contrasts.

As mentioned earlier, this study does not constitute a full factorial design (e.g., lack of Social Inconsistent Nonsequencing condition). To further support our hypothesis that the posterior cerebellar activation in the inconsistent sequencing condition was predominantly stronger compared to all other consistent conditions (Ma, Pu, Haihambo et al., 2021; Ma et al., 2021c), we conducted a spreading interaction contrast by comparing Social Inconsistent Sequencing > all other conditions (Social Consistent Sequencing + Social Consistent Nonsequencing + Nonsocial Consistent Sequencing; using contrast weights +3 − 1 − 1 − 1, respectively; Table 2). Consistent with the hypothesis, this spreading interaction revealed significant activation in the bilateral posterior cerebellar Crus 2. Additional activations in the cerebrum were found in the dmPFC, mPFC, superior frontal gyrus, inferior frontal gyrus, middle temporal gyrus, and inferior parietal lobule. As hypothesized, the reverse contrasts revealed no activation in the cerebellum, but found brain activations in the cerebrum, including the cuneus, superior orbital gyrus, middle frontal gyrus, and superior frontal gyrus.

Detecting stereotype-inconsistent actions

More importantly, to investigate the neural process in detecting stereotype-inconsistent behaviors more directly, we compared the first occurrence of an inconsistent sentence > the third consistent sentence in each set of the Social Inconsistent Sequencing condition. This analysis demonstrated stronger activation in the posterior cerebellar Crus 1 and 2, together with activation in the mentalizing regions including TPJ, precuneus and dmPFC, and conflict monitoring regions, including the lateral PFC and pmFC (Table 3; Fig. 3). Note, however, that cerebellar Crus activation in this inconsistency contrast is predominantly located in the executive control network (Fig. 3; Buckner et al., 2011). Additional activations were found in the middle occipital gyrus, superior frontal gyrus, insula, and middle frontal gyrus. No significant brain activation was found for the reverse contrast.

Table 3 Whole-brain analyses when detecting the stereotype-inconsistent actions
Fig. 3
figure 3

Detecting stereotype-inconsistent behaviors when learning action sequences (i.e., 1st inconsistent sentence (of the second half) > 3rd consistent sentence (of the first half)). Top: Left panel shows sagittal view with activation in the posterior cerebellar Crus 1 indicated by a circle representing the ROI (sphere with radius 10 mm) at an uncorrected threshold of p < 0.001 (whole brain p < 0.05, FWE corrected). Right panel shows activation of the cerebellar Crus on a SUIT flatmap; together with a flatmap atlas and the functional network flatmap from Buckner et al. (2011; http://www.diedrichsenlab.org/imaging/AtlasViewer/viewer.html). Bottom: activation in cortical regions including TPJ, dmPFC, precuneus, pmFC, and lateral PFC denoted by a circle representing the ROIs (cluster-level p < 0.05, FWE using a small volume correction with a sphere with 15-mm radius)

As might be expected, for both Social and Nonsocial Consistent Sequencing conditions, no cerebellar Crus activation was found when contrasting the fourth consistent sentence > the third consistent sentence. Specifically, for this contrast in the Social Consistent Sequencing condition, activations were found in the middle occipital gyrus, middle frontal gyrus, medial cingulate cortex, basal ganglia (putamen), paracentral lobule, precentral gyrus, and superior temporal gyrus (Table 3). In the Nonsocial Consistent Sequencing condition, we observed activations in the middle occipital gyrus, cerebellum VII, insula, superior frontal gyrus, basal ganglia (caudate), and middle frontal gyrus (Table 3). No significant brain activation was found for the reverse contrasts.

For sequence retrieval, stereotype/characteristic judgment, and stereotypicality rating, we did not observe any significant brain activation for the Social Inconsistent Sequencing > Social Consistent Sequencing contrast (nor the reverse contrast).

Discussion

The goal of the current study was to investigate the neural substrates of stereotype-conflicting behaviors of group members. Participants were asked to memorize the temporal order of a series of social actions that were either stereotype-consistent or inconsistent with prior knowledge or expectations of a social group. This was designed to explore two fundamental functions of the cerebellum, related to its basic role of predicting upcoming behaviors. First, we expected that the posterior cerebellum supports learning the sequence of social actions implying group stereotypes, as this facilitates the prediction of subsequent behaviors. Second and more importantly, we hypothesized that the posterior cerebellum contributes to identifying stereotype-inconsistent behaviors, as this disrupts predictions of ongoing behaviors. Moreover, cortical activation was expected in mentalizing regions in all social sequencing conditions, and additionally in conflict monitoring areas during the stereotype-inconsistent condition (Buckner et al., 2011; Yeo et al., 2011).

Cerebellar Crus and memorizing the sequence of social actions

In line with our hypothesis, our results confirmed that the posterior cerebellum contributes to memorizing social action sequences as opposed to simply reading and recognizing social actions that implied a consistent group stereotype. This finding extends prior research demonstrating the general role of the posterior cerebellar Crus in action sequencing along a large variety of social mentalizing tasks, most often without any a priori inherent order, including memorizing the temporal order of trait-implying actions (Pu et al., 2020, 2021), predicting social action sequences based on personality traits (Haihambo et al., 2021), memorizing social trajectories involved in goal-directed navigation (Li et al., 2021), and even implicitly learning the order of others’ beliefs (Ma et al., 2021b). Together, these studies confirmed the “sequence hypothesis” (Leggio & Molinari, 2015) applied to social cognition (Van Overwalle et al., 2019b), which states that the cerebellum identifies and encodes sequences of actions in the social domain. In addition, the mentalizing network including the TPJ and mPFC also was activated in this contrast, which indicates that mentalizing processes were involved during learning social action sequences.

However, the current results failed to find a preferential recruitment of the cerebellum for social sequencing under consistent conditions (i.e., nonsignificant Social Consistent Sequencing > Nonsocial Consistent Sequencing contrast). This runs against the hypothesis and the prior finding of significant effects of social versus non-social conditions under consistent trait-implying conditions (Pu et al., 2020). The present finding is paralleled by a similar pattern of retrieval accuracy, which revealed that although accuracy was highest under social inconsistent sequencing, it did not differ between the consistent conditions mentioned above (Social Consistent Sequencing vs. Nonsocial Consistent Sequencing). This seems to suggest that the inconsistency manipulation became so salient, that it received processing precedence over other information so that other distinctions in the manipulation such as human versus nonhuman agents became much less relevant or salient. This is an interesting qualification to the social sequencing hypothesis of the social cerebellum originally put forward by Van Overwalle et al. (2019b), which seems to indicate that monitoring one’s alignment with other’s actions is more important than representing upcoming actions of self and other (for a similar view, see Deschrijver & Palmer, 2020). Both strategies are in line with the overall idea that the major function of the social cerebellum is to aid predicting and preparing social interaction.

Cerebellar Crus and detecting stereotype-inconsistent actions

As expected, the posterior cerebellar Crus was preferentially recruited when social sequencing involved actions that were stereotype-inconsistent rather than consistent (i.e., 1st occurrence of Social Inconsistent action >3rd occurrence of Social Consistent action). Consistent with our hypothesis, cerebellar Crus activation was not found in the same contrast involving Social and Non-social Consistent conditions (e.g., 4th Consistent sentence >3rd Consistent sentence). We did, however, observe a weak activation in the anterior cerebellar lobule VII in the Nonsocial condition. Given that this effect was weak and not robust across all Consistent conditions, and given that previous research associated lobule VII with sensorimotor (Stoodley & Schmahmann, 2009) and working memory tasks (Brissenden et al., 2021), it is difficult to pinpoint a specific interpretation to this finding. Taken together, the present results extends the “sequence hypothesis” (Leggio & Molinari, 2015) where the cerebellum detects violations between predicted and actual sequences, to a higher level of implied social meaning (i.e., stereotype-inconsistencies). Thus, the function of the posterior cerebellum is to identify and potentially correct “prediction errors” not only by inconsistencies in the sequences of actions, but also in the social implications of these actions, thus allowing adjustment of future interactions with members of a social group.

In addition, we observed stronger cerebellar Crus activation when we contrasted Social Inconsistent Sequencing > Nonsocial Consistent Sequencing, although no cerebellar activation was observed for the parallel Social Consistent Sequencing > Nonsocial Consistent Sequencing contrast. This seems to highlight the sensitivity of the posterior cerebellum to inconsistencies in the present study, rather than any domain-specific preference, unlike prior research where the posterior cerebellum was preferentially recruited in the social domain when inconsistencies were absent (Pu et al., 2020). This is likely due to a somewhat diminished sensitivity to, or processing of, social information in general, to facilitate processing of inconsistencies in the social information. Therefore, taking these two studies together, our results provide evidence on the critical function of the posterior cerebellum in processing social action sequences as well as in detecting violations in the social implications of these actions.

One might argue that the increased cerebellar activation in processing (Social) Inconsistent Sequencing against (Social and Nonsocial) Consistent Sequencing was potentially due to memory difficulty (e.g., increased cognitive load) in the study phase. However, this interpretation is unlikely. The behavioral data show that there was no significant difference in accuracy and response time of sequence retrieval between the Social Inconsistent and Social Consistent Sequencing conditions. In fact, we only found significantly higher retrieval accuracy in the Social Inconsistent Sequencing than in the Nonsocial Consistent Sequencing condition, indicating lower memory demands and cognitive load in the Social Inconsistent condition. This is probably because inconsistent information elicits increased top-down attention to task-relevant sequence information, thus improving memory performance when retrieving the sequence of sentences (Egner & Hirsch, 2005; Krebs et al., 2015). These results are incompatible with the possibility that the stronger cerebellar activation was due to the difficulty of memorization. Note that increased top-down attention to task-relevant sequence information about sentences may have caused less attention to task-irrelevant information, thus decreasing the accuracy of the factual manipulation check question.

Previous studies have demonstrated that the cerebellum is involved in the prediction and violation detection of language and social cognition (Sokolov et al., 2017), including semantic processing (Moberget et al., 2014), linguistic prediction (Lesage et al., 2017), and social norms (Berthoz et al., 2002). Nevertheless, the exact function of the cerebellum in processing violations in high-order social functioning remained largely unexplored. In a novel study, Pu et al. (2021) recently observed that, while memorizing the temporal order of trait-implying actions of individuals, the cerebellar Crus was more strongly activated during trait-inconsistent than trait-consistent actions. Our study extends this individual social inconsistency effect (Pu et al., 2021) for the first time to stereotype-inconsistent behaviors of group members.

This finding is theoretically important because this suggests a greater role of the posterior cerebellum in predicting and monitoring conflicts at a higher and more complex social level. This role is functionally very relevant, because the involvement of the cerebellum may aid to adjust the order of subsequent actions during social interactions to avoid future inconsistencies (e.g., we let other people enter a room first, to appear more polite and considerate). Taken together, our results are consistent with the hypothesized role of the cerebellum in predicting sequential events to anticipate others’ social behaviors, so that we can automatize current and future social interactions and instantly detect violations (e.g., error signals) in these action sequences (Van Overwalle et al., 2019b), which signal potential ways for improvement. Importantly, our findings provide novel evidence for a greater role in detecting inconsistencies in high-level social understanding, providing even richer signals for adjusting one’s social repertoire.

Note that the sequences in the present study did not involve in a logical order as when they are an inherent part of an event or story as in previous research (Heleven et al., 2019; Van Overwalle et al., 2019a). As mentioned, even in the absence of a logical order, the posterior cerebellar was observed in other studies when memorizing the temporal order of actions that imply a personality trait (Pu et al., 2020) or a goal (Li et al., 2021). This strongly suggests that the posterior cerebellum is sensitive to all sorts of sequences that might help to predict upcoming behaviors, even if they are quite idiosyncratic and limited so specific situations, events and persons, not only when logically or inherently plausible across a large set of contexts. This coincides with the common observation that we often receive a great variety of behavioral information about different people and the chronological order of their behaviors, which is sometimes idiosyncratic but is nevertheless crucial for understanding others’ motivations and predicting their upcoming behaviors.

With respect to functional anatomy, cerebellar Crus activation while detecting stereotype-inconsistent behaviors was mainly located in the executive control network (Buckner et al., 2011). This is consistent with previous research that demonstrated that the identification and resolution of conflicting information requires the involvement of executive control processes (Thomas Yeo et al., 2011) located in the conflict monitoring network, encompassing the lateral PFC and pmFC in the cortex (Cohen et al., 2000). In the social domain, these areas also were activated in updating trait-inconsistent behaviors (Ma et al., 2012; Mende-Siedlecki, Cai, & Todorov, 2013) and identifying stereotype-conflicting facial emotional expressions (e.g., white angry faces instead of stereotypical back angry faces; Hehman et al., 2014). Our study is in line with these findings in that we also observed activation of the lateral PFC and pmFC during stereotype-inconsistent behaviors. Moreover, the present findings extend this stereotype-inconsistency effect to the cerebellar Crus, analogous to the trait-inconsistency effect observed in the executive cerebellum by Pu et al. (2021).

In addition, the mentalizing network including the TPJ, dmPFC and precuneus showed stronger sensitivity to stereotype-inconsistent behaviors in comparison with consistent behaviors. This indicates that these regions are responsible not only for social mentalizing (Van Overwalle, 2009), but are also involved during inconsistency resolution of social information. This is consistent with previous studies indicating that the dmPFC, TPJ, and precuneus play an important role in evaluating inconsistencies in person perception, such as inconsistent trait behaviors within a person (Ma et al., 2012) and social categories (e.g., a Democrat wants a small government; Cloutier et al., 2011). This is probably because these regions are involved in making sense of the social violation, which requires heightened mentalizing processing to resolve the inconsistency and to form a coherent perception of persons (Cloutier et al., 2011; Ma et al., 2012).

That cerebellar and cortical regions collaborate during mentalizing and conflict monitoring is supported by growing evidence on the functional connectivity between the cerebellum and the cerebral cortex (Van Overwalle & Mariën, 2016). As identified by earlier research using Dynamic Causal Modeling (Van Overwalle et al., 2020; Van Overwalle et al., 2019c), it is likely that there is substantial connectivity between the mentalizing and executive regions in the cerebellum and the cortex via closed-loops. This is currently an avenue for future research.

This study has a number of limitations. First, the Social Consistent Nonsequencing control condition was always presented first, whereas the Sequencing conditions were presented afterwards. We did so to avoid spontaneous memorizing of the action sequences in the control condition. However, this might have an adverse impact on the validity of contrasts between the Nonsequencing and Sequencing conditions. Hence, it is possible that non-specific time-related factors (e.g., novelty) influenced these contrasts. Nonetheless, this is not very likely given that an earlier study on social sequencing (Li et al., 2021) found similar posterior cerebellar activation when contrasting Sequencing versus Nonsequencing conditions when the Sequencing task was presented before the Nonsequencing task during social navigation. To eliminate this bias, one possible approach is to counterbalance the Non-sequencing and Sequencing conditions across the participants. This would be considered in the future study.

Second, one might argue that an fMRI analysis specifying 10-s duration for memorizing the sentences during the Study phase in the Sequencing conditions may have allowed for subvocal rehearsal or mind-wandering, which might have led to posterior cerebellar activations for this reason alone. To exclude this possibility, we ran an alternative model with duration = 0 s during the Study phase. This analysis yielded approximately the same results as before with duration = 10 s (Supplementary Table S2), indicating that the duration in our analysis is of no critical importance. Therefore, subvocal rehearsal or mind-wandering is an unlikely explanation for our findings.

Third, we noticed that the accuracy on factual check question was significantly lower in the Social Inconsistent Sequencing than Social/Nonsocial Consistent Sequencing conditions. As noted before, we speculate that inconsistent information that conflicts with prior expectations directs attention to task-relevant information and hence less to task-irrelevant details, thus participants were worse at recognizing whether or not they had seen some details in a sentence before in this condition. However, note that the accuracy of the factual check in general was quite high in the inconsistent condition (85%) as well as across all sequencing conditions (90%), which indicates that participants understood the sentences well when memorizing the sequence of sentences.

Conclusions

Our findings elucidate that the posterior cerebellum contributes to memorizing social action sequences that require mentalizing about stereotypes of social groups, and so confirm the role of the cerebellum in prediction and error-based learning in the nonmotor social domain (Leggio & Molinari, 2015). Crucially, this study sheds new light on the cerebellar function to identify stereotype-inconsistent behaviors that conflict with prior expectations and knowledge about social groups, and so raise the error-correcting function of the cerebellum to a higher, social level, beyond mere action sequences. This demonstrates the importance of the posterior cerebellum for adapting upcoming social interactions when conflicting information arises.