Introduction

Autism spectrum disorder (ASD) is a lifelong complex neurodevelopmental disorder that affects individuals’ social interactions, verbal and non-verbal communication, and behaviors (i.e., manifestation of mannerisms, restricted interests, and repetitive behaviors) (Diagnostic and Statistical Manual of Mental Disorders [DSM]-5-TR, American Psychiatric Association, 2022). Except for these unquestionable social deficits, impaired executive function (EF) is another salient characteristic of several ASD samples across age (Demetriou et al., 2019). EF, conceptualized as an umbrella term, refers to a set of high-order goal-directed processes that regulate cognitive control of one’s behavior, thought, and emotions. Different aspects of EF—not an exhaustive report though—include abilities such as working memory (ability to maintain in mind and manipulate relevant information), inhibition (ability to willfully suppress automatic responses/ignore distractions and attend to relevant tasks), planning (ability of goal setting and organizing of the steps required to complete a cognitive task), cognitive flexibility (ability of flexibly switch between two tasks), and self-regulation/control (see Elliott, 2003 and Lezak, 2012 for a more detailed discussion of the various areas of EF). EF abilities are suggested to neurally depend on circuits mostly of the prefrontal cortex (e.g., Best & Miller, 2010; Duncan, 2013) and other areas (e.g., posterior, cortical, subcortical, and thalamic areas) (Monchi et al., 2006). EF emerges in early years and presents dramatic changes throughout preschool period. Recent data suggest that EF continues to develop well into childhood and adolescence and reaches maturation in early adulthood, protracted to the development of the underlying brain area (prefrontal cortex) (Anderson, 1998). The developmental pathway of EF in ASD is not as clear, due to limited data and mixed results to date.

According to the executive dysfunction theory (Russell, 1997; Russo et al., 2007), early disruptions in EF may account for the manifestation of the ASD symptomatology and that such disruptions may cause indirect effects on other crucial sociocognitive skills such as theory of mind (ToM). ToM which reflects one’s ability to interpret others’ mental/emotional states, goals, and desires has also very often been found impaired in ASD (e.g., Brewer et al., 2017). Evidence from typical development has consistently shown that EF is strongly associated to ToM across development (e.g., Austin et al., 2014; Carlson & Moses, 2001; Devine et al., 2016; Phillips et al., 2011) but this association is less studied in ASD. This study thus attempted to investigate the developmental trends of the cross-sectional EF trajectories as well as their relation to ToM in an attempt to shed more light on the EF development of children and adolescents within the spectrum.

Traditionally, despite its diversity, EF has been mainly examined under a cognitive lens in typical development and ASD. However, behavioral and neural data from lesion and neuroimaging studies (e.g., Gilbert & Burgess, 2008; Wagner et al., 2001) over the last two decades propose that different measures of EF activate different prefrontal cortex regions and that EF abilities vary—as a function of emotional/motivational significance of the task—from “hot EF” to “cool EF” (Zelazo, 2020; Zelazo & Carlson, 2012). Cool EF skills are elicited in abstract, decontextualized, and relatively emotionally neutral contexts, with lateral parts of the prefrontal cortex (dorsolateral & ventrolateral) suggested to regulate these aspects (e.g., Stuss, 2011). Hot EFs on the other hand refer to abilities needed in motivationally and emotionally significant situations and depend on orbitofrontal cortex and ventromedial prefrontal cortex areas (e.g., Stuss, 2011). These brain areas have been found to also engage with the limbic system (amygdala) which underlie emotional processing (Phan et al., 2004). Cool EF processes differentiate from the hot ones but are believed to collaborate as part of a more general function, according to each task’s demands (Zelazo, 2020). Cool EFs encompass processes such as working memory, cognitive flexibility, inhibition, and planning, while hot EF processes include abilities such as affective decision-making (i.e., the mental process taking place when one is to choose among possible options under risk) and delay discounting [(i.e., individuals’ tendency to choose more immediate but smaller rewards over larger delayed ones—viewed as an impulsivity measure (Logue, 1988; Monterosso & Ainslie, 1999)]. Hot EF tasks thus have meaningful awards/losses for the individuals. We employed this cool and hot EF model as theory suggests that it may further clarify the role the EF deficits play in such neurodevelopmental disorders (Zelazo & Carlson, 2012). The developmental EF pathway in ASD has been mainly examined with cool EF tasks; thus, very little is known regarding the hot EF development and whether it follows a similar or differentiated trajectory to cool EF within the spectrum. Ongoing research within the hot EF domain is currently underway, yet the availability of pertinent data remains limited, despite their notable clinical significance for ASD. Neurobiological studies have suggested that brain regions responsible for hot EF processes (e.g., orbitofrontal and ventromedial prefrontal cortex) may exhibit atypicalities in individuals with ASD (see MRI review of Li et al., 2017). Differences in these brain regions could contribute to both hot EF deficits and social challenges in ASD. More specifically, hot EF processes may be closely associated to social interactions and social cognition. For example, hot EF skills, such as affective decision-making and delay of gratification/ discounting, involve processes related to motivation, reward processing, and decision-making based on emotional and social cues. Moreover, ToM relies on the ability to recognize and interpret emotional/social cues. Thus, it is likely that hot EF impairments could influence an individual’s engagement in social activities and interactions. Impairments in hot EF could therefore hinder individuals with ASD in developing and navigating social relationships and understanding the perspectives of others (social and emotional cues). A deeper understanding of the interplay between hot EF, social cognition, and ASD could have implications for interventions and support strategies targeting social and emotional difficulties in individuals with ASD.

Development of EF

In typical development, ΕF emerges in early years and develops through middle childhood and adolescence (Best & Miller, 2010), with maturation reached in early adulthood (Korkman et al., 2001) and a decline in older adulthood, linked to prefrontal cortex changes (Gogtay et al., 2004). Differentiated developmental profiles are evident (e.g., Best & Miller, 2010; De Luca et al., 2003) during middle childhood and adolescence, with varying peak maturity times. For instance, “cool” inhibition, working memory, and planning develop through middle/later childhood into adolescence (Bedard et al., 2002; Bishop et al., 2001; Gathercole et al., 2004; Williams et al., 1999), while cognitive flexibility matures around 12 years (Cragg & Nation, 2008; Crone et al., 2006). Regarding hot EF development, studies in typical development have produced mixed results regarding the differentiation of hot from cool EF trajectories. For instance, Hooper et al. (2004) reported similar trends (age-related gains for hot & cool EF) in middle childhood and adolescence (9–17 years), while Prencipe et al. (2011) extended this and revealed later maturation for hot EF. Finally, Kouklari et al. (2018) demonstrated improvements in cool EF aspects from childhood to adolescence, but no significant changes in hot EF. It may be likely that hot and cool EF have an independent development in the transition from childhood to adolescence, warranting further study (Ferguson et al., 2021).

The neurodevelopmental diversity within ASD suggests that the developmental paths of hot and cool EF may diverge in ASD. The intricate neurobiological foundation of EF in ASD involves atypical connectivity, particularly in brain regions like the prefrontal cortex (e.g., Leisman et al., 2023). Altered neural circuitry may influence EF development timing and pace, leading to distinct trajectories from typically developing individuals. Investigations into the cool EF development in ASD during childhood and adolescence have revealed either a delayed developmental path (e.g., Chen et al., 2016; Happé et al., 2006) or persistent deficits compared to typically developing peers, despite the age-related progress (e.g., Andersen, Skogli, Hovik, Egeland, & Øie, 2015; Fossum et al., 2021; Kouklari et al., 2018, 2019; Luna et al., 2007; Ozonoff & McEvoy, 1994). Although cool EF impairments seem to persist throughout the course of ASD development, certain preservation tendencies may exist from childhood to adolescence. Some other studies though have indicated either no noticeable improvement or even a decline in performance with age for specific cool EF aspects, such as working memory, in children and adolescents with ASD (e.g., Andersen, Skogli, Hovik, Geurts, et al., 2015; Kouklari et al., 2018). These findings reflect the complex interplay between different cognitive and neural processes in ASD that may account for differentiated EF trajectories.

Hot EF regulates emotional and motivational/reward factors of cognitive processes. Considering that individuals with ASD frequently encounter difficulties in these areas—manifesting for instance as challenges with social/emotional interactions—“hot” abilities could be impacted by various confounding elements of emotionally/motivationally charged situations (e.g., anxiety). Consequently, this interplay could potentially contribute to differentiated hot EF developmental trajectories in ASD. Research on hot EF in ASD is limited, but it has indeed indicated a distinctive developmental trajectory compared to cool EF. As of now, only two studies (Kouklari et al., 2018; Kouklari et al., 2019) have explored the developmental patterns of both hot and cool EF in children and adolescents with ASD. In a cross-sectional investigation (Kouklari et al., 2018), the study found that the developmental trajectory of hot EF (affective decision-making and delay discounting) did not show age-related disparities from childhood to adolescence in ASD, unlike most cool EF trajectories, which exhibited changes during adolescence. In a longitudinal study (Kouklari et al., 2019), it was observed that only one of the two hot EF components (affective decision-making but not delay discounting), along with all cool EFs, displayed enhancements after a 12-month follow-up period in children (aged 7–11 years) with ASD. These limited findings underscore the variability in developmental trajectories for cool and hot EF in the context of ASD, emphasizing the necessity for further in-depth investigation of the shape, rate, and direction of developmental trajectories in ASD.

EF and ToM

The strong associations between EF and ToM have been vastly documented in typical development, especially in early childhood in which the emergence account of ToM states that EF may serve as a platform for ToM development (Moses, 2001). The extension of this relation to middle childhood, let alone in adolescence, has been less examined. However, as an increasing number of more recent studies in typical development have reported a significant EF-ToM correlation beyond the preschool period (e.g., Austin et al., 2014; Bock et al., 2015; Im-Bolter et al., 2016; Kouklari et al., 2017), it may be likely that EF and ToM are associated across development.

In several ASD samples, ToM deficits have been found either in false belief measures (e.g., Begeer et al., 2012; Sobel et al., 2005) or tasks such as the Reading the Mind in the Eyes which tap mental state/emotion recognition (e.g., Brent et al., 2004; Holt et al., 2014; Kouklari et al., 2017). It is plausible that in ASD, the EF-ToM relationship becomes evident, wherein impairments in EF potentially contribute to the difficulties observed in ToM (e.g., Russell, 1997). In preschool, evidence of such a relation between EF and ToM in ASD has been previously documented (e.g., Kimhi et al., 2014; Pellicano, 2007, 2010). Current knowledge beyond the preschool stage in ASD lacks comprehensive insights into the nature of this relationship, including whether it attenuates/dissipates over time, or continues to manifest as a noteworthy correlation. As children with ASD transition to middle childhood and adolescence, cognitive demands and social complexities of their environments increase. They are required to engage with more complex social contexts/cues and interactions, but given the inherent cognitive demands of such interactions, one could suspect that EFs remain indispensable for ToM in older children and adolescents with ASD. Indeed, the few previous studies about the EF-ToM relation beyond preschool in ASD have provided some evidence for this notion (e.g., Joseph & Tager-Flusberg, 2004; Kouklari et al., 2019; Ozonoff et al., 1991). Results indicated that different ToM tasks (e.g., location change and unexpected content false belief; mental state and emotion recognition) were significantly related to cool EF in school-aged children and adolescents with ASD, not only cross-sectionally but even after a 12-month follow-up (e.g., Kouklari et al., 2019). However, only three studies to date (Kouklari et al., 2017, 2019; Yu et al., 2021) have attempted to examine the contribution also of hot EF to ToM aspects in ASD, despite hot EF being regulated by brain areas strongly associated to emotional processing. Results from the latter studies demonstrated that not only cool but also hot EFs share a significant relation to ToM. However, their findings need to be interpreted cautiously and significantly to be further expanded due to the limited measures addressed. More specifically, Yu et al. (2021) used only one type of task to measure cool and hot EF respectively, while Kouklari et al. (2017, 2019) addressed only two ToM tasks (1st-order false belief and mental state/emotion recognition tests) and also omitted to address the “cool” cognitive flexibility EF aspect which is considered to be central in ASD (a neurocognitive dimension related to the core ASD characteristics; Cheng et al., 2021). Thus, the present study sought to expand the hot EF-ToM relation in children and adolescents in ASD, by addressing an extensive battery to capture both domains.

Current Objectives

The present study had two objectives. The first goal was to examine the cross-sectional developmental trajectories of cool and hot EF relative to age in children and adolescents with ASD. As already mentioned, previous studies (e.g., Andersen, Skogli, Hovik, Geurts, et al., 2015; Happé et al., 2006; Kouklari et al., 2019) have demonstrated that the EF developmental framework in ASD is not clear due to mixed results. Besides, to date, there have been only two studies (Kouklari et al., 2018, 2019) that investigated the developmental trajectory of hot EF (along with cool EF) in ASD across both middle childhood and adolescence and their reported variable developmental data highlight the need for further examination with more extensive batteries. Thus, the assessment of both cool and hot EF skills will attempt to shed more light on the developmental pathway of EF in childhood and adolescence in ASD and will address the age gap in the literature. Based on limited previous data (Kouklari et al., 2018, 2019) and the theory-based expectations discussed above (i.e., the “Development of EF” section), we hypothesized that hot and cool EF cross-sectional developmental trajectories would differentiate in ASD (i.e., would not present similar developmental patterns).

The second goal of the study was to investigate the relation between ToM and hot and cool EF in middle childhood and adolescence in ASD. Middle childhood and adolescence are crucial developmental periods with cognitive changes in which children face increasing environmental demands and need to understand their sense of self and others (Siegel, 2013). For that reason, it is important to examine the extension of the relation between EF and ToM beyond early childhood and well up to adolescence as it can serve as a solid ground for further longitudinal studies towards the identification of the neural developmental EF-ToM mechanisms. Most notably, the hot and cool EF distinction employed in the present study will address the minimal literature knowledge regarding the hot EF-ToM association in ASD. Theoretically, situations that involve ToM abilities may require the control of behavior or thought under emotionally significant situations (hot EF) (Zelazo & Müller, 2002). Therefore, based on theory and limited previous evidence (Kouklari et al., 2017, 2019; Yu et al., 2021), we hypothesized that hot EF would relate to ToM and attempted to examine whether ToM could be predicted by hot EF over and above cool EF in middle childhood and adolescence in ASD.

Methods

Participants

Eighty-two (82) children and adolescents with an official diagnosis of ASD, aged between 7 and 16 years (M = 11.02, SD = 2.71), participated in the research. Inspection of participants’ records showed that they were all high functioning, held an official ASD diagnosis by clinicians using DSM-5 (Diagnostic and Statistical Manual of Mental Disorders 5th Edition) criteria (American Psychiatric Association [APA], 2013), and qualified for a “broad ASD” on the Autism Diagnostic Observation Schedule (Lord et al., 2000). We corroborated the clinical diagnoses of the participating children and adolescents with the Autism Spectrum Quotient (child (Auyeung et al., 2008) and adolescent (Baron-Cohen et al., 2006) Greek versions) to quantify ASD traits. Any participant who presented comorbid disorders (i.e., attention-deficit/hyperactivity disorder, psychiatric illnesses) as well as Full-Scale Intelligence Quotient (FSIQ) below 70 was excluded. Ethical approval for the study was obtained by the hospital’s ethics board, and all participants’ parents/carers provided the researchers with written informed consent. Table 1 presents participants’ characteristics.

Table 1 Participants’ characteristics (n = 82)

Measures

Cool EF Tasks

Stroop Color and Word Test (Stroop, 1935)

Stroop test measures the participants’ ability to inhibit the cognitive interference occurring when the processing of a particular stimulus interferes with the processing of a second stimulus simultaneously (Scarpina & Tagini, 2017). The Stroop test assesses participants’ ability to produce a contradicting response as they are asked to read the color of the ink in which different color names are printed, instead of reading the color names. The color names are printed in a different colored ink and the participants have to read the ink color of the printed letters and not read the actual words. The number of errors when reading the ink color was recorded. The Stroop test has been widely used as a color-word interference measure and has been found to present high test-retest reliability (e.g., Strauss et al., 2005).

Berg’s Card-Sorting Task-64 (from PEBL Platform; Mueller & Piper, 2014)

This test is a computerized version of the Wisconsin Card Sorting Test in which participants are asked to sort cards of multiple features into piles (i.e., categorize them) according to an unknown and changing rule. There are four different piles to sort each card and the only feedback given is whether the sorting is correct or incorrect. Participants can sort the cards according to the color of their symbols, the shape of the symbols, or the number of shapes on each card. The sorting rule changes every 10 cards. This sorting test assesses participants’ flexibility to adapt to changing rules. The number of correct responses was recorded. Reliability estimates for computerized versions of this test have been found to be above .90 (e.g., Steinke et al., 2021).

Tower of London (Shallice, 1982)

The Tower of London test was used to assess participants’ planning skills. This task includes two identical wooden boards which the researcher places side by side in front of the participant. Each board has three wooden beams on which there are three balls of a specific color each: green, red, and blue. Participants need to reproduce a series of patterns with the wooden balls only with a certain number of movements, according to the researcher’s instructions. After the presentation of the instructions, a 3-move problem is addressed to the participants as practice. Participants then proceed with the 12 planning problems. There are two 2-move planning problems, two 3-move planning problems, four 4-move planning problems, and four 5-move planning problems. In order to successfully complete each planning problem, participants have to adhere to the following two rules. Firstly, each problem must be completed with a specific number of moves. Secondly participants can remove only one ball from each beam at a time. In terms of scoring, the number of problems completed successfully and without violating the rules was recorded. Specifically, we gave one point if participants completed the problem successfully and 0 points if they failed to complete it. This test has been the most commonly used measure of planning across the lifespan (Chang et al., 2011) and presents good test-retest reliability (Köstering et al., 2015).

Forward and Backward Digit Span Subtests (Wechsler Intelligence Scale for Children-Fifth Edition, 2014; WISC-V)

In order to assess participants’ verbal working memory, we used the forward and backward digit span tests from WISC-V. In these tests, participants have to recall and repeat sequences of random numbers—of increasing difficulty—in the exact same order as presented by the examiner (e.g., “Please listen carefully and then repeat the following sequence of numbers back to me in the exact same order: 5689”). The researcher reads each number sequence at a rate of one number per second. In the backwards digit span subtest, the sequence of numbers must be repeated in a reverse order (e.g., “4598” will be repeated as “8954”). When participants repeat the 2 trials of a block successfully, the examiner proceeds with the next one. In terms of scoring, we gave participants 1 point for each correct trial. The total sum of points awarded for each test made a composite working memory score. Digit span has been extensively researched and is considered to be a highly reliable and valid measure of working memory (e.g., Siegel et al., 1996).

Hot EF

IOWA Gambling Task (IGT; Bechara et al., 1994)

Affective decision-making was tapped by a computerized (modified) version of the Iowa Gambling Test developed by Mueller and Piper (2014). In IGT, participants are presented with four different card categories (i.e., A, B, C, and D). They are told to use the mouse to select a card of their choice each time, from any of the four categories. Some cards are advantageous and give participants money while others are disadvantageous and take money away. One hundred (100) card choices are made throughout the game. Two of the card categories, namely, A and B, are equivalent in the total net loss, while the other two (C and D) are equivalent in the total net winning. For each selected card, wins and losses are set by default in such a way that for every block of 20 cards from categories A or B, there is a total potential win of €1000. This potential win can be interrupted though by potential losses up to €1250. In category B, the loss is not so frequent, but is higher compared to category A in which the loss is more frequent but in smaller amounts of money. For categories C and D, wins for each block can go up to a total €500 but the potential net losses are €250. In category D, the losses are less frequent and of a higher magnitude than those in category C. Thus, categories A and B are equally “disadvantageous” in the long run, while categories C and D are equally “advantageous.” Participants’ affective decision-making is assessed on the basis of whether they make predominantly “advantageous” or “disadvantageous” choices. In terms of scoring, we adopted the technique used in previous studies (e.g., Verdejo-Garcia et al., 2006), in which scores are calculated by the subtraction of the number of disadvantageous choices (categories A and B) from the number of advantageous choices (categories C and D), divided by the number of the overall 100 trials. IGT has been documented to have good test-retest correlations ranging between rs = .64–.82 (e.g., Sullivan-Toole et al., 2022).

Delay Discounting Task (Richards et al., 1999)

In this task, participants are asked to hypothetically choose either between immediate small amounts of money or €10 available after a delay (e.g., sample question: would you prefer to have (a) €2 now or (b) €10 in 30 days?). An algorithm adjusts the amount of immediate money until the participant is indifferent between the two offered options (random adjusting procedure; for a more detailed description of the procedure, see Richards et al., 1999). The indifference point for each participant (i.e., the amount of the small, immediate money estimated by the algorithm to be equivalent to €10) represents the subjective value of the delayed large reward relative to an immediate money amount (Richards et al., 1999). In the present task, delay discounting is determined by five delays (0, 10, 30, 180, and 365 days later). With regard to scoring, the procedure described in Myerson et al. (2001) was followed. More specifically, the indifference points were used towards the estimation of delay discounting. Indifference points were estimated for each participant and were then plotted against delay (time). We normalized indifference points and delays by converting indifference points into proportions of the amount of the maximum delayed reward (€10) and delays as proportions of the maximum delay (365 days). The normalized values of delay and indifference points were used as the x and y axes accordingly to plot delay discounting. From each data point, there were vertical lines drawn on the x axis which created four distinct trapezoids. In order to calculate the area of each trapezoid, we used the formula (x2x1) ∙ [(y1 + y2)/2]. The areas under these discounting curves (AUC) were calculated by summing the trapezoids that resulted. The AUC range from 0 (maximum discounting) to 1 (no discounting). Higher scores (larger AUCs) here reflect less discounting by delay (suggesting less impulsivity). Delay discounting has been documented to have good test-retest reliabilities (rs= .67 and .76 respectively; e.g., Anokhin et al., 2015).

ToM

ToM Scenarios (Sullivan et al., 1994)

Two ToM stories measuring second-order false belief knowledge (false belief about someone else’s belief) and second-order ignorance (do not know/understand that the others do not know) were used in the present study. Both stories were in detail quoted (verbatim) from Sullivan et al.’s study (1994). Authors mention in their paper (Sullivan et al., 1994) that the first story (ice-cream van story) is based on Perner and Wimmer’s (1985) original story, in which two children are independently informed about the unexpected transfer of an object (ice-cream van) to a different location. Their second story (birthday puppy), also used in the present study, is a scenario in which a mother deceives her son about his birthday present (puppy). In terms of scoring, in this test, we created four categorical variables for each one of the ToM questions of interest (2 second-order false belief and 2 second-order ignorance questions) in which O was awarded if participants failed to answer correctly and 1 point when participants answered successfully. Reliability of second-order false belief tasks, akin to those originally formulated by Perner and Wimmer (1985), has been evaluated and deemed acceptable (e.g., Hughes et al., 2000).

Reading the Mind in the Eyes (Baron-Cohen et al., 2001)

This test is a widely used measure of the mental state/emotion recognition ability. It presents 28 images of different people’s eyes with four different choices around each image. Participants are told to look at each image carefully and then make a choice of what they think that person may be feeling/thinking. Successful performance requires participants to correctly identify the emotional or mental state of each person. In terms of scoring, one point was awarded for each correct answer. Scores range from 0 to 28. Reading the Mind in the Eyes has been used in hundreds of studies to date and has been found to have good test-retest reliability (e.g., Fernández-Abascal et al., 2013; Vellante et al., 2013).

Autism Quotient (Children (Auyeung et al., 2008) and Adolescent Version (Baron-Cohen et al., 2006))

Both 50-item parent-report questionnaires were used to measure the expression of ASD traits in our sample. All the scale’s items assess behaviors related to ASD, such as social skills, attention to detail, communication, and imagination. It is a Likert scale for both tests (0 = definitely agree, 1 = slightly agree, 2 = slightly disagree, 3 = definitely disagree). Higher scores here reflect more “autistic-like” behavior (cut-off score 32).

Results

Statistical Analysis

SPSS-28 was used for the statistical analyses performed. Two outliers were removed from the Stroop and IGT variables and one from the digit span variable. Visual inspection of scatterplots of all cool EF tasks (see Figs. 1, 2, and 3) revealed that their relationship to age was linear. Linear regression analysis was run for cool EF, in order to construct single developmental trajectories for each task relative to age. For hot EF tasks, scatterplots revealed non-linear relationships (curve-like patterns) between both delay discounting and affective decision-making and age (Figs. 4 and 5). Curve estimation analysis was conducted in order to identify the curve that best represented the pattern observed in the hot EF data. The best fitting model for the two hot EF tasks was deduced by comparing goodness-of-fit indices. Established goodness-of-fit measures were used to evaluate model fit (for linear, quadratic, and cubic models). Firstly, we used Akaike’s Information Criterion (AIC), with increasingly negative values corresponding to increasingly better fitting models. Moreover, the F test was used, which contrasted the simpler model against the more complex model each time. If the p value was greater than 0.05, then the simpler model was selected as the best fitting model. If the p value was less than 0.05, then the more complex model was selected as the best fitting model. See Table 2 (Appendix) for model comparison results. Moreover, hierarchical regression (for continuous dependent Eyes Test variable) and hierarchical logistic regression (for categorical dependent false belief and ignorance variables) analyses were run in order to assess the extent to which EF scores would relate to ToM skills above and beyond age. Age was entered in block 1, cool EF variables (Stroop, sorting test, digit span, ToL scores) were entered in block 2, and hot EF variables (delay discounting and IGT scores) were entered in block 3 (to assess the extent to which hot EF would show a unique contribution to ToM above and beyond age and cool EF). No violations of multivariate assumptions for these variables were found. All tests were two-tailed, and statistical significance was set at p < .05.

Fig. 1
figure 1

Trajectory of planning (ToL) and working memory (digit span) relative to age for ASD participants

Fig. 2
figure 2

Trajectory of inhibition relative to age for ASD participants

Fig. 3
figure 3

Trajectory of cognitive flexibility relative to age for ASD participants

Fig. 4
figure 4

Trajectory of affective decision-making relative to age for ASD participants

Fig. 5
figure 5

Trajectory of delay discounting relative to age for ASD participants

Cross-sectional Trajectories

Cool EF Tasks

Planning

Linear regression analysis showed that performance on the Tower of London test improved with age increase, R2 = .25, F (1, 80) = 27.03, p < .001. As shown in Fig. 1, adolescents with ASD presented a better performance than children with ASD in planning.

Working Memory

Linear regression analysis showed that digit span performance improved with age increase, R2 = .22, F (1, 79) = 21.9, p < .001. As shown in Fig. 1, adolescents with ASD performed better than children with ASD in working memory.

Inhibition

Linear regression analysis showed that performance on the Stroop test improved with age increase, R2 = .28, F (1, 78) = 29.8, p < .001. As shown in Fig. 2, adolescents with ASD performed better than children with ASD in inhibition.

Cognitive Flexibility

Linear regression analysis showed that performance on the Berg’s Card Sorting Test improved with age increase, R2 = .13, F (1, 80) = 11.5, p = .001. As shown in Fig. 3, adolescents with ASD presented better performance in cognitive flexibility than children with ASD.

Hot EF Tasks

Affective Decision-making

Results (see Table 2) showed that the quadratic was the best fitting model for the relationship between age and affective decision-making scores (R2= .25, F (2, 77) = 12.8, p < .01). The quadratic fit indicated a U-shaped pattern (Fig. 4) showing that affective decision-making performance deteriorated from middle childhood to early adolescence (7–11 years) and then improved into mid adolescence (11–16 years of our sample).

Delay Discounting

Results (see Table 2) indicated that cubic was the best fitting model for delay discounting (R2= .1, F (3, 78) = 2.73, p =. 049). To interpret the non-linear relationship between delay discounting and age, we consider the model displayed in Fig. 5. Figure 5 demonstrates that there is an initial increase in delay discounting scores from 7 to 9 years of age, a decrease from 9 to 13 years of age, and an increase from 13 years of age onwards.

Regression Analyses

The EF-ToM relations were investigated by running firstly a hierarchical regression analysis to determine whether EF predicted ToM Eyes Test above and beyond age and the extent to which hot EF would show a unique contribution to ToM above and beyond age and cool EF. Our analysis showed that the first block (age) contributed significantly to the variance of the ToM Eyes Test, F (1, 76) = 19.73, p < .001, explaining 19.6% of the variance. Age was a significant predictor of the Eyes Test (p < .001). For all the cool EF aspects introduced in block 2, the variance explained rose to 30.5%, representing a significant increase of 10.9% (F (4, 72) = 3.98, p = .006) additional variance explained. The ToM Eyes Test was significantly predicted by digit span scores (p = .034) in ASD. Higher performance on digit span correlated with higher performance on the Eyes Test. Finally, the hot EF aspects entered in block 3 explained no significant additional variance (F (2, 70) = .2, p = .82; no significant hot EF predictors).

A series of logistic regressions were performed to examine the effects of hot and cool EF on the likelihood that participants pass successfully the ToM second-order false belief and ignorance tests.

ToM 2nd-Order False Belief (Ice-cream Van Story)

Results showed that the first block (age) did not explain any variance of the ToM 2nd-order false belief (p = .22). The addition of the second block of predictors (cool EF) demonstrated a statistically significant model, χ2(5) = 14.55, p = .012, which explained 23.1% (Nagelkerke R2) of the variance in ToM 2nd-order false belief and correctly classified 61% of cases. Higher performance on digit span (working memory) was associated with an increased likelihood of passing successfully this ToM test (p = .01). Finally, the third block of predictors (hot EF) also indicated a statistically significant model, χ2(7) = 21.33, p = .003, which explained 32.5% (Nagelkerke R2) of the variance in ToM 2nd-order false belief and correctly classified 63.6% of cases. Higher performance on “hot” IGT (affective decision-making) was associated with an increased likelihood of passing successfully this 2nd-order false belief test as well (p = .02).

ToM 2nd-Order Ignorance (Ice-cream Van Story)

Results showed that the first block (age) demonstrated a statistically significant model, χ2(1) = 9.91, p = .002, which explained 19.7% (Nagelkerke R2) of the variance in ToM 2nd-order ignorance and correctly classified 81.8% of cases. Older participants presented an increased likelihood of passing successfully this ToM test (p = .007). The addition of the second block of predictors (cool EF) also demonstrated a statistically significant model, χ2(5) = 16.3, p = .006, which explained 31.1% (Nagelkerke R2) of the variance in ToM 2nd-order ignorance and correctly classified 88.3% of cases. However, none of the cool EF tests were significant predictors. Finally, the third block of predictors (hot EF) also indicated a statistically significant model, χ2(7) = 16.88, p = .018, which explained 32.1% (Nagelkerke R2) of the variance in ToM 2nd-order ignorance and correctly classified 88.3% of cases. Similarly, none of the hot EF tests were significant predictors of this ToM test.

ToM 2nd-Order False Belief (Birthday Puppy Story)

Results showed that the first block (age) demonstrated a statistically significant model, χ2(1) = 4.28, p = .04, which explained 9.3% (Nagelkerke R2) of the variance in ToM 2nd-false belief and correctly classified 84.4% of cases. Older participants presented an increased likelihood of successfully passing this ToM test. The addition of the second block of predictors (cool EF) also demonstrated a statistically significant model, χ2(5) = 18.92, p = .002, which explained 37.6% (Nagelkerke R2) of the variance in ToM 2nd-order false belief and correctly classified 88.3% of cases. However, none of the cool EF tests were significant predictors. Finally, the third block of predictors (hot EF) also indicated a statistically significant model, χ2(7) = 19.8, p = .006, which explained 39.2% (Nagelkerke R2) of the variance in ToM 2nd-order false belief and correctly classified 89.6% of cases. Similarly, none of the hot EF tests was a significant predictor of this ToM test.

ToM 2nd-Order Ignorance (Birthday Puppy Story)

Results showed that the first block (age) demonstrated a statistically significant model, χ2(1) = 7.37, p = .007, which explained 13.8% (Nagelkerke R2) of the variance in ToM 2nd-ignorance and correctly classified 76.6% of cases. Older participants presented an increased likelihood of successfully passing this ToM test. The addition of the second block of predictors (cool EF) also demonstrated a statistically significant model, χ2(5) = 20.98, p < .001, which explained 36% (Nagelkerke R2) of the variance in ToM 2nd-order ignorance and correctly classified 88.3% of cases. Higher performance on ToL (planning) was associated with an increased likelihood of passing successfully this ToM test (p = .046). Finally, the third block of predictors (hot EF) also indicated a statistically significant model, χ2(7) = 21.36, p = .003, which explained 36.5% (Nagelkerke R2) of the variance in ToM 2nd-order ignorance and correctly classified 88.3% of cases. However, none of the hot EF tests was a significant predictor of this ToM test.

Discussion

This study assessed the cross-sectional developmental trajectories of cool and hot EF and their relationships to aspects of ToM in children and adolescents with ASD. To date, the majority of developmental EF research has predominantly focused on cool EF aspects, leaving gaps in our understanding of the hot EF developmental framework within the ASD context. This scarcity of data and mixed findings underscore the need for a comprehensive investigation, which our study addressed through an extensive battery of assessments encompassing both cool and hot EF domains, alongside ToM measures. Our findings revealed that all cool EF aspects presented linear age-related improvements in ASD, while notably both hot EF aspects were found to exhibit non-linear associations with age. The quadratic model of affective decision-making unveiled a U-shaped trajectory indicating a decline in performance from middle childhood to early adolescence, followed by improvement into mid and later adolescence. Moreover, the cubic model of the delay discounting task indicated turning points of the trajectory during development. Finally, selective cool EFs (i.e., working memory and planning) significantly predicted ToM over and above age, while “hot” affective decision-making was significantly associated to ToM, over and above age and cool EFs in middle childhood and adolescence in ASD.

EF Development in ASD

The performance gains in working memory are in line with previous findings in typical development showing that this EF aspect continues to improve throughout childhood and well into adolescence (e.g., Best & Miller, 2010; Gathercole et al., 2004; Luciana & Nelson, 2002) but contradict previous research in ASD which presented either a developmental arrest (Andersen, Skogli, Hovik, Geurts, et al., 2015; Van den Bergh et al., 2014) or even age-related deteriorations (Kouklari et al., 2018; Rosenthal et al., 2013) in working memory. Although these previous studies have suggested that working memory impairments might persist or increase with age in ASD—more demanding environments/increased needs for working memory processes during adolescence—our data showed that adolescents with ASD were more capable of manipulating higher loads of working memory information. Considering Luna’s et al. (Luna et al., 2007) maturation process perspective, these significant improvements in working memory could imply that the brain developmental/maturation processes for EF working memory may be intact for ASD participants. Due to lack of control group though, this statement is hypothetical and needs to be corroborated in future, longitudinal, and neuroimaging studies (this applies throughout discussion). Similar to working memory, results for planning showed significant age-related improvements across development in ASD, in line with previous evidence in young children (Pellicano, 2010) and young adolescents (Happé et al., 2006) with ASD. Generally, little is known about the developmental patterns of planning in individuals with ASD (van den Bergh et al., 2014) and present data highlight the likelihood that the brain developmental/maturation processes for EF planning may be intact and exhibit progress from middle childhood to adolescence in ASD (similar to working memory, this hypothesis needs to be examined in future neuroimaging studies). The age-related improvements in inhibition support the previously reported advances of this skill not only during early and middle childhood (e.g., Carlson et al., 2013; Romine & Reynolds, 2005) but also beyond the age of 10 years in typical development (Best & Miller, 2010). These performance gains (with the increase of age) in inhibition are in line with previous similar findings in ASD (e.g., Happé et al., 2006; Luna et al., 2007; Van Eylen et al., 2015) and paint a more positive picture of the ASD cognitive developmental trends, indicating perhaps the likelihood of an intact developmental progression for this ability too (similar to working memory and planning above, this hypothesis needs to be examined in future neuroimaging studies). Finally, when it comes to cognitive flexibility, although previous evidence is rather inconsistent (Geurts et al., 2009), our results are in line with limited data showing that cognitive flexibility improves in ASD during childhood (Happé et al., 2006). Our results suggest that although cognitive flexibility problems are very characteristic within ASD, there is an increase of this capability rather than increase of flexibility problems from childhood to adolescence in ASD. Generally, the reported age-related improvements of cool EF could be explained in the basis of the underlying prefrontal cortex regions experiencing substantial maturation (Otero & Barker, 2014) during the transition from middle childhood to adolescence not only in typical development but in ASD as well (as already highlighted, this hypothesis needs to be cautiously examined in future neuroimaging studies). During these crucial developmental periods, there are augmented cognitive demands which could justify performance improvements in cool EF in ASD. It should be noted at this point that it is likely that alongside the maturation of the prefrontal cortex, benefits of treatment over time may also play a significant role in enhancing cool EF skills in individuals with ASD. Cognitive interventions for instance have been reported to improve cool EFs (e.g., Pasqualotto et al., 2021) while social and emotional interventions can address social interaction challenges and/or reduce stress, which may free cognitive resources for EF tasks. However, as this matter was beyond the scopes of our research, we did not keep a track of the individualized therapeutic interventions/treatment our participants may have undertaken or were undertaking during the time of the recruitment. Thus, we cannot draw any conclusions about the interrelation between treatment gains and cool EF development.

The relationship between affective decision-making and age demonstrated a quadratic relationship, represented by a U-shaped parabola, which provides important insights into how affective decision-making evolves as individuals with ASD age. At the early age range of the sample, approximately between 7 and 10 years old, affective decision-making scores were found to decline with age. This initial decrease in affective decision-making abilities during middle childhood in ASD could be attributed to several factors, including challenges in integrating social and emotional cues, and potential difficulties in navigating increasingly complex social interactions during this stage of development. A further decline in affective decision-making scores occurred between 10 and 11 years of age, reaching a lower peak, indicative of additional challenges or changes in hot EF processing during the transition from childhood to early adolescence. During early adolescence, individuals with ASD may face heightened social and emotional challenges, as well as increased anxiety-related difficulties, which could influence their affective decision-making negatively. However, as ASD participants progressed into later adolescence (from 11 years onwards), the U-shaped curve displayed a notable upturn. Affective decision-making scores began to increase, reaching the highest levels (for the specific sample) in the latter years of the age range (16 years). This resurgence in affective decision-making abilities could be associated with an extended development of “hot” processes and, potentially, the acquisition of adaptive coping strategies over time. As individuals with ASD approach late adolescence, they may develop more sophisticated motivational and emotional understanding, leading to improved affective decision-making skills. The U-shaped pattern observed in affective decision-making in this study suggests that affective decision-making abilities in ASD children and adolescents are dynamic and undergo distinct changes throughout their developmental trajectory. The non-linear association between age and affective decision-making highlights the need for a comprehensive understanding of how emotionally/motivationally charged cognitive processes evolve in individuals with ASD. Current findings should be interpreted with caution, acknowledging the cross-sectional design and the use of a specific age range. Future longitudinal research with a larger and more diverse sample would be valuable to validate and further elucidate the dynamic relationship between age and affective decision-making in individuals with ASD.

In terms of delay discounting, our findings revealed a complex, non-linear association (non-monotonic pattern) between age and delay discounting. These observed patterns align with previous research in typical development suggesting that delay discounting follows a non-linear trajectory during development (Steinberg et al., 2009). Findings indicated that as age increased, delay discounting behavior did not linearly decrease or increase, nor did it exhibit a simple U-shaped or inverted U-shaped curve. Instead, the relationship appeared to be more intricate, characterized by three distinct phases in its developmental trajectory. More specifically, from 7 to 9 years of age, children with ASD showed an initial increase in delay discounting scores, which indicates less impulsivity when faced with delayed rewards (for this age group). Second, between 9 and 13 years of age, a decline in delay discounting scores occurred, indicating increased impulsivity in decision-making regarding delayed rewards in this age group. This phase corresponds to the onset of early adolescence, a period characterized by heightened emotional reactivity and increased risk-taking behaviors (Somerville et al., 2010). The surge in impulsivity during this stage might be influenced by a combination of pubertal changes and heightened sensitivity to rewards, which can be amplified in ASD, due to additional sensory and social challenges (e.g., higher anxiety levels). Third, from 13 years of age onwards (up to 16 years of age), there was an increase in delay discounting scores, indicating a reduction in impulsivity with age in this group. This phase corresponds to the later stages of adolescence, during which individuals tend to exhibit greater cognitive control and emotional regulation (Steinberg, 2005). Generally, the development of a relative preference for larger, delayed rewards may follow a timetable akin to the maturation of time perspective or the anticipation of future consequences (Steinberg et al., 2009), a phenomenon that undergoes significant growth during adolescence. The present findings are of significant importance, as they suggest that the underlying neural circuits involved in delay discounting and impulse control may progress after mid adolescence in ASD as well. Relevant hot EF brain regions, such as the ventromedial prefrontal cortex, exhibit robust connections with the ventral striatum responsible for reward processing and motivation (Pujara et al., 2016). These interconnections are crucial in integrating emotional and motivational information, thereby facilitating adaptive behavioral responses based on anticipated outcomes of various actions. Previous fMRI studies conducted in clinical populations have demonstrated a correlation between ADHD and ventral striatum activity, during reward anticipation in delay tasks (Scheres et al., 2007; Ströhle et al., 2008). Investigating similar associations in individuals with ASD could be informative in shedding light on whether ASD shares a complex neural association with the ventral striatum which may impact reward processing atypically across different developmental phases of ASD. These findings may have practical implications for understanding how age associates with discounting of delayed rewards, as it seems there may be certain age ranges associated with higher or lower levels of delay discounting in ASD. Tailored interventions could be thus designed addressing specific age ranges where delay discounting appears to be more pronounced, aiming to reduce impulsive temporal decision-making and promote better long-term planning. Similarly to affective decision-making, these findings should be interpreted cautiously due to the cross-sectional design of our study and the limited age range. Future longitudinal studies are warranted to elucidate the dynamic nature of this relationship over time and generalize results to broader age groups.

In summary, the current research highlights the need to examine development of different EF aspects in ASD separately as it seems that the EF structure differentiates more with age and different EF aspects develop differently in ASD as well. Our findings paint a more positive picture of the EF developmental progression, as the age-related improvements found in all cool and the turning points of improvement in hot EF suggest that there may be windows of plasticity in ASD across development. These findings need to be further examined in future longitudinal and neuroimaging studies, since understanding of the clinical implications of EF plasticity in ASD can allow for the design of interventions tailored to focus on specific areas of deficits/strengths. Identifying especially areas of relative EF strength and building upon those can empower individuals with ASD, boost their confidence, and foster a sense of competence, which may positively impact other areas of their life.

EF-ToM Relation

Results from regression analyses indicated that ToM shares a significant association with selective hot and cool EF in middle childhood and adolescence in ASD—beyond the early years of its emergence and first development (preschool period). In line with previous studies in young or school-aged children as well as adolescents with ASD (e.g., Kimhi et al., 2014; Kouklari et al., 2017, 2019; Ozonoff et al., 1991; Pellicano, 2007, 2010), we corroborated that ToM (i.e., mental state/emotion recognition, second-order false belief, and second-order ignorance) was predicted by cool EF aspects (planning and working memory). More specifically, participants who presented a higher performance in the working memory test also scored higher in the Eyes Test (tapping mental state/emotion recognition) and had a higher likelihood of passing the second-order false belief task. The ability to understand others’ beliefs and the attribution/recognition of more complex mental/emotional states seem to associate with heavier loads of working memory in middle childhood and adolescence in ASD. Working memory is crucial for children and adolescents during social interactions as they need to maintain and manipulate new, complex information (i.e., other people’s emotional or mental states) (McQuade et al., 2013). Moreover, the recognition and successful attribution of the appropriate emotions/beliefs may be in need of the active maintenance and manipulation of one’s own and others’ perspectives (e.g., social or emotional cues in social circumstances) in ASD. Finally, participants who exhibited better performance in planning had a higher likelihood of passing the second-order ignorance test. It is likely that school-aged children and adolescents with ASD are in greater need of their planning abilities as they cope with advanced forms of knowledge and social interactions with peers and/or adults (Del Giudice, 2014).

Regression results demonstrated that ToM second-order false belief was significantly associated with “hot” affective decision-making in ASD, over and above cool EF. This finding aligns with the existing limited evidence that has previously demonstrated associations between ToM abilities and facets of hot EF, including delay discounting (Kouklari et al., 2017, 2018) and affective decision-making (Yu et al., 2021) in individuals with ASD. Hot EF, encompassing the processing of emotional and motivational salience of various selections, may be elicited when children need to successfully apply ToM (Zelazo et al., 2005). Thus, in the intricate context of ASD, these hot EF mechanisms could play a role in facilitating the awareness of others’ beliefs, emotions, and motivations, which are quintessential components of successful ToM functioning. Of particular note is the congruence between the developmental stage under study and the theoretical underpinnings of cognitive maturation. The present investigation, focused on middle childhood and adolescence, lends support, albeit partial, to the notion that there is likely an extended developmental association between the underlying neural structures of hot processes and ToM in middle childhood and adolescence in ASD. The observed association between affective decision-making and ToM second-order false belief, even after controlling for cool EF, underscores the intricate interplay between emotional/motivational processing and social cognition. Emotional salience and motivational context play a pivotal role in the formation and interpretation of social interactions. As individuals with ASD navigate complex social scenarios, hot EFs may aid them to evaluate the affective significance of different options and anticipate the emotional responses of others. These findings accentuate the clinical relevance of hot EF in ASD, as potential deficits in affective decision-making could potentially hinder the acquisition of false belief understanding, thereby contributing to the challenges observed in social interactions among individuals with ASD. Future longitudinal and neuroimaging studies need to further explore this relationship in ASD.

Findings of the current study need to be interpreted cautiously due to limitations. Firstly, the lack of control group was an important limitation of the present study. The comparison of the ASD developmental pathway to a control group could have allowed to theoretically clarify whether there are a developmental delay and deviance, or deficits are constant across the cross-sectional trajectories in ASD. Moreover, the present sample size of children and adolescents of our study may not represent the more general ASD population. Participants’ age also ranged between 7 and 16 years and thus, future studies need to establish whether the present findings can be replicated across younger children and/or adults. Finally, as the reliability of measurements may vary across different tests (i.e., in the context of comparing distinct tasks, such as “cool” and “hot” tasks, as well as ToM measures), it is essential to acknowledge the potential impact of such discrepancies on the observed relationships. We refrain however from making definitive claims about the varying reliability’s potential impact, as further investigations are warranted to elucidate such complex interplays.

In conclusion, the investigation of the developmental trajectories and correlates of higher cognitive processes such as EF could shed more light on the theoretical understanding of the cognitive maturation processes beyond preschool years in ASD. This research is limited and used cross-sectional approaches; thus, results should be interpreted with caution. Future longitudinal and neuroimaging studies need to corroborate findings of intact EF developmental progression across childhood and adolescence in ASD.