Difficulty sensitivity replaces reward sensitivity during adolescence: Task-related fMRI and functional connectivity during self-regulative learning choices

.

A hallmark of development is the increasing ability to weigh competing thoughts, emotions and actions and select the most optimal behaviour to achieve a (long-term) goal.This self-regulating decisionmaking ability as well as the underlying brain circuits undergo maturational changes throughout adolescence (e.g., [1][2][3]).In spite of the ongoing advancement of self-regulation, from early adolescence on the educational system appeals to self-regulating abilities, such as metacognitive monitoring and control of study (e.g., [4,5]).For example, in secondary school adolescents are expected to plan their homework, monitor their learning activities, and make decisions on what to remember and select the most relevant information for studying for a test.Research has shown, however, that such regulating and decision-making skills are still far from optimal at the beginning of secondary school [6,7].This may lead to suboptimal learning choices, experienced stress, and lower academic outcomes.Particularly during adolescence, decision-making is modulated by affective factors such as reward (e.g., [1]), but how this affects choice behaviour in educational settings is still largely unknown.
Adolescents are thought to be more sensitive to reward compared to adults, both at the behavioural and neural level (e.g., [8][9][10]).However, functional magnetic resonance imaging (fMRI) studies are inconsistent regarding how reward affects activity in core reward structures during adolescence.Compared to adults, studies sometimes report higher activation in the ventral striatum (VS) during reward anticipation (e.g., [11,12]) and other times no difference or even reduced activation in adolescents (e.g., [13,14]).This inconsistency might be explained by differences in the tasks used and the accompanying effort required [15].Higher reward-related activity in the VS of adolescents is typically observed in tasks involving a passive submission to the outcome, which is characteristic of gambling tasks such as the slot machine [12], wheel of fortune [16], or 50/50 odds forced-choice tasks [11,17].Furthermore, the higher sensitivity of the adolescent VS for reward is in these tasks mostly observed during the receipt of reward ( [11,12,16]; but not in [17]), and less consistently during reward anticipation ( [11]; but not in [12,17]).Contrasting results are obtained with the monetary incentive delay task, in which participants need to respond as fast as possible to the appearance of a stimulus to obtain the reward [13,14,18,19].The response window is individually adjusted to keep the success rate constant (often at 66.67%).Under these circumstances, when participants have active control over the trial outcome, the adolescent VS response to reward does not exceed that of adults, and often is even less, particularly during the anticipation of reward ( [13,18,19] only in Joseph et al. [19] also less during the receipt of reward).
A possible explanation for these discrepant findings is that the cognitive control mode induced by the more engaging speeded performance task (i.e., monetary incentive delay task) has a regulative influence on valuation structures (e.g., VS), such that in adolescents the VS responsiveness to reward is supressed.Tasks requiring a higher cognitive control level would then induce a transition in adolescents from a 'default state' (e.g., during passive reward tasks), featuring an adult-like response during anticipation and a higher response during receipt of reward, to a 'cognitive performance state' (e.g., during performancedependent reward tasks), featuring a supressed response during anticipation and an adult-like response during receipt of reward.Jarcho et al. [17] indeed showed that the increased neural response to reward (in insula and anterior cingulate cortex) in adolescents compared to adults was present when rewards were obtained without decision-making, but disappeared when rewards were obtained with decision-making.They concluded that cues signalling the need for decision-making engage top-down processes and thus minimize developmental differences in the engaged valuation circuitry [17].That cognitive control might play a role is also suggested by the finding of Lamm et al. [14].These revealed that specifically the cognitive control related caudate nucleus showed a longitudinal increase in the activity surplus for high versus low incentive trials between 16 and 20 years of age [14].
A comparable regulative modulation of valuation structures by anticipated cognitive effort is observed in effort discountingi.e., the finding that the subjective value of an option not only depends on reward expectation, but also on the expected amount of effort required to obtain the reward [20,21].Studies in which adults are asked to choose between high-load and low-load tasks for a high or low reward showed that cognitive effort discounts the value of rewards and that both components of subjective value are weighted in the same brain areas [21,22].Children are also sensitive to task difficulty manipulations, but it is not until the age of 11-12 years that they use this information to guide their decisions away from engaging in difficult tasks [23,24], as adults do [25,26].Likewise, decision-making in an educational context is not only sensitive to the reward associated with the learning goal (e.g., how important is an exam and what are the consequences of passing/failing), but also to the effort required to obtain it (e.g., how much content do I have to learn and how familiar is it) [27].Literature on self-regulated learning stresses the importance of item difficulty in making decisions about which items to select for study [4,28].Unlike in typical goal-oriented decision-making, where difficulty triggers avoidance, in metacognitive decisions about studying, effective regulation requires allocating more study time to difficult compared to easy items [29,30].With increasing age (i.e., between ages six to 12) children are better able to distribute time effectively, studying difficult items longer than easy items, but even at ages 13 to 15 years adolescents are still less effective than adults in optimizing learning [31][32][33].
This complex interplay between cognitive and affective factors in value-based decision-making, relies on interconnected circuits of (sub) cortical structures in the brain [2,34,35].Computation of the subjective value of choice options occurs in a so-called valuation network centred on the VS and the ventromedial prefrontal cortex (vmPFC) [21,36,37].Neural activity in these regions parametrically varies with the subjective values participants attribute to options [36,37].This parametric variation is also observed in the amygdala [22], which is functionally connected with VS and vmPFC in a neural circuit involved in regulating behaviour [3].The amygdala assesses the affective value of rewards, choice options, and action outcomes and shares this information with the valuation network [38].Other brain areas that are engaged during value-based decision-making are frontoparietal areas such as the middle cingulate cortex, lateral prefrontal cortex, intraparietal sulcus, as well as striatal regions beyond VS [22,36,39].These structures are associated with cognitive control and their activation scales with effort cost, decision difficulty and arousal [21,36].This cognitive control circuit modulates on its turn the earlier mentioned subcortical structures, i.e., VS and amygdala, which code motivational values in value-based decision-making [3,37,40].
The connections within and between valuation and cognitive control networks are undergoing maturational changes throughout adolescence [2,3,41].Subcortical local circuits mature first, with stronger functional connections between the amygdala and VS heightening emotional reactivity to value cues in early adolescence [42].The top-down regulating influence of prefrontal areas on these subcortical circuits show a more protracted maturational course until young adulthood, diminishing emotional and reflexive actions [43].Similarly, the valence of the interactions between anterior-medial prefrontal regions and limbic structures, particularly the amygdala, changes from co-fluctuation (i.e., positive correlations) in children to complementarity (i.e., negative correlations) in adults [42,[44][45][46][47].This is interpreted as the result of the maturation of regulative control of medial prefrontal circuits over the responsiveness of subcortical valuation structures (amygdala, VS) [3,48].
In the present study, we examine the impact of anticipated reward as well as anticipated difficulty of learning material on brain activation in valuation and cognitive control structures during metacognitive decision-making.We asked adolescents to learn word-pairs while in the MRI scanner, based on value-directed remembering tasks [49,50].Before presenting each word-pair, however, they were prospectively asked to choose whether their later recall performance for the upcoming word-pair would be evaluated for points (and monetary gains) or not (no consequence for their gains).The participants based their metacognitive decision on the subjective valuation of the word-pairs, which was determined by the independently manipulated level of reward (low/high levels of points to gain) and of cognitive effort (low/high levels of effort).If the adult-like bias towards item difficulty considerations in study choices is related to the maturation of cognitive control, then this must be evident in the activity modulation of brain structures involved in cognitive control and their effect on the activation of structures in the valuation network.We examined brain activation parameters during metacognitive decision-making in six regions of interest (ROI), in respectively the valuation network (VS, vmPFC, amygdala) and the cognitive control network (caudate nucleus, dorsal medial prefrontal cortex, and lateral prefrontal cortex), as delineated in the literature (e.g., [22,37]).Additionally, we examine the functional connectivity (FC) between valuation and cognitive control structures, to identify developmental differences that may underlie dissimilarities in activation and improvement of self-regulation in adolescence.We focus on low-frequency functional connectivity, rather than dynamic task-related interactions, because maturational changes are likely long-term neurophysiological alterations that are not task or task-condition specific [51,52].FC reflects the prior history of co-activation between brain structures [53], resulting from neuroanatomical constrained and experience shaped interactions between neuron populations [54][55][56].Maturational changes in constraints and/or experience result in lasting changes in the relative strengths of couplings between the neurons involved [53,57], and consequently in functional connectivity.We only included 13 and 17 year-old adolescents in a narrow age range, because previous studies have shown that between these ages major changes occur in risky decision-making as well as in the functionality of the ROIs (e.g., [3,[58][59][60]).Regarding brain activation during value-based decision-making, we hypothesize that valuation areas are more active in young compared to old adolescents when reward level is modulated, while cognitive control areas are more active in old comparted to young adolescents when difficulty level is modulated.Regarding FC, the functional connections between subcortical valuation areas are hypothesized to be stronger in young than old adolescents, whereas the functional connections between the medial PFC on the one hand and the subcortical valuation areas on the other hand are thought to be stronger in old than young adolescents.

Participants and procedure
Participants were recruited through advertisement in a local newspaper.They provided demographic information, such as their birth date and sex (defined as 'sex assigned at birth' with a binary categorization (male/female).Participants self-reported that they had normal or corrected-to-normal vision, never repeated or skipped a school grade, had no psychiatric or neurobiological abnormalities and did not use medication that could influence cognitive functioning.Written informed consent was obtained from all adolescents and their parents.They received travel expenses reimbursement and vouchers with monetary value, after completing the study.The present study was approved by the local ethical review committee (ECP-113 08-04-2012).
In total, 47 adolescents participated in the study: 25 young adolescents (mean age 13.1 ± 0.42 years; range 12.2-13.8years; 15 females) and 22 old adolescents (mean age 17.0 ± 0.27 years; range 16.5-17.5years; 15 females).Compared to earlier studies we used narrow age ranges to create homogenous groups and limit inter-individual differences related to age.A post hoc power analysis (G*Power 3.1.9.7; [61]) indicated that with a sample size of 47, our F-tests in a two x two x two design could detect small to medium effect sizes (Cohen's f = 0.17) given an alpha level of 0.05, and a statistical power of 0.80.
All adolescents attended one training and one scanning session.During the training session participants received head motion training in a mock scanner and practiced the word-learning task, instructing them to use a visualization technique for encoding word-pairs.The two age groups did not differ in estimated verbal IQ (Peabody Picture Vocabulary Test-III-NL [62]); F (1, 39) = 0.04; p = 0.850) or non-verbal IQ (Raven Standard Progressive Matrices [63]; F (1, 45) = 0.95; p = 0.335).All participants scored within one SD of the norm on the Youth Self Report and the Child Behavior Checklist [64], and age groups did not differ on the total problem scales (YSR: F (1, 45) = 2.18; p = 0.147; CBCL: F (1, 45) = 1.24; p = 0.273).

Word-learning task
Each trial in the task consisted of a metacognitive decision-making phase, during which participants received information about the upcoming word-pair's reward value and difficulty level and had to decide whether they wanted to be tested on the item, and an encoding phase, during which the word-pair was presented for memorization.After every 20 trials, a recall test was administered, after which a new block of 20 trials started.
Each word-pair consisted of a cue and target word.Only the target word had to be recalled when presented with the cue.The word-pairs differed in reward value received upon correct recall (low gain onetwo points vs. high gain nine-ten points) and in difficulty level of learning the word-pair in terms of concreteness or abstractness of the word content (low vs. high difficulty).Concrete words depicted solid objects (e.g., melon, schoolboard, mirror, pencil, etc.), whereas abstract words depicted feelings, actions, features, or non-concrete things (e.g., envy, prayer, heavy, color, etc.).
During the metacognitive decision-making, participants saw the difficulty and reward level of the upcoming word-pair and decided to either "play" (i.e., earning points if correctly recalling the target word afterwards) or not to "play" (i.e., not earning points if correctly recalling the target word afterwards).This decision incited participants to contemplate the value information, instead of only passively registering it.No points could be lost when not recalling a "play" target word, because task try-outs had shown that this made participants reluctant to "play".After indicating their choice with a key press, the actual wordpair appeared for encoding.
A recall test was administered after every 20 trials, in which every cue word of the 20 word-pairs was shown once and the associated target word had to be given verbally.At the end of the experiment the points earned in the "play"-items were exchanged for a small amount of money.This whole procedure of decision-making, encoding and recall after word-pairs was repeated for a total of five runs.The word lists used in these five runs were randomized across participants.
The task had a slow event-related design.The information during the metacognitive decision-making and the word-pair during the encoding were presented for six s each.The start of each phase was scanner triggered and the interval between the phases (i.e., decision, encode, next decision, etc.) varied pseudo-randomly between six and ten s.The long phase onset assynchronicity of 12 to 16 s ensured minimal overlap in BOLD signals of different phases, and the jittering ensured adequate deconvolution of any remaining overlap [65,66].Duration per task run was 9.46 min.
To force participants to disengage from encoding the word-pair during the inter-phase intervals, an unrelated visual detection task was administered in the interval, in which a 3D face configuration had to be detected in an array of seemingly random moving dots.The face detection stimulus lasted two s, and contained a face in 40 % of the stimuli, a cylinder in 30 %, and a box in another 30 %. From one to three stimuli were shown in the Inter Stimulus Interval (ISI) of six to ten s, respectively, with a fixation cross in the remaining time.
Because our hypotheses centre on age-dependent brain activity differences during the anticipation of and decisions about valueinformation, the paper will only present results related to the metacognitive decision-making, and not on the encoding, the recall tests or the face detection task.

Behavioural analyses
Behavioural analyses were conducted using the statistical package SPSS 26.0.To meet the normality criterion, prior to statistical analyses, dependent variables were transformed to square root (number of times decided to "play") or logarithmic scale (reaction times).Analyses were conducted to examine the influence of difficulty and reward associated with each word-pair as well as age group on task performance during metacognitive decision-making.For each dependent variable, an analysis of covariance (ANCOVA) was conducted with age group (two levels: young vs old adolescents) as between-subject factor, difficulty (two levels: low vs high) and reward (two levels: low vs high level) as withinsubject variables and sex, verbal and nonverbal IQ as covariates.The alpha level was set at 0.05.Greenhouse-Geisser correction was applied to compensate for deviations from the sphericity assumption.
Data was preprocessed using SPM version 12 (Welcome Department of Cognitive Neurology, London, UK).First, functional images were corrected for slice scanning time differences, followed by rigid realignment to the first scan and co-registering to the participant's structural Keulers et al. image.Spatial normalization to the MNI template was performed on the segmented anatomical images and applied to the functional images, which were in that step re-sliced to a two mm isotropic resolution.For the functional connectivity and multivariate pattern analyses, no smoothing beyond that imposed by the re-slicing was applied.The functional connectivity data were not smoothed to avoid contaminating the signal of the relatively small seed regions with signal from adjacent voxels, thus improving specificity of functional connections.The contribution of local noise was suppressed, nonetheless, by averaging the voxel signals within each seed region.When the same functional data was used for task-related GLM analyses, the images were spatially smoothed using a four mm Gaussian kernel.

Task-related activation analyses
1.4.2.1.Head motion handling.In order to control for the confounding effect of head motion [67,68], we applied three corrections to our data [51,58].First, participants were excluded when start-to-end head movement exceeded 3 mm in the z direction and/or 2.33 • in x-axis rotation in any of the five runs.This led to the exclusion of data from five young adolescents and two old adolescents, leading to the described final sample of 47 participants.Secondly, the six affine realignment parameters were included as covariates in the statistical analysis, to remove signal fluctuations related to magnetic field inhomogeneity [69].Lastly, volumes for which head position changed more than one: tenth of the slice thickness (i.e., 0.4 mm) were identified.These fast movements relative to the gradient setting during acquisition distort the signal measured.Task trials with an onset eight seconds or less prior to such a fast movement volume were modelled in a separate covariate of no interest.This covariate included 3.9 % of all volumes in the younger group and 4.4 % in the older group.For the remaining trials, there was no significant age differences in the absolute intra-volume z-translation (t (45) = 0.7; one-tailed p = 0.246), or the intra-volume x-rotation (t (45) = 0.1; one-tailed p = 0.464).

First level imaging analyses.
A first level, or within participant, analysis was conducted using a traditional univariate general linear model (GLM) in SPM12.The results of this analysis constitute the starting point for both the second level univariate ROI-based GLM analysis and for the ROI-based multivariate pattern (MVP) classification analysis.Within the word-learning task, word-pair value information defined four conditions (i.e., low rewardlow difficulty, etc.), resulting in four decision regressors, for the choice to "play" or not, and four encoding regressors for the subsequent encoding of the word-pair.The events were marked in time by their onset (the presentation of the value information, or the word-pair itself, respectively), and convolved with a theoretical hemodynamic response function combined with time and dispersion derivatives.The face detection stimuli presented in the ISI, were also included in the design matrix, marked by their onset time and convolved with the theoretical hemodynamic response function (HRF).Additionally, the six motion parameters as well as trials contaminated with fast head movement were modelled as events of no interest.For subsequent second level analyses, estimated weights for the eight taskrelated regressors of interest were transformed to percent signal change maps (PSC maps) relative to the baseline signal intensity [70].

Regions of interest definition.
Mass-univariate whole brain analyses are less sensitive to developmental differences, which, due to the subtle and distributed nature of these changes [51,52,58], do not easily co-localize across individual brains and/or get lost in the smoothing intended to improve co-localization.The ROI-approach allows us to avoid inter-individual and inter-age-group co-localisation issues and to select in each participant the most responsible voxels for analysis.Moreover, it allows us to assess for each participant with multivariate pattern analysis whether reward-related and/or difficulty-related information is encoded within each ROI.
For the present study, we selected three brain structures representative of the cognitive control network and three of the valuation network.The cognitive control ROIs were the posterior inferior frontal sulcus (IFS) in the lateral PFC, the dorsal medial prefrontal cortex associated with motor processing (dmPFC) and the caudate nucleus (e. g., [21,36,71]).ROIs in the valuation network were the vmPFC, VS and amygdala (e.g., [21,36,37]), which are central to the processing of rewards [72] and subjective value of choice options [38,73].The ROIs were derived from existing atlas data [74][75][76][77].They are visualized in Fig. 1, and their delineation is described in detail in the Supplementary Information 1 'Regions of interest definition'.
These ROI masks define the location and extent of the selected brain structures at the group level.They were individually optimized in two steps: only participant-specific grey matter voxels (density > 0.5) within each ROI were selected, and within this sub-sample only voxels that show a significant activity change (See 'ROI-based GLM group analysis') were included.ROI masks and individual grey matter density maps were resliced to the image grid of the functional data.event yielded a significant F-statistic at α = 0.005.The significance level was not corrected for multiple comparisons, because the thresholding merely intended an estimate of the responsiveness of the ROI, for the purpose of subsequent second level group analysis.In addition to the number of active voxels, the response amplitude was quantified as the average present signal change (PSC) in the ten% most activated voxels.
Depending on the ROIs and their known functional properties, the analysis focused on either positively active voxels (IFS, dmPFC, caudate nucleus and VS), negatively activated (deactivated) voxels (vmPFC), or both (amygdala).Each ROI and dependent variable (number of voxels, and response amplitude) were assessed in a separate analysis.This was deemed necessary, given the large difference between ROIs in response valence and amplitude, and in fMRI accessibility/susceptibility.This resulted in 14 analyses (six ROIs x two dependent variables, plus two extra for positive and negative activations in amygdala).Each analysis assessed the effect of reward, difficulty, and hemisphere as within subject variables and age group as between subject variable, with as covariates the participants' sex, verbal IQ score, non-verbal IQ, and the average reaction time in the metacognitive decision-making.Greenhouse-Geisser correction was applied to compensate for deviations from the sphericity assumption.
1.4.2.5.ROI-based MVP analyses.MVP analyses were performed per unilateral ROI, participant, and valuation dimension (i.e., difficulty or reward).Single trial BOLD response estimates were obtained by averaging the signal measured during the third and fourth scans (or foureight seconds) after trial onset.Only correct trials not contaminated by fast movement (See 'Head motion handling') were included.Only voxels within an individual's grey matter map (See 'Region of interest definition) were selected.The regularized logistic regressor algorithm (http:// www.csbmb.princeton.edu/mvpa) was used for classification, with a run-wise cross-validation.In addition, a 12-step recursive feature elimination procedure [78] was performed, which consisted of eliminating the 33% voxels that contributed the least to the prior classification and then repeating the analysis with the reduced voxel set.The highest of these 12 averaged classification accuracies was reported as the end result of the MVP analysis.The MVP accuracies were per ROI, participant and value dimension transformed to Z-scores using a randomization procedure in which the exact same analysis was repeated 250 times, each time with a random redistribution of the training labels.The mean and standard deviation of the empirical null-distribution obtained from the 250 randomization accuracies were used to transfer the observed accuracy to a Z-score.Then, across these participant-and-ROI-specific Z-scores a one-sample t-test against the expected value of 0 was performed.The Z-scores were the dependent variable in a ROI-and value-dimension-specific analysis of variance with age as between and value (high or low) as within subject factor.The dependent variables (number of voxels, and response amplitude) were assessed in a separate analysis.A more detailed description of the MVP analyses can be found in the Supplementary Information 2: 'ROI-based MVP analyses'.

Functional connectivity analysis
1.4.3.1.Cleaning and head motion handling.The functional connectivity analysis was conducted on the task-related fMRI data, after removing the task-specific effects and filtering out high-frequency fluctuations in the data.It should be noted that task execution invokes subtle differences in the functional connectivity between task-relevant brain structures, compared to a state of rest, and this during execution itself [79][80][81][82] as well as in the period of rest following execution [83,84].Such task-effects may have influenced the low-frequency structure of the data used here (e.g., [82,85]).However, the overall lay-out of networks in task and rest data is very similar [79,86].Moreover, such task effects on connectivity cannot have induced a systematic difference between the groups, because similar task-related data were used in both groups and the groups did not differ in the performance measures for the task (and hence in the level of commitment to the task).Task-induced modulations of the signal were removed by regressing out the signal variance associated with the task-related GLM model [79,87].The regressor model further included a linear trends removal by applying the standard SPM whitening filter, the six affine realignment parameters, the session-specific mean, and the average signal from the white matter and from the CSF.The global brain signal or the average grey matter signal were not included.Intrinsic autocorrelations were removed from the data following the standard SPM procedures.The residual values were Fourier band pass filtered from 0.1 to 0.01 Hz.Finally, for every fast head movement volume (as defined above) from -one to +two volumes relative to the contaminated volume were removed from the time series.
The clean low-frequency time series data from the five runs were concatenated into a single time series data set for the function connectivity analysis.

Seed definition.
For the investigation of differences in the interactions between cognitive control and valuation structures in the two age groups, 28 small three mm radius spherical seeds (14 per hemisphere) were positioned in a systematic manner in these brain structures, as indicated in Fig. 5.The location of the seeds included all the ROIs of the task-related analyses, except the IFS: seven covered the cingulate cortex from the motor/cognition-related section to the subgenual region, four in the medial striatum from dorsal to ventral with one extra in the putamen, and two in the amygdala (lateral and medial).Multiple smaller seeds were used to be able to observe differences in FC with age at a higher spatial resolution.While studies agree that adolescent brain maturation involves changes between medial PFC and subcortical structures, the specific region of the medial cortex involved varies considerablyfrom affect-related medial cortex (vmPFC; [45,88,89]) and pregenual anterior cingulate cortex [45,46], to the transition zone between affective and motor-related cingulate cortex [3,42,47], to the motor control related cingulate cortex (MCC; [44]).The spherical seeds were combined with the individual participant grey matter density images (density >0.5) to exclude non-grey matter voxels.

Seed-by-seed based group analysis.
For each participant a seedby-seed correlation matrix was computed.Seed time courses were obtained by averaging the voxel-time courses within each seed and grey matter mask.A Fischer z transformation improved the normality distribution of the correlation coefficients.
The correlation matrix per participant were subjected to a seedspecific statistical analysis, with age as the between subject factor, and hemispheric side of the seed as a within subject factor.Because our research question does not involve any laterality effects, only the main effect of age group will be presented in the results section.Since there were 14 seeds, the total number of tests performed was (14 2 -14)/2 = 91 tests.

Correction for multiple comparisons
The number of ROIs and dependent measures investigated necessitates a protection against the chance of observing false positives due to multiple comparisons.A traditional correction for multiple comparisons, such as the Bonferroni-correction, would greatly undermine the sensitivity of the analyses.Therefore, a different procedure was followed to take into account the accumulation of chance due to repeated tests.In this procedure, a statistical test was evaluated for significance at the uncorrected α-level of 0.05.Then, we used the cumulative binomial coefficient to assess the likelihood that the observed pattern of positive tests over the repeated analyses was attributable to false positives.This test assesses what the probability is of obtaining an equal or higher number of hits (i.e., significant tests) than the actually observed hit-rate given the number of tests performed and a random hit rate equal to α (i.

E.H.H. Keulers et al.
e., 0.05).These probability checks were computed separately for all the tests on different ROIs that related to the same dependent variable.
Single ROI effects at α = 0.05 were only interpreted as meaningful if the cumulative binomial probability of the observed single effects on similar tests for all the ROIs was less than 0.05.
While performance was influenced by both reward and difficulty associated with the word-pair, there were no differences between age groups in metacognitive decision-making parameters.Given that the two adolescent age groups did not differ in any measure of task performance, it can be assumed that any BOLD difference observed between age groups truly reflects differences caused by maturation, rather than caused by differences in task performance.

Task-related imaging
General effects of age (independent of the value parameters associated with the word-pairs) and value dimensions (independent of the age of participants) are described in Supplementary Information 3.These effects do not address our research question, but support the validity of the task paradigm.Main age effects confirm previous findings that cognitive control related cortical areas show higher response amplitude with age, while default mode areas become stronger deactivated with age during cognitive performance [58,90].Opposite, activity in subcortical structures, such as the caudate nucleus and amygdala, is stronger in young compared with old adolescents.The complementarity of cortical and subcortical activity differences between age groups could signal a shift towards more automated processing with age, with young adolescents relying more on active decision processes via cortical-subcortical circuits, while old adolescents rely more on automated processing implemented in trained cortical networks [91].Main effects of value were seen in the ventral striatum, amygdala and dmPFC in line with the literature on neural correlates of valuation [22,36].Although the dmPFC is a cognitive control region, this value-based sensitivity is in agreement with the idea that action selection is outcome-oriented [92,93].
The main research question was whether young and old adolescents differ in their way of processing value information during metacognitive decision-making.To address this question, we investigated task-induced brain response amplitude and extent of activation in each of six ROIs, in relation to reward and difficulty associated with learning the word-pair.If age matters in processing this value-related information this should be manifest in different modulations of brain activation during metacognitive decision-making in the two adolescent age groups.
The results of the age by value interaction analyses are summarized in Fig. 3 for the reward dimension and in Fig. 4 for the difficulty dimension.Although sometimes subtle, age differences in the value responses of the ROIs showed a consistent pattern.During deciding to be Fig. 2. Adolescents' behavioural performance during metacognitive decision-making in the word-learning task: transformed data is represented with error bars indicating standard error of the mean.evaluated on the upcoming word-pair, young adolescents' brain activity showed sensitivity to the reward level associated with the word-pairs, while old adolescents' brain activity tended to be more affected by the difficulty level associated with the word-pairs.Reward level could be decoded above chance in 13 year-old participants from activity in four ROIs: right amygdala (t (24) = 2.63, p = 0.0073), right vmPFC (t (24) = 2.73, p = 0.0058), bilateral in ventral striatum (left: t (24) = 2.28, p = 0.0160; right: t (24) = 2.01, p = 0.02814) and right caudate nucleus (t (24) = 2.34, p = 0.0141).The number of significant tests was above chance level (p = 0.0059, cumulative binomial coefficient for ≥ five positives out of 24 tests (six ROIs x two sides x two ages) at α= 0.05).Moreover, assuming that all five positive tests are false positives, the chance that four of them occur in the tests of the left and/or right hemisphere valuation ROIs of the 13 year-olds (three ROIs x two sides) is statistically unlikely (p = 0.0065, i.e., the chance of obtaining four or more hits out of five attempts, when the hit probability is 6/24).The caudate nucleus was additionally implicated in agedependent processing of reward-related information, because young adolescents had significantly more voxels activated in the bilateral caudate nucleus than old adolescents during low compared to high reward items (F (1, 41) = 7.24, p = 0.0100).The multiple comparison probability for one or more positive test at α= 0.01 out of seven tests is borderline significant (cumulative binomial p = 0.0679).
Difficulty rather than reward tends to modulate brain activity in the 17 year-old participants.The amygdala showed a significantly wider deactivation in old adolescents during deciding on items associated with high levels of difficulty (F (1, 41) = 4.51; p = 0.0400) compared to young adolescents.This age effect adds up to the global deactivating effect that difficulty has on the bilateral amygdala activity, as discussed above (Results section 'Task-related imaging'; Supplementary Information 3).At the same time, in the other two valuation structures (VS: F (1, 41) = 3.19, p = 0.0810; vmPFC: F (1, 41) = 3.79, p = 0.0590) and again in the caudate nucleus F (1, 41) = 3.25, p = 0.079), the response amplitude in old adolescents tended to be stronger during deciding on more difficult items, while in the young adolescents the pattern tended to be the inverse.In the vmPFC, which is part of the default mode network and therefore deactivated during task execution [94], the pattern appeared as stronger deactivation for anticipated difficult items.Although these age-dependent difficulty effects are only borderline significant, the chance of observing four hits in 14 tests by chance, even at the highest observed p-value (p = 0.0810), is statistically unlikely (cumulative binomial p = 0.0222).The three-way-interaction age x reward x difficulty was not significant.

Functional connectivity
Seed-to-seed functional connectivity was computed for 14 spherical seeds, yielding 91 unique analyses.At a significance level α= 0.05 (critical one-tailed t = 1.68), 14 revealed lower FC for old than young adolescents, which is significantly more than expected under the nullhypothesis (cumulative binomial p = 0.0002 (probability of ≥14 positive tests)).In the opposite direction, 18 tests showed higher FC for old than young adolescents (cumulative binomial p= < 0.0001).Given these small probabilities of observing positive tests, it is highly unlikely that the observed FC differences are chance observations.
The FC results are used here to address three hypotheses regarding adolescent brain maturation, derived from the previous findings discussed in the introduction.The first hypothesis is that FC between amygdala and VS reduces with age.Fig. 5B shows the significant changes in FC between striatal and amygdala seeds.One-tailed t-tests confirmed a main effect of age for our VS-seed and the lateral amygdala, with negative FC in older, but not in younger adolescents (young M = 0.006, old M= − 0.0413, t (45) = 4.13, 1-tailed p < 0.0001).This confirms the hypothesis.No significant FC was evident between the accumbens seed and the amygdala (highest t (45) = 0.12, 1-tailed p = 0.4496).Additionally, we found reduced FC between dorsal and lateral striatal seeds (caudate and putamen) and amygdala seeds (smallest t for caudate-lateral amygdala: t (45) = 1.93, 1-tailed p = 0.0298).
The second hypothesis is that cognitive control maturation during adolescence is paralleled by increased FC between the medial PFC and the striatum.Fig. 5C shows that the dominant pattern of age-related differences in medial PFC-striatum connectivity is that of increased coupling: 11 ROI pairs have stronger coupling in older adolescents, targeting all striatal seeds, and only three had a weaker coupling, confined to ventral striatal seeds.Noteworthy is, first, that the most implicated medial PFC region in age-related differences is ACC-pg, with different couplings to all striatal seeds, followed by TCC-p, to three striatal seeds.In both cases, the coupling with the accumbens seed was weaker, while the coupling with the VS and caudo-ventral seeds was stronger in old adolescents.Second, the pattern of coupling differences with age was opposite for the ACC-sg seed, with stronger coupling to the accumbens seed and weaker to the VS seed in old adolescents.This differentiation between ACC-sg and ACC-pg gives the impression of finetuning the specificity of the connection patterns, with ACC-sg interacting more exclusively with nucleus accumbens and ACC-pg (and TCC-p) more exclusively with VS and more dorsally located striatal regions.
The third hypothesis is that FC between medial prefrontal regions and the amygdala reduces from co-fluctuation in children (i.e., positive correlations) to complementary in adults (i.e., negative correlations).The coupling differences related to age between medial PFC and amygdala are presented in Fig. 5D.The amygdala coupling is (more) negative in old adolescents for the MCC seed (M young= 0.020; M old= − 0.017; t (45) = 2.26, one-tailed p = 0.0144) and the TCC-a seed (M young= − 0.001; M old= − 0.030; t (45) = 2.11, one-tailed p = 0.0202), but it became stronger for the ACC-psg seed (M young = − 0.010; M old = 0.035; t (45) = − 2.41, one-tailed p = 0.0101).It is worth noting that the medial PFC seeds with age-related different amygdala connectivity were located interleaved with the medial PFC seeds that showed age-related different striatal connectivity.

Discussion
The anticipated level of reward and difficulty both played a role in differentiating the age groups during the metacognitive decision to being evaluated or not on the upcoming word-pair: young adolescents' brain activation showed sensitivity to the reward level associated with the word-pair, while old adolescents' brain activation tended to be more affected by the difficulty level of the word-pair.It is noteworthy that the same brain areas reflected this age-related difference between rewardrelated and difficulty-related modulation of activation, namely, the valuation related structures VS, vmPFC, amygdala and the cognitive control related caudate nucleus.These differences in brain responsiveness with age co-occurred with age-related differences in low frequency functional coupling between these brain structures, in a way that suggest increased regulative cognitive control exerted within medial PFC-medial striatum cortico-striatal circuits.
The anticipated level of difficulty differently affected metacognitive decision-making related brain activation in the age groups.Activity modulations in the VS and the caudate nucleus tended to variate from a larger BOLD amplitude for easy items in 13 year-olds to a larger amplitude for difficult items in 17 year-olds.The stronger response for higher difficulty in the striatum with increasing age is in line with recent studies suggesting that VS and caudate nucleus are involved in the regulation of effort [95][96][97].Chiu et al. [96], for instance, manipulated the control level required to identify pictures of famous actors by associating some faces frequently and others infrequently with incongruent names.They found that caudate nucleus activity reflected the trial-by-trial control level demanded by the different pictures.Similarly, Suzuki et al. [95] reported that the BOLD signal in a medial striatal region, stretching from VS into caudate nucleus, increased with the effort required to navigate through a virtual reality maze.This encoding of task difficulty was already evident during the anticipation phase prior to action [95].In the light of these previous studies, the difficulty-related stronger activity in VS and caudate nucleus in year-olds can be seen as an indication of the anticipated demand for cognitive control in the upcoming word-pair.This increased sensitivity to item difficulty is carried by an increased low frequency functional coupling between medial PFC (particularly ACC-pg and TCC-p seeds) and the medial striatum (particularly caudo-ventral and ventral seeds), suggesting a generally more intensive neural exchange between these structures in old than in young adolescents.The location of the most involved seeds at the transition between the cognitive and affective loops of the cortico-striato-thalamic circuits, suggests increased regulation of motivational and emotional influences in metacognitive decision-making in old compared to young adolescents.
The interpretation that 17 year-old adolescents are more sensitive to implementing cognitive control receives further support from the finding that they showed stronger deactivation of the amygdala and the vmPFC during metacognitive decision-making.Increased cognitive conflict suppresses the amygdala's responsiveness to fearful stimuli, both in adults [98,99] and adolescents [100].Moreover, working memory performance in patients with amygdala atrophy is facilitated compared to healthy controls [101], suggesting that amygdala interferes with effortful cognitive performance.Similarly, difficulty-related deactivation of vmPFC is well established.The default mode network, of which vmPFC is part [102,103], was originally defined by its complementarity to cortical regions engaged during cognition [104,105] and the degree of its deactivation varies with the difficulty of the task [106,107].This task-induced deactivation becomes more pronounced during adolescence [51].While in 17 year-olds the deactivation was larger for deciding on high compared to low levels of difficulty associated with the word-pairs, in 13 year-olds it was larger for easy than difficult word-pairs.This inverse pattern of difficulty-related activation in year-olds suggests a higher level of cognitive control during deciding on anticipated easy items.Consequently, we hypothesize in hindsight that the 13 year-olds attached more weight to the word-pairs associated with low levels of difficultythey find it more difficult to decide on whether to be tested or not on the easy trialswhereas the 17 year-old adolescents attached more importance to the word-pairs associated with high levels of difficulty.In terms of FC, the emerging deactivation of amygdala and vmPFC in older adolescents is mirrored by the appearance of anti-coupling between MCC (and TCC-a) and medial amygdala, together with a positive coupling between amygdala and the seed closest to vmPFC, namely ACC-pg.This pattern agrees with the above discussed deactivation of default mode network nodes, of which vmPFC and also amygdala are representatives [108], by the cognitive network nodes, of which MCC is a representative.
The anticipated reward level, too, differently affected neural activation during metacognitive decision-making in 13 and 17 year-old adolescents.The caudate nucleus in 13 year-olds showed a higher number of active voxels during deciding on low compared to high reward items, whereas reward level did not cause a different brain response in 17 year-old adolescents.Following the above reasoning, this would mean that the information that the upcoming word-pair is associated with low reward triggers a heightened control signal in 13 (but not in 17) year-olds.Apparently, 13 year-olds find it more difficult to decide on being tested when the anticipated reward is low, similar to when the item is anticipated to be easy.In addition, voxel activation patterns in the caudate nucleus and in the three valuation ROIs allow above chance classification of multivoxel activity according to whether the items judged were associated with a low or a high reward in 13 but not in 17 year-old adolescents.Because the degree of decodability of task-relevant information has been shown to depend on reward value of the information [109,110] and on the difficulty of the task [111][112][113], this result may again indicate that in 13 year-olds the reward level plays a more important role in making the decision to study or not than in 17 year-olds.In the FC analysis, the amygdala-to-VS coupling was less strong in old compared to young adolescents, whereas the coupling between most of the medial PFC seeds and the mid-medial striatum (caudo-ventral and ventral seeds) was stronger in old compared to young adolescents.These findings are compatible with the idea that affective responses (e.g., to rewards) are more regulated in older adolescents, while the cortico-striatal loops at the intersection of affect and cognition become more active.
In spite of these age differences at the level of brain activation and functional connectivity, the outcome of the metacognitive decisions or the speed of arriving at a decision was not different between the age groups.This means that the differences in brain activation observed are not just reflections of a different strategy or a different interpretation of the experimental situation that adolescents find themselves in.The young adolescents arrived at the same behavioural result as did the old adolescents.Nevertheless, the neural (control) processes triggered by the events in this experimental situation did differ.A comparable neural difference in the absence of a behavioural difference is not unusual in reward-based decision-making tasks [58,114,115].For instance, in Evers, Stiers and Ramaekers [114], participants were asked to decide between an anticipated smaller or larger reward in a simple 50% chance gamble paradigm, in which there was no relationship between the choice and obtaining the outcome.While this choice for the lower or higher anticipated reward did not affect the speed of the decision, the value chosen and the reward history in previous trials did reliably modulate activity in the VS during these choices.Moreover, reward expectations are reflected in VS activity without participants being conscious of these expectations [11].
The involvement of amygdala and VS in the developmental differences in value sensitivity is in line with the model proposed by Casey, Heller, Gee and Cohen [3].This model proposes that excitatory signals from the amygdala to the VS are responsible for impulsivity in younger individuals when confronted with emotional stimuli.Due to a later developing, indirect regulatory loop from the medial prefrontal cortex to amygdala [42,46,116], this subcortical path becomes cortex-controlled in young adults.In our study, the amygdala's decodable distributed representation of reward level in 13 year-olds became replaced by a higher number of deactivated voxels for word-pairs associated with high compared to low levels of difficulty in 17 year-olds.These value-related activity changes are mirrored in the activity of VS, caudate nucleus and vmPFC.In young adolescents, caudate nucleus and vmPFC, and to a lesser extend VS, also allow reward level classification during metacognitive decision-making.Within the Casey et al. [3] model this would mean that the amygdala signal projected to VS resonates in the limbic-cortico-striatal loop and in the striatal cognition zone.This effect is no longer present in 17 year-old adolescents.Instead, at that age the strongest effect is deactivation of the amygdala during deciding on difficult compared to easy items.That suggests down-regulation of amygdala involvement in old adolescents.The fact that the direction of modulation is opposite to the modulation observed in the VS and caudate nucleusin contrast to the reward effects in 13 year-olds, which were all in the same directionsuggests that it reflects the influence of stronger cognitive control.In the Casey et al. [3] model this is hypothesized to be the consequence of direct regulative control of medial PFC over amygdala.We found only weak support for this direct control, in the form of emerging anti-coupling between cognition-related medial PFC (MCC and TCC-A seeds) and medial amygdala.In fact, the observed age differences better fit the broader picture of complementarity between cognitive and default mode networks, which becomes more pronounced during adolescence [51].Instead of direct cortical control over amygdala, we see improved functionality of the cortico-striatal decision-making circuitry.Namely, in older participants affective processes are more effectively deactivated during metacognitive decision-making, whereas deliberation in cortico-striatal circuits has gained in importance.It seems that for 17 year-olds the knowledge that the item will be more difficult makes the choice to be tested or not more difficult (lower choice valuemore vmPFC deactivation, as discussed above), and that affective value of the anticipated difficulty, which might interfere with the cognitively best choice, is suppressed (more widespread deactivation of amygdala voxels).Therefore, the results confirm our interpretation of the inconsistent findings in the literature, as discussed in the introduction, regarding VS responsiveness to reward: when the task incites to exert cognitive control the "average" adolescent is able to suppress the affective responses towards value aspects of the task, but when the task does not proved any active control over the outcome, the "average" adolescent responds affectively to value aspects of the task.The "average" adolescent is a mixture of young and old adolescents.Our choice to work with clearly defined, narrow age groups allows us to conclude that the transition takes place grosso modo between the age of 13 and 17 years.
The current effects are not strong despite the optimal ROI-based voxel selection and the narrow age ranges used.The relatively small sample size is a limitation, even though the post hoc power analysis indicated sufficient power to detect small to medium effects.Another contributing factor could be inter-individual differences in developmental trajectories, as well as other individual differences such as personality traits involved in valuation and metacognitive decision-making [20,117,118].In the present study we focussed on sex-transcending age effects in metacognitive decicion-making and its underlying functional brain patterns.Given that sex has an impact on neurocognitive maturation on both the brain and behavioural level [119,120], we have added sex as a regressor-of-no-interest in our GLM analyses.It would be interesting to examine sex-specific effects or interactions between sex and age on metacognitive decision-making in forthcoming research.For future studies, it would also be beneficial to think of ways to define neurodevelopmental maturity in other ways than simply chronological age, such as hormonal, neuroimaging and/or neurobehavioural measures.A more accurate picture could also be obtained by longitudinal follow-up studies through the age trajectory from 13 to 17 years of age, and beyond, to improve interpretation of age-related differences.Nevertheless, the observed pattern of brain activation differences in relation to reward and difficulty at ages 13 versus 17 years was consistent across ROIs, and supports the idea that the brain processes value information in making (educational) metacognitive decisions differently throughout adolescence.This finding has important implications for educational guidance and support of teenagers on different sides of this maturational marker.Adolescent age should be taken into account, for example, in ways to optimise learning behaviour, young adolescents might profit from associating rewards with studying whereas old adolescents might need more challenging, difficult study material.

Fig. 1 .
Fig. 1.Illustration of regions of interest (ROI) for the task-related fMRI analyses in the left hemisphere.ROI masks are overlaid on the Collin brain in MRIcron (https://www.nitrc.org/projects/mricron). Green color indicates ROIs of the cognitive control network: IFS Inferior frontal sulcus, dmPFC dorsomedial prefrontal cortex, Caudate n. caudate nucleus.Red color indicates selected ROIs from the valuation network: vmPFC ventromedial prefrontal cortex, ventral striatum, and amygdala.The crosshair demarcates the MNI centre position (0, 0, 0 mm).

Fig. 3 .
Fig. 3. Age-dependent modulation of brain activity by the reward level associated with word-pairs during metacognitive decision-making.A) Accuracies of the MVP decoding of reward level in ROI-specific trial-wise brain activity data in 13 or 17 year-old adolescents (per ROI: 13-year old left hemisphere, 13 year-old right hemisphere, 17 year-old left hemisphere, 17 year-old right hemisphere).Accuracies per participant and ROI were Z-transformed based on mean and SD of 250 randomization accuracies for the same data.B) Percent signal change (PSC) difference calculated as PSC during high minus low level of reward associated with wordpairs in 13 year-old (left, dark-grey/red bars) and 17 year-old adolescents (right light-grey/blue bars).C) Difference in number of activated voxels calculated as voxel count of high minus low level of reward associated with word-pairs in 13 year-old (left, dark-grey/red bars) and 17 year-olds adolescents (right, light-grey/blue bars).Statistical notes.Values shown are Mean±SE; Statistical significance: (A) paired t-tests; (B) and (C): age-by-value ANOVA interaction; Significance level: **/red or blue bar, p < 0.01; */red or blue bar, p < 0.05.# These ROIs were tested for negative responses i.e., deactivated voxels.

Fig. 4 .
Fig. 4. Age-dependent modulation of brain activity by the anticipated word-pair difficulty during metacognitive decision-making.A) Accuracies of the MVP decoding of difficulty level associated with the word-pair in ROI-specific trial-wise brain activity data in 13 or 17 year-old adolescents (per ROI: 13 year-old left hemisphere, 13 year-old right hemisphere, 17 year-old left hemisphere, 17 year-old right hemisphere).Accuracies per participant and ROI were Z-transformed based on mean and SD of 250 randomization accuracies for the same data.B) Percent signal change (PSC) difference calculated as PSC during high minus low levels of difficulty associated with the word-pairs in 13 year-old (left, dark-grey/red bars) and 17 year-old adolescents (right light-grey/blue bars).C) Difference in number of activated voxels calculated as voxel count of high minus low level of difficulty associated with word-pairs in 13 year-old (left, dark-grey/red bars) and 17 year-old adolescents (right, light-grey/blue bars).Statistical notes.Values shown are Mean±SE; Statistical significance: (A) paired t-tests; (B) and (C): Age-by-value ANOVA interaction; Significance level: **/red or blue bar, p < 0.01; */red or blue bar, p < 0.05; ( * ) /pale red or pale blue, 0.1 < p < 0.05.# These ROIs were tested for negative responses i.e., deactivated voxels.