Introduction

Working memory, an executive function that supports higher-level cognition (e.g., fluid intelligence), has been extensively examined using the N-back task, often with brain imaging to investigate the underpinning neural processes1,2,3,4. Typically, blocks with memory load (e.g., 2-back), in which participants are required to track and remember the sequence of stimuli, and blocks of 0-back, in which participants are required to detect a pre-specified target, are inter-mixed5,6. As 2-back requires both attentional monitoring and working memory to keep track of the stimuli while 0-back requires only attentional monitoring of the target, a contrast of brain activation during 2-versus 0-back would highlight regional responses specific to working memory. Investigators quantify the accuracy rate (AR) and reaction time (RT) to evaluate individual performance, with a higher AR and shorter RT during correct responses of 2-relative to 0-back indicating superior working memory. However, whereas abundant research has characterized the regional responses to N-back memory, relatively little is known about the neural processes supporting individual difference in AR2–0 or RT2–0 or whether these neural processes are inter-related. Earlier work showed that activations of bilateral premotor and/or lateral prefrontal cortex (PFC) during 3- vs. 1-back or fixation were associated with higher 3-back accuracy across subjects7,8. A more recent study demonstrated that the linear slope of load-related activity (from 1- to 6-back) of the left lateral PFC was associated with individual accuracy in target identification3. No studies to our knowledge have examined the neural correlates of RT or addressed whether or how the neural correlates of AR and RT may be inter-related in the N-back task.

Distinguishing the neural mechanism subserving individual variation in AR2–0 and RT2–0 is of conceptual interest in understanding the psychological constructs of working memory. On one hand, successful performance requires both maintenance of the stimuli in the memory (keeping track of the 1-back stimulus when the 2-back stimulus represents a potential target) and focus switching (updating the target identity with the appearance of each new stimulus)9. Thus, a high AR reflects superior capacity in maintenance and updating, whereas a short RT more narrowly reflects the ability in target switching/updating. Individuals with neuropsychiatric conditions typically perform worse compared to healthy controls by showing diminished AR and prolonged RT in the N-back task10. On the other hand, with few exceptions (e.g.11, the great majority of N-back studies have not purposely constrained the RT by imposing a response window; hence, participants may slow down to optimize AR. In other words, it is possible that participants are slower but more accurate in identifying the target, resulting in a ceiling effect on the AR and rendering the RT of correct responses a more sensitive metric of individual performance12,13. Examining the potentially distinct neural underpinnings of the AR and RT (i.e., AR2–0 and RT2–0) would facilitate research of working memory and how working memory relates to higher-level cognition such as fluid intelligence (Gf).

Working memory is strongly related to Gf. As one of the primary predictors of Gf, working memory contributes to Gf via the process of central executive or cognitive control14,15,16,17,18. Moreover, training on working memory appears to improve Gf19,20. The close link between working memory and Gf may result from shared underlying neural mechanisms. For instance, bilateral prefrontal and parietal cortical activity accounts for a significant proportion of the shared variance in working memory and Gf21. Nevertheless, it remains unclear whether or how individual variation in Gf is reflected in the AR or RT of the N-back task and how regional brain activation inter-links Gf and the performance measures.

The current study aimed to address these issues by using a large data set curated from the Human Connectome Project (n = 949). We contrasted brain activities between the 2- and 0-back blocks to control for the effects of attentional monitoring within subjects. We computed a critical success index (CSI) to reflect AR and the RT of correct trials for individual subjects. We performed a whole-brain regression of 2- vs. 0-back each against the block differences (i.e., 2- minus 0-back) in CSI2–0 and in RT2–0, with age, sex and years of education as covariates, to identify the regional correlates. We also performed a whole-brain linear regression against the PMAT24_A_CR, the number of correct responses in the Raven’s Standard Progressive Matrices, to identify regional responses to Gf. We hypothesized that individual variations in CSI2–0 and RT2–0 are associated with both distinct and shared regional activities and that some of these regional processes reflect individual variation in Gf.

Materials and methods

Data set

We employed the 1200 Subjects Release (S1200) data set, including behavioral and 3 T MR imaging data of 1206 healthy young adults collected from 2012 to 2015, for this study. Of all 1206 subjects, 124 did not participate or participate fully in the N-back task. Further, 133 subjects who had head movements greater than 2 mm in translation or 2 degrees in rotation or for whom the images failed in registration to the template were excluded. As a result, a total of 949 (493 women; 22–37 with mean ± SD = 28.8 ± 3.7 years) were included in the current study. Individuals were without documented history of psychiatric, including developmental (e.g., autism), neurological (e.g., Parkinson’s disease), or medical (e.g., diabetes) disorders known to influence brain function. Twins/non-twins born prior to 34/37 weeks of gestation were excluded. The HCP included smokers, alcohol drinkers and users of illicit substances as long as they did not experience severe symptoms (e.g., inability to stop using; substance abuse despite health consequences) or receive treatment for substance use, so that the data collected of the study reflected the broader populations22. We have obtained permission from the HCP to use the Open and Restricted Access data. Participants provided written informed consent and all aspects of the study, including subject recruitment, experimental procedures were conducted according to a protocol in accordance with the Declaration of Helsinki and approved by the Washington University Institutional Review Board (IRB #201204036; title: “Mapping the Human Connectome: Structure, Function and Heritability”).

The Raven’s Standard Progressive Matrices (RSPM) is a 60-item test of abstract reasoning, a nonverbal estimate of fluid intelligence (Gf). All HCP participants were evaluated with the Form A, an abbreviated version of the RSPM with 24 items and 3 bonus items, arranged in order of increasing difficulty23. Participants were instructed to complete all items or until they made 5 incorrect responses in a row. The total number of correct responses was coded as PMAT24_A_CR, which we employed as an index of Gf in the current work.

N-back task

Each subject completed two runs each of eight blocks (four 0- and four 2-back) of the N-back task in a fixed order (first run: 2,0,2,0,2,2,0,0; second run: 2,0,2,0,0,2,0,2). Four categories of stimuli (body part, face, place, tool) were used in individual blocks. The first run was shown in Fig. 1A. In each block a cue was presented for 2.5 s to indicate the current task (0- or 2-back, including target for 0-back) at block start. In 0-back participants were to identify the specified target and in 2-back blocks participants identified the target, a cue that was the same as the one that appeared two time steps back. A “null” block of 15 s was inserted every two blocks in the task. There was a total of 10 trials in each block, of which 2 were targets and 2–3 were non-target lures (i.e., same items in wrong n-back position, either 1-back or 3-back). In each trial, the stimulus was presented for 2 s, followed by an inter-trial interval of 0.5 s. Before undergoing MR scans, participants were engaged in a practice session in a mock scanner to become familiar with the task and acclimated to the environment22.

Figure 1
figure 1

N-back task and the correlation of CSI2–0 (%) and RT2–0 (ms) with Gf. (A) Block sequence of the first run of the N-back task. Linear regression of (B) the difference in critical success index (CSI) vs. the difference in reaction time (RT) of correct trials between 2- and 0-back: CSI2–0 vs. RT2–0 (r = -0.350, Cohen’s f2 = 0.140, p = 1.1588 × 10–28); (C) Gf, as indexed by PMAT24_A_CR vs. CSI2–0 (r = 0.006, Cohen’s f2 = 3.6 × 10–5, p = 0.847); and (D) PMAT24_A_CR vs. RT2–0 (r = 0.155, Cohen’s f2 = 0.0246, p = 0.000002), with age, sex, and years of education as covariates. Each data point represents one subject. B, C and D were generated by SPSS Statistics 22.0 (https://www.ibm.com/support/pages/spss-statistics-220-available-download).

We used the Critical Success Index (CSI) to evaluate N-back performance. A modified estimate of percent accuracy, the CSI was defined as: hits number (correct intentional responses) divided by the sum of hits, false alarms (incorrect intentional responses), and misses (incorrect intentional non-responses)24,25. Because correct intentional non-responses (“rejections”) could not be discriminated from correct unintentional non-responses, the CSI was preferred over standard percent accuracy. That is, percent accuracy as computed conventionally resulted in overinflated accuracy estimates, especially for designs with a high percentage of non-response trials, as in the present study ( 80% of trials). Thus, the CSI provided a performance measure less biased by the ambiguity of non-response trials24,25.

For linear regression between CSI and RT, CSI and Gf, RT and Gf, the effect sizes were quantified by Cohen’s f2 = r2/(1 − r2) and interpreted following the convention: 0.02 ~ small, 0.15 ~ medium, 0.35 ~ large26.

Imaging protocol and data preprocessing

In the HCP imaging protocol27, MRI scanning was done using a customized 3 T Siemens Connectome Skyra using a standard 32-channel Siemens receiver head coil and a body transmission coil. T1-weighted high-resolution structural images were acquired using a 3D MPRAGE sequence with 0.7 mm isotropic resolution (FOV = 224 × 224 mm, matrix = 320 × 320, 256 sagittal slices, TR = 2400 ms, TE = 2.14 ms, TI = 1000 ms, FA = 8°) and used to register functional MRI data to a standard brain space. N-back fMRI data were collected using gradient-echo echo-planar imaging (EPI) with 2.0 mm isotropic resolution (FOV = 208 × 180 mm, matrix = 104 × 90, 72 slices, TR = 720 ms, TE = 33.1 ms, FA = 52°, multi-band factor = 8, 405 frames, ~ 4 m and 51.6 s/run).

We followed the same published routines in our earlier studies28,29. Imaging data were preprocessed using SPM8. Images of each individual subject were first realigned (motion corrected). A mean functional image volume was constructed for each subject from the realigned image volumes. These mean images were co-registered with the MPRAGE image and then segmented for normalization with affine registration followed by nonlinear transformation. The normalization parameters determined for the structural volume were then applied to the corresponding functional image volumes for each subject. Finally, the images were smoothed with a Gaussian kernel of 4 mm at Full Width at Half Maximum.

Imaging data modeling and statistics

We modeled the BOLD signals to identify 2-back and 0-back responses. We followed our previous routine in image data modeling28,29. A statistical analytical block design was constructed for each individual subject, using a general linear model (GLM) by convolving the canonical hemodynamic response function (HRF) with a boxcar function in SPM. Realignment parameters in all six dimensions were entered in the model as covariates. The GLM estimated the component of variance that could be explained by each of the regressors.

In the first-level analysis, we constructed for each individual subject a statistical contrast “2- minus 0-back” for second-level, group analyses. In group analyses, we conducted a one-sample t test of the contrast “2- minus 0-back” to identify regional responses to working memory. In addition to the T maps, effect size maps were computed using tools available in CAT12 toolbox (http://www.neuro.uni-jena.de/cat/), by approximating Cohen’s d26 from the t-statistics using the expression \(d=\frac{2t}{\sqrt{df}}\) as employed in30. To examine how regional brain activations to working memory varied across subjects in relation to accuracy and RT, we conducted whole-brain multiple regressions each on the contrast (2- minus 0-back) against differences in critical success index (2- minus 0- back; CSI2–0) and differences in RT (2- minus 0-back; RT2–0) of correct trials only as the regressor, with age, sex and years of education as covariates. We performed another whole-brain multiple regression on the contrast (2- minus 0-back) against PMAT24_A_CR with the same covariates.

We evaluated the results at voxel p < 0.05, corrected for family-wise error (FWE) of multiple comparisons, on the basis of Gaussian random field theory, as implemented in SPM. We identified brain regions using the Data Processing & Analysis of Brain Imaging toolbox (DPABI)31 and an atlas32, if the peak was not identified by the DPABI.

Path analyses

We employed path analysis to evaluate how CSI2–0, RT2–0, and the neural correlates (see “Results”) were inter-related. Model fit was assessed with standard fit indices which included the Root Mean Square Estimation of Approximation (RMSEA, < 0.08 for an acceptable fit), Chi-square (χ2/df, < 3), Comparative Fit Index (CFI, > 0.9), and Standardized Root Mean Square Residual (SRMR, < 0.06)33,34,35.

Results

Behavioral performance and its relationship to fluid intelligence

Subjects averaged at a CSI of 73.9 ± 22.3 (mean ± S.D.) % in 0-back and 58.0 ± 19.6% in 2-back, and a RT (correct trials only) of 790 ± 138 ms in 0-back and 989 ± 137 ms in 2-back. The CSI was significantly lower in 2- than in 0-back (t = − 21.695, Cohen’s d = -1.14, p = 4.6532 × 10–85, paired-sample t test) and the RT was significantly longer in 2- than in 0-back (t = 49.217, Cohen’s d = 3.20, p = 2.3886 × 10–263, paired-sample t test). Across subjects, the differences in CSI between 2- and 0- back, i.e., CSI2–0, was negatively correlated with the differences in RT between 2- and 0- back, i.e., RT2–0 (r = -0.350, Cohen’s f2 = 0.140, p = 1.1588 × 10–28, Pearson regression with age, sex and years of education as covariates, Fig. 1B). That is, higher CSI2–0 was associated with smaller RT2–0.

Across subjects, the Gf, as indexed by the PMAT24_A_CR, was not correlated with CSI2–0 (r = 0.006, Cohen’s f2 = 3.6 × 10–5, p = 0.847, Fig. 1C) but positively correlated with RT2–0 with a small effect size (r = 0.155, Cohen’s f2 = 0.0246, p = 0.000002, Fig. 1D) in Pearson regression with age, sex and years of education as covariates.

Regional activations to 2- vs. 0-back

Figure 2 shows the results of one-sample t test of 2- vs. 0-back in a whole-brain analysis. Two- vs. 0-back involved higher activation in bilateral superior/middle/inferior frontal cortex, bilateral inferior parietal cortex, medial frontal cortex in the area of pre-supplementary motor area (preSMA), extending to an area just anterior to the dorsal anterior cingulate cortex (dACC), bilateral caudate/lentiform nucleus, anterior thalamus, anterior insula, superior parietal lobule, including the dorsal precuneus. Conversely, 0- vs. 2-back involved higher activation in the dACC, middle/posterior cingulate cortex, bilateral somatomotor cortex, paracentral lobule, ventral precuneus, frontopolar cortex, and thalamus in the area of pulvinar and habenula. Many of these brain regions were contiguous to form larger clusters, as summarized in Supplementary Table S1.

Figure 2
figure 2

One-sample T-test of the contrast 2- minus 0-back. Voxel p < 0.05, FWE corrected. Voxels showing higher activity during 2- vs. 0-back and 0- vs. 2-back are shown in warm and cool colors, respectively. BOLD contrasts are overlaid on a structural image in axial sections from z = − 16 to + 64 with 8 mm gaps. Color bars show voxel T values and the corresponding Cohen’s d scores. Neurological orientation: right = right. The inset shows a mid-sagittal section of the clusters. Figures 2, 3, 4, 5A and 7A were generated by DPABI_V4.0_190305 (http://rfmri.org/dpabi).

Regional activations to 2- vs. 0-back in correlation with CSI2–0 and RT2–0

Whole-brain linear regression against CSI2–0 shows regional activations in Fig. 3 (clusters summarized in Table 1). Briefly, CSI2–0 was correlated positively with activation of a small cluster located in the dorsal anterior cingulate cortex (dACC) bordering the supplementary motor area (SMA) and negatively with activation of the preSMA, bilateral frontoparietal cortex (biFPC), and right anterior insula (rAI). Figure 4 shows regional activations to 2- vs. 0-back in correlation with RT2–0 (clusters summarized in Table 2). RT2–0 was positively correlated with activation in biFPC, preSMA, rAI, caudate head and dorsal precuneus, and negatively correlated with activation of a large cluster extending from the midcingulate cortex to paracentral lobule, ventral precuneus, bilateral primary motor cortex, middle/posterior insula, and right superior temporal sulcus.

Figure 3
figure 3

Whole-brain multiple regression against CSI2–0 with age, sex, and years of education as covariates. Voxel p < 0.05, FWE corrected. Voxels in warm/cool colors show positive/negative correlations. Color bars show voxel T values and the corresponding Cohen’s d scores. The inset shows a mid-sagittal section of the clusters.

Table 1 Clusters showing correlation with CSI2–0 (Critical Success Index), with age, sex, years of education as covariates.
Figure 4
figure 4

Whole-brain multiple regression against RT2–0 (correct trials only) with age, sex, and years of education as covariates. Voxel p < 0.05, FWE corrected. Voxels in warm/cool colors show positive/negative correlations. Color bars show voxel T values and the corresponding Cohen’s d scores. The inset shows a mid-sagittal section of the clusters.

Table 2 Clusters showing correlation with RT2–0 (correct trials only), with age, sex, years of education as covariates.

A number of clusters showed positive correlation with CSI2–0 and negative correlation with RT2–0 (CSI + RT −; dACC) or negative correlation with CSI2–0 and positive correlation with RT2–0 (CSI − RT +; preSMA, biFPC, and rAI) (Fig. 5). These shared regional activities may represent the neural substrates interlinking accuracy and RT in the N-back task. Thus, we performed path analyses to examine the inter-relationship between the shared correlates (CSI + RT − or CSI − RT +), CSI2–0, and RT2–0. For the sake of completeness, we evaluated all 12 models, although the models with CSI + RT − and CSI − RT + as dependent variables were conceptually unlikely. The results of path analyses showed the model CSI + RT- → RT2–0 → CSI2–0 with the best fit (Fig. 6), suggesting that the dACC may facilitate target switching during the stimulus stream and target identification accuracy. Supplementary Table S2 shows the statistics of all other models (Supplementary Fig. S2). In contrast, the counterpart CSI − RT +  → RT2–0 → CSI2–0 or any other models did not show a significant model fit.

Figure 5
figure 5

(A) A number of clusters showed positive correlation with CSI2–0 and negative correlation with RT2–0 (CSI + RT −), highlighted in green, or negative correlation with CSI2–0 and positive correlation with RT2–0 (CSI − RT +), highlighted in magenta. The CSI + RT − cluster comprised solely of the dACC. Linear regression of (B) beta estimates of green cluster (CSI + RT −) vs. the difference in CSI between 2- and 0-back: CSI + RT − vs. CSI2–0; (C) beta estimates of green cluster (CSI + RT −) vs. the difference in RT of correct trials between 2- and 0-back: CSI + RT − vs. RT2–0; (D) CSI − RT + vs. CSI2–0; and (E) CSI − RT + vs. RT2–0. Note that the scatter plots show the residuals, after age, sex, and years of education were accounted for. (BE) were generated by SPSS Statistics 22.0 (https://www.ibm.com/support/pages/spss-statistics-220-available-download).

Figure 6
figure 6

The model of CSI + RT − → RT2–0 → CSI2–0, which showed the best fit of all 12 models, is shown with path coefficients, covariates and covariance structures. Figure was generated by IBM SPSS Amos 22 (https://www.ibm.com/support/pages/downloading-ibm-spss-amos-22).

Regional activations to 2- vs. 0-back in correlation with Gf (PMAT24_A_CR)

Figure 7 shows regional activation in positive correlation with PMAT24_A_CR, with age, sex and years of education as covariates. Summarized in Table 3, these clusters involved biFPC, preSMA, dorsal precuneus, and rAI. Almost all of the clusters overlapped with those with activities in positive correlation with RT2–0 as highlighted in magenta in Fig. 5A. No clusters showed activation in negative correlation with PMAT24_A_CR. Because the correlation between PMAT24_A_CR and RT2–0 showed a very small effect size (Cohen’s f2 = 0.0246), we did not follow up with path or mediation analyses on these variables.

Figure 7
figure 7

(A) Whole-brain multiple regression against PMAT24_A_CR with age, sex, and years of education as covariates. Voxel p < 0.05, FWE corrected. The clusters are summarized in Table 3. Clusters in warm color show higher activation during 2- vs. 0-back blocks in positive correlation with PMAT24_A_CR. Color bars show voxel T values and the corresponding Cohen’s d scores. The inset shows a mid-sagittal section of the clusters. The voxels (almost all) with activation in positive correlation with both Gf and RT2–0 are highlighted in magenta. Linear regression of (B) beta estimates of the magenta clusters (Gf + RT +) vs. Gf (PMAT24_A_CR): Gf + RT + vs. Gf; and (C) Gf + RT + vs. RT2–0. Note that the scatter plots show the residuals, after age, sex, and years of education were accounted for. B and C were generated by SPSS Statistics 22.0 (https://www.ibm.com/support/pages/spss-statistics-220-available-download).

Table 3 Clusters showing correlation with PMAT24_A_CR, with age, sex, years of education as covariates.

To determine if the working memory-specific neural correlates engaged during the n-back task and predicting working memory performance also reflect Gf, we computed the beta values of the clusters and ran a regression of the beta value against PMAT performance scores (Gf). The results showed that the beta value of (CSI + RT −; dACC) was not correlated with Gf (r = − 0.013, p = 0.698, Cohen’s f2 = 0.000169). The beta value of (CSI − RT + ; preSMA, biFPC, and rAI) was only weakly correlated with Gf (r = 0.130, p = 0.000062, Cohen’s f2 = 0.01719).

Discussion

Bilateral superior/middle/inferior frontal cortex, bilateral inferior parietal cortex, dorsomedial prefrontal cortex (in the area of the pre-SMA), bilateral caudate head, thalamus, and anterior insula showed higher activation during 2- vs. 0-back, in accord with earlier findings14,36,37,38,39,40,41,42,43,44. In the aims to characterize individual variation in behavioral performance and the neural correlates, we showed that, first, CSI2–0 was negatively correlated with the RT2–0, suggesting that “taking time to identify the target” did not improve the accuracy in 2- vs. 0-back; participants who performed with higher accuracy were also faster in identifying the target. The dACC showed activities during 2- vs. 0-back both in positive correlation with CSI2–0 and negative correlation with RT2–0. Further, path analyses showed a most significant fit of the model dACC → RT2–0 → CSI2–0, suggesting a critical role of target switching and the dACC in determining performance accuracy. Individual variations in RT2–0 and Gf were positively correlated, though only with a small effect size, whereas CSI2–0 and Gf were not significantly correlated. We highlighted the main findings in discussion.

Individual variation of CSI2–0 and RT2–0 in the N-back task

Individuals differ markedly in working memory performance, as reflected in the accuracy and RT (Supplementary Fig. S1). Here, the CSI2–0 was negatively correlated with the RT2–0, indicating that participants who were more accurate were also faster in identifying 2- vs. 0-back target. Conventionally, accuracy is emphasized without specific constraint on response time in the N-back task, which can lead to a ceiling or near-ceiling effect13,45. That is, individuals may intentionally slow down to optimize accuracy. However, we observed here that the CSI2–0 and RT2–0 were negatively correlated across subjects, suggesting that prolonging RT did not confer an advantage in achieving higher accuracy.

In whole-brain linear regressions we identified the correlates of individual variation in CSI2–0 and RT2–0. CSI2–0 was correlated with higher activation of a small cluster on the border of dACC and SMA and with lower activation of the preSMA, bilateral FPC and right AI. Despite a more limited coefficient of variation (CV = SD/mean; 0.61 for RT2–0 vs. 2.04 for CSI2–0), RT2–0 was associated with a wider swath of regional responses, in the preSMA, biFPC, thalamus, basal ganglia, dorsal precuneus, rAI, and cerebellum in positive correlation and in the dACC, middle and posterior cingulate, ventral precuneus, bilateral somatomotor cortex in negative correlation. This finding may suggest RT as a more sensitive index of regional responses to support working memory13 and individual variation reflecting not only differences in memory capacity but also the efficiency in utilizing the memory46,47. The preSMA showed higher activation in correlation with RT2–0, consistent with its role in decision making and controlled actions48,49. In contrast, somatomotor cortex showed higher activities in support of speedier responses during 2- vs. 0- back, consistent with earlier reports of motor cortical activities in relation to RT50,51.

It is possible that CSI2–0 may be determined with a number of different neural processes, including encoding, maintenance and target-updating during stimulus presentation, with each engaged to different degrees across subjects that altogether accounted for the individual variation in CSI2–0. In contrast, RT2–0 may more specifically reflect the process of target updating and identification, allowing its neural correlate to reveal in group regression. This contrast is reminiscent of earlier findings from the stop signal task52,53,54. Whereas individuals exhibited behavioral slowing following both stop success (SS) and stop error (SE) trials, a direct contrast between post-SE and post-go trials identified right hemispheric ventrolateral PFC activation but one between post-SS and post-go trials failed to demonstrate regional activities. We similarly argued that post-SS likely involved more complex mental processing, including motor hesitation, which differed too extensively across subjects to yield a consistent pattern of regional responses53.

Neural processes inter-relating CSI2–0 and RT2–0

The dACC showed lower activation during 2- vs. 0- back blocks (Fig. 2), as also demonstrated in an earlier study55, seemingly in contrast with the role of the ACC in cognitive control. However, we observed across the whole-brain regressions dACC activity in positive correlation with CSI2–0 and negative correlation with RT2–0. Thus, less diminution of dACC activity during 2- vs. 0- back is associated with more efficient response and higher accuracy. These findings together suggest that while dACC is overall less engaged during 2- vs. 0- back, a greater extent of dACC engagement during 2-back would facilitate target identification. An extensive body of work demonstrated that the dACC responds to saliency and set switching56,57,58,59,60. For instance, in the monetary delay incentive task, pupil dilations were linked to increased activity in the dACC, which may trigger an increase in arousal to enhance task performance61. It is likely that the dACC was more engaged in 0- vs. 2- back because 0-back required simply target detection and focused, moment-to-moment attention whereas 2-back required attention to be distributed over the sequence of stimuli. On the other hand, higher dACC activity in switching the target identity would facilitate N-back performance. Notably, without clearly distinguishing the subregions, studies have reported higher activation of the medial prefrontal cortex to 2- vs. 0-back, but a closer examination revealed that the clusters, with Z coordinates ranging from + 40 to + 44, appeared to be largely in the preSMA44,62, as we also observed here.

A number of brain regions, including the preSMA, biFPC and rAI showed activation during 2- vs. 0-back in negative correlation with CSI2–0 and positive correlation with RT2–0. As described earlier, the preSMA is widely implicated in volitional, controlled action and decision making. For instance, the preSMA monitored conflict and facilitated slowing of motor response as a result of expected conflicts49. Bilateral FPC likewise is known for its role in restraining impulsive responses63,64, consistent with the current findings. While also considered as part of the salience network61,65,66,67,68, the AI showed higher activation during 2- vs. 0-back, suggesting that saliency alone cannot account for these regional activities. A recent study showed that the rAI increased in activity monotonically as a function of cognitive load in a backward masking majority function task41, broadly in accord with the current finding of higher response during 2- vs. 0-back. The AI has also been implicated in higher demand of effort across many other behavioral paradigms69,70,71. Together, these considerations highlight the multiple component processes involved in working memory; how areal activations are dedicated specifically to these component processes remain to be investigated.

Importantly, we showed in path analyses the model with the most significant fit: dACC activity → RT2–0 → CSI2–0, suggesting that, by way of enhancing target switching, the dACC decreases the RT and facilitates accuracy during 2- vs. 0- back. In contrast, the preSMA, biFPC and rAI did not significantly form paths with RT2–0 → CSI2–0. These findings support a central role of the dACC in supporting N-back performance, whereas the CSI − RT + clusters—preSMA, biFPC, and rAI—did not appear to partake specifically in relating RT2–0 to CSI2–0.

Working memory and the Gf

Gf was not significantly correlated with CSI2–0 and only correlated with RT2–0 with a small effect size. Thus, Gf did not appear to be well reflected in individual N-back performance. On the other hand, both Gf and RT2–0 shared activations in positive correlation in the preSMA, bilateral but predominantly right FPC, rAI and the dorsal precuneus. These findings suggest that Gf was at best marginally captured by individual differences in RT2–0 and, to the extent these variances could be accounted for by regional brain activities, the activities reflect slower and perhaps more cautious responding during 2- vs. 0-back. While these results appear to be consistent with an earlier literature associating Gf with activities of the cognitive control network72,73,74 and with the parieto-frontal integration hypothesis of human intelligence75,76,77, a large proportion of the variance in Gf was notably not explained by RT2–0 or shared regional activities.

The same brain regions showed higher activations proportionally to the extent to which participants anticipated conflict and slowed down in the stop signal task49,56,78. This finding suggested that individuals with higher Gf were more inclined to strategize their RT and the preSMA, right FPC and AI support these behavioral processes. Post-conflict slowing reflects cognitive control. In the cognitive control model of human intelligence, cognitive control serves as a core component of working memory and intellectual abilities, especially the Gf, and drives the relationships between these constructs14. Regional activations in behavioral paradigms other than the N-back task may better capture individual differences in Gf.

Limitations of the study and conclusions

A few limitations need to be considered. First, in the HCP N-back task, the stimulus was presented for 2 s, followed by an inter-trial interval of 0.5 s, in each trial, which allowed participants plenty of time (2.5 s in total) to respond to the target. However, the mean RTs were 989 and 790 ms for 2- and 0-back, respectively, suggesting that the majority of participants responded well before the trial ended most of the time. We reviewed 18 studies from the literature and noted that this appeared to be typically the case (Supplementary Table S3). Although we cannot speculate whether or how participants were engaged in the decision about speed-accuracy trade-off on the basis of these data, it is likely that regional brain activities were dictated by these task parameters and performance metrics, and the current findings should be considered as specific to the HCP. A study that systematically manipulates the task constraints within the same group of participants would be needed to thoroughly investigate how task parameters influence speed and accuracy and the neural processes underlying individual speed-accuracy trade-off in the N-back task. Second, although working memory is central to fluid intelligence, to what extent the N-back performance reflects Gf, as evaluated by the RSPM, represents a potential issue6 and suggests the need to consider the current findings on Gf as specific to RSPM. Third, the voxels that showed overlap between two sets of regressions were not identified on the basis of a statistical procedure. However, to our knowledge, there are no formal statistical approaches to assessing the significance of “overlap” between two sets of regressions (i.e., two different models). Conjunction or disjunction analyses, as implemented in SPM, could only be performed for different contrasts within the same GLM. Finally, although the HCP aimed to recruit “healthy populations,” the participants were heterogeneous in clinical characteristics, including many with history of or current substance use. Whereas we did not control for these variables, in the hope that, as true to the HCP, the data may reflect a broader population, the influences of the variables on the current findings remain to be clarified.

In conclusion, our findings highlight the key neural correlates of N-back performance metrics. The findings suggest the RT as a more sensitive measure of N-back performance and a key role of the dACC in supporting efficient target identification during 2- vs. 0-back.