Brain network dynamics during spontaneous strategy shifts and incremental task optimization

With practice, humans may improve their performance in a task by either optimizing a known strategy or discovering a novel, potentially more fruitful strategy. How does the brain support these two fundamental abilities? In the present experiment, subjects performed a simple perceptual decision-making task. They could either use and progressively optimize an instructed strategy based on stimulus position, or spontaneously devise and then use a new strategy based on stimulus color. We investigated how local and long-range BOLD coherence behave during these two types of strategy learning by applying a recently developed unsupervised fMRI analysis technique that was specifically designed to probe the presence of transient correlations. Converging evidence showed that the posterior portion of the default network, i.e. the precuneus and the angular gyrus bilaterally, has a central role in the optimization of the current strategy: these regions encoded the relevant spatial information, increased the level of local coherence and the strength of connectivity with other relevant regions in the brain (e.g. visual cortex, dorsal attention network). This increase was proportional to the task optimization achieved by subjects, as measured by the reduction of reaction times, and was transiently disrupted when subjects were forced to change strategy. By contrast, the anterior portion of the default network (i.e. medial prefrontal cortex) together with rostral portion of the fronto-parietal network showed an increase in local coherence and connectivity only in subjects that would at some point spontaneously choose the new strategy. Overall, our findings shed light on the dynamic interactions between regions related with attention and with cognitive control, underlying the balance between strategy exploration and exploitation. Results suggest that the default network, far from being “shut-down” during task performance, has a pivotal role in the background exploration and monitoring of potential alternative courses of action.


Introduction
"Practice makes perfect", they say. By engaging long enough in any activity, we expect major improvements in both accuracy and speed. This is true for tasks as complex as playing piano and as mundane as preparing homemade pasta (Ericsson and Lehmann, 1996). These improvements may happen through multiple paths. One way is incremental task optimization: while following the same solution strategy, one can optimize the implementation of the adopted algorithm, achieving measurable processing gains (e.g., becoming quicker in mixing the same ingredients for pasta). Alternatively, one can optimize the task by learning about useful but previously unknown contingencies (e.g., that changing the order in which ingredients are mixed speeds up the procedure). This new information can be used to devise a new strategy, and then reach the same task goals with greater efficiency (Heathcote et al., 2000;Cohen et al., 2007;Badre et al., 2010;Hayden et al., 2011;Collins and Frank, 2013;Donoso et al., 2014;Schuck et al., 2015;Roeder and Ashby, 2016;Cole et al., 2017;Gaschler et al., 2019).
Task optimization has been associated with a decrease of activation both in areas specialized for the task and in a set of brain regions associated to control and attentional functions (Chein and Schneider, 2005;Patel et al., 2013;Hampshire et al., 2016). This evidence hints at an increasingly efficient processing of task-relevant information, but how this efficiency increase is reflected in the interplay of different brain circuits remains an open question. The distributed nature of the effects suggests that optimization entails an increasingly efficient routing of information across the brain. This may be understood as a modulation of neural circuits involving multiple brain regions, rather than just a local activation change. Evidence relating learning to connectivity changes has become available thanks to the recent development of tools for network dynamics analysis (Bassett et al., 2011(Bassett et al., , 2015Cole et al., 2013;Bassett and Mattar, 2017). These studies showed that learning induces brain-wide network modifications, with task-related regions becoming increasingly segregated. Yet, the evidence available to date is essentially limited to motor sequence learning. Whether these observations generalize to other types of progressive task optimization, or underlie a strategy shift is currently unknown. In fact, other studies comparing task execution with rest observed an increase of integration across brain regions (Shine and Poldrack, 2018).
In this work, we investigate the modulation of functional coupling occurring during task optimization, instructed strategy shifts and the spontaneous discovery of new strategies. Subjects were instructed to press one of two alternative buttons based on the spatial features of the stimuli (as in Schuck et al., 2015). Although subjects were not informed, the color of the stimuli could be used to determine the correct response. Subjects could either spontaneously discover and use the new color strategy, or continue to use the instructed spatial strategy until they were explicitly told otherwise (Fig. 2). Our previous work (Schuck et al., 2015) revealed that the encoding of stimulus color in Medial Prefrontal Cortex predicts a strategy change. However, with our previous analysis approach, we could not assess whether and how brain networks changed their behavior during task optimization, or whether they behaved differently in people that would or would not generate a strategy change.
Here, we fill this gap by analyzing the same data set with a novel analysis approach, Coherence Density Peak Clustering (CDPC, Allegra et al., 2017). By detecting sets of temporally coherent voxels ("clusters"), CPDC can simultaneously reveal a functional coupling between neighboring and between distant voxels. An important feature of this algorithm is that it allows identifying clusters even in short time windows (approximately 20 s). This makes CDPC more sensitive to transient coherence than other analysis approaches, for instance group ICA (for a systematic comparison between the two methods see Allegra et al., 2017). Consistently with previous works (Bassett, 2015), our main hypothesis is that task optimization will induce changes of the neural connectivity within the task relevant network. Whether change occurs in the form of an increased Fig. 1. Summary of the neuroimaging analyses. In a window of 11 scans (22 s), we identify the subset of voxels that are locally coherent with at least four spatial neighbors (a). Voxels surviving this local coherence filter are shown in yellow. The filtering procedure is applied separately for each subject, using sliding windows (1-11, 2-12, etc.) and including only gray matter voxels (b). For each subject, we identify the frequency (fraction of time windows) with which a voxel has local coherence with its neighbors, producing a clustering frequency map Φ (c). We compute an average Φ map for all subjects (d). On this map, we identify voxels having significant values of average Φ. Significance is defined by computing Φ on white matter voxels (e). We threshold the gray matter Φ on the maximum value found in white matter. The above-threshold voxels are divided in 22 regions around each peak of the average Φ. Different regions are shown in different colors (f). Finally, we compute a connectivity matrix between high-Φ regions. For each subject separately, in each time window of 11 scans we consider the voxels surviving the spatial filter and we divide them into different coherent clusters based on density peak clustering (g). Voxels assigned to two different clusters are shown in red and green respectively. We compute the clusters in all time windows (h) and define a pairwise connectivity between two regions by computing the frequency with which voxels belonging to two regions assigned to the same cluster (j). Fig. 2. Stimulus-response mappings in the task (a) Instructed S-R mapping used by corner users (b) learned S-R mapping used by color users. segregation, as suggested by Bassett et al., (2015), or rather an increased integration (as in Finc et al., 2017), is one of the open questions motivating our investigation. Furthermore, following our previous work (Schuck et al., 2015) we hypothesize that brain networks centered on rostro-medial prefrontal cortex (e.g. medial BA10, part of the default mode network) will behave differently in subjects who will or will not generate an alternative strategy.
Overall, the CDPC analysis method combined with our experiment allowed us to reveal how the dynamics of local coherence and long-range connectivity is associated with instructed strategy optimization, the discovery of possible alternative strategies, and strategic shifts.

Methods
Task. Behavioral and imaging data of the main experiment were recorded while participants performed a simple perceptual decisionmaking task (Spontaneous Strategy Switch Task, Schuck et al., 2015). Participants were instructed to respond manually to the position of a patch of colored dots within a square reference frame. They were asked to select one of two responses depending on which corner of the reference frame the colored squares were closest to (Fig. 2a). Participants held a button box in each hand and could press either left or right. Two opposite corners (along the diagonal) were mapped to the same response. The main task during scanning included twelve runs with 168 trials each. In Runs 1 and 2 (Random Runs), the stimulus color was unrelated to the position of the stimulus and the response. In Runs 3-10 (Correlated Runs) the color had a fixed relation to the response (e.g., all upper-left and lower-right stimuli were green, the remaining ones were red). Participants were not informed about this contingency but could learn and apply it spontaneously (Fig. 2b). By the end of Run 10, all participants were informed about the existence of a fixed association between color and corner (without specifying the relation) and instructed to use the color from then on (Instructed Runs). Each of the twelve runs of the main experiment lasted about 5 min and was followed by a short break. The experimenter monitored the performance of the participants. Written and oral feedback was given between runs if the error rate exceeded 20%. The response-stimulus interval was 400 ms. To measure the learning and use of color information, different trial conditions were used (for details, see Schuck et al., 2015). In the standard condition (80 out of 168 trials/run), the patch of dots was presented for 400 ms and was closest to one of the four corners of the reference frame; in the ambiguous trials (32 out of 168) the stimulus was centered within the reference frame and was presented for 400 ms; in the NoGo trials (32 out of 168) the colored squares were displayed for 2000 ms without a reference frame in some trials and the task afterward continued with the next trial, with participants having to hold back any key press on the current trial; in the LateGo trials (16 out of 168), the frame was displayed after the initial 2000 ms, and the participants had to react in a regular fashion; finally, in eight trials of each run the screen remained black for 3000 ms (baseline condition). Due to the duration of the hemodynamic response function, the fast design of the experiment resulted in event-related BOLD signals, which also contained a signal proportion that reflected brain activation caused by previous and following events.
Before entering the scanner, participants were instructed and trained in the task. The instructions described all conditions (except ambiguous trials). Participants were only told to press any key of their choosing in case they were uncertain about the stimulus location. The color of the stimuli was mentioned only in an unspecified manner ("A stimulus can be either red or green."). The response mapping was shown in all color combinations (a stimulus in each of the four corners was shown in both red and green during the instruction). In the training phase, participants were slowly accustomed to the short display durations (the display duration was successively shortened until it reached 400 ms). Feedback was given for all wrong and premature responses and time-outs (2500 ms threshold). The color of the stimuli was not systematically related to stimulus position during training. The training lasted at least 50 trials and ended when the participant made less than 20% errors in 24 consecutive trials. If the participant exceeded 168 trials without reaching the criterion, the training was restarted. Participants were further instructed that upon entering the scanner, no more feedback would be provided. After completion of the main experiment, participants completed a questionnaire with the following questions: (1) "In the experiment, which you have just completed, each corner had one associated color. Did you notice this while you were performing the task?" [yes/no]. (1b) "If yes, when did you notice this (after what percentage of the experiment)" [participants had to mark their answer on a scale from 0% to 100%]. (1c) "Did you use this color-corner relation to perform the task, i.e. to choose which button to press?" [yes/no]. (2) "Please indicate now which color the stimulus had for each of the four corners. If you did not notice this relation during the experiment or you are uncertain, you can guess." Scanning and preprocessing. Acquisition of magnetic resonance images was conducted at the Berlin Center for Advanced Neuroimaging, Charit e Berlin. We used a 3 T S MagnetomTrio (Siemens) researchdedicated MRI scanner to acquire all data. T1-weighted structural images were acquired with an MP-RAGE pulse sequence with a resolution of 1 mm3. A T2*-weighted echo-planar imaging (EPI) pulse sequence was used for functional imaging (3 Â 3 Â 3 mm voxels, slice thickness ¼ 3 mm, TR ¼ 2000 ms, TE ¼ 30 ms, FOV ¼ 192 mm, flip angle ¼ 78 , 33 axial slices, descending acquisition). EPI slices were aligned to the anterior-posterior commissure axis. Field maps for distortion correction were acquired also using an EPI sequence. To allow for T1 equilibration effects, the experiment was started 6 s after the acquisition of the first volume of each run. Image pre-processing was performed using SPM12 (Wellcome Trust Centre for Neuroimaging, London, United Kingdom) running under Matlab 7.4 (R2007a) (Mathworks, Sherborn, MA, USA). The performed preprocessing steps were: a correction for magnetic inhomogeneities using field maps, slice timing correction, realignment to correct for motion, co-registration of anatomical images with functional images, and tissue segmentation based on the co-registered structural images to build a brain mask. Whenever required to allow for comparison of results across subjects, spatial normalization and/or smoothing was performed on the Φ maps (Allegra et al., 2017). The Φ maps were normalized to the standard MNI template and spatially smoothed with a Gaussian kernel of 9 mm FWHM. For visualization purposes, we used the MRIcron software (www.mricron.com) and BrainNet Viewer (Xia et al., 2013;www.nitrc.org/projects/bnv/).
Overview of the neuroimaging analyses. The core of our neuroimaging analyses is based on Coherence Density Peak Clustering (CDPC) (Allegra et al., 2017). We assume that: (i) the task-evoked modulation of activation elicits temporal coherence in the frequency domain f ! 0.05 Hz (see. Bassett, 2011;Bassett, 2015;Sun, 2004); the elicited temporal coherence may be discontinuous within each block, thus better detectable over short time-windows; (ii) the changes in coherence, while involving multiple spatial scales from neighboring to distant voxels, affect at least a few spatially contiguous voxels. On the basis of assumption (i), we measured coherence in time windows of length 22s and then assessed how frequently coherence occurred within a block. On the basis of assumption (ii), we excluded from further analysis of all those voxels not coherent with at least 4 of their spatial neighbors.
In Fig. 1, we summarize the main steps of the analysis. We first applied CDPC over short (22 s) sliding windows. 22s is the same timewindow length for which CDPC was validated and the shortest time scale over which coherence can be reliably detected (see Methods and Allegra et al., 2017). We then measured how frequently, within each block, any voxel is functionally coupled to its neighboring voxels, i.e. how often any voxel is locally clustered within a cube of 9 mm side. This defines a clustering frequency map Φ (Fig. 1a-c). A large majority of voxels in the brain is hardly ever clustered, resulting in very low values of Φ: we assume that these voxels are not involved in task-dependent modulation of coherence. We thus focused the subsequent analysis on voxels with a relatively high value of Φ. The latter can be grouped into 22 brain regions (Fig. 1d-f), that can be used as a basis to explore the time-dependence and the subject-dependence of Φ.
Next, we focused on long-range coherence. We assumed that a voxel with a time-series not coherent with its spatial neighbors is not providing a task-related signal, either at a local or a global scale. In Allegra et al., 2017, we showed that spurious long-range coherence can be observed by chance also when only noise is present in the data. However, the value of Φ for the voxels involved in these spurious correlations is low, offering a route to identify these artifacts. Intuitively, voxels producing spurious long-range coherence are sparse. On the basis of this analysis, voxels with low Φ (thus, with poor coherence with their spatial neighbors) should not be part of long-range clusters of coherent activity. Therefore, high-Φ regions represent a suitable basis to study not only the local coherence, but also the long-range one. We used CDPC to compute a pairwise connectivity matrix between the 22 regions: for each pair of regions, we measured the frequency with which voxels of the first regions are functionally coupled with voxels of the second (Fig. 1g-j). Here, the functional coupling is measured by whether two voxels have similar BOLD time series and are thus assigned to a common long-range coherent cluster, as identified via CPDC (see methods for details). Again, we explored the subject-and time-dependence of results.
Aside CDPC, we also used two state-of-the-art fMRI analysis methods to corroborate the interpretation of the CDPC results. We used a task-vsrest GLM contrast to verify which of the regions is activated/deactivated during the task. Moreover, we applied multivariate pattern analysis (MVPA, as described in detail in Schuck et al., 2015), to verify which of the regions is encoding the color/corner features of the stimuli.
Clustering frequency maps. Coherence Density Peak Clustering (CDPC, Allegra, et al., 2017) aims at finding groups of voxels (clusters) whose BOLD signal is coherent in a given time window, usually short (e.g. 20 s). Contrary to other methods, such as community partition on a connectivity matrix, CPDC does not simply split all voxels into different clusters. In fact, voxels can be poorly coherent with other voxels (and hence not clearly part of any well-defined cluster), or coherent with other voxels only as a consequence of correlated fluctuations in the noise (a spurious cluster). CDPC first discards all voxels displaying poor or potentially artifactual coherence and then assigns only the remaining voxels (usually a small fraction of the total) to clusters.
The method starts by defining a distance d ij that captures the coherence between the BOLD signals of voxels i and j. The distance is given by the Euclidean distance between the BOLD time-series of the two voxels where, however, the raw time-series ν i ðtÞ; ν j ðtÞ have been suitably preprocessed, undergoing demeaning and amplitude-normalization. Note that the lowest frequency affecting the distances is 1/T (where T is the time window length), which for T ¼ 22 s is 0.045 Hz.
Two voxels are regarded as coherent if the distance between the respective BOLD signals is low, as defined by a threshold, d ij < d c . Coherence between voxels can occur even if only noise is present. However, when only noise is present, high coherence tends to be observed between isolated voxels, while the presence of several coherent voxels within a close spatial neighborhood is unlikely (see Allegra et al., 2017). More formally: for each voxel i, we define its neighbors as the voxels falling within a cube of 9 mm side centered on i, corresponding to about 27 voxels. We denote with n i the number of neighbors that are coherent with i. In our previous work, we showed that n i was generally lower than a threshold n 0 ¼ 4 when only noise is present. All voxels with n i ! n 0 are thus considered in the clustering procedure, while voxels with n i < n 0 are discarded. This filtering procedure was shown to minimize the rate of detection of spurious clusters (see Allegra et al., 2017). This criterion of cluster membership relies entirely on a measure of coherence that is strictly spatially local (27 neighboring voxels).
We run this procedure on sliding windows of 11 scans (22 s). This is the same length for which CDPC was validated (see Allegra et al., 2017), and it is considered as the minimal window length for which transient connectivity clusters can be reliably identified (Hutchison et al., 2013). We use overlapping windows, progressively shifting the center of the window by 1 scan. The procedure is applied twice for each subject, the first time including only gray matter voxels, and the second time only white matter voxels. For each subject and for each voxel i, the clustering yields a binary value, for every time window t, tracking whether the voxel was or not in a cluster. We devised an index, Φ, measuring how often a voxel i is part of a cluster in an interval comprising N t time windows.
where t is a time window label, N t the number of time windows, χ is a step function (χðn i ðtÞ > n 0 Þ ¼ 1 if χðn i ðtÞ > n 0 Þ, and χðn i ðtÞ > n 0 Þ ¼ 0 otherwise). Intuitively, Φ i is an aggregate measure of the coherence of the local activity of a voxel with its surroundings. We call Φ i clustering frequency map, since, as we discuss below, if voxel satisfies the condition n i ðtÞ > n 0 it is automatically included in one of the clusters (Allegra et al., 2017). We compute a clustering frequency map for each block, i.e. half of a run (~150 s). Thus, for every subject, we generated 24 maps. The information given by a Φ map is not equivalent to the one obtained by running CDPC on a single time window equal to the entire block. The latter choice would include and emphasize the contribution of low frequencies (.005 Hz < f < 0.05 Hz) in the computation of d ij . This would reduce the sensitivity of the procedure to higher frequencies (f > 0.05 Hz), which are likely those critical for capturing the task related signals. The Φ maps focus on transient coherence occurring over timescales shorter than the whole block (Sako glu et al., 2010).
High-coherence regions. The Φ i maps can be used to identify voxels that are potentially relevant for a task, under the assumption that task relevant voxels would be more often part of a cluster than voxels that are not (Allegra et al., 2017). For each subject, we averaged Φ maps over all blocks, obtaining one map for each subject. We normalized the average individual maps to MNI space and performed a Gaussian smoothing (FWHM ¼ 9 mm). Finally, we averaged individual maps to obtain a single group map Φ i representing, for each voxel, the probability of being part of a cluster during task execution over all subjects.
To define "high-Φ i " regions, we selected a threshold as follows. We carried out the same CDPC procedure for voxels outside the gray matter, i.e., in regions for which we can assume that no real effects were present (Logothetis and Wandell, 2004). Outside gray matter, we observed the highest Φ values in the white matter (as compared e.g. with CSF). We decided to use as threshold the most conservative value from white matter to exclude most of the Φ measurements not related to a real underlying signal in gray matter. Therefore, to identify potentially task-relevant voxels we conservatively thresholded the Φ i map of the gray matter with the maximum value Φ max ¼ :11 observed in the white matter.
We focus attention on voxels with high Φ i , dividing them into a set of regions. To define the latter, we aggregated voxels above the Φ max threshold around every peak in the Φ map. The detailed procedure is reported in the Supplementary Materials. In this way, we could define regions tailored to the average spatial distribution of Φ (the center of each region corresponds to a point of high Φ). These regions are used as a "basis" to explore the time-dependence and the subject-dependence of Φ. In particular, they are used to perform statistical tests to compare Φ maps obtained in different blocks (see "statistical tests" in Supplementary Materials). This approach has several advantages as compared to a-priori parcellation schemes. First, it allows focusing only on those voxels showing a high local coherence in their activation pattern, which are the best candidates for possibly showing measurable dynamical effects in the subsequent analyses. Second, it allows shaping the ROIs on the observed spatial distribution of the relevant signal (coherence). This may reduce the washing out of the signal caused by averaging different voxels within (possibly large) a priori parcels. Third, it enhances the statistical power, focusing the analysis on a limited set or regions-of-interest and thus limiting the severity of the correction for multiple comparisons. Connectivity: regional and long-range coherence. Φ maps identify voxels that are frequently coherent with their close spatial neighbors, and are thus assigned to clusters. Given that, voxels in the same high-Φ region can be considered, over the whole experiment, mutually coherent. However, Φ does not measure to which extent voxels within a high-Φ region, or in different high-Φ regions are mutually coherent. To answer this question, one needs to know not only whether two voxels are part of any cluster (as Φ does) but, more specifically, whether two voxels are part of the same cluster. To assign each voxel in each time window and in each subject to a specific cluster proceed as follows (Allegra et al., 2017).
We first define the density ρ i as the number of (non-isolated) voxels that are coherent with i, over the whole brain: Notice that ρ i is usually higher than the number of coherent neighbors (measured by n i for Φ) because typically a voxel is coherent with many voxels outside its local neighborhood. Cluster centers are identified as peaks in the density distribution. Following Rodriguez and Laio 2014, we compute δ i ¼ min ϱ j >ϱ i d ij , which is the minimum distance (in the space of BOLD signals) from a voxel with higher density. Cluster centers stand out as isolated points with a large values of δ i . We rank the voxels according to their value of δ i and consider as putative cluster centers the first 10 ( Allegra et al., 2017). After the cluster centers have been chosen, all remaining voxels are assigned to a cluster following a recursive procedure. Each voxel is assigned to the same cluster of the most similar voxel having a higher density; if the latter voxel in not yet assigned, one looks for the voxel most similar to it having a higher density, and so forth until either an already assigned voxel or one of the cluster centers is reached. At the end of the procedure, we obtained a map, for each time window, assigning each voxel to a specific cluster. Given two regions a and b we define a measure of their mutual coherence as where s aðbÞ is the number of voxels in region a (resp. b). The term δðc i ðtÞ ¼ c j ðtÞÞ is equal to one if voxel i and voxel j belong to the same cluster at time t. We weight this term by its density normalized to that of the cluster center, ρ c i ðtÞ ¼ ρ i ðtÞ ρ c ðtÞ χð ρ i ðtÞ ρ c ðtÞ > αÞ where ρ c ðtÞ is the density of the cluster peak and α is a lower cutoff threshold (here, α ¼ :3). In this way, we weight more pairs of voxels in the cluster cores (high density) than to the cluster tails (low density). The diagonal terms N aa measure the local coherence within the region a, while the off-diagonal terms N ab measure long-range coherence between different regions. GLM analysis. We performed a GLM analysis, with standard trials and resting trials as separate regressors, and motion parameters as nuisance variables. The response in standard trials was modeled as a response to stimulus presentation: onsets and durations corresponded to the onsets and durations of stimulus presentation. Resting trials had a duration of 3000 ms. We tested for significant activation or deactivations in the high-Φ regions, by averaging the contrast map over each region and performing a region-wise t-test over subjects.
MVPA analysis. Representation of stimulus features (color and corner) was analyzed by a multivariate classification approach based on a support vector machine (SVM) with a linear kernel in combination with a searchlight approach (Kriegeskorte et al., 2006;Norman et al., 2006;Haynes, 2015). For details, we refer to our previous work (Schuck et al., 2015). For color representation, the SVM was trained on parameter estimates (''betas'') from a general linear model of red and green NoGo trials in the last two runs (where all participants use the color strategy), and then tested on betas from Runs 1-10. This resulted in one accuracy map for each block and subject. For corner representation, the classifier was trained on betas of standard trials in the first two runs (where no participants use the color strategy) and then tested on betas from Runs 3-12.

Results
Participants performed a simple perceptual decision task (Schuck et al., 2015). They were instructed to respond manually to the position of a patch of colored dots within a square reference frame by selecting one of two responses. Participants held a button box in each hand and could press either the left or the right button. Color patches closer to the upper left or lower right corner mapped onto the left button, while patches closer to the lower left or upper right corner mapped onto the right button (Fig. 2). Therefore, there was a four-to-two stimulus-response mapping, where two opposite corners (along the diagonal) were mapped onto the same response. Participants performed 12 runs of the task, each one comprising 168 trials and lasting~5 min. In runs 1 and 2, the stimulus color (red or green) was unrelated to the position of the stimulus and the response. In runs 3-12 the color had a fixed relation to the response (e.g., all upper-left and lower-right stimuli were green, the remaining ones were red, Fig. 2). Participants were not informed about this contingency but they could learn it and generate a new task strategy based on the stimulus color. Before the last two runs (11-12), participants were explicitly instructed to switch to the color strategy. For the following analyses, we considered experimental "blocks", where 1 block ¼ ½ run, lasting~2.5 min. Such block length is roughly the timescale over which the targeted behavioral changes would become reliably measurable. The same definition for blocks was used in our preceding work (Schuck et al., 2015).
Behavioral results. Most of the behavioral results have been already reported in our previous work (Schuck et al., 2015). We briefly summarize here the findings relevant to the present work. The majority of subjects (25/36, the "spatial strategy users") used the instructed spatial strategy over the first 20 blocks. As expected, spatial strategy users showed evidence of incremental task optimization during the first 20 blocks, as indexed by a progressive reduction of reaction times (RTs) and errors (Fig. 3a). After the instructed switch to the color strategy from block 21, RTs and errors further decreased, thus confirming the effectiveness of the color strategy.
A minority of the subjects (11/36, "color users") switched spontaneously to the color strategy before the end of the block 20. The switch point could be precisely and robustly identified by several behavioral markers (see Methods). In particular, in a fraction of the trials (ambiguous trials) the dots were centered within the square reference frame, equidistantly from all corners: in these trials, evidence of a color-based strategy comes from the number of responses that are consistent with the stimulus color (while a strategy based on stimulus position should yield essentially random responses). The fraction of color consistentchoices in color users shows an abrupt increase in the switch block (Fig. 3b). Before the strategy switch, color users also showed a progressive reduction in RTs and errors (Fig. 3c). This trend exhibits a transient stop just before the spontaneous switch. After the spontaneous switch to the color strategy, RTs and errors further decrease also in color users, albeit less abruptly as compared to spatial strategy users (Fig. 3d).
Identification of a set of relevant regions. In the following analyses, we considered fMRI data from 35 participants. One subject was excluded because the field of view did not cover the whole brain. The core of our neuroimaging analyses is based on Coherence Density Peak Clustering (CDPC) (Allegra et al., 2017). See "Overview of the neuroimaging analyses" in Methods for a quick overview. For each subject, and for each of the 24 independent blocks, we applied CDPC to gray matter voxels as identified by tissue segmentation. CPDC was applied on sliding windows of 11 vol (22 s).
For each subject and each block, we computed a clustering frequency map Φ i that measures the fraction of time windows within a block in which a voxel is coherent with its spatial neighbors and hence clustered. The Φ i maps are consistent across subjects: averaging Φ i maps over all blocks and performing spatial smoothing (9 mm FWHM), we obtained a between-subject correlation of 0.623 (SD ¼ 0.003). A comparison between the average Φ i maps for color and spatial strategy users failed to detect any significant difference. Also, we did not observe any effect of the sex of the participants. We thus averaged Φ i maps over all subjects, and we obtained a Φ i map reporting the average clustering frequency across subjects and blocks. The large majority of voxels have low values of Φ i (<0.1) and are thus rarely part of coherent clusters: we assume that these voxels are not involved in task-dependent modulation of coherence. Areas with high Φ i represent voxels that are consistently clustered over different blocks and subjects.
To study the time-and subject-dependence of the results, we focus attention on voxels with high Φ i , dividing them into a set of regions serving as a common "basis" for analysis. To focus analysis on potentially task-relevant voxels we conservatively thresholded the Φ i map of the gray matter with the maximum value Φ max ¼ :11 observed in the white matter (see Methods for details).
Approximately 7% of all brain voxels survive this threshold. In Fig. 4 we show the resulting thresholded Φ i map for gray matter. We grouped spatially contiguous voxels above the white matter threshold in regions around each peak in the Φ i map, thus obtaining 22 regions (Table 1). These high-Φ regions are distributed throughout the brain, including areas in the occipital cortex, parietal cortex, prefrontal cortex, temporal cortex, thalamus, and mesencephalon. We stress that our main results are robust with respect to the choice of the Φ i threshold: in the Supplementary Information (Fig. S1-S4), we replicated our major results with two different thresholds (such that less than half, or more than twice the amount of voxels are included) and correspondingly two different sets of regions. The main results did not change.
Since CDPC is an unsupervised technique, the high level of local coherent activity in the identified high-Φ regions is not necessarily related to the task. To verify whether the high-Φ regions are indeed taskrelated, we used multiple approaches. First, we compared CDPC results with the supervised, multivariate pattern analysis (MVPA) performed on the same regions (see Methods, and for details Schuck et al., 2015). For each subject and each block, MVPA produced accuracy maps assessing whether local activity patterns represented relevant features of the stimuli, namely, to which corner the patch is nearest. For each high-Φ region, we tested whether the average accuracy of the multivariate classifier was above the chance level. We found that several regions in the occipital, parietal and prefrontal cortex encoded spatial information (p < .05, FDR corrected, Fig. 5a and b). A second approach to investigate task-relevance is to test whether the identified regions had a different average activation during task performance as compared to a resting baseline. Almost all regions showed either significant activation or significant de-activation (p < .05, FDR corrected, Fig. 5c and d). Regions  more active than baseline are medial and lateral occipital cortex bilaterally (1-2), and the superior parietal lobule bilaterally (9-10). All remaining areas, including occipital (3-4), central and lateral parietal (5-8, 11), prefrontal (12-15), temporal (16-20) are deactivated. Taken together, these findings suggest that most of the identified regions are likely involved in the task. Further converging evidence is provided by the subsequent analyses. Results concerning the occipital regions deserve an additional comment. The two regions in the anterior calcarine sulcus (3-4, Fig. 4) did not encode task-relevant information, while they showed learning effects similar to parietal regions (discussed below). By contrast, the two regions in the posterior calcarine sulcus and lateral occipital cortex (1-2) did not show learning effects but they encoded spatial information. Notably, posterior calcarine was activated compared to baseline while anterior calcarine was deactivated.
We interpret this pattern as the effect of an attentional negative modulation on the portion of the calcarine sulcus processing the peripheral visual field, which is not related to the task at hand. Thus, negative attentional modulation produces both a deactivation (Broday-Dvir et al., 2018) and a progressive increase in local coherence. By contrast, central visual field regions are activated, but their coherence, already high at the beginning, does not further increase with learning.
Temporal dynamics of the clustering frequency (Φ). Having identified the set of high-Φ regions, we focused on differences in Φ related to behavioral changes across time and across subjects. We found that the Φ maps obtained were similar for different blocks within single subjects. We computed a block similarity for each subject by computing the Pearson correlation between all pairs of block Φ maps within each subject and averaging over blocks. Over all subjects, we obtained an average similarity of 0.64 (SD ¼ 0.09), suggesting an overall qualitative stability of the brain regions involved during the experiment.
Consistently with this finding, the set of regions with significant coherence appeared stable during the experiment. This is not entirely obvious. Indeed, high-Φ regions were identified by applying a fixed threshold obtained from white matter to the average (across subjects and blocks) Φ map. Having a high average Φ across blocks does not imply that Φ is high in each block separately: there may be large variations over blocks, e.g., with some regions exhibiting high coherence only over a few blocks. This is why we repeated the analysis for different blocks, showing that the set of high-Φ regions obtained would not change from block to block. For both color and spatial strategy users, we considered the voxels above the threshold Φ max in each block. Results are reported in Fig. 6. High-Φ regions do not qualitatively change during the experiment. No regions disappeared, and no new regions appeared.
The lack of qualitative change, however, does not imply that Φ remains constant over time. In fact, quantitative variations of Φ were observed. We first analyzed the subjects who used the spatial strategy up to block 20 and switched to color strategy in blocks 21-24, after receiving explicit instructions. We computed the average Φ in each region as a function of the block. The variations of Φ across blocks were different between regions. In regions 3,4 (occipital) and 5,6,7,9,11 (lateral and medial parietal) we observed three phenomena, as shown in Fig. 7a. First, in blocks from 1 to 20, i.e. when subjects used and improved the spatial strategy, Φ rose. Second, after the switch to the color strategy, Φ suddenly dropped. Third, Φ underwent a fast recovery, so that the measure was back to the pre-switch level after just one block (~2.5 min). Regions 9,10 (lateral superior parietal) showed both the increase and the drop but did not show the recovery (Fig. 7b). Regions 16-20 (temporal lobe) showed the increase, but no drop (Fig. 7d). Finally, the remaining regions, including 1,2 (occipital lobe), 12-15 (prefrontal cortex) and 21,22 (thalamus and mesencephalon) show little variation of Φ across time ( Fig. 7c and e). The effects corresponding to the increase, drop and recovery are summarized for all regions in Fig. 8 and in Table. 2.
As mentioned, several regions showed at the same time an increase in Φ during task optimization and a decrease upon strategy change (Fig. 8, Table. 2). In these regions, the dynamics of Φ in blocks 1-20 and 20-21 is what would be expected for a variable correlated to the optimization of the spatial strategy: a gradual increase during learning of a specific strategy and a drop when such strategy is abandoned. In order to explore this possibility further, we evaluated the correlation between Φ and participants' reaction times (RTs). Overall, the average RTs in blocks 1-20 are negatively correlated with Φ averaged over all regions (Fig. 9a).
We computed the correlation between Φ and RTs in each region and then averaged over subjects. The statistical significance of the resulting average correlation was assessed by using a permutation approach, recomputing the Pearson coefficient for 10,000 random permutations of the blocks. We found that the increase in Φ in most regions was significantly correlated with the decrease of RTs (p < .05, FDR corrected for multiple comparisons). The strongest effects were in the parietal, occipital and temporal regions (Fig. 9b). If the observed negative correlation Table 1 Summary information about the 22 high-Φ regions, including brain location (main AAL region and Brodmann area, MNI coordinates of the region center), size (number of voxels), and the short name used in figures. were just due to an unspecific, time-dependent, increase of Φ we would expect to find the same relation also in brain areas outside the 22 high-Φ regions examined. We assessed whether the negative correlation is present also in other brain regions by using a standard whole-brain atlas parcellating the brain in 268 regions (Finn et al., 2015). Only 28 regions out of 268 showed a significant negative correlation (p < .05, uncorrected). These regions are located in the parietal, precuneus and prefrontal cortex, with a large overlap with high-Φ regions. Finally, we checked whether the increase of Φ would be present even in an experiment in which task optimization is not expected to occur. If the increase of Φ were just due to artifacts or physiological noise, the increase should be observed irrespectively of whether some learning occurs in the experiment or not. We repeated the same analysis procedure on an experiment (Reverberi et al., 2018) where 15 subjects performed 7 runs of a simple language task (naming of common objects) not expected to trigger any learning. We evaluated the dynamics of Φ as a function of the run in the same regions showing a time-dependent increase in the present study. We did not observe any evidence for an increasing Φ in the language experiment (region 1: p ¼ .01 uncorrected; region 6: p ¼ .04 uncorrected; all other ps > .1, uncorrected). Thus, the increase of Φ in the present study is likely related to the task. We carried out on color users an analysis similar to the one performed on spatial strategy users. It should be noticed that in color users the timing of the strategy change was not fixed as in spatial strategy users, but it was variable from subject to subject. Thus, we considered the increase of Φ from block 1 to the block of the strategy change. We again observed a significant increase of Φ in the regions already reported for spatial strategy users (Fig. 7 f-j). In contrast with spatial strategy users, We mark with an asterisk (*) regions with significant mean corner accuracy (FDR correction at α ¼ 0.05, t-test). (b) Overlap (purple) between CDPC regions (red) and voxels with significant spatial representation (cluster-wise FWE correction at α ¼ .05, T-test) for spatial strategy users (blue). (c) Mean activation for task versus rest contrast for all subjects. We mark with an asterisk (*) regions with significant activation or deactivation (FDR correction at α ¼ 0.05, t-test). (d) Regions with significant activation (red) or deactivation (blue) for task versus rest contrast as revealed by GLM analysis for all subjects (color scale represents significance, -log(p), 5 corresponds to p ¼ 10 À5 ). O: region 1-4, occipital; P: region 5-11, parietal; F: region 12-15, frontal; T: region 16,18,19,20, temporal; C: region 18, caudate; Θ: region 21, thalamus; M: region 22, midbrain. however, in color users such an increase was observed also in prefrontal cortex (regions 12-15). Furthermore, color users showed no sudden decrease in Φ between blocks 20 and 21. This was expected given that color-users did not switch strategy at that point in time. Relatedly, we explored the presence of a decrease of Φ between the block in which subjects spontaneously changed strategy and the following one (equivalent to blocks 20/21 in spatial strategy users). We observed a decrease of Φ in the same regions that showed an effect in the spatial strategy users. The effect, however, is considerably weaker compared to the spatial strategy users. In fact, the results are not significant after FDR correction; the largest effect is observed for region 8 (left parietal) with p ¼ .005 (uncorrected). The weakness of the effect may be related to the reduced sample of color users (11 instead of 24 subjects). More importantly, the transition between the two strategies, when non-instructed, is likely to be gradual and, for a short lapse of time, the two strategies might be used simultaneously.
Finally, we checked whether the observed dynamic changes of Φ might be explained by motion artifacts. This is not the case (see supplementary information, Fig. S5-S6).
Cluster-based connectivity analysis. Clustering frequency (Φ) maps reveal how often a voxel is involved in a coherent activity in a time window of approximately 20s and in a spatially local neighborhood. A group of voxels generally shows high coherence not only with voxels in the spatial neighborhood but also at larger distance. Φ maps, however, do not directly measure long-distance coherence, neither can reveal which regions, among those with high-Φ, have mutually coherent time series, i.e. are connected. To quantify long-range coherence effects we measured the frequency with which voxel pairs in two regions are assigned to the same cluster, weighted by a measure of the robustness of cluster assignment (Methods). By computing these values for all regions, we built a clustering-based connectivity matrix (Fig. 10). The diagonal terms of the matrix represent a measure of the within-region coherence. By contrast, the off-diagonal terms represent the coherence or connectivity between regions. The usage of the high-Φ regions for analysis of the longrange coherence is optimal, since only regions with high local coherence are involved in long-range coherent clusters.
Like the Φ-maps, the connectivity matrices obtained are similar for different blocks within a single subject (r ¼ 0.79, SD ¼ 0.06) and, upon averaging over blocks, across different subjects (r ¼ 0.65, SD ¼ 0.15). In general, the within-region coherence (average N aa ¼ 0.035, SD ¼ 0.006) is higher than the between-region coherence (average N ab ¼ 0.008, SD ¼ 0.010). Nevertheless, we found strong long-range links between subsets of regions. In Fig. 10 we show the connectivity matrix averaged over blocks and subjects, and the 50% strongest links. We used the popular Louvain modularity optimization method (Blondel et al., 2008) to assign regions to subnetworks or modules. Regions within a module display Fig. 6. Stability of the high-Φ regions during the experiment. For each block separately, we considered the Φ for that block, averaged over subjects (top: corner users; bottom: color users), and identified the voxels with average Φ higher than the maximum value Φ max found in the white matter. Most voxels pass the threshold for almost all blocks. Here we show the conjunction map, representing the number of blocks where a voxel passes the threshold.
higher mutual connectivity. The optimal partition identifies 6 modules. The modules are frontal (F), parietal (P), occipito-parietal (OP), occipital (O), temporal (T) and thalamus (Fig. 10). Furthermore, the precuneus (region 5) acts as a hub connecting the four fronto-parieto-occipital modules, thus showing high connectivity with all of them. More in detail, all regions in the anterior frontal cortex (12-15) are assigned to the frontal module. The regions in the temporal cortex (16-20) form the temporal module including also the midbrain (22). The thalamus (21) is not connected with other regions and forms a module alone. The occipital and parietal regions are split into 3 modules. The parietal module includes the precuneus (5) and the inferior lateral parietal regions (6-8, 11). The occipito-parietal module joins two occipital regions (1,2) and the superior lateral parietal regions (9,10). Finally, two occipital regions (3,4) form the occipital module.
These modules do not strictly follow mere anatomical proximity or functional subdivisions at rest. All frontal regions are in one module, but parietal regions and occipital regions are split. The module including regions in the angular gyrus and precuneus is largely composed by voxels belonging to the default network. Similarly, the module including two occipital regions is entirely composed by voxels from the visual network. By contrast, the other two modules are mixed: one (the occipito-parietal) including both regions from the visual and the dorsal attention network, the other (the frontal) including both voxels from the default network and from the fronto-parietal control network.
By using an approach similar to the one used for Φ, we explored variations of the connectivity network in time. We first analyzed spatial strategy users (Fig. 11). We observed an increase of connectivity centered on the medial and lateral parietal lobe involving namely the parietoparietal, parieto-occipital and parieto-frontal links (p < 0.05 uncorrected, Wilcoxon test; links with p < 0.01 are also FDR corrected). Upon switch to color strategy (block 21), the strength of the links mainly centered on the parietal lobe decreased and then, as for clustering frequency, the same links showed a rebound to the connectivity level reached before the switch (p < 0.05 uncorrected, Wilcoxon test). In Fig. 11a-c, we show the links with an increase between block 1 and block 20, those with a decrease between block 20 and 21, and those with an increase between block 21 and 24 (p < .05 uncorrected, Wilcoxon test). It is apparent that there is an increase between block 1 and block 20 of the connectivity within the P module, and between the P module and the OP, O, and F modules. To improve statistical sensitivity, we performed the same tests focusing on module-wise connectivity. We averaged over all pairs of links between regions assigned to two modules: for example by averaging all links between module P regions and module OP regions we obtained P-OP connectivity (Fig. 12a). The P-P, P-OP, P-O, P-F, and O-F links have a significant increase in between block 1 and block 20 (p < .05 FDR corrected, Wilcoxon test) and a significant decrease between block 20 and block 21 (p < .05 FDR corrected, Wilcoxon test).
We performed similar analyses for color users ( Fig. 11d and e, 12b) by realigning the time series to the switch block. Overall, the pattern is similar to the one found in spatial strategy users. Notably, however, color users showed an increase in fronto-frontal connectivity that was not present in spatial strategy users: Connectivity increased within medial prefrontal cortex (region 13) and between medial prefrontal and the two anterior lateral frontal regions bilaterally (region 12 and 14). Using the module-wise connectivity, we found a significant increase in P-P, P-OP, P-F, O-F, OP-F and F-F links between block 1 and the transition block (p < .05 FDR corrected, Wilcoxon test). Notably, the increase was significant even before the color-corner correlation was introduced (p < .05 FDR corrected, Wilcoxon test). Thus, color users seem to be characterized by a greater integration between regions of the F module, and between the F module and the occipito-parietal modules. Finally, in the passage to color strategy (from the transition block to the subsequent block) we observe a weak decrease of connectivity within the P module (p ¼ .007 uncorrected), and between the latter and the F, OP and O modules (p ¼ .06, p ¼ .07, p ¼ .06 uncorrected).
Summary of the results. In summary, we identified 22 brain regions displaying high average local coherence over the~1h of task performance in all subjects.
Twenty-five out of thirty-six subjects continued to use the instructed spatial strategy and steadily improve their performance for~50 min until they were explicitly told to switch to the color strategy for the last~10 min. The optimization of the spatial strategy was associated with a progressive increase of local coherence in the precuneus, the lateral parietal lobe and in the medial occipital lobe. The increase of coherence in these (f-j) Change of the clustering frequency as a function of the block for color strategy users. On the left of each plot, we show the first 7 blocks. On the right, we show the 6 blocks around the switch after realigning the time series of each subject to the individual switch blocks (À1 is the block in which the subjects spontaneously switch strategy, þ1 the subsequent block). The vertical dashed line identifies the switch.
regions was correlated with the reduction of reaction times during the optimization of the spatial strategy. When subjects were finally instructed to apply the color strategy, local coherence sharply dropped, as it would be expected if those regions were indeed involved in spatial strategy optimization. In the course of the instructed application of the new color strategy, coherence showed a fast recovery returning to the pre-switch level after only one block (2.5 min) in the anterior calcarine sulcus, in the precuneus and the angular gyrus bilaterally, thus suggesting that these regions were also involved when applying the new color strategy. While increasing their local coherence, these parietal and (e-f) Difference of frequency variation in the high-Φ regions between block 24 and block 21 in spatial strategy users. (g-h) Difference of clustering frequency between the block in which subjects spontaneously switched strategy (block "-1") and the first block (block "1") in color strategy users. (i-j) Difference of clustering frequency in the high-Φ regions between the block in which the subject spontaneously switches strategy (block "-1") and the following block (block "þ1") for color strategy users. Column bars represent the average over subjects, error bars the standard error. We mark with an asterisk (*) regions where Φ increases significantly (Wilcoxon test, FDR correction at α ¼ 0.05). In the rendering, we show regions with p < 0.05 (unc.), the colorbar represents -log(p). O: region 1-4, occipital; P: region 5-11, parietal; F: region 12-15, frontal; T: region 16,18,19,20, temporal; C: region 18, caudate; Θ: region 21, thalamus; M: region 22, midbrain.  9. (a) Average reaction times (red) and average clustering frequency (blue) in high-Φ regions as a function of the block for spatial strategy users. Points are the mean over subjects, shaded regions AE the standard error. For each subject, reaction times are averaged over all trials in a block, and Φ is averaged over all regions. (b) Pearson correlation between Φ and reaction times in high-Φ regions for spatial strategy users. We mark with an asterisk the regions that have a significant correlation (permutation test, p < .05). O: region 1-4, occipital; P: region 5-11, parietal; F: region 12-15, frontal; T: region 16,18,19,20, temporal; C: region 18, caudate; Θ: region 21, thalamus; M: region 22, midbrain. occipital regions showed either a decreasing or a constant average activation. Furthermore, all occipital and parietal regions encoded the relevant spatial information of the stimulus, with the exception of the anterior calcarine sulcus that processes the peripheral eye field (see also Broday-Dvir et al., 2018).
Eleven subjects discovered the uninstructed color strategy and applied it at a variable moment during task performance. These subjects showed an overall coherence and connectivity dynamics similar to the one described for corner strategy users, but with revealing differences. Similar to spatial strategy users, color strategy users showed an increased local coherence in the occipital and parietal regions before the spontaneous strategy change. Importantly, however, only color users showed an increase of local coherence in the anterior prefrontal regions. This specificity in local coherence was mirrored also in connectivity. While the intra-module connectivity of the prefrontal module remained constant in spatial strategy users, in color users the connectivity increased (Fig. 11). Notably, the tendency to increase started immediately, even before the color and the response were associated (first 4 blocks). Moreover, as spatial strategy users, color users showed a switch-related drop in local coherence and connectivity in the parietal module.

Discussion
Humans can improve their performance in any task by gradually optimizing the implementation of a known strategy, or by devising and then adopting novel, more efficient strategies (Heathcote et al., 2000;Badre et al., 2010;Collins and Frank, 2013;Donoso et al., 2014;Schuck et al., 2015;Roeder and Ashby, 2016). Previous research has shown that practicing a task induces changes not only in the activation level of specific brain regions, but also in the long-range organization of the relevant brain networks (Chein and Schneider, 2005;Cole et al., 2013;Patel et al., 2013;Bassett et al., 2015;Bassett and Mattar, 2017). However, the network dynamics governing strategy optimization versus the discovery of a new strategy are still unknown. By applying an analysis approach integrating the Coherence Density Peak Clustering (CDPC) (Allegra et al., 2017) with Multi-Voxel Pattern analysis, standard GLM and behavioral analysis, we could identify the brain regions involved in the task learning, and describe the dynamics of local coherence and long-range connectivity involving these regions.
Incremental task optimization and instructed strategy change. Our first aim was to understand how coherence and connectivity vary when an established strategy is improved and when a forced shift to a new strategy occurs. We found that progressive task optimization shapes the activity of neural populations to become more coherent, specifically in regions involved in task processing. To the best of our knowledge, such relation between local coherence, learning, and task performance has not been previously reported. Previous literature has mainly analyzed local coherence in the context of the resting state, highlighting its usefulness as a marker of several pathologies rather than its modulation during a task (Jiang and Zuo, 2016). A modulation of local coherence is possibly Fig. 10. (a) Connectivity matrix between high-Φ regions, averaged over all subjects. The matrix element value corresponds to the value of N ab . Regions have been assigned to 6 modules according to a modularity maximization algorithm. The 6 modules are separated by black lines and are shown in different colors: occipito-parietal module (OP, regions 1-2,9-10), occipital module (O, regions 3-4), parietal module (P, regions 5-8,11), frontal module (F, regions 12-15), temporal module (T, regions 16-20,22), thalamus (Th, region 21). (b) the 50% strongest links in the average network in axial and sagittal view. Nodes assigned to different modules are shown in different colors (blue) O module (cyan) OP module (green) P module (yellow) F module (orange) T module (red) Th module.
achieved by a competitive mechanism enhancing task-relevant signals while reducing unrelated signals (Aston-Jones and Cohen, 2005;Eldar et al., 2013;Schmitz and Duncan, 2018). Furthermore, an increase in local coherence may indicate a progressive noise reduction, in line with recent neurophysiological findings. Works studying neuronal variability across multiple trials, as a measure of the internal noise of a neural system, have shown that the variability of task-relevant neurons decreases when stimuli are attended or perceived (Mitchell et al., 2007;Churchland et al., 2010;Hussar and Pasternak, 2010;Schurger et al., 2015;Broday-Dvir et al., 2018;Nougaret andGenovesio, 2018, 2018). The reduction in neuronal variability has been associated to individual differences in perceptual ability, and to training in a working memory task (Qi and Constantinidis, 2012;Arazi et al., 2017).
The analysis of the connectivity between regions and its dynamics . Panel (f) shows the links with a significant increase (p < .05 uncorrected) (g-h) Connectivity increase between block 1 and the block when subjects switched strategy (block "-1") for color strategy users. The matrix in the panel (g) shows the p-value of the increase between blocks 1 and -1 (Wilcoxon test comparing). Panel (h) shows the links with a significant increase (p < .05 uncorrected) (i-j) Connectivity decrease between the block when subjects switched strategy (block "-1") and the subsequent block (block "þ1") for color strategy y users. The matrix in the panel (i) shows the p-value of the decrease between blocks À1 and þ1 (Wilcoxon test). Panel (j) shows the links with a significant increase (p < .05 uncorrected). Fig. 12. Strength of the links between modules as a function of the block for corner and color users. We show only links that have a significant effect (increase or change after the strategy shift) for either corner or color users (see Table. 3). To facilitate the inspection of results, we show in green all links involving the parietal module, which have a similar behavior, and in yellow those involving the frontal module. (a) Strength of the links between modules as a function of the block for spatial strategy users. (b) Strength of the links between modules as a function of the block for color users. We show the 6 blocks around the switch after realigning the time series of each subject to the individual switch blocks (À1 is the block in which the subject spontaneously switches strategy, þ1 the subsequent block). The vertical dashed line identifies the switch.
provided both a confirmation of the findings on local coherence and further insights. Four modules were identified in the fronto-parietooccipital network with stronger connectivity between regions within each module (Blondel et al., 2008;Sporns and Betzel, 2016). Modules did not strictly follow mere anatomical proximity or functional subdivisions at rest. Thus, the engagement in the task introduced major modifications in the network organization of the brain observable at rest, as already reported elsewhere (Power et al., 2011a,b;Yeo et al., 2011;Spadone et al., 2015). The connectivity dynamics followed a pattern similar to that of local coherence, particularly in the parietal module ("P" in Fig. 10). During the task optimization phase, the regions belonging to the parietal module greatly increased connectivity, both intra-module and with all the other modules. In addition, the connectivity showed a sharp drop upon strategy change. By contrast, other modules did not generally show a systematic increase of connectivity. It would be certainly interesting to compare these findings with those obtained with a more traditional method, chiefly, standard connectivity analysis. In principle, such analysis may highlight other long-range coherent patterns correlating with behavior but escaping the CDPC analysis. This is possible only if the latter involve regions with poor local coherence. We leave such an analysis to a future investigation. Overall, our findings suggest that the increase in regional and longrange connectivity is a driver of learning. Increased connectivity favors transfer of information and integration between brain regions, with possibly different functional specializations, but all involved in processing a task (Deco and Kringelbach, 2016;Shine and Poldrack, 2018). Compatible effects have been reported when comparing brain connectivity during rest to visuospatial attention (Al-Aidroos et al., 2012;Spadone et al., 2015), working memory (Cohen and D'Esposito, 2016;Shine et al., 2016) or flexible rule application (Vatansever et al., 2017). By contrast, the evidence on the network dynamics associated with learning is still limited, also because the analysis tools available until recently had low sensitivity in detecting network changes (Bassett et al., 2015;Bassett and Mattar, 2017). One important study by Bassett and collaborators reported that visuomotor learning was associated with increased functional segregation (i.e. decreased temporal coherence Bassett et al., 2015), a finding seemingly in contrast with ours. We argue that this diversity is due to the different task-processing requirements of the two studies. In the task that we considered, subjects needed to rely on visual information to produce the correct motor response even when the task was highly practiced. In the study by Bassett and collaborators subjects produced motor sequences, that once learned, could be recognized and generated from memory without further relying on visual information. This may explain why learning produced a segregation of the visual and motor network (see also Cohen and D'Esposito, 2016). Our results would then better relate to all those real-life situations necessitating the integration of multiple processing paths.
While generally indicating a tendency towards integration rather than segregation, the observed connectivity increase was far from homogeneous. The parietal module, a part of the default network, had a central role, being the only one increasing connectivity with all other modules. This finding highlights an active role of the default network during task processing, in contrast with the commonly held idea that the default network is shut down when a subject is engaged in a task (Spreng, 2012;Crittenden et al., 2015;Vatansever et al., 2015;Margulies et al., 2016). Thanks to its widespread connectivity the default network could both receive sensory information and affect all task-relevant regions, to optimize stimulus processing and decision (Bar, 2007;Margulies et al., 2016;Vatansever et al., 2017;Dohmatob et al., 2018). Interestingly, the fact that the intra-and extra-module connectivity of the parietal module rebounds one block after switch suggests that optimization is an abstract process, recruited for strategies as diverse as those based on spatial information and color information. By contrast, during switching, when the system is reorganizing to cope with the new strategy the optimization is transiently paused.
Spontaneous alternative strategy discovery and change. Our second aim was to understand how the spontaneous generation of new strategies is related to coherence and connectivity dynamics. We found that connectivity patterns in medial prefrontal and rostrolateral prefrontal cortex reflected the engagement of processes for the discovery of novel strategies. Notably, connectivity among different frontal regions differed between participants long before their behavior began to change, foreshadowing who will discover a novel strategy and who will not. Rostrolateral prefrontal cortex has been proposed to be responsible for the evaluation of potential alternative strategies (Donoso et al., 2014;Domenech and Koechlin, 2015;Badre and Nee, 2018), while our own work has suggested that medial prefrontal cortex is involved in the internal simulation of an alternative strategy (Schuck et al., 2015). Moreover, the frontal regions also involve parts of orbitofrontal cortex that have been linked to the representation of task states, i.e. the information underlying choice selection (Wilson et al., 2014;Schuck et al., 2016;Badre and Nee, 2018). It is thus possible that the connectivity increases reflect cross-talk between the above-named computations that are involved in finding and implementing a novel strategy. While this is also consistent with proposals relating the default network to background exploration (Bar, 2007;Crittenden et al., 2015;Vatansever et al., 2015;Margulies et al., 2016;Dohmatob et al., 2018), our findings additionally show a functional differentiation within the default network (Karahano glu and Van De Ville, 2015). The observation that the connectivity dynamics in the two subjects groups diverged from the beginning of the experiment suggests that the equilibrium between these two poles might be a relatively stable individual feature (Melnick et al., 2013;Beaty et al., 2018), possibly present even from childhood .
Limitations. The present study has some limitations. First, we could not identify robust network modifications associated with spontaneous strategy change. This is probably due to the relatively small number of spontaneously switching subjects. While the proper number of subjects necessary to uncover network changes with CDPC was not known in advance, a posteriori we observe that n ¼ 11 is sufficient to detect the stronger, abrupt changes brought about by instructed strategy change, but not the subtle, possibly gradual ones occurring in spontaneous strategy change. A second limitation is that we did not perform restingstate fMRI on the same subjects. This would have provided a more straightforward "baseline" to benchmark the task CDPC results, and possibly allowed us to detect subtle differences between the spatial and color strategy users existing already in their spontaneous BOLD signal. Finally, the quite homogeneous demographics (all subjects were university students aged 21-31) did not allow us to investigate how general individual traits such as age, education, general intelligence might modulate the observed behavioral and functional changes.
Conclusions. We explored how brain networks behaved while human subjects optimized their strategy or created a new one. The observed network dynamics indicates a pivotal role of default-mode network regions, but with a clear functional differentiation within the network. While the posterior part of the default-mode network increased connectivity and local coherence when subjects optimized their current strategy, the anterior part of the network together with the rostrolateral prefrontal cortex was only involved in subjects who changed strategy. We speculate that the different behavior of the default-mode network in different people is a stable individual feature. A key ingredient for performing this analysis and highlighting this complex scenario is the use of an unsupervised clustering approach, capable of capturing the presence of transient coherence, and of monitoring the subtle changes in this coherence during learning.
CRediT authorship contribution statement