Rapid suppression and sustained activation of distinct cortical regions for a delayed sensory-triggered motor response

Summary The neuronal mechanisms generating a delayed motor response initiated by a sensory cue remain elusive. Here, we tracked the precise sequence of cortical activity in mice transforming a brief whisker stimulus into delayed licking using wide-field calcium imaging, multiregion high-density electrophysiology, and time-resolved optogenetic manipulation. Rapid activity evoked by whisker deflection acquired two prominent features for task performance: (1) an enhanced excitation of secondary whisker motor cortex, suggesting its important role connecting whisker sensory processing to lick motor planning; and (2) a transient reduction of activity in orofacial sensorimotor cortex, which contributed to suppressing premature licking. Subsequent widespread cortical activity during the delay period largely correlated with anticipatory movements, but when these were accounted for, a focal sustained activity remained in frontal cortex, which was causally essential for licking in the response period. Our results demonstrate key cortical nodes for motor plan generation and timely execution in delayed goal-directed licking.

In brief Esmaeili, Tamura, et al. investigate cortical contributions to a task in which mice learn to respond to a brief whisker stimulus with delayed licking for reward. They find suppression of orofacial sensorimotor cortex inhibits premature licking, whereas excitation of secondary motor cortex maintains a lick plan during the delay period.

INTRODUCTION
Incoming sensory information is processed in a learning-and context-dependent manner to direct behavior. Timely execution of appropriate action requires motor planning, in particular when the movement triggered by a sensory cue needs to be delayed. In this situation, the motor plan must persist throughout the delay period while the immediate execution of the motor response needs to be suppressed. Delayed-response paradigms are often used to study the neuronal circuits of sensorimotor transformation, because they allow to temporally isolate the neuronal activity that bridges sensation and action. In such paradigms, prominent delay-period activity has been reported in many cortical areas (Chabrol et al., 2019;Chen et al., 2017;Erlich et al., 2011;Esmaeili and Diamond, 2019;Fassihi et al., 2017;Funahashi et al., 1989;Fuster and Alexander, 1971; Gilad et al., 2018;Guo et al., 2014;Li et al., 2015;Makino et al., 2017;Tanji and Evarts, 1976). In particular, a previous study in mice identified delay-period activity in the anterolateral motor (ALM) cortex, which causally contributed to a lick motor plan (Guo et al., 2014). The persistent delayperiod activity in ALM is driven through a recurrent thalamocort-ical loop (Guo et al., 2017) and further supported by cerebellar interactions (Chabrol et al., 2019;Gao et al., 2018). The circuit mechanisms maintaining the persistent activity in ALM are therefore beginning to be understood. However, less is known about the circuits that initiate such persistent activity and how task learning shapes such circuits. In addition, how the persistent neuronal activity is related to body movements that animals often exhibit during delay periods needs to be carefully considered (Musall et al., 2019;Steinmetz et al., 2019;Stringer et al., 2019). Similarly, the neuronal circuits contributing to withholding a premature motor response during the delay are poorly understood. To dissect this process, it would be crucial to examine how neuronal activity flows across brain areas as sensory information is transformed into goal-directed motor plans (de Lafuente and Romo, 2006) and investigate how the underlying sensory and motor circuits become connected through reward-based learning (Esmaeili et al., 2020).
Here, we address these questions in head-restrained mice performing a whisker-detection task with delayed licking to report perceived stimuli. In our task, a brief and well-defined sensory input is rapidly transformed into a decision, and mice need to withhold the response until the end of the delay period.

Article
Through a unified and comprehensive approach, we detail the spatiotemporal map of causal cortical processing that emerges across learning. We found that following the fast sensory-evoked response in somatosensory cortex (Petersen, 2019), the activity in orofacial sensorimotor cortex (Mayrhofer et al., 2019) was rapidly and transiently suppressed, which contributed causally to withholding premature licking. The subsequent rapid sequential excitation of frontal cortical regions and their changes across task learning revealed that secondary whisker motor cortex (wM2) likely plays a key role linking whisker sensation to motor planning. We also found that the global activation of dorsal cortex during the delay period could be largely ascribed to preparatory movements that develop with learning, except for a localized neuronal activity in ALM (Komiyama et al., 2010), consistent with previous studies Guo et al., 2014). Our results therefore point to task-epoch-specific contributions of distinct cortical regions to whisker-triggered planning of goal-directed licking and timely execution of planned lick responses.

RESULTS
Behavioral changes accompanying delayed-response task learning We designed a go/no-go learning paradigm where headrestrained mice learned to lick in response to a whisker stimulus after a 1-s delay period ( Figures 1A-1C). To precisely track the sequence of cortical responses, we used a single, short (10 ms) deflection of the right C2 whisker. Trial start was indicated by a 200-ms light flash, followed 1 s later by the whisker stimulus in 50% of the trials (referred to as go trials); after a subsequent 1-s delay, a 200-ms auditory tone signaled the beginning of a 1-s response window. Licking during the response window, in go trials, was rewarded with a drop of water, whereas licking before the auditory tone (early lick) led to abortion of the trial and time-out punishment ( Figure 1B). To study essential neuronal circuit changes specific to the coupling of the whisker stimulus with the licking response, a two-phase learning paradigm was implemented: (1) pretraining and (2) whisker training ( Figure 1C). Pretraining included trials with visual and auditory cues only, and licking during the response window was rewarded, while licking before the auditory cue aborted the trial. Novice mice only went through the pretraining, which established the general task structure. Expert mice followed an additional whisker-training phase, dur-ing which they learned the final task structure ( Figures 1B, 1C, and S1A).
Novice and expert mice were recorded in the same final task condition but performed differently. While novice mice licked in both go and no-go trials, expert mice had learned to lick preferentially in go trials (Figures 1D and S1A; mean ± SEM, novice: hit = 70.6% ± 3%, false alarm = 71.1% ± 2.7%, p = 0.85, n = 15 mice; expert: hit = 67% ± 1.5%, false alarm = 19.7% ± 1.6%, p < 0.001, n = 25 mice; Wilcoxon signed-rank test). Expert mice made more frequent premature early licks in go trials compared to novice mice ( Figure 1D; mean ± SEM, novice = 12.5% ± 2.7%, expert = 25.1% ± 3.3%, p = 0.02; Wilcoxon rank-sum test), and most of their early licks happened toward the end of the delay period, reflecting predictive licking. Considering trials with licking during the response window, expert mice showed longer reaction times in no-go trials (false alarm) compared to go trials (hit) ( Figure S1B; mean ± SEM, novice: hit = 298.5 ± 21.2 ms, false alarm = 292.7 ± 21.4 ms, p = 0.14; expert: hit = 297.8 ± 16.7 ms, false alarm = 380.3 ± 16.3 ms, p < 0.01; Wilcoxon signed-rank test). These results indicate that expert mice used whisker information and learned to produce delayed licking. After whisker training, mice also adopted new movement strategies ( Figures 1E, 1F, S1C, and S1D). In hit trials, expert mice compared to novice mice decreased whisker movement before whisker stimulus, possibly to improve the detection of brief whisker stimuli in the receptive mode of perception (Diamond and Arabzadeh, 2013;Kyriakatos et al., 2017). The tongue and jaw movements in the delay period after the whisker stimulus increased in hit trials of expert mice compared to novice mice, reflecting preparation for licking. These anticipatory movements were absent in miss and correct-rejection trials ( Figure S1C) and thus correlated with the perceptual response. These patterns were similar comparing mice used for electrophysiology and imaging (Figures 1F and S1D).
Emergence of cortical activation and deactivation patterns through whisker training The delay task enables the investigation of different aspects of neuronal computations underlying reward-based behavior, including sensory processing, motor planning, and motor execution in well-isolated time windows. As a first step, we mapped the large-scale dynamics of cortical activity using wide-field calcium imaging at a high temporal resolution (100 frames per second) (Figures 2,S2,and S3). In transgenic mice expressing a fluorescent calcium indicator in pyramidal neurons (RCaMP mice) (Bethge et al., 2017), functional images of the left dorsal cortex were obtained through an intact skull preparation, and registered to the Allen Mouse Brain Common Coordinate Framework (Figures 2A and 2B;Lein et al., 2007;Wang et al., 2020).
To examine the changes in cortical processing upon learning, we compared the activity in the same mice (n = 7) before (novice, 62 sessions) and after (expert, 82 sessions) whisker training (Figure 2C for hit trials and Figure S2A for correct-rejection trials; Videos S1 and S2). The visual cue evoked responses in the primary visual (Vis) and surrounding areas (Andermann et al., 2011;Marshel et al., 2011;Wang and Burkhalter, 2007), which decreased after whisker training (Figures 2C and S2A; subtraction between novice and expert mice images; Wilcoxon ranksum test, p < 0.05; for details, see STAR Methods). Stimulation of the C2 whisker-evoked two focal responses, in the primary and secondary whisker somatosensory areas (wS1 and wS2) in both novice and expert mice ( Figure 2C). Immediately after, activity transiently decreased in orofacial areas, including the primary tongue/jaw sensory and motor areas (tjS1 and tjM1), followed by a widespread gradual increase toward the auditory cue initiating in the primary and secondary motor areas for whisker (wM1 and wM2) and tongue/jaw (tjM1 and ALM), as well as posterior parietal cortex (PPC) and limb/trunk areas. These positive and negative responses during the delay period were selective to hit trials of expert mice (Figures 2C,S2B,and S2C;Videos S3 and S4). We further quantified response selectivity of different cortical regions for hit and correct-rejection trials by comparing their trial-by-trial activity based on receiver operating characteristic (ROC) analysis ( Figure 2D; see STAR Methods). Across all cortical regions tested, selectivity was significantly enhanced in expert compared to novice mice during the delay period and response window (p < 0.05; non-parametric permutation test). Therefore, important learning-induced global changes of information processing emerged during the delay period.
To control for hemodynamic effects of the wide-field fluorescence signal (Makino et al., 2017), we also imaged transgenic mice expressing an activity-independent red fluorescent protein, tdTomato, which has excitation and emission spectra similar to RCaMP ( Figure S3; 57 sessions from 7 expert mice). We imaged RCaMP and tdTomato mice at the same baseline fluorescence intensity ( Figure S3E; p = 0.80, Wilcoxon rank-sum test, n = 7 RCaMP mice and n = 7 tdTomato mice) by adjusting illumination light power and using identical excitation and emission filters. The tdTomato control mice showed significantly smaller task-related changes in fluorescence than the RCaMP mice ( Figures S3A-S3D; subtraction between RCaMP and tdTomato mice images; Wilcoxon rank-sum test, p < 0.05). In visual cortex of both RCaMP and tdTomato mice, negative intrinsic signals were evoked $1 s after the visual stimulus. However, the short whisker stimulation evoked a rapid positive sensory response only in RCaMP mice, and no clear response was evoked in tdTomato mice ( Figure S3F). On the other hand, some positive intrinsic optical signals were evoked in midline and frontal regions of tdTomato mice, but the amplitude of these signals was significantly smaller than for RCaMP mice (Wilcoxon rank-sum test, p < 0.05). These results suggest that the spatiotemporal patterns of fluorescence signals in RCaMP mice largely reflected the calcium activity of the cortex.

Distinct modification of early and late whisker processing in single neurons
To further investigate learning-and task-related cortical dynamics with higher temporal and spatial resolution, we carried out high-density extracellular recordings (Buzsá ki, 2004) from 12 brain regions, with guidance from wide-field calcium imaging (Figures 2 and S2), optical intrinsic imaging and previous literature (Esmaeili and Diamond, 2019;Guo et al., 2014;Harvey et al., 2012;Kyriakatos et al., 2017;Le Merre et al., 2018;Mayrhofer et al., 2019;Sippy et al., 2015;Sreenivasan et al., 2016) including: Vis, wS1, wS2, wM1, wM2, tjM1, ALM, PPC, auditory cortex (Aud), the dorsolateral region of striatum innervated by wS1 (DLS), medial prefrontal cortex (mPFC), and the dorsal part of hippocampal area CA1 (dCA1) . Two areas were recorded simultaneously during any given session. The precise anatomical location of the recording probes was determined by 3D reconstruction of the probes' tracks using whole-brain two-photon tomography and registration to the Allen atlas (Figures 3A and S4A-S4C; for details, see STAR Methods; Lein et al., 2007;Wang et al., 2020). In total, 4,415 neurons (classified as regular spiking units In vivo images were registered to the Allen Mouse Brain Atlas (right). Cortical areas targeted for electrophysiological experiments are indicated: wS1, primary whisker somatosensory cortex; wS2, secondary whisker somatosensory cortex; wM1, primary whisker motor cortex; wM2, secondary whisker motor cortex; Aud, auditory area; Vis, visual area; PPC, posterior parietal cortex; tjM1, primary tongue and jaw motor cortex; ALM, anterior lateral motor area. (C) Grand-average time course of global cortical activity in hit trials for novice versus expert mice. Each frame shows DF/F 0 without temporal smoothing (10 ms/frame). For each pixel, baseline activity in a 50-ms window before visual cue onset was subtracted. Mean calcium activity of 62 novice and 82 expert sessions from seven mice, novice and expert difference, and the statistical significance of the difference (p value of Wilcoxon rank-sum test, FDR corrected, p < 0.05) are plotted from top to bottom. Green traces, anatomical borders based on Allen Mouse Brain Atlas. Black plus sign indicates bregma. (D) Selectivity index in novice and expert mice. For each brain region, selectivity between hit versus correct-rejection trials was determined in non-overlapping 50-ms bins based on the area under the ROC curve. Mean selectivity of each area in 62 novice and 82 expert sessions from seven mice, and the statistical significance of the difference (p value of non-parametric permutation test, FDR corrected, p < 0.05) is plotted. Region-of-interest (ROI) size, 3 3 3 pixels. See also Figures S2 and S3 and Videos S1, S2, S3, and S4.

Article
[RSUs] based on their spike waveform) were recorded in 22 expert mice, and 1,604 RSUs were recorded in 8 novice mice. Single neurons encoded different task aspects such as whisker sensory processing, lick preparation, and lick execution ( Figure 3B). Assuming that neurons with similar firing dynamics perform similar processing, it is informative to identify those temporal patterns and investigate whether a single pattern is confined or distributed across the brain. We therefore performed unsupervised clustering of neurons according to their temporal firing pattern in different trial types (hit, miss, correct rejection, and false alarm) by pooling neurons from different brain regions of both novice and expert mice (Figures 3C and S5;see STAR Methods). Gaussian mixture model (GMM) clustering ( Figures S5A and S5B) yielded 24 clusters of neurons, among which 20 were modulated in at least one of the task epochs (Hastie et al., 2009). By sorting task-modulated clusters by their onset latency and labeling them based on their task-epoch-related response, we analyzed the distribution of clusters across areas along a functional axis (Figures 3C and 3D). Clusters composed predominantly of neurons from expert mice were particularly modulated during the delay period (clusters 5-7) and the response window (clusters 14, 15, and 17) and were mainly distributed across different motor-related areas ( Figure 3D). Next, we calculated a ''distribution index,'' which quantifies within-area versus between-area composition of clusters (Figures 3D and S5C; for details, see STAR Methods). The distribution index was small for visual and whisker clusters, indicating localized distribution of those clusters in specific brain regions. On the other hand, the distribution index was large in the majority of response clusters, indicating broad distribution of those clusters across brain areas. Across learning, prominent activity patterns remained similar in wS1, wS2, and Vis areas, while it changed in all other regions ( Figure S5D).
To reveal spatial changes in neuronal firing following whisker training, we calculated the average time-dependent firing rate for all recording probes ( Figure S4D; Videos S5 and S6) and for the 12 anatomically defined areas ( Figure 3E). The visual cue evoked responses localized in Vis and PPC of novice and expert mice. Following the auditory cue, excitation rapidly covered all recorded regions in both mice groups. Major changes following whisker training appeared in the delay period between the whisker and auditory stimuli. Similar to deactivation patterns of orofacial cortex revealed by wide-field imaging ( Figure 2C), tjM1 showed a transient suppression of firing after whisker stimulation in expert mice. The whisker stimulus also evoked a widespread excitation across whisker sensorimotor areas (wS1, wS2, wM1, and wM2), as well as PPC, DLS, and ALM, with different latencies. The initial excitation was significantly enhanced in wM2 and ALM (non-parametric permutation test, p < 0.05). Firing rates of all areas in novice mice returned to baseline levels shortly after whisker stimulation, whereas in expert mice, wS2, PPC, DLS, wM2, ALM, and tjM1 neurons showed increased activity in different parts of the delay. PPC neuronal firing remained elevated only during the first part of the delay period, returning to baseline before the auditory cue, while the activity of wM2, DLS, and tjM1 neurons ramped up toward the lick onset. Average neuronal firing in ALM remained elevated throughout the entire delay period. These results suggest that the whisker training enhanced the initial distributed processing of the whisker stimulus and formed the memory of a licking motor plan among higher-order areas of whisker and tongue/jaw motor cortices, while introducing a transient inhibitory response in tjM1.
We further investigated what was encoded in the acquired neural activity by considering other trial types. First, we found that the pronounced delay-period activity during hit trials was absent in miss trials and thus correlated with percept ( Figures  S4E and S4F). Second, we quantified the selectivity of single neurons for whisker detection and delayed licking by comparing their trial-by-trial spiking activity in hit and correct-rejection trials based on ROC analysis ( Figure 3F; see STAR Methods). We found that a significantly larger percentage of neurons became selectively recruited during the delay period in many areas of the expert mice, suggesting the possible involvement of widespread cortical networks in the acquisition of motor planning for delayed licking (p < 0.05; non-parametric permutation test).  Figure 2B. (B) Example neurons from expert mice. Raster plots and peristimulus time histograms (PSTHs) for three representative units in wS1, ALM, and tjM1 encoding whisker, delay, and licking, respectively. Trials are grouped and colored based on trial outcome. (C) Unsupervised neuronal clustering. Activity maps of all single units from novice and expert mice clustered based on their trial-type average normalized firing rate. Black horizontal lines separate different clusters. Labels on the right indicate the task epoch, where the response onset was observed on cluster average response. Only task-modulated clusters (20/24) are shown. (D) Composition of clusters. Left: weighted proportion of neurons within each cluster belonging to different brain regions in novice and expert mice. Right: percentage of neurons in each cluster from novice and expert mice and distribution index. To calculate distribution index for each cluster, the probability distribution of the area composition was compared to a uniform distribution, and an index between 0 (localized in one area) to 1 (uniformly distributed) was defined. Values are corrected for different sample size in different areas and mouse groups. (E) Population firing rate in hit trials. Left: baseline-subtracted mean firing rate (mean ± SEM) in each region is superimposed for expert (purple) and novice (cyan) mice. Right: p value map of expert versus novice mice comparison in 50 ms non-overlapping windows (non-parametric permutation test, FDR corrected). (F) Proportion of neurons with significant selectivity index in novice and expert mice. For individual neurons, selectivity between hit versus correct-rejection trials was determined in 100-ms non-overlapping windows based on the area under the ROC curve. Percentage of neurons with significant negative (left) or positive (right) selectivity in each region is shown across time in novice (top) and expert (bottom) mice. Significance of selectivity was determined using non-parametric permutation tests (p < 0.05). See also Figures S4 and S5 and Videos S5 and S6.

Article
Active suppression of orofacial sensorimotor areas In the delay period, expert mice showed a transient suppression in broad orofacial sensorimotor cortices selectively in hit trials (Figures 2, 3, S2, and S4). The suppression of activity in this region coincided with the onset of the whisker-evoked excitation in adjacent secondary motor cortices, including ALM ( Figure 4A). This inhibition could contribute to suppressing immediate licking in response to the whisker stimulus. To test this hypothesis, we first compared trials in which mice successfully withheld licking until the end of the delay period (hit), with trials in which mice made premature licking following the whisker stimulus (early licks). We found that tjM1 activity was significantly suppressed in hit compared to early lick trials ( Figure 4B) in both calcium imaging signals (tjM1: p = 0.040; Wilcoxon signed-rank test) and neuronal firing rate (tjM1: p = 0.017; non-parametric permutation test). Next, to evaluate the causal role of tjM1 in the suppression of premature licking, we optogenetically manipulated tjM1 activity during task execution ( Figure 4C). Activation of tjM1 in transgenic mice expressing ChR2 in excitatory neurons (Emx1-ChR2) increased the fraction of early licks ( Figure 4C; n = 19 sessions in left) and expert (n = 82 sessions, lower left) sessions from seven mice, calcium traces (middle, mean ± SEM), and firing rates (right, mean ± SEM) in tjM1 and ALM after whisker stimulus. For the calcium signal, the mean during a 50 ms period before whisker stimulation is subtracted, and for spiking data, the mean during 200 ms before whisker onset is subtracted. (B) tjM1 suppression during delayed licking in expert mice. Top: calcium traces averaged (mean ± SEM) across hit and early lick trials in ALM (left) and tjM1 (middle) and comparison of signal amplitude in the suppression window (right, 160-210 ms after whisker stimulus; n = 82 sessions from seven mice; ALM: p = 2.93 3 10 À4 , tjM1: p = 0.040; Wilcoxon signed-rank test, FDR corrected). Mean signal during 50-ms period before whisker onset is subtracted. Bottom: average spiking activity (mean ± SEM) in hit versus early lick trials in ALM (left) and tjM1 (middle) and comparison in the suppression window (right, 50-100 ms; ALM: n = 766 neurons, p = 0.466, tjM1: 377 neurons, p = 0.017, nonparametric permutation test, FDR corrected). Mean spike rate during 200 ms before whisker stimulus is subtracted. Trials with first-lick latency ranging from 300 to 1,000 ms after whisker stimulus onset were selected for early lick trials. (C) Causal contribution of tjM1 activity to delayed licking. Left: optogenetic activation and inactivation of tjM1 were performed in Emx1-ChR2 and VGAT-ChR2 transgenic mice, respectively. Middle: fraction of early lick trials in go and no-go conditions upon tjM1 activation and no-light control trials (n = 19 sessions in six expert mice; light-off versus light trials, no-go trials: p = 4.27 3 10 À4 , go trials: p = 1.94 3 10 À3 ; Wilcoxon signed-rank test, FDR corrected). Right: fraction of early licks in go and no-go trials upon tjM1 inactivation during whisker or delay epochs (n = 32 sessions in nine expert mice; light-off versus light trials, no-go trials: whisker: p = 0.239, delay: p = 1.2 3 10 À4 ; go trials: whisker: p = 0.018, delay: p = 2 3 10 À6 ; Wilcoxon signed-rank test, FDR corrected). Thick lines show mean ± SEM; lighter lines show individual sessions. For details, see STAR Methods. (D) Movement suppression in no-lick trials. Top: wide-field images 250 ms after auditory cue in miss (left) and correct-rejection (middle) trials and p value of comparison (right; n = 82 expert sessions from seven mice; p value of Wilcoxon signed-rank test, FDR corrected). Mean signal during the 50-ms period before auditory onset is subtracted. Bottom: baseline-subtracted (200 ms prior to auditory cue) average firing rate (mean ± SEM) of tjM1 neurons in miss versus correct-rejection trials (left) and the comparison of mean tjM1 spike rate during the response window (200-1,000 ms window after auditory cue; n = 377 neurons; p = 0.005, non-parametric permutation test); percentage of neurons with positive (solid lines) and negative (dotted lines) modulation in miss (red) and correct-rejections (blue) trials during the response period compared to baseline (right) (p < 0.05; non-parametric permutation test, FDR corrected).

Article
(legend on next page) ll OPEN ACCESS Article six expert mice; light-off versus light trials, no-go trials: p = 4.27 3 10 À4 ; go trials: p = 1.94 3 10 À3 ; Wilcoxon signed-rank test). Conversely, inactivation of tjM1 in transgenic mice expressing ChR2 in GABAergic inhibitory neurons (VGAT-ChR2) (Guo et al., 2014) significantly reduced premature licking (Figure 4C; 32 sessions in nine mice; light-off versus light trials, go trials: whisker: p = 0.018, delay: p = 2 3 10 À6 ; Wilcoxon signed-rank test). The opposite effect of these optogenetic manipulations indicates that the behavioral changes are not visually induced by the stimulation light. Altogether, these results suggest that the tjM1 suppression acquired in expert mice plays an important role in delaying the lick response.
To further investigate the relationship between reduction of cortical activity and movement suppression, we compared neural activity after the auditory cue between correct-rejection and miss trials, as they likely reflect distinct origins of a ''no-lick'' response ( Figure 4D). We found that the calcium signal in orofacial sensorimotor cortices showed significantly stronger suppression in correct-rejection trials compared to miss trials (Figure 4D; p < 0.05; Wilcoxon signed-rank test). Consistently, the spiking activity in tjM1 during the response window revealed a stronger inhibition in correct-rejection trials ( Figure 4D, p = 0.005; non-parametric permutation test). Moreover, in the same behavioral epoch, a larger proportion of neurons in tjM1 were negatively modulated in correct-rejection trials ( Figure 4D; p < 0.05; non-parametric permutation test). These results highlight the correlation and causality between the deactivation of orofacial sensorimotor cortex and active suppression of licking.

Routing of whisker information to frontal cortex
The brief whisker stimulation allowed us to follow the sequence of evoked responses across cortical regions. Frame-by-frame analysis of high-speed calcium imaging data and high-resolution quantification of spiking activity showed that the whisker stimulus evoked the earliest responses in wS1; activity then spread to wS2, wM1, wM2, and finally reached ALM . This earliest sequence of excitation, as well as the deactivation of tjM1/S1, was significantly enhanced across learning by whisker training ( Figure 5A; Wilcoxon rank-sum test, p < 0.05). This sequential activation and deactivation were diminished when mice failed to lick (miss trials) (Figures S6D and S6E; Wilcoxon signed-rank test, p < 0.05), supporting its involvement in whisker detection and delayed licking (see also Figure 4).
To test whether the sequential activation of cortical areas occurs in single trials, we examined whether the variability of the response latency in wS1 propagates to downstream areas in the imaging data. We divided the data into slow and fast trials based on the latency of the whisker-evoked response in wS1 ( Figure 5B), and we analyzed the latencies in other areas where single-trial analysis of whisker-evoked response latency was feasible (wS2, wM1 and wM2). The latencies of those areas were significantly longer in slow trials (wS1: p = 2.2 3 10 À8 , wS2: p = 1.1 3 10 À7 , wM1: p = 2 3 10 À9 , wM2: p = 3.3 3 10 À4 ; Wilcoxon signed-rank test), further suggesting a chain of activation from wS1 to the other regions.
We also analyzed the change in response latency in singleneuron data between novice and expert mice. For neurons with significant firing rate modulation in the 200-ms window following the whisker stimulus compared to the 200 ms before the whisker stimulus (p < 0.05, non-parametric permutation test), latency was calculated as the half-maximum (minimum for suppressed neurons) whisker-evoked response (see STAR Methods). The latency of the whisker-evoked response in wM2 was shorter following whisker training, whereas that of wM1 was longer ( Figure 5C; wM1: p = 0.008, wM2: p = 0.041, Wilcoxon rank-sum test). Moreover, among all areas recorded, wM2 showed the earliest significant increase in firing upon whisker training ( Figure 5D; novice versus expert: p = 0.015, non-parametric permutation test), as well as the earliest significant difference comparing hit and miss trials ( Figure 5E; hit versus miss: p = 0.01, non-parametric permutation test).
The neuronal clustering revealed three main patterns of activity during the delay period (Figures 3C and 5F): (1) a fast and transient increase in neuronal activity following the whisker stimulus (clusters 2-4) that was mostly represented in wS1 and wS2 of Figure 5. Conversion of a sensory signal into a motor plan (A) Wide-field signal after whisker stimulus in novice and expert mice in hit trials. Each frame shows the instantaneous calcium activity (10 ms/frame). Mean signal during the 50-ms period before whisker onset is subtracted. From top to bottom, average calcium signal of 62 novice and 82 expert sessions from seven mice, and the statistical significance of the difference (p value of Wilcoxon rank-sum test, FDR corrected). (B) Propagation of whisker-evoked response latency to downstream regions in expert mice (82 sessions, seven mice). Left: calcium traces (mean ± SEM) in different regions were grouped based on single-trial response latencies in wS1. Right: latencies of whisker-evoked calcium response (mean ± SEM) in fast and slow trials (wS1: p = 2.2 3 10 À8 , wS2: p = 1.1 3 10 À7 , wM1: p = 2 3 10 À9 , wM2: p = 3.3 3 10 À4 ; Wilcoxon signed-rank test, FDR corrected). (C) Latency of the whisker-evoked spiking response. Cumulative distribution of single neuron latencies for key cortical areas in novice (left) and expert (middle) mice. Distribution of latencies across different areas and their change across learning (right). Boxplots indicate median and interquartile range. Only neurons with significant modulation in the 200-ms window following whisker stimulus compared to a 200-ms window prior to the whisker stimulus are included (p < 0.05, nonparametric permutation test). Latency was defined at the half-maximum (minimum for suppressed neurons) response within the 200-ms window. (D) Early whisker-evoked spiking activity in hit trials. Baseline-subtracted (200 ms prior to whisker onset) mean ± SEM firing rate across critical cortical areas in expert and novice mice are overlaid. Gray horizontal bars represent the p value of novice/expert comparison in 50-ms consecutive windows (non-parametric permutation test, FDR corrected). (E) Spiking activity in hit versus miss trials. Baseline-subtracted (200 ms prior to whisker onset) mean ± SEM firing rate across critical cortical areas in hit and miss trials of expert mice are overlaid. Gray horizontal bars represent the p value of hit/miss comparison in 50-ms consecutive windows (non-parametric permutation test, FDR corrected). (F) Whisker and delay responsive neuronal clusters, related to Figures 3C and 3D. Left: average normalized firing rate (mean ± SEM) of whisker (clusters 2-4) and two distinct delay clusters (clusters 5 and 6). Right: proportion of neurons within each cluster belonging to different brain regions and groups of mice, related to Figure 3D. See also Figures S5 and S6. Figure 6. Delay processing beyond preparatory movement (A and B) Focalized delay activity in quiet hit trials. Imaging and neuronal data were averaged across selected quiet trials with no preparatory jaw movements during the delay period (see STAR Methods). (A) Mean wide-field calcium signal in a 50-ms window during the delay period (270-320 ms after whisker onset) subtracted by the mean during the 50-ms period before whisker onset. From top to bottom, mean calcium signal of 62 novice and 82 expert sessions from seven mice, their difference, and the statistical significance of the difference (p value of Wilcoxon rank-sum test, FDR corrected). (B) Mean ± SEM firing rate in expert and novice mice (left) and p-value map of expert/novice comparison in 50-ms non-overlapping windows (non-parametric permutation test, FDR corrected) (right). (C-F) Poisson encoding model capturing trial-by-trial neuronal variability. (C) Schematic of the Poisson encoding model. Concatenated spike trains from hit and correct-rejection trials (y (t) ) were fitted using a Poisson regression model (GLM). The design matrix (X (t) ) included different types of task-related and movement variables (see STAR Methods). (D) Fraction of neurons significantly encoding whisker (top), delay (middle), and lick initiation (bottom) (p < 0.05, likelihood ratio test, see STAR Methods) in different regions. Asterisks represent significant change comparing the fraction of novice and expert neurons (proportion test, ***p < 0.001 (legend continued on next page) ll OPEN ACCESS Article both novice and expert mice; (2) a slow ramping activity (cluster 6) that was mostly represented in ALM but only in expert mice; and (3) the activity of cluster 5 rose and peaked after clusters 2-4, but before cluster 6, and slowly decayed along the delay period, thus bridging the activities of clusters 2-4 and cluster 6. Interestingly, cluster 5 was most prevalent in wM2 of expert mice, as well as contributing importantly to activity in wS2, wM1, and ALM ( Figures 3C, 3D, and 5F).

Article
Altogether ( Figures 5C-5F), these results highlight the possible role of wM2 as a potential node to bridge sensory processing to motor planning perhaps helping to relay whisker sensory information from wS1/wS2 to ALM.

Focalized delay-period activity in frontal cortex
The most prominent cortical change after whisker training was the emergence of widespread delay-period activity (Figures 2  and 3). In the late delay period, expert mice showed uninstructed, anticipatory movements of whisker, jaw, and tongue ( Figures 1E, 1F, S1C, and S1D), which could be broadly correlated with activity across the brain (Musall et al., 2019;Steinmetz et al., 2019). To identify neural activities more directly related to task execution, we leveraged trial-by-trial variability of the neuronal activity and anticipatory movements ( Figure 6).
First, we separated neural activities by selecting trials in which mice did not make jaw movements during the delay period (quiet trials) (Figures 6A, 6B, and S7A; see STAR Methods). When only quiet trials were considered, the increased calcium activity during the delay became more localized to ALM ( Figure 6A). This focal activation emerged across learning (Wilcoxon rank-sum test, p < 0.05). Electrophysiology data also demonstrated a consistent localization of the neuronal delay-period activity (Figure 6B). In quiet-hit trials, only ALM population firing remained elevated throughout the delay period and was clearly enhanced by whisker training. In the other regions, the whisker-evoked firing during the delay period returned to baseline, just as in novice mice. Thus, selecting quiet trials demonstrated that the essential processing in cortex during the delay period is localized in a focal frontal region that includes ALM.
Assessing the impact of movements considering only quiet trials highlighted the unique activity pattern of ALM during the delay period. However, quiet hits represented a minority of all hit trials in expert mice (42% ± 2%; mean ± SEM). Trials with movements during the delay period may carry richer information about how neuronal activity drives behavior. Therefore, to capture neuronal encoding during single trials, we used a generalized linear model (GLM) (Nelder and Wedderburn, 1972) Figure 6C): discrete task events (e.g., sequential boxcars time-locked to sensory stimuli and first-lick onset), analog movement signals (whisker, tongue, and jaw speed), and slow variables capturing motivational factors (e.g., current trial number) and trial history (e.g., outcome of the previous trial).
We assessed fit quality using predictor-spike mutual information and selected only the neurons with a good quality of fit for the rest of analysis (Cover and Thomas, 1991;Gerstner et al., 2014; Figure S7C; see STAR Methods). The contribution of each model variable to the neuron's spiking activity was tested by re-fitting the data after excluding the variable of interest (reduced model) and comparing the fit quality to the model including all variables (full model) using a likelihood ratio test (Figures 6D and S7D;Buse, 1982).
Whisker-related sensorimotor areas (wS1, wS2, wM1, and wM2) had the largest proportion of neurons significantly modulated by whisker stimulus in the first 100 ms in both novice and expert mice ( Figures 6D and S7E). The fraction of whisker encoding neurons decreased across whisker training in wM1 (p = 0.029). In contrast, delay-encoding neurons that were significantly modulated between 100 ms and 1 s after the whisker stimulus (Figures 6D and S7E) were found mainly in ALM but also in wM2, which was strikingly enhanced by whisker training (p = 5 3 10 À5 ). Some neurons in wM2, ALM, tjM1, and DLS were found to be significantly modulated during the 200 ms prior to the lick onset, before and after whisker training ( Figures 6D and S7E), reflecting the licking initiation signal in these areas beyond those captured by orofacial movements or sound onset predictors in the model.
We next asked to what extent the same neurons encode different task variables. To address this question, we quantified the degree of overlap across populations of whisker, delay, and lick initiation encoding neurons in the key areas of interest and visualized it using Venn diagrams ( Figure 6E). We found that enhanced delay and lick initiation encoding populations were largely non-overlapping. Finally, we asked whether our encoding model, fitted using all trials, can reproduce neuronal activity in quiet trials ( Figures 6F and S7F). Model-reconstructed peristimulus time histograms (PSTHs) after removing movement-related regressors confirmed that neurons in ALM kept their firing throughout the delay period, while the firing in other areas returned to baseline, in agreement with the empirical data. This result supports the model validity and highlights the prominence of ALM for motor planning.
Temporally specific causal contributions of different cortical regions Imaging and electrophysiology data suggested multiple phases of neural processing for whisker detection, motor planning, and delayed licking. To examine the causal contribution of cortical regions in each of these phases, we performed spatiotemporally selective optogenetic inactivation in transgenic mice expressing ChR2 in GABAergic neurons (n = 9 VGAT-ChR2 mice). We applied blue light pulses to each brain region through an optical fiber randomly in one third of the trials, occurring in one of the four temporal windows ( Figure 7A): baseline (from visual cue onset to 100 ms before whisker stimulus onset), and *p < 0.05). (E) Venn diagrams showing the amount of overlap among neuronal populations in different regions significantly encoding whisker, delay, and lick initiation variables. The sizes of the circles are proportional to the fraction of significantly modulated neurons. (F) Comparison of empirical (data, dotted lines) and reconstructed (model, solid lines) PSTHs for quiet (blue) and all (black) trials in expert mice. See also Figure S7 and Table S1. Article whisker (from 100 ms before to 200 ms after whisker stimulus onset), delay (from 200 ms to 1,000 ms after whisker stimulus onset), or response (from 0 ms to 1,100 ms after auditory cue onset).
Inactivation in different time windows provided spatiotemporal maps of the behavioral impact ( Figures 7B, 7C, and S8). During the baseline window, a significant decrease in hit rate occurred after inactivation of Vis, dCA1, and mPFC (light off versus light, Vis: p = 0.031, dCA1: p = 0.016, mPFC: p = 0.031; Wilcoxon signed-rank test). During the whisker window, a significant decrease in hit rate occurred in every region tested with the strongest impact in wS2 (light off versus light, p = 0.016; Wilcoxon signed-rank test). During the delay period, inactivation of ALM and mPFC produced a strong reduction in hit rate (light off versus light, ALM: p = 0.016, mPFC: p = 0.016; Wilcoxon signed-rank test). Finally, during the response window, when the licking behavior had to be executed, inactivation of tonguerelated tjM1 and ALM, but also whisker-related wM2, impaired behavior by decreasing both hit and false-alarm rate (light off versus light, tjM1: p = 0.016, ALM: p = 0.016, wM2: p = 0.016; Wilcoxon signed-rank test), supporting the causal involvement of the lick initiation-encoding of wM2 neurons ( Figure 6D). The differential impact of inactivating nearby cortical regions is consistent with high spatiotemporal specificity of our optogenetic manipulations. Inactivation during the whisker and delay periods also broadly reduced the fraction of premature licking and reduced preparatory movements, with spatiotemporal specificities relatively similar to those observed in hit rate changes ( Figure S8). Thus, spatiotemporal mapping of causal impacts suggests that critical whisker processing is initially distributed across diverse cortical regions, and then converges in frontal regions for planning lick motor output, in agreement with neural activity.
To directly compare the obtained causal maps with observed neural correlations, we quantified the difference in firing rate between hit versus correct-rejection and the change in hit rate upon optogenetic inactivation for each brain area and time window ( Figure 8A). If a brain region is critically involved in task execution, then neural activity in that area would code behavioral decision (large hit-correct rejection difference), and its inactivation would cause behavioral impairments (strong decrease in hit rate). This is further quantified by an involvement index as the product of the two terms described above (Figure 8B). The involvement index during the whisker period was largest in wS2 and wS1 (mean ± SEM, wS2: 0.7 ± 0.11, p < 0.01, wS1: 0.58 ± 0.11, p < 0.05; non-parametric permutation test versus other areas), highlighting these areas as the main nodes of whisker sensory processing. During the delay period, ALM had the largest involvement index (mean ± SEM, ALM: 0.48 ± 0.09, p < 0.001; non-parametric permutation test versus other areas). Although, mPFC inactivation during the delay provoked the largest reductions in hit rate, there was little change in neuronal activity in this area, resulting in small involvement values. The most critical areas in the response window were tjM1 and ALM (mean ± SEM, tjM1: 1.16 ± 0.15, p < 3 3 10 À5 , ALM: 0.76 ± 0.09, p < 0.05; non-parametric permutation test versus other areas). This reflects the prominent role of tjM1 in licking execution. Interestingly, wM2 had a moderate but significant involvement index in all three time windows, supporting its possible role in bridging sensory processing and motor execution.

DISCUSSION
We found converging evidence for the temporally distinct involvement of diverse cortical regions in delayed sensorimotor transformation using an array of complementary technical approaches. Our analyses of the learning-induced changes in causal neural activity revealed three key findings further discussed below: (1) widespread neuronal delay-period activity was dominated by preparatory movements, but essential causal neuronal delay-period activity was predominantly localized to ALM; (2) sequential activation of cortical regions wS1, wS2, wM2, and ALM suggests the possible contribution of a corticocortical pathway for whisker sensory information to reach ALM, with wM2 showing the earliest increase in sensory-evoked response across learning; and (3) suppression of orofacial sensorimotor cortex was observed in the early delay period, likely contributing to inhibition of premature licking.
Essential cortical delay-period activity in ALM Broad regions of cortex showed elevated activity in expert mice during the delay period in hit trials (Figures 2 and 3), correlating with preparatory movements (Figures 1 and 6). These results are thus in good agreement with widespread motor-related cortical activity (Musall et al., 2019;Steinmetz et al., 2019;Stringer et al., 2019). When we analyzed only trials free from the delay-period preparatory movements, wide-field imaging and electrophysiology demonstrated a localized excitatory activity in a small region of secondary motor cortex including ALM ( Figures 6A and 6B). Inactivation of ALM during the delay period was highly effective in reducing hit rates in the subsequent response period (Figure 7). Essential causal neuronal delay-period activity therefore appears to be predominantly localized to ALM ( Figures 8A and 8B), in good agreement with previous closely related tasks (Guo et al., 2014;Li et al., 2015). (B) Behavioral impact of optogenetic inactivation across time windows for each brain region (mean ± SEM). For each area, hit rate (black) and false-alarm rate (red) are plotted for light-off (off), baseline (B), whisker (W), delay (D), and response (R) windows. Asterisks represent significant difference comparing hit (black) or false alarm (red) in light trials versus light-off trials (n = 9 mice; *p < 0.05; Wilcoxon signed-rank test, Bonferroni correction for multiple comparison). (C) Spatiotemporal map of behavioral impact of focal inactivation in go (top) and no-go trials (bottom). Circles represent different cortical regions labeled on the schematic in (A); color shows change in lick probability, and circle size shows the p value of the significance test comparing light trials versus light-off trials (n = 9 mice, Wilcoxon signed-rank test, Bonferroni correction for multiple comparison). See also Figure S8.

Article
By accounting for movement contributions using linear regression analysis of trial-by-trial variability, we found that most delay-period-responsive neurons were indeed localized in ALM but that the fraction of delay-encoding neurons was also significantly enhanced by learning in wS2, wM1, wM2, and tjM1 ( Figures 6C-6E). Furthermore, during the delay period, inactivation of several cortical areas, including not only ALM but also wS1, wS2, mPFC, and tjM1, significantly reduced hit rates ( Figure 7). Indeed, causal contributions to the delay period measured by the involvement index were also significant in wS1, wS2, PPC, mPFC, wM2, and tjM1, as well as ALM. In addition to the strongest causal involvement found for ALM, these causal impacts observed in broader cortical areas during the delay period might in part result from reduced preparatory (B) A causal involvement index was defined as the region-and epoch-specific absolute value of the difference in firing rate comparing hit and correct-rejection trials (n = 22, expert mice) multiplied by the change in hit rate induced by optogenetic inactivation (n = 9, VGAT-ChR2 mice). Error bars are obtained from bootstrap (see STAR Methods) and represent standard deviation (bootstrap standard error). Asterisks represent significance level (*p < 0.05 and ***p < 0.001; non-parametric permutation test, Bonferroni correction for multiple comparison). (C) Proposed cortical circuits connecting whisker somatosensory cortex to tongue/jaw motor cortex upon task learning.

OPEN ACCESS
Article movements induced by inactivation ( Figures S8B and S8C). The preparatory movements, which were most prominent in hit trials of expert mice, may thus contribute a form of embodied sensorimotor memory in which ongoing movements might help maintain a plan for delayed licking (Mayrhofer et al., 2019).
During the delay period, mPFC inactivation had the largest impact on hit rate across the tested areas (Figure 7). However, we did not find robust sustained activity in mPFC during this window for maintenance of the motor plan. Interestingly, mPFC inactivation during all task epochs (including baseline) impaired behavior. One possibility is that the observed behavioral effect relates to the representation of task rules (Durstewitz et al., 2010), behavioral strategy (Powell and Redish, 2016), or motivation (Popescu et al., 2016).
A putative corticocortical signaling pathway linking sensory to motor cortex through learning Our measurements at high spatiotemporal resolution revealed a rapid sequential activation of cortical areas evoked by whisker deflection, ultimately reaching ALM in hit trials of expert mice. The earliest cortical response to whisker stimulus occurred in wS1 and wS2, which changed relatively little after whisker training (Figures 2, 3, and 5). This initial processing was essential as shown by optogenetic inactivation (Figure 7), and therefore, wS1 and wS2 appear to form the cortical starting points for task execution, in agreement with previous studies of whisker detection tasks without a delay period (Kwon et al., 2016;Kyriakatos et al., 2017;Le Merre et al., 2018;Mayrhofer et al., 2019;Miyashita and Feldman, 2013;Sachidhanandam et al., 2013;Yang et al., 2016).
Sensory cortical areas project directly and strongly to frontal cortex through parallel pathways, with wS1 innervating wM1, and wS2 innervating wM2 (Ferezou et al., 2007;Mao et al., 2011;Oh et al., 2014;Sreenivasan et al., 2017). Whisker deflection evoked rapid sensory responses in these downstream motor regions. Interestingly, the sensory response in wM2 showed the earliest significant increase in whisker-evoked firing and a decrease in response latency across learning ( Figures 5C and  5D), whereas a decrease in amplitude and increase in latency were found in wM1. Neuronal activity in wM2 also showed the earliest choice-related activity when comparing hit and miss trials ( Figure 5E). Thus, wM2 might serve as a key node in the corticocortical network to begin the process of converting a whisker sensory stimulus into longer-lasting preparatory neuronal activity. Shortly after wM2 activation, ALM, an important premotor area for control of licking (Guo et al., 2014;Li et al., 2015;Mayrhofer et al., 2019), started to increase firing ( Figure 5). Through cortico-cortical connectivity (Luo et al., 2019), activity in wM2 could contribute directly to exciting its neighbor region, ALM, which manifested the most prominent delay-period activity through whisker training (Figures 3 and 6), consistent with previous studies Li et al., 2015).
Our results suggest a hypothesis for a minimal cortical network connecting whisker sensory coding to preparatory neuronal activity for motor planning; a pathway wS1 / wS2 / wM2 / ALM could be the main stream of signal pro-cessing ( Figure 8C). Some of the most prominent whiskerrelated changes through whisker training occurred in wM2 and ALM, and it is possible that reward-related potentiation of synaptic transmission between wS2 / wM2 and wM2 / ALM could underlie important aspects of the present learning paradigm. All of these cortical areas are likely to be connected through reciprocal excitatory long-range axonal projections, which could give rise to recurrent excitation helping to prolong firing rates of neurons in relevant brain regions during the delay period of hit trials. Interestingly, in a related whisker detection task without a delay period, enhanced reciprocal signaling between wS1 and wS2 has already been proposed to play an important role (Kwon et al., 2016;Yamashita and Petersen, 2016). It is also important to note that a large number of subcortical structures are also likely to be involved in task learning and performance, including thalamus (El-Boustani et al., 2020;Guo et al., 2017), basal ganglia (Sippy et al., 2015), and cerebellum (Chabrol et al., 2019;Gao et al., 2018).

Lick and no-lick signals in tjM1
In expert mice, we found that the whisker stimulus evoked a sharp deactivation broadly across orofacial sensorimotor cortex, including tjM1, an area thought to be involved in the initiation and control of licking (Mayrhofer et al., 2019). In contrast, tjM1 neurons were activated soon after whisker deflection in a previous study of a detection task without a delay period before licking (Mayrhofer et al., 2019). One interesting possibility is thus that the deactivation in tjM1 develops through learning of a task where suppression of immediate licking is demanded. In support of this hypothesis, here, we found that premature early licking during the delay period was accompanied by reduced suppression of tjM1 ( Figure 4B) and that activation of tjM1 increased early licks, whereas inactivation of tjM1 reduced early licks ( Figure 4C). We furthermore found that tjM1 activity was suppressed after the auditory cue in correct-rejection trials where mice are supposed to suppress licking compared to miss trials where mice failed to lick, suggesting that the reduction of activity in orofacial cortex reflects active response inhibition ( Figure 4D). Finally, inactivation of tjM1 in the response window evoked the strongest decrease in hit rates, further supporting the causal involvement of this area in the control of licking (Figure 7).
Previous studies in human subjects have suggested the importance of inhibitory mechanisms for preventing actions from being emitted inappropriately (Chikazoe et al., 2009;Duque et al., 2017). Parallel suppression and activation during a delay period might be a common principle of response preparation preserved across species (Cohen et al., 2010). Here, we reveal causal contributions of inhibitory and excitatory cortical delayperiod activity in a precisely defined task, and, as a hypothesis, we put forward a specific corticocortical circuit that could contribute to task learning and execution, requiring future further experimental testing.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Carl Petersen (carl.petersen@epfl.ch).

Materials availability
This study did not generate new unique reagents.

Data and code availability
The complete dataset and MATLAB analysis code are freely available at the open access CERN Zenodo database https://doi.org/10. 5281/zenodo.4720013.  Article cages at a temperature of 22 ± 2 C with food available ad libitum. Water was restricted to 1 mL a day during behavioral training with at least 2 days of free-access to water in the cage every 2 weeks. All mice were weighed and inspected daily during behavioral training.

Experimental design
This study did not involve randomization or blinding. We did not estimate sample-size before carrying out the study. However, the sample-size in this study is comparable with those used in related studies (Allen et al., 2017;Guo et al., 2014;Harvey et al., 2012;Hattori et al., 2019;MacDowell and Buschman, 2020;Pinto et al., 2019).

Implantation of metal headpost
Mice were deeply anesthetized with isoflurane (3% with O 2 ) and then were maintained under anesthesia using a mixture of ketamine and xylazine injected intraperitoneally (ketamine: 125 mg/kg, xylazine: 10 mg/kg). Carprofen was injected intraperitoneally (100 ml at 0.5 mg/ml) for analgesia before the start of surgery. Body temperature was kept at 37 C throughout the surgery with a heating pad. An ocular ointment (VITA-POS, Pharma Medica AG, Switzerland) was applied over the eyes to prevent them from drying. As local analgesic, a mix of lidocaine and bupivacaine was injected below the scalp before any surgical intervention. A povidone-iodine solution (Betadine, Mundipharma Medical Company, Bermuda) was used for skin disinfection. To expose the skull, a part of the scalp was removed with surgical scissors. The periosteal tissue was removed with cotton buds and a scalpel blade. After disinfection with Betadine and rinsing with Ringer solution, the skull was dried well with cotton buds. A thin layer of super glue (Loctite super glue 401, Henkel, Germany) was then applied across the dorsal part of the skull and a custom-made head fixation implant was glued to the right hemisphere without a tilt and parallel to the midline. A second thin layer of the glue was applied homogeneously on the left hemisphere. After the glue had dried, the head implant was further secured with self-curing denture acrylic (Paladur, Kulzer, Germany; Ortho-Jet, LANG, USA). For electrophysiological recordings a chamber was made by building a wall with denture acrylic along the edge of the bone covering the left hemisphere. Particular care was taken to ensure that the left hemisphere of the dorsal cortex was free of denture acrylic and only covered by super glue for optical access. This intact, transparent skull preparation was used to perform wide-field calcium imaging as well as intrinsic optical signal (IOS) imaging experiments. Mice were returned to their home cages and ibuprofen (Algifor Dolo Junior, VERFORA SA, Switzerland) was added to the drinking water for three days after surgery.

Skull preparation and craniotomies
For wide-field calcium imaging and optogenetic activation, an intact transparent skull was used as described above. For electrophysiological recordings, up to 10 small craniotomies were made over the regions of interest using a dental drill under isoflurane anesthesia (2%-3% in O 2 ). The craniotomies were protected using a silicon elastomer (Kwik-Cast, World Precision Instruments, Sarasota, FL, USA). Regions of interest were selected based on the hotspots of activity from wide-field calcium imaging experiments, functionally relevant areas based on previous studies (Esmaeili and Diamond, 2019;Guo et al., 2014;Harvey et al., 2012;Le Merre et al., 2018;Mayrhofer et al., 2019;Sachidhanandam et al., 2013;Sippy et al., 2015;Sreenivasan et al., 2016) and IOS imaging (Lefort et al., 2009). IOS was performed under isoflurane anesthesia (1%-1.5% with O 2 ) to map the C2-whisker representation in primary and secondary whisker somatosensory cortex (wS1 and wS2), as well as the auditory area (Aud). A piezoelectric actuator was used to vibrate the right C2 whisker, or to generate rattle sounds. Increase in absorption of red light (625 nm) upon sensory stimulation indicated the functional location of the corresponding sensory cortex. For the other regions stereotaxic coordinates relative to bregma were used: primary and secondary whisker motor cortices (wM1: AP 1.0 mm; Lat 1.0 mm and wM2: AP 2.0 mm; Lat 1.0 mm), primary and secondary tongue/jaw motor cortices (tjM1: AP 2.0 mm; Lat 2.0 mm and ALM: AP 2.5 mm; Lat 1.5 mm), visual cortex (Vis: AP À3.8 mm; Lat 2.5 mm), posterior parietal cortex (PPC: AP À2 mm; Lat 1.75 mm), medial prefrontal cortex (mPFC: AP 2 mm; Lat 0.5 mm), dorsal part of the CA1 region of the hippocampus (dCA1: AP À2.7 mm; Lat 2.0 mm) and dorsolateral striatum (DLS: AP 0.0 mm; Lat 3.5 mm). For optogenetic inactivation experiments the bone over the regions of interest was thinned and a thin layer of superglue was applied to protect the skull for stable optical access over days. For the inactivation of mPFC and dCA1 a small craniotomy was made for the insertion of an optical fiber or an optrode.

Behavioral paradigm
A total of 55 mice were examined in the delayed whisker detection task including 9 RCaMP, 24 wild-type or negative, 6 Emx1-ChR2, 9 VGAT-ChR2 and 7 tdTomato mice. During the behavioral experiments, all whiskers were trimmed except for the C2 whiskers on both sides, and the mice were water restricted to 1 mL of water/day. Mice were trained daily with one session/day and their weight and general health status were carefully monitored using a score sheet. Both groups of mice (Expert and Novice) went through a Pretraining phase which consisted of trials with visual and auditory cues (without any whisker stimulus) ( Figure 1C). Mice were rewarded by licking a spout, placed on their right side, in a 1-s response window after the auditory cue onset. Trials were separated 6-8 s and started after a quiet period of 2-3 s in which mice did not lick the spout. Each trial consisted of a visual cue (200 ms, green LED) and an auditory cue (200 ms, 10 kHz tone of 9 dB added on top of the continuous background white noise of 80 dB). The stimuli ll OPEN ACCESS Article were separated with a delay period which gradually was increased to 2 s over Pretraining days. Licking before the response period (Early lick) aborted the trial and introduced a 3-5 s timeout. After 3-6 days of Pretraining, mice learned to lick the spout by detecting the auditory cue and to suppress early licking. The wide-field imaging and electrophysiological recordings from the Novice group of mice was performed when mice finished the Pretraining phase and were introduced to the whisker delay task ( Figure 1C). In this phase a whisker stimulus (10 ms cosine 100 Hz pulse through a glass tube attached to a piezoelectric driver) was delivered to the right C2 whisker 1 s after the visual cue onset in half of the trials. Importantly, the reward was available only in trials with the whisker stimulus (Go trials), and time-out punishment (together with an auditory buzz tone) was given when mice licked in trials without the whisker stimulus (No-Go trials) ( Figure 1B). Thus, mice were requested to use the whisker stimulus to change their lick/no-lick behavior. Since the whisker stimulus was weak, Novice mice continued licking in most of Go and No-Go trials irrespective of the whisker stimulus and did not show any sign of whisker learning (Figures 1D and S1B).
The Expert mice entered a Whisker-training phase of 2-29 days during which a stronger whisker stimulus (larger amplitude and/or train of pulses) and shorter delays (for some mice) was introduced ( Figure S1A). As the mice learned to lick correctly, the whisker stimulus amplitude was gradually returned to a smaller amplitude and delay was extended to 1 s, eventually matching the conditions in Novice mice. Expert mice decreased licking in No-Go trials but increased their premature early licks after the whisker stimulus, as monitored by the piezoelectric lick sensor ( Figure 1D, see below). Behavioral hardware control and data collection were carried out using data acquisition boards (National Instruments, USA) and custom-written MATLAB codes (MathWorks).

Quantification of orofacial movements
Contacts of the tongue with the reward spout were detected by a piezo-electric sensor. Continuous movements of the left C2 whisker, tongue and jaw were filmed by a high-speed camera (CL 600 X 2/M, Optronis, Germany; 200 or 500 Hz frame rate, 0.5or 1 ms exposure, and 512x512-pixel resolution) under blue light or infrared illumination. Movements of each body part were tracked using custom-written MATLAB codes. For the imaging sessions, arc regions-of-interest were defined around the basal points for both the whisker and jaw (Mayrhofer et al., 2019). Crossing points on these arcs were detected for the whisker (the pixels with the minimum intensity) and the jaw (pixels with the maximum slope of intensity). A vector was then defined for each pair of basal point and the cross point, and the absolute angle was calculated for each vector with respect to midline. For the electrophysiology sessions, whisker angular position was quantified in a similar manner while movements of tongue and jaw were quantified as the changes in mean image intensity within a rectangular regions-of-interest (ROI) defined separately on the tracks of tongue and jaw. These signals were then normalized to the area covered by tongue and jaw ROIs. Absolute derivatives of orofacial time series (whisker/jaw/ tongue speed) were calculated to derive angular whisker speed and normalized tongue/jaw speed.

Wide-field calcium imaging
Mice were mounted with a 24-degree tilt along the rostro-caudal axis. The red fluorescent calcium indicator R-CaMP1.07 or the red fluorescent protein tdTomato were excited with 563-nm light (567-nm LED, SP-01-L1, Luxeon, Canada; 563/9-nm band pass filter, 563/9 BrightLine HC, Semrock, USA) and red emission light was detected through a band pass filter (645/110 ET Bandpass, Semrock). A dichroic mirror (Beamsplitter T 588 LPXR, Chroma, USA) was used to separate excitation and emission light. Through a faceto-face tandem objective (Nikkor 50 mm f/1.2, Nikon, Japan; 50 mm video lens, Navitar, USA) connected to a 16-bit monochromatic sCMOS camera (ORCA FLASH4.0v3, Hamamatsu Photonics, Japan), images of the left dorsal hemisphere were acquired with a resolution of 256x320-pixels (4x4 binning) aligned in rostro-caudal axis at a frame rate of 100 Hz (10 ms exposure). Behavioral task and imaging were synchronized by triggering acquisition of each image frame by digital pulses sent by the computer for behavioral task control. For each trial, 600 frames (6 s) of images were acquired from 1 s before the visual cue onset to 3 s after the auditory cue onset. To control for calcium-independent changes in cortical fluorescence (Makino et al., 2017), we imaged transgenic mice expressing tdTomato in vasoactive intestinal peptide-expressing neurons (tdTomato mice) by using the same optical filters as the imaging of RCaMP. tdTomato had excitation and emission spectra similar to RCaMP, and the illumination condition was adjusted so that tdTomato mice and RCaMP mice had comparable fluorescence intensity.
Electrophysiological recording Extracellular spikes were recorded using single-shank silicon probes (A1x32-Poly2-10mm-50 s-177, NeuroNexus, MI, USA) with 32 recording sites covering 775 mm of the cortical depth. In each session two probes were inserted in two different brain targets acutely. Probes were coated with DiI (1,1'-Dioctadecyl-3,3,3 0 ,3 0 -Tetramethylindocarbocyanine Perchlorate, Invitrogen, USA) for post hoc recovery of the recording location (see below). The neural data were filtered between 0.3 Hz and 7.5 kHz and amplified using a digital headstage (CerePlex M32, Blackrock Microsystems, UT, USA). The headstage digitized the data with a sampling frequency of 30 kHz. The digitized signal was transferred to our data acquisition system (CerePlex Direct, Blackrock Microsystems, UT, USA) and stored on an internal HDD of the host PC for offline analysis.

Optogenetic manipulations
Optogenetic activation of tjM1 was performed in 6 Expert Emx1-ChR2 mice with the same transparent skull preparation and 24-deg tilt as the wide-field imaging. 473-nm laser beam (S1FC473MM, Thorlabs) was steered on the cortex by a pair of Galvo mirrors ll OPEN ACCESS Article each frame, and F 0 is the mean intensity of that pixel during the 1 s baseline period before the onset of the visual cue. In each imaging session, mean DF/F 0 images for different trial outcomes (Hit, Miss, False-alarm and Correct-rejection trials) were calculated by averaging all trials of each trial type, or by averaging ''Quiet'' trials in which mean jaw speed during the 1-s delay period after the whisker stimulus did not exceed 4 times of the mean absolute deviation of the jaw speed (angle) during the 1-s baseline period in each trial. Images from different mice were aligned according to the functionally-identified C2-barrel (RCaMP mice) (Mayrhofer et al., 2019) and the cerebellar tentorium (RCaMP and tdTomato mice), and smoothed by spatial Gaussian filter (sigma = 1 pixel, 111 mm). Those trialaveraged images in each session were used as individual samples for statistical analysis. To test statistical differences in the pixel values, Wilcoxon rank-sum test (Expert versus Novice and RCaMP versus tdTomato) or Wilcoxon signed-rank test (Hit versus Miss and Miss versus Correct-rejection) was performed in each pixel, and p-value was corrected for multiple comparison by false-discovery rate, FDR (Benjamini and Hochberg, 1995). The corrected p-values were log-scaled (-log 10 P) to create spatial p-value maps. Borders between anatomical areas were drawn on the functional images (Vanni et al., 2017) by using Allen Mouse Common Coordinate Framework version 3 (CCF) (Lein et al., 2007;Wang et al., 2020) and ARA tools (Han et al., 2018;MacDowell and Buschman, 2020;Musall et al., 2019;Pinto et al., 2019). First, we defined the three-dimensional location of bregma in 25-mm resolution Allen CCF by considering brain structures in the stereotaxic atlas (Paxinos and Franklin, 2019), and the thickness of skull (325 mm) (Soleimanzad et al., 2017). Second, the atlas was rotated by 24 degrees along the rostro-caudal axis. Third, anatomical borders were projected onto the horizontal plane to make a 24-deg tilted border map. Then, the border map was linearly scaled and horizontally shifted to match the functional images of RCaMP mice according to the C2-barrel, bregma, and the anteromedial end of the left hemisphere.

Electrophysiology data
Spiking activity on each probe was detected and sorted into different clusters using Klusta, an open source spike sorting software suited for dense multielectrode recordings (Rossant et al., 2016). After an automated clustering step, clusters were manually inspected and refined. Single units were categorized as regular spiking (RSU) or fast-spiking neurons based on the duration of the spike waveform, and, in this study, we specifically focus on the putative excitatory RSUs (spike peak-to-baseline > 0.34 ms, 4415 units in 22 Expert and 1604 units in 8 Novice mice). Activity maps in Figures S4D and S6B were computed by averaging the trial-aligned peristimulus time histograms of all excitatory units recorded on the same probe.
Assessing expert/novice and hit/miss differences Statistical difference between mean firing rates of Expert versus Novice (Figures 3E and 5D) and Hit versus Miss (Figures 5E, S4E, and S4F) in each area was identified using non-parametric permutation tests in 50-ms bins and p-values were corrected by FDR.

Receiver operating characteristic (ROC) analysis
To quantify the selectivity of ROI calcium traces for Go versus No-Go trials we built ROC curves comparing the distribution of calcium activity in bins of 50 ms including only correct trials (Hit and Correct-rejection). Selectivity index was defined by scaling and shifting the area under the ROC curve (AUC) between À1 and 1: Selectivity index = 2ðAUC À 0:5Þ; where positive selectivity reflects higher activity in Hits and vice versa ( Figure 2D). Similarly, to quantify the selectivity of single units for Go versus No-Go trials we built ROC curves comparing distribution of spiking activity in bins of 100 ms including only correct trials (Hit and Correct-rejection). The area under the ROC curve was then compared to a baseline distribution (5 bins of 100 ms before visual cue onset) to examine the significance of selectivity beyond baseline fluctuations. Non-parametric permutation tests were performed and p-values were corrected by FDR and percentage of neurons with significant positive or negative selectivity in each area were identified (p < 0.05, FDR-corrected, Figure 3F).

Clustering neuronal responses
For clustering the neuronal response patterns, RSUs from both Novice and Expert mice (1) with more than 200 spikes throughout the recording, and (2) with more than 5 trials for each trial-type (i.e., Hit, Miss, CR and FA) were included in the analysis (n = 5405 out of 6019 RSUs). For each neuron and each trial type, time varying PSTHs (100 ms bin size) were computed over a 4-s window starting from 1 s before the visual cue and lasting until 1 s after the auditory cue. PSTHs from different trial types were baseline subtracted, normalized to the range of values across all bins (of all 4 trial types) and then concatenated resulting in an activity matrix X˛R 54053160 whose row i corresponds to the concatenated normalized firing rate of the neuron i across different trial types ( Figure S5A). Other normalization methods such as z-scoring resulted in similar clustering outcomes. To reduce the existing redundancy between firing rate time bins, we used Principal Component Analysis (PCA) and linearly projected firing rate vectors on a low-dimensional space. We applied PCA on the centered version of X (i.e., x i À x i .) and found 14 significant components (permutation test with Bonferroni correction for controlling family-wise error rate by 0.05) (Macosko et al., 2015). The weight of different components was equalized by normalizing the data resulting in unity variance for different components ðX 0˛R5405314 Þ.
Next, we employed spectral embedding on the data to detect non-convex and more complex clusters (Abbe, 2017;Von Luxburg, 2007). To do so, we computed the similarity matrix S˛R 540535405 whose element at row i and column j measures the similarity between where s is a free parameter determining how local similarity is measured in the feature space. We tuned s by putting the average of similarity values equal to 0.5 (the tuned value for s is 0:0987). Then, we computed the normalized Laplacian matrix as L = I À D À0:5 WD À0:5 ; where I is the identity matrix, and D is the diagonal degree matrix defined as diagðf P It should be noted that the new feature space is non-linearly transformed version of the PCA-space which is itself a linearly transformed version of the original firing rate space. Such a transformation is believed to naturally separate data points which are clustered together (Abbe, 2017;Von Luxburg, 2007). Using the elbow method on the eigenvalues of matrix L (i.e., finding the sharp transition in the derivative of sorted eigenvalues), we considered (after excluding the very 1 st eigenvector) the first 13 eigenvectors of matrix L as representative features which yielded matrixX˛R 5405313 . Finally, neurons were clustered based on the resulting matrixX using a Gaussian Mixture Model (GMM). The algorithm considers that underlying distribution of data is a mixture of K Gaussians with means fm 1 ; .; m k g; diagonal covariance matrices fS 1 ;.;S k g, and weights fp 1 ; .; p k g. For a given K, we estimated the parameters of this mixture model by using expected maximization (EM) algorithm (5000 repetitions and 1000 iterations). The number of clusters was then selected ðK = 24Þ by minimizing the Bayesian information criterion (BIC) (Engelhard et al., 2019) ( Figure S5B). Using the fitted parameters, we assigned a cluster index c i˛f 1; .; 24g to each neuron corresponding to the Gaussian distribution to which it belongs with the highest probability. The output of the GMM step was the vector C˛f1; .; 24g 5405 containing the cluster indices of neurons. Task-modulated clusters (20/24) were sorted by their onset latency and were labeled based on their task epoch-related response ( Figure 3C).
To study to what extent neurons from different brain regions and Novice and Expert mice contribute to the composition of clusters we took 3 steps. First, we quantified the distribution of neurons of each cluster across different brain regions in Novice and Expert mice ( Figures 3D, 5F, and S5C). To account for the differences in the total number of neurons belonging to each group and brain region, weighted proportions were considered. Next, to identify the patterns which are more prevalent after whisker training, we quantified the percentage of neurons in each cluster that belong to Expert mice ( Figure 3D). Similarly, in computing this percentage, weighted proportions were considered to correct for the difference in sample sizes (n = 3960 neurons from Expert, n = 1445 neurons from Novice). Finally, we defined a ''distribution index'' which quantifies the spread of each cluster among different brain regions (Figure 3D). For this purpose, we measured the total-variation distance between the weighted distribution of neurons of each cluster across 12 brain regions and the uniform distribution: Where p c;a is the weighted proportion of neurons in cluster c belonging to area a. Note that p c;a is normalized with respect to areas, i.e., P a p c;a = 1. The distance TV c takes 0 as its minimum value when the neurons of cluster c are uniformly distributed in all areas, and takes 11 12 as its maximum value when all neurons of cluster c belong to a single brain area. To scale this value between zero and one, for each cluster c we defined a distribution index ðD c Þ as: where D c = 1 indicates that cluster c is uniformly distributed among areas, and D c = 0 indicates that cluster c is concentrated in a single brain region.
To characterize changes across learning of the delay task in each area, we computed separately in Novice and Expert mice, the activity pattern of the two most representative clusters (i.e., clusters with the highest number of neurons among all clusters) by averaging the activity among neurons belonging to the pair of area and cluster. The two most representative clusters are labeled as 1st and 2nd rank ( Figure S5D).

Single neuron whisker-evoked response latency
To quantify the latency of whisker-evoked sensory response in spiking activity of single neurons ( Figure 5C), we limited the analysis to the first 200-ms window following the whisker stimulus. We first examined whether each neuron was modulated (positively or negatively) in the 200-ms window following the whisker stimulus compared to a 200-ms window prior to the whisker onset. For responsive neurons (p < 0.05, non-parametric permutation test), latency -calculated on the temporally smoothed PSTHs (1 ms non-overlapping binned PSTH filtered with a Gaussian kernel with s = 10 ms) -was defined as the time where the neural activity reached half maximum (half minimum for suppressed neurons) within the 200-ms window. Only responsive neurons are included in the cumulative distributions and boxplots in Figure 5C.

GLM encoding model
We used Poisson regression to fit an encoding model (generalized linear model, GLM) to predict the spiking activity of each individual neuron given behavioral data (Nelder and Wedderburn, 1972;Park et al., 2014). For each session, we concatenated all correct trials (Hit and Correct-rejection) and then split the data to perform five-fold cross-validation. In Poisson regression, one aims at predicting the spike count y(t) in a time bin t according to the formula: i.e., assuming that the spike counts are sampled from a Poisson distribution with rate that depends on the design matrix XðtÞ and on the weight vector b. In our case, y was constructed by binning the spikes in 100-ms bins. The weights b were fit by maximizing the likelihood with Ridge regularization for each fold, and then averaged across the five folds. The parameter that controls the strength of the regularization was determined separately for each neuron using evidence optimization (Cunningham et al., 2008;Park et al., 2014). The design matrix was constructed by including three types of variables: ''event'' variables, associated to task-related events; ''analog'' variables, associated to real-valued behavioral measures from videography; and ''slow'' variables, which were constant during one trial but could vary over the course of one session. Event variables included the visual cue onset, the whisker stimulus onset, the auditory cue onset and the onset of the first lick. The exact time of lick onset was determined from the high-speed video using a custom algorithm. To assess the delayed effect of such task-related variables, each of these event-like variables was associated with a set of ten 100-ms wide and unit height boxcar basis functions, spanning in total one second after each event. The firstlick variable was associated with two additional boxcar functions covering 0.2 s prior to the lick onset, to capture lick-specific preparatory neuronal activity. Analog variables included in the design matrix were the whisker, tongue and jaw speed. These quantities were first extracted from the high-speed videos using custom code and then averaged in 100-ms bins. Among the slow variables, we included the trial index, i.e., a variable that at each trial k took a constant value equal to k = k total , where k total is the total number of trials in a session. This variable could capture shifts in a neuron baseline activity due to slow effects across the session such as changes in satiety and motivation. Finally, we included three binary variables that took value one only if the previous trial was an early lick, a False-alarm or a Hit trial, to capture the effect of the previous trial outcome on the subsequent trial. In total, our design matrix had 50 columns, corresponding to the number of free parameters of the model. To assess the significance of each variable in the design matrix, we fitted a new GLM model obtained by removing the variable of interest (reduced model) from the full model. If for a certain neuron the reduced model fitted the data significantly worse than the full model (p < 0.05, according to a likelihood ratio test [Buse, 1982]), then that neuron was considered significantly modulated by the removed variable. The reduced model was fitted independently for each fold, using the same data splitting used for the full model.
In the likelihood ratio test, the test statistics are given by 2log L full= L reduced , where L full and L reduced are the full and reduced model likelihood respectively. These statistics were computed for each fold and then averaged to obtain an average statistic, from which the final p-value was computed (Buse, 1982). Note that in the presence of correlations among variables, this approach is stringent in that it tends to underestimate the significance of different variables. To separately assess the effect of the onset of event-like variables from their delayed effects, we quantified their significance independently by separately removing the first two basis functions or remaining eight basis functions (Visual, Auditory and Lick). For the whisker variable, since it was very brief in time (10 ms), we removed either the first or the remaining nine bins (referred to as 'Whisker' and 'Delay' respectively in Figures 6D and 6E). To assess the significance of the modulation due to lick-preparatory neuronal activity we separately removed the two basis functions that preceded the lick onset (referred to as 'Lick initiation' in Figures 6D and 6E). Spatial weight maps for selected model variables ( Figure S7E) were built by first averaging the weights over the time course of the variable, i.e., by averaging over the weights of the boxcar basis functions. Next, for each neuron these weights were projected on the reconstructed anatomical location in 2D, and were then averaged across all neurons with a certain spatial bin (50x50 mm). The resulting spatial weight map was smoothed using a 2D Gaussian kernel (sigma = 150 mm). All the GLM analysis was performed in MATLAB using a combination of existing and custom-written code.

Assessing optogenetic manipulation impact
We measured the impact of optogenetic activation in tjM1 by counting early licks evoked during the delay period. Sessions with a difference between Hit rate and False alarm rate smaller than 0.2 were excluded from the analysis. The early lick rates with the strongest optogenetic stimulation (9 mW) were calculated in each session to test statistical difference between light-off and light trials.
To quantify the impact of optogenetic inactivation we compared mouse averaged performance (n = 9; Hit rate, False alarm rate and Early lick rate) for different light windows (i.e., Baseline, Whisker, Delay, Response) to light-off control trials. P-values were corrected for multiple comparison (i.e., 4 windows) using Bonferroni correction.
To assess the effect of inactivation on movements, we quantified the change in light versus no-light trials by defining a movement modulation index as: for different orofacial movements (whisker, jaw and tongue speed) and lick spout reading with the piezo sensor ( Figure S8).

Quantifying involvement index
The involvement index was defined by combining the neuronal correlates and behavioral impact of optogenetic inactivation. For each pair of area and temporal window of interest, we built two distributions of bootstrap estimation of the mean, separately for neuronal correlates and inactivation impact, by bootstrapping 1000 times. The neuronal correlates were quantified as the mean firing rate difference in Hit versus Correct-rejection trials across all neurons recorded from 22 Expert mice. The inactivation impact was quantified as the mean change in Hit rate across 9 VGAT-ChR2 mice. The distribution of involvement index was calculated as the product distribution of the two bootstrap distributions.

Statistics
Data are represented as mean ± SEM unless otherwise noted. The Wilcoxon signed-rank test was used to assess significance in paired comparisons; and the Wilcoxon rank-sum test was used for unpaired comparisons (MATLAB implementations). Analysis of spiking activity, selectivity of calcium signals, and involvement index was performed using a non-parametric permutation test.   (D) Orofacial movements in Hit trials for Novice and Expert RCaMP mice (n=7). Same configuration as Figure 1F, left.   A and C, each frame shows instantaneous ΔF/F0 without averaging (10 ms/frame). For each pixel, baseline activity in a 50 ms window before visual cue onset was subtracted.
RCaMP ( (F) Hemodynamic signal in tdTomato mice evoked by whisker stimulation. The brief whisker stimulation (a single 10 ms pulse) used in the task did not evoke detectable changes in wS1, but a prolonged whisker stimulation (100 pulses each of 10 ms at 100 Hz lasting 1000 ms) evoked a strong reduction of cortical fluorescence (n = 3 tdTomato mice). (F) Same as E but for Expert mice. Note prominent differences during the delay period after whisker stimulus across many regions including wS2, DLS, wM2, ALM and tjM1.  ROI size, 3×3 pixels. Activity did not change by learning in the input node wS1, but did diverge in other regions. Note sharp decrease of signal in tjM1 of Expert mice.
(B) Time-lapse maps of mean firing rate immediately after whisker onset. Novice and Expert mice in Hit trials. Same configuration as Figure S4D   (A) Orofacial movements in selected Quiet trials. The same configuration as Figure 1F, left. Left, grand average movements in all Hit trials without selection (same data as Figure 1F). Right, grand average movements for selected Quiet trials where mice did not show jaw movements. Note that preparatory movements in the delay period after whisker stimulus disappeared.  (C) Changes in movement during response window. Similar to (B), but when light was applied and movements were quantified during the response window. Similarly, only Hit trials are included for both light and light-off trials.

Supplemental Table S1
Supplemental