Maintained avalanche dynamics during task-induced changes of neuronal activity in nonhuman primates

Sensory events, cognitive processing and motor actions correlate with transient changes in neuronal activity. In cortex, these transients form widespread spatiotemporal patterns with largely unknown statistical regularities. Here, we show that activity associated with behavioral events carry the signature of scale-invariant spatiotemporal clusters, neuronal avalanches. Using high-density microelectrode arrays in nonhuman primates, we recorded extracellular unit activity and the local field potential (LFP) in premotor and prefrontal cortex during motor and cognitive tasks. Unit activity and negative LFP deflections (nLFP) consistently changed in rate at single electrodes during tasks. Accordingly, nLFP clusters on the array deviated from scale-invariance compared to ongoing activity. Scale-invariance was recovered using ‘adaptive binning’, that is identifying clusters at temporal resolution given by task-induced changes in nLFP rate. Measures of LFP synchronization confirmed and computer simulations detailed our findings. We suggest optimization principles identified for avalanches during ongoing activity to apply to cortical information processing during behavior.

Recent analysis of neuronal avalanches during sustained dynamic movie stimulation in the visual cortex of turtle  and visuo-motor tasks in human fMRI (Fagerholm et al., 2015) and EEG (Arviv et al., 2015) suggest transient dynamics that differ from scale-invariant avalanches. Similarly, neural models implied transient deviations from avalanche dynamics during strong external inputs (Millman et al., 2010;Taylor et al., 2013;Hartley et al., 2014;Stepp et al., 2015;Williams-García et al., 2014). Here, we hypothesize that transient changes in activity during evoked responses could simply reflect a change in the rate of avalanches rather than a transition to a different dynamical regime, for example a change in synchronization. Such a rate change likely requires modifications to standard avalanche analysis, as the standard was originally introduced for stationary rates of activity (Beggs and Plenz, 2003). In fact, by taking changes in activity rate into account, here we show in the behaving nonhuman primate that cortical dynamics during sensory and cognitive processing exhibit clear power law organization in line with the maintenance of neuronal avalanches. In line with these findings, our phase-synchronization analysis demonstrates that synchrony levels are maintained for ongoing and task-related activity. We expand on our experimental results using computational models and explore the sensitivity of this advanced approach to distinguish avalanche dynamics from states with changed synchrony, for example supercritical dynamics. Our results extend previous work on the dynamics of resting state activity and suggest that the optimization principles identified for critical dynamics during resting state activity extend to and sculpt cortical information processing during behavior.

Results
In order to demonstrate avalanche maintenance during evoked activity that is independent of specific brain region or behavioral paradigm, we chose two types of tasks and two cortical areas in two adult macaque monkeys. High-density microelectrode arrays (MEA; 10 Â 10; corner electrodes missing; inter electrode distance Dd = 400 mm) were chronically implanted in left prefrontal cortex (PFC) of monkey A and left premotor cortex (PM) of monkey B. We recorded extracellular unit activity (0.3-3 kHz) to confirm involvement of recording sites in behavioral tasks as well as the LFP (1-100 Hz), which is commonly used for avalanche analysis (Beggs and Plenz, 2003;Petermann et al., 2009;Scott et al., 2014).

Avalanche maintenance during a cue-triggered cognitive task
In our first behavioral condition, we asked whether scale-invariant dynamics can be found when strong transient activity changes due to cognitive processing are present. To this end, we analyzed avalanche dynamics in the dorsal-lateral prefrontal cortex (dlPFC; left hemisphere) for a visual-motor mapping task (monkey A; Figures 1-3). Monkey A applied a learned rule (>95% success rate) to two individually presented visual cues, which instructed the retrieval of food from a left feeder (left trials) or right feeder (right trials).
It is known that the dlPFC is involved in such visual-motor mapping task (Asaad et al., 1998;Puig and Miller, 2012). Consistently, in all of 29 putative single-units recorded in the dlPFC (1 day, 1 hr recording session), unit activity was variable with transient increases or decreases in firing rate around cue on-and off-set that depended on left or right trials ( Figure 1A; three examples shown). No significant change in Fano Factor (Churchland et al., 2010) was found during task execution ( Figure 1B; see Materials and methods). On the other hand, the average LFP changed consistently for each electrode exhibiting two waveforms that differed between left and right trials ( Figure 1C). We extracted peak times of the negative LFP (nLFP) at a threshold thr = -2 SD for further analysis (see Materials and methods), and observed a rapid~8-fold increase in nLFP rate~0.5 s after cue   Figure 1. Neuronal avalanches in prefrontal cortex during a cognitive task. (A) Firing rate changes for putative single units demonstrate PFC region recorded by the array is involved in task performance. Unit raster (top) separated into right (green) and left (black) trials with corresponding average firing rates (bottom). Cue presentation elicits distinct rate changes for units 1 and 2 during left and right trials, but not for unit 3. Colored area (orange/purple) indicates periods that differ significantly (high/low) from baseline (see Materials and methods). Vertical solid lines: Cue on-and off-set. ( presentation, when the monkey was preparing to reach for the left feeder ( Figure 1D, single electrode; Figure 1E, all electrodes). A much weaker, more delayed increase was observed for right trials. Based on the delayed increase in nLFP rate for left trials, we divided trials into a baseline period before cue-onset (BASE; -0.4-0 s), and an early (EARLY; 0-0.4 s) and late epoch (LATE; 0.4-0.8 s) after cue-onset.
To study neuronal avalanches, two 1 hr recording sessions from 2 consecutive days resulting in 322 trials (day 1: 139, day 2: 183) were combined. In short, for each trial successive nLFPs on the array were concatenated into contiguous clusters at temporal resolution Dt (Beggs and Plenz, 2003). A cluster started with the transition from an empty time bin of width Dt to a time bin with at least one nLFP on the array. The concatenation continued until a time bin with no nLFP was encountered (see also Figure 2A). The temporal resolution Dt was defined by the inverse of the average nLFP rate on the array and is not a free parameter (Beggs and Plenz, 2003). In line with standard avalanche analysis, we kept Dt fixed ('fixed binning') and determined its value from the full recording to be Dt = 4.75 ms for monkey A. We found that the probability density function of cluster size s, that is the number of nLFPs in a cluster, followed a power law, for both left and right trials, with slope a close to -1.5 and a cut-off for s > 96, the maximal number of MEA electrodes (Klaus et al., Figure 1F, solid black and green lines for left and right trials, respectively; exponential vs. power law: p<10 À5 ; see Materials and methods). Importantly, these hallmarks of avalanche dynamics were destroyed by trial shuffling, which removes spatial trial-to-trial correlations while maintaining average rate changes ( Figure 1F, broken lines; exponential vs. power law: 1 À p<10 À5 ). This control demonstrates that the power law in cluster sizes, indicative of avalanche dynamics, does not simply arise from transient activity changes during behaviorally relevant periods.   By combining clusters from different behavioral epochs into one single distribution, however, transient deviations from scale-invariance could cancel each other, giving the appearance of maintained avalanche dynamics during behavior. Indeed, when separating clusters according to trial epochs BASE, EARLY and LATE, systematic differences in the corresponding avalanche raster were revealed. Specifically, the increase in nLFP rate during LATE for left trials yielded significantly larger avalanches during the period prior to reach ( Figure 2B, red dots; Figure 2C, orange and purple segments indicate significantly high and low values, respectively; p 0:05; see Materials and methods). Accordingly, a preponderance of large avalanches was visible in the corresponding size distribution during LATE for left trials ( Figure 3A and B, left), whereas size distributions remained close to a power law during BASE and EARLY and for right trials ( Figure 3A and B; p<10 À5 , all cases). In the case of right trials, where the transient changes in activity rate are much less pronounced, size distributions are more similar among all epochs, with a small tendency for larger avalanches during the LATE epoch ( Figure 3A and B, right; p<10 À5 , all cases).
These deviations from scale-invariance compared to BASE, on the other hand, were abated when the temporal resolution Dt to define clusters was obtained from the average nLFP rate for each trial i and epoch separately (Dt BASE;EARLY;LATE i ; Figure 2D-F). We define this approach as 'adaptive binning'. It decreases Dt during periods of high nLFP rate, while increasing Dt for low activity periods ( Figure 2D). Indeed, adaptive binning obtained scale-invariant power laws in avalanche sizes that were similar for each epoch ( Figure 3C,D; p ¼ 0:06 0:08 ð Þ and p ¼ 0:11 0:1 ð Þ for EARLY (LATE) compared to BASE, for left and right trials respectively; see Materials and methods), with a change in exponent consistent with previous results (Beggs and Plenz, 2003;Petermann et al., 2009;Priesemann et al., 2014) and a cut-off for s > 96 (Klaus et al., 2011;Yu et al., 2014). Comparing BASE and LATE avalanches for each trial, the bias toward larger ones in the latter epoch seen when employing fixed binning is significantly decreased after adaptive binning for left trials ( Figure 3E, left), while during right trials, in which scale-free avalanches were obtained even for fixed binning, the bias does not change after adaptive binning. As can be seen in the corresponding raster plots and trial averages ( Figure 2E,F), avalanche size decreased in LATE concomitant with an increase in avalanche rate (more prominently for left trials). We conclude that the apparent over-representations of large avalanches during behavioral epochs were due to a mismatched temporal resolution Dt by not accounting for systematic changes in nLFP rate during different epochs.
We further explored if nLFP rate within an epoch correlates with behavioral outcome. Indeed, reaction time, defined as the time it took the monkey to reach the reward after the visual cue was switched off, did not correlate with nLFP rate (left trials: R = 0.036, p=0.73; right trials: R = À0.039, p=0.71; p-values indicate the likelihood of obtaining a correlation at least as high by chance), avalanche rate (left trials: R = 0.041, p=0.68 for fixed Dt and R = À0.022, p=0.79 for adaptive Dt; right trials: R = À0.033, p=0.74 for fixed Dt and R = 0.012, p=0.87 for adaptive Dt) or avalanche size (left trials: R = 0.085, p=0.52 for fixed Dt and R = 0.062, p=0.66 for adaptive Dt; right trials: R = 0.058, p=0.63 for fixed Dt and R = À0.025, p=0.72 for adaptive Dt), in line with our finding of a scale-free regime regardless of rates for each trial. Given the monkey's high performance (>95% successful trials), rates could not be analyzed for correct/incorrect trial outcome.

Avalanche maintenance during self-initiated movements
In monkey B, we studied activity changes during self-initiated movements. The monkey, without further cue or prompting, initiated the touching of a pad with her right hand, after which a reward was provided. Similar to monkey A, two 1 hr sessions were recorded in 2 consecutive days. Since results were highly consistent between different days, the data were pooled together, resulting in 663 trials analyzed (day 1: 319; day 2: 344). As the recording site was within the arm representation area within the PM, recorded activity clearly reflected task execution ( Figure 4). About 200 ms before touching the pad, 18/43 (~42%) putative single-units increased in firing rate, whereas 5/43 units (~12%) decreased their firing rate around the time of touching ( Figure 4A,B). Fano Factor of unit activity decreased in line with previous reports ( Figure 4C) (Churchland et al., 2010). Similarly, the LFP transiently turned negative~200 ms before touching for all electrodes ( Figure 4D), together with a~4fold increase in the nLFP rate and a slight decrease thereafter ( Figure 4E), which readily separated activity into a baseline epoch (BASE; -0.8 to -0.4 s relative to the touch), followed by a pre-touch (PRE; -0.4 to 0 s) and post-touch epoch (POST; 0 to 0.4 s) ( Figure 4F).
When all data across epochs was analyzed using fixed binning, that is Dt based on the full recording (Dt = 10.3 ms), avalanche sizes followed a power law with -1.5 exponent and cut-off at~96 electrodes ( Figure 4G; exponential vs. power law: p<10 À5 ), which was destroyed by trial shuffling (exponential vs. power law: 1 À p<10 À5 ), again demonstrating that scale-invariant LFP clusters were  based on the trial-by-trial correlations between cortical sites and did not simply arise from transient changes in nLFP rate. Evaluation of avalanche raster plots based on epochs revealed systematic differences in the corresponding avalanche raster ( Figure 5A,B). Specifically, the increase in nLFP rate during PRE yielded significantly more avalanches and of larger size during that period, whereas the converse was true during POST when nLFP rate decreased ( Figure 5A; colored dots; Figure 5B; orange and purple segments respectively; p 0:05; see Materials and methods). The corresponding size distributions exhibited deviations from scale-invariance obtained for BASE ( Figure 6A and B, left) with an excess of large avalanches for PRE and a deficit of large avalanches for POST, despite all distributions remaining close to power laws (exponential vs. power law: p<10 À5 , all three epochs).
As in the case for the cognitive task, these systematic deviations in size distributions were abated when calculating Dt from the average nLFP rate for each trial i and epoch (Dt BASE;PRE;POST i ; Figure 6A and B, right). Through adaptive binning we once again obtained scale-invariant power laws in avalanche size that were similar for each epoch (p ¼ 0:177, PRE vs. BASE; p ¼ 0:05, POST vs. BASE), with the bias towards larger avalanches during PRE being significantly reduced ( Figure 6C). As can be seen in the corresponding raster plots and trial averages ( Figure 5C,D), avalanche size decreased in PRE concomitant with an increase in avalanche rate confirming our findings in monkey A. During POST, under representation of large avalanches due to mismatched temporal resolution Dt can also be partially compensated for by adaptive binning.

change in dynamics
We next illustrate the expected deviation from scale-invariance when rates change and its remediation using 'adaptive binning' in a simple schematic. In the hypothetical example shown in Figure 7A, we consider a baseline period with nLFP rate of 3 Hz per electrode, followed by a period with the rate increased to 9 Hz ( Figure 7A, left). Fixed binning, which sets Dt according to the average rate of 6 Hz, underestimates the temporal resolution for baseline activity, whereas it overestimates Dt for the high activity period. The resulting power law size distributions will be steeper for the baseline period and shallower for the high activity period (Beggs and Plenz, 2003). The combined distribution will show an upward bend at the cross-over of the individual distributions ( Figure 7A; middle, solid line) which seems to suggest a deviation from avalanche dynamics with an overabundance of large clusters similar to what would be expected for supercritical dynamics. Adaptive binning, that is calculating Dt for each period separately approximates the exponents for both power laws, resulting in a final distribution that matches a power law with an intermediate slope ( Figure 7A, right) (Beggs and Plenz, 2003;Petermann et al., 2009;Priesemann et al., 2014). We note that simply using shorter time bins for all data would retain different slopes for different regimes and thus is not an alternative to adaptive binning. Although we illustrated this principle using two sequential periods with different rates, the same holds for repeat trials with different rates. In Figure 7B, we separated trials of monkey B into lowand high-rate responses based on PRE epochs ( Figure 7B, left). Size distributions obtained from high-rate trials at fixed binning exhibit an overabundance of large avalanches ( Figure 7B, middle, arrow). With adaptive binning, deviations from the baseline distribution decreased and all distributions approached power laws with a -1.2 exponent ( Figure 7B, right). This demonstrates that trials in monkey B exhibit similar dynamics as those observed for monkey A or, more generally, that deviations from scale-invariance observed in the two experimental regimes can be explained by nLFP rate changes alone.
Neuronal simulations support 'adaptive' binning to recover scaleinvariance for critical dynamics Using simulations, we explored the conditions under which adaptive binning recovers scale-invariant size distributions. Simulations were carried out on a 10 Â 10 cellular automaton network, the approximate size of our MEAs, with cascades unfolding according to a critical branching process (Zapperi et al., 1995) (see Materials and methods). Quiescent times between cascades were randomly drawn from the experimentally obtained quiescent time distribution of monkey B. Simulated baseline activity, by randomly initiating avalanches in the network, was superimposed after 0.4 s by a second process, which produced a transient increase in rate mimicking movement initiation. We tested three different processes for two levels of rate increase and obtained corresponding size distributions for fixed and adaptive binning. Increasing rate by adding Poisson inputs that could trigger avalanches ( Figure 7C, 'Input driven') closely matched our experimental results. For fixed binning, an overabundance of large avalanches close to the cut-off was found in the size distribution ( Figure 7C, middle, arrow), which originated from the concatenation of spontaneous with input-triggered avalanches. We note that this overabundance manifests while the model interactions remained tuned to the critical point in the limit of zero input. Accordingly, the power law in the size distribution was recovered by means of adaptive binning ( Figure 7C, right). Next, a high rate was achieved by increasing the likelihood of nodes exciting each other, that is changing to a supercritical branching process with increased synchronization ( Figure 7D, 'Supercritical'). The corresponding size distribution exhibits an overabundance of large avalanches that ( Figure 7D, middle) reflects explosive growth as activity propagates in the network. Adaptive binning, which is tied to the overall increase in event rate, was insufficient to compensate for the increase in synchrony. It failed to collapse the cut-off of the distributions, instead further separating non-global cascades from synchronized, global activity ( Figure 7D, right, arrows). Finally, event rate was increased by adding uncorrelated activity, that is Poisson inputs that do not trigger avalanches ( Figure 7E, 'Poisson noise'). The resulting size distribution exhibits a preponderance of intermediate size avalanches that reflects the mean rate of uncorrelated events introduced ( Figure 7E, middle). Here, even low-noise levels destroy the power-law regime. While distributions tend to get closer to a power law with adaptive binning, their cut-off at system size is lost in the process ( Figure 7E, right).

'Adaptive' thresholding as an alternative approach to 'adaptive' binning
Adaptive binning reduces Dt for periods of high rates, thereby reducing the number of nLFPs per time bin. Alternatively, one can increase the threshold for nLFP detection during these periods, likewise reducing the number of nLFPs encountered per time bin (see also ). The reverse argument holds when periods of low rates are encountered. We tracked the mean and standard deviation of the LFP in successive windows of 100 ms width and used a moving threshold thr = mean -2SD to identify nLFPs within each window ( Figure 8A,C). Thus, activity modulations induced by the task were mainly reflected in nLFP amplitude, but not rate. This approach, which trades temporal resolution for local sensitivity, collapsed distributions from different periods (grey and orange areas in Figure 8B,C) in monkey A into a similar power law, indicative of sustained avalanche dynamics during task performance ( Figure 8D,E).

Phase synchrony analysis supports recovered scale-invariance through 'adaptive' binning
Experiments and theory have consistently shown that size distributions of clustered activity are sensitive to the degree of synchrony in a system (Shew et al., 2009;Larremore et al., 2011;Shew et al., 2011;Yang et al., 2012;Bellay et al., 2015;Shew et al., 2015). Our demonstration of scale-invariant size distributions suggests avalanche dynamics are maintained throughout task performance, which should mirror a sustained average synchrony among cortical sites. Previous studies have shown overall phase-synchronization between sites sensitively captures fast dynamical changes in a network (Kitzbichler et al., 2009;Yang et al., 2012), in particular at higher frequencies (Meisel et al., 2016). In line with our results from adaptive binning and thresholding, levels of LFP synchrony R (see Materials and methods) were not different across epochs in monkey A (Left: 0.56 ± 0.015, 0.56 ± 0.01 and 0.56 ± 0.01; Right: 0.56 ± 0.01, 0.55 ± 0.01 and 0.56 ± 0.004; mean ± standard deviation for BASE, EARLY and LATE respectively; p>0.05; two-tailed paired student's t-test) as well as in monkey B (BASE, 0.450 ± 0.003; mean ± standard deviation; vs. PRE, 0.452 ± 0.003; vs. POST, 0.455 ± 0.003; p>0.05).

Discussion
Our findings suggest that cortical avalanches are maintained in nonhuman primates during information processing, thus adding avalanche dynamics to the many links found between ongoing and evoked activities in the brain (Arieli et al., 1996;Tsodyks et al., 1999;Womelsdorf et al., 2006).
During ongoing activity, avalanche dynamics have been demonstrated in various brain areas including the primary visual (Hahn et al., 2010), primary motor , somatosensory (Gireesh and Plenz, 2008), premotor and prefrontal (Yu et al., 2014) cortices, reflecting a common dynamic feature regardless of specific cortical areas. Therefore, we aimed to extract a similar rule that govern avalanches during evoked activity. To achieve this goal while minimizing the number of nonhuman primates used, we chose to record activity from different cortical sites and different behavioral tasks in two monkeys, while the common condition is that the recorded region was strongly activated by the corresponding task. Monkey A performed a visual-motor mapping task, in which the correct actions have to be applied to specific cues (Wise and Murray, 2000). This task is known to involve the dorsal-lateral prefrontal cortex (Asaad et al., 1998;Puig and Miller, 2012), in line with our finding of differential spiking as well as LFP responses for the left and right cues. Conversely, the target area in our self-initiated motor task for monkey B was the premotor cortex, which is involved in movement planning (Weinrich et al., 1984;Brasted and Wise, 2004) and self-initiated movement (Hoffstaedter et al., 2013). Consistent with previous studies, we found that, at both the single neuron and population level, the premotor cortex exhibited strong activation a few hundred milliseconds before the self-initiated touch. Our highly consistent results obtained for these different cortical areas and behavioral tasks suggest that avalanche dynamics might be a general principle governing cortical dynamics underlying various behaviors.
Our trial-shuffling control demonstrated that the power law in avalanche sizes reflects correlations that are maintained during behavior on a trial-by-trial basis and that this statistical feature did not simply arise from non-stationary rates. Recently, power laws, for example Zipf's law, have been shown to arise when the range of event probabilities becomes enlarged, for example from underlying common correlations due to common inputs typically captured in latent variables (Schwab et al., 2014;Aitchison et al., 2016). Because trial-shuffling maintains such latent variables as captured in the average evoked response, yet, abolishes scale-invariance, our findings suggest that the scaleinvariance encountered in our behavioral data reflects intrinsic correlation structure of trial-to-trial variability and not common correlations.
Originally introduced for resting or spontaneous activity (Beggs and Plenz, 2003), a fixed temporal resolution Dt assumes stationary rates, which is typically violated during sensory epochs or periods of movement when event rates change systematically. For avalanche analysis, the temporal resolution Dt is based on the inter-event interval distribution of the population, for example successive nLFP events on the array. This temporal scale Dt is not a free parameter. It further depends on the spatial scale, that is inter-electrode distance Dd, as well as local minimal event threshold thr. Importantly, all three parameters Dt, Dd and thr have been experimentally linked and shown to affect the slope of the power law in a predictable manner (Beggs and Plenz, 2003;Petermann et al., 2009). In our initial approach, that is adaptive binning, we kept the threshold thr and inter electrode distance Dd constant while changing Dt in accordance with the change in event rate, which in our experiments increased up to a factor of 10. This recovered power law statistics, which was confirmed by our simulation in which rate changes were created by increasing avalanche rate. These findings are in line with reported changes of the power law slope with changes in Dt (Beggs and Plenz, 2003;Petermann et al., 2009;Priesemann et al., 2014).
Our results highlight the importance of the principle of 'separation of time scales' (Bak et al., 1988;Vespignani and Zapperi, 1998;Plenz, 2012;Priesemann et al., 2014) for measuring avalanche dynamics. That is, the time scale at which cascading events unfold within individual avalanches, characterized by the avalanche duration, should be significantly shorter than the time scale at which different avalanches emerge, characterized by the inter-avalanche interval. Estimating the temporal resolution Dt from the inverse of the event rate provides a reasonable separation of time scales, because it balances two potential errors -false concatenation of separate avalanches and false separation of a single avalanche (Beggs and Plenz, 2003;Plenz, 2012).
In an alternative approach, we kept spatial resolution Dd and temporal resolution Dt fixed while changing threshold thr. This approach does not affect the power law organization, as demonstrated in nonhuman primates ) and human fMRI (Tagliazucchi et al., 2012). Particularly when Dt is difficult to change, changing thr might provide an alternative means to reestablish a separation of time scales, since large local events are less common than small local events. Our recovery of a power law for large local events during high activity periods demonstrates the analogous roles time and local event size play in avalanche dynamics. Our separate demonstration that synchrony does not change between BASE, EARLY/PRE and LATE/POST conditions further supports our finding that avalanches are maintained despite changes in event rates. Conversely, if thresholds are calculated individually for different temporal segments corresponding to different conditions of interest and the resulting distributions deviate from power laws along with concurrent changes in synchrony measures, then one has strong indication that the underlying dynamics deviate from criticality (Meisel et al., 2013).
It is important to emphasize that adaptive binning and thresholding are only tools to reveal the signatures of scale-free dynamics after they become blurred by increased drive and absence of separation of time-scales. They are not meant to be a perfect theoretical solution to the problem and have obvious limitations. For instance, there is no clear answer as to how large the window for calculating adaptive binning or thresholding should be. If it is too small, the inherent fluctuations of a critical system will be regarded as rate changes and therefore no power law dynamics can be observed. If it is too large, the rate changes can be averaged out inside them, and therefore the problem of the non-stationarity would remain. In the case of our experiments, we had a clear indication of when we should expect a rate change, which gave us a good parameter to choose the window size (~400 ms). Another problem arrives for extreme rate regimes. For very high drive, in which Dt would be much shorter than the time it takes for spikes to propagate from a neuron to its neighbors, it is unlikely that a power law can be recovered since activity becomes uncorrelated at that temporal resolution. On the other extreme, if there is almost no activity to be observed, a very large bin may require unrealistic long recordings to produce power laws. These two extreme cases may explain some of the residual error we found for left trials during PRE for monkey A (slightly larger avalanches even after adaptive binning) and during POST for monkey B (slightly smaller avalanches even after adaptive binning).
Another important point is that the advantages of criticality within a certain area allow it to optimally process information, regardless of the nature of the input received by that area. In fact, one of the advantages of criticality is maximal dynamic range (Kinouchi and Copelli, 2006;Shew et al., 2009;Gautam et al., 2015), which means that the downstream network which employs criticality will be able to respond properly for a larger range of different inputs. Nevertheless, it is possible downstream neurons employ a sort of adaptive binning mechanism. It has been suggested in simulations that the activity of an upstream network can influence how individual downstream neurons perform spatial and temporal integration at the synaptic level (Bernander et al., 1991;Rapp et al., 1992). The more input a cell receives, the more synchronized these incoming spikes have to be in order to produce a spike on the post-synaptic cell, effectively increasing the temporal resolution by which this downstream cell process its input (Bernander et al., 1991), similarly to how our proposed adaptive binning method works.
Neuronal avalanche dynamics are predominantly located in cortical layers 2/3 (Stewart and Plenz, 2006;Petermann et al., 2009) and exhibit strong non-linear components involving the interaction between cortical sites, that is the interactions of neurons or neuronal groups (Thiagarajan et al., 2010;Yu et al., 2011;Plenz, 2012). Accordingly, maintained avalanche dynamics during self-initiated and cue-related responses suggest non-linear interactions in superficial layers during cortical processing, which numerous studies have shown to be indeed the case. For example, evoked responses in superficial layers of visual cortex in the nonhuman primate have been found to be non-linear, potentially involving the local recurrent network (Williams and Shapley, 2007;Xing et al., 2012). Similarly, functional connectivity based on the interaction from neuronal firing in layer 2/3 of nonhuman primates has been found to contribute to motor coding (Hatsopoulos et al., 1998) and to be as robust or even superior over tuning curves in predicting motor outcome (Stevenson et al., 2012). With respect to their spatiotemporal spread, avalanches reveal a fractal dimension when analyzed individually (Plenz, 2012), with nearest-neighbor relationships in the average (Gireesh and Plenz, 2008;Yu et al., 2014), in line with compact spatiotemporal spreading of average evoked responses reported, for example, for premotor cortex (Rubino et al., 2006).
Our current findings may retroactively explain previous reports that found deviations from avalanche dynamics during stimulus presentation. Arviv et al. (2015) found power law avalanche size distribution when considering the full 1 s window for a visual detection task in human MEG recordings. This is consistent with our results when considering the entire period of task performance (~2 s). They also observed a trend toward supercritical dynamics during transient activity periods, just as we found predominantly large avalanches during high rate periods. Our analysis, though, shows adaptive binning as one potential approach to distinguish true supercritical dynamics from avalanche dynamics. Fagerholm et al. (2015) reported subcritical dynamics during focused attention in human EEG, in line with the observation of reduced correlations during such episodes (Cohen and Maunsell, 2009;Mitchell et al., 2009;Harris and Thiele, 2011). However, changes in activity levels were not reported in that study, and size distributions were evaluated outside their cut-off (Yu et al., 2014). Finally, early stimulus responses in the turtle's visual system  have been reported to deviate from avalanches, a finding which might reflect the statistics of the stimulus used rather than intrinsic cortical dynamics.
Numerical simulations of systems under external drive or with non-zero spontaneous neuronal activity can never be truly critical, since the quiescent phase (and with it the transition connecting it to an active phase) disappears under these conditions (Taylor et al., 2013;Hartley et al., 2014;Williams-García et al., 2014). This effect is increasingly pronounced in simulations with small network size and large activity drive. Given the relatively small change in firing rate observed during behavior and the large size of the cortical network, our findings are in line with simulations that show weakly driven critical system to exhibit power law behavior when the temporal resolution is matched to the activity rate (Hartley et al., 2014).
In conclusion, our findings provide strong evidence that the scale-free organization of neuronal avalanches is maintained during evoked responses in premotor and prefrontal cortex, despite systematic fluctuations in firing rates. We therefore suggest that neuronal information processing during tasks might capitalize on various functional benefits shown for critical dynamics. This calls for further investigation into how critical dynamics may explain the execution of specific brain functions. Of equal importance, our results also shed new, methodological light on how to identify true deviations from avalanche dynamics under pathological conditions.

Materials and methods
Behavioral training and electrophysiological setup All procedures followed the Institute of Laboratory Animal Research (part of the National Research Council of the National Academy of Sciences) guidelines and were approved by the NIMH Animal Care and Use Committee (protocol #LSN-11). Two adult rhesus monkeys (Macaca mulatta) were surgically implanted with a titanium head post each under sterile conditions while under isoflurane anesthesia. After recovery, the monkeys were trained to sit head-fixed in a primate chair for behavioral performance. In the cue-initiated task, monkey A (male, 9 years old, 8 kg) had to press a bar in front of the chair upon presentation of the 'trial-initiation' cue on a computer screen (a grey square in the center of the screen). After~2 s, the initiation cue was followed by an 'instruction' cue, either 'green cross' or 'red circle', for the duration of 1 s. Upon cue disappearance, monkey A had to release the bar and reach with his right arm to one of two specialized feeders (Mitz et al., 2001). The 'green cross' instructed the monkey to reach to the left feeder for reward ('left trials'; contralateral to the reaching hand); a 'red circle' instructed reaching for the right feeder ('right trials'; ipsilateral to the reaching hand). Approaching the incorrect feeder rapidly triggered a proximity sensor to sequester the food rewards in both feeders, which prevented the monkey from obtaining a reward on that trial. The inter trial interval was 3-5 s. In the self-initiated motor task, monkey B (female, 8 years old, 7 kg) had to move her right arm to touch a pad placed~30 cm in front of the monkey chair after which a food reward was given. Pad touching was self-initiated: no cue was presented. After the monkeys learned their respective tasks, a multi-electrode array (MEA; 96 channels -10 Â 10 without corners, inter-electrode distance: 400 mm; BlackRock Microsystems) was chronically implanted in the left prefrontal area (area 46, monkey A; electrode length: 0.55 mm) or the arm representative region of the left premotor cortex (monkey B; electrode length: 1 mm -PM is thicker than PFC, therefore longer shanks were employed). After recovering from the implantation surgery (~1 week), behavioral training resumed. The LFP (1-100 Hz band pass filtered; 2 kHz sampling frequency) and extracellular unit activity (0.3-3 kHz band pass filtered; 30 kHz sampling frequency) were simultaneously obtained from the implanted MEA. Electrophysiological signals as well as the timing of behaviorally relevant events, for example touching the pad, presentation of visual cues, etc., were stored for off-line analysis.

Unit analysis
Extracellular unit activity was extracted offline by spike sorting (Offline Sorter, Plexon Inc.). Putative single-unit activity was identified whenever a clear clustering and separation of waveforms could be identified in at least one feature space. In total, 29 and 43 putative single-units were identified in monkeys A and B, respectively. For calculating the peristimulus time histogram (PSTH), firing rates were calculated for a 250 ms window moving in steps of 50 ms. Statistical significance of firing rate changes was based on surrogate spike trains from individual units obtained by randomly permuting inter-event intervals. Specifically, a firing rate change was considered significantly above (below) expectation when it was among the top (bottom) 2.5% of the firing rate distribution obtained from the shuffled data. For this analysis, left and right trials were treated separately for monkey A.

Fano Factor
We employed the method described by Churchland et al. (2010) to calculate the Fano Factor, defined as the variance (over trials) of the spike count divided by the mean, in order to evaluate how the variability of the neuronal population studied evolved with time. We computed the variance and mean spike count for each time window (100 ms sliding window moving in steps of 40 ms) for each neuron separately. After that, a linear regression was performed on the scatterplot of the variance versus the mean spike count in which each point represents a single neuron. The slope of this regression measures the Fano Factor for the relevant time window, and the estimated error in the slope calculation provides a 95% confidence interval. To account for the effect from the increase in firing rate on the Fano Factor, a mean-match procedure was employed as described previously (Churchland et al., 2010). In short, this procedure removes units from the analysis until the mean firing rate of the remaining units does not change significantly between the periods studied (see Figures 1B and 4C; the amount of data surviving the respective mean-matching is given in the legends). Statistical significance of changes in the Fano Factor due to the task was assessed by a p-value computed from the probability of observing a given slope based on the confidence interval calculated from the baseline level.

LFP analysis
Negative deflections in the LFP (nLFPs) were detected by applying a threshold at -2 (monkey A) or -2.5 (monkey B) SD of the continuous LFP fluctuations estimated for each electrode separately (fixed and adaptive binning; for adaptive thresholding see below). The time stamps of nLFPs, determined by the data points with the largest negative amplitude, were then employed for avalanche analysis. The same procedure employed to assess the significant of firing rates was also employed for nLFP rates.

Avalanche analysis using fixed and adaptive temporal binning
Avalanches were identified by concatenating nLFPs occurring in successive time bins of width Dt on the array into temporally contiguous spatiotemporal clusters as described originally (Beggs and Plenz, 2003). The avalanche size was defined as the number of nLFPs within the avalanche. The temporal resolution Dt for the avalanche analysis was defined by the inverse of the average nLFP rate on the array over a given period and thus is not a free parameter. Two approaches for binning the data were employed. The first one, fixed binning, was obtained by calculating the average nLFP rate across all epochs (see Figures 1 and 4 for different epochs). The second one, adaptive binning, was obtained by calculating the average nLFP rate over different epochs separately. Therefore, this second method resulted in Dt that varied between epochs according to changes in the activity rate. The adaptive binning method calculates the temporal resolution on a trial-by-trial basis, taking into account highly variable responses across trials.

Significance of changes in average avalanche rate and size
In order to test for significant changes in average avalanche rate and size during task performance, we compared the values from each point in time to those obtained from 100 shuffled sets, providing a p-value. Significance was defined at p 0:05. For size analysis, shuffled sets were constructed by randomly permuting avalanche sizes, while keeping avalanche occurrence time (thus also keeping the avalanche rate unchanged). For rate analysis, shuffled sets were constructed by randomly assigning a new occurrence time (within the same trial, drawn from a uniform distribution) for each avalanche. For monkey A, left and right trials were treated separately.

Avalanche size distribution analysis
In order to fit power laws and exponentials to the probability densities of avalanche size, and obtain the corresponding exponents, we employed the Maximum Likelihood Estimation method, as previously described (Clauset et al., 2009;Klaus et al., 2011). The distribution used for the power law fit had a sharp cut-off: p s ð Þ ¼ Cs Àa exp À s=s 0 ð Þ g ½ , where the normalization constant is given by The three parameters to be determined were the exponent of the power law a, the cut-off s 0 and the strength of the decay beyond the cut-off g. In order to evaluate the uncertainty in the size distributions, we employed a bootstrap procedure (Efron, 1979) to estimate the 95% confidence interval for the probability density at each avalanche size from 1000 resampled data sets, which were obtained by randomly sampling the same number of avalanches found in each case from the original distribution. In the case of model distributions (see below), 100 different instances of the network were simulated, from which the confidence intervals were obtained. The significance of the different fits to the data was obtained by employing the likelihood test: where D is the test statistics and L pl is the likelihood for the power-law fit. A p-value is obtained using the calculated statistics from the chi-squared distribution. When comparing power laws to exponentials L alt is the likelihood for the exponential fit, and the p-value computes the chances that the fit with lower likelihood value is actually better. When comparing different distributions L alt is the likelihood for the alternative power-law fit (e.g. when comparing PRE distribution to BASE in Figure 6A right, the alternative fit is the one obtained for the BASE distribution). In this case, the p-value computes the chances that the compared distribution cannot be explained by the alternative fit or, in other words, how similar they are. Note that a p-value above 0.05 in this case does not imply that the distributions are significantly different (whereas p-values of 0.05 or lower indicate that the distributions are significantly similar).

Trial shuffling
In order to assess the influence of rate modulation introduced by the task on avalanche size distributions, we compared the original data to data obtained by randomly shuffling the trial order for each electrode independently, that is, the shift predictor (Gerstein and Perkel, 1969;1972). This procedure created datasets that preserved the average change in nLFP rate for all electrodes and trials, but destroyed spatio-temporal correlations within individual trials.

Cellular automaton models
To study the interplay between avalanches and activity rate, we employed a network of 10 Â 10 cellular automata (Kinouchi and Copelli, 2006;Ribeiro et al., 2014). Each site, which represents the activity of a single channel from the experimental data, cycles through its 11 states: x i t ð Þ ¼ 0 if the i th site is quiescent at time t, x i t ð Þ ¼ 1 if it is active, and x i t ð Þ ¼ 2; . . . ; 10 if it is refractory. A quiescent site at time t can become active at t + 1 if any of its pre-synaptic neighbors is active at t and transmits successfully, each connection independently with probability P. Once a site is active, its state is incremented according to the following equation until it is back to quiescence: x i t þ 1 ð Þ ¼ x i t ð Þ þ 1 ½ mod 11 (deterministic refractory period). Each site sends k ¼ 16 connections to randomly chosen post-synaptic sites, thus forming a random network. Each connection has a probability P of transmitting a spike, which was tuned according to P ¼ 1=k in order to achieve critical dynamics in the network. For avalanche initiation, we randomly chose a site to become active in an otherwise quiescent network. After the propagation of activity ended, a time interval drawn from the inter-avalanche interval distribution obtained from experimental data was imposed before another avalanche was initiated.
In this model, the event, that is nLFP, rate was adjusted according to the baseline level in the experimental data by employing a time step of 2 ms (implying a refractory period of 18 ms). Four different approaches to introduce a transient increase in the activity rate, as observed in the recordings from both monkeys, were studied. In the Input driven model, we added independent Poisson inputs that triggered avalanches for each site, with a rate that increases and then decreases linearly from trial time -0.35 s to -0.05 s, peaking at two different possible levels: in the low drive regime, h max ¼ 1 s À1 and in the high drive regime, h max ¼ 6:5 s À1 . In the supercritical model, we increased the probability of transmission P by multiplying it by a modifier with the same temporal profile described above for the drive, with the modifier ranging from 1 (no change) to a maximum of either 1.5 (50% increase in the probability of spike transmission) or 2.5 (150% increase). Note that this led to both increased activity rate and supercritical dynamics. In the Poisson noise model, we added independent Poisson inputs that could not trigger avalanches to each site, with a rate similar to the one employed for the Input driven model, but peaking at either 17.5 s À1 (low-noise regime) or 35 s À1 (high-noise regime).
For avalanche analysis, the methods used for experimental data were also employed for the model. In order to reduce artificial separations of single avalanches during adaptive binning, that is when Dt becomes smaller than the time step of the model, we introduced a small jitter in event times ( half of the 2 ms time-step). This procedure effectively transformed the discrete temporal dynamics of the model into a continuous one. Jittering did not change our results for experimental data or simulations (data not shown).

Adaptive thresholding
For each electrode, the mean LFP amplitude and SD were calculated for consecutive windows of duration 100 ms. For each window, the adaptive threshold thr was set to mean -2SD.

Synchronization measures
We derived estimates of mean phase synchronization for band-pass filtered data. After filtering the data (50-100 Hz frequency band; phase neutral filter by applying a second-order Butterworth filter in both directions), we first obtained a phase trace q i (t) from each LFP channel F i (t) by applying its Hilbert transform H[F i (t)]: i t ð Þ ¼ tan À1 H F i t ð Þ ½ =F i t ð Þ ð Þ. Next, we quantified the mean synchrony R in each segment by R ¼ r t ð Þ h i ¼ L À1 P L t¼1 r t ð Þ, where L is the length of the data segment in samples and r(t) is the Kuramoto order parameter: r t ð Þ ¼ N À1 P N j¼1 e ij t ð Þ , which was used as a time-dependent measure of phase synchrony. Here, N is the number of channels in the data segment. The length of the segment in samples L is the product of the time segment considered (0.4 s) and the sampling frequency (2000 Hz), that is L ¼ 800.