Brain criticality beyond avalanches: open problems and how to approach them

A homeostatic mechanism that keeps the brain highly susceptible to stimuli and optimizes many of its functions—although this is a compelling theoretical argument in favor of the brain criticality hypothesis, the experimental evidence accumulated during the last two decades is still not entirely convincing, causing the idea to be seemingly unknown in the more clinically-oriented neuroscience community. In this perspective review, we will briefly review the theoretical framework underlying such bold hypothesis, and point to where theory and experiments agree and disagree, highlighting potential ways to try and bridge the gap between them. Finally, we will discuss how the stand point of statistical physics could yield practical applications in neuroscience and help with the interpretation of what is a healthy or unhealthy brain, regardless of being able to validate the critical brain hypothesis.


Neuronal avalanches as a proxy for the critical point
Bak, Tang, and Wiesenfeld [1] (BTW) proposed that cascades of events (or avalanches) were responsible for self-organizing a system into a critical point; and that the signature of this self-organized criticality (SOC) was that its avalanches obeyed power law (PL) distributions for size and duration. Soon, a hunt for avalanches in nature started, from real sandpiles to superconductor vortices, from forest fires to earthquakes [2,3]. A breakthrough came in neuroscience when Beggs and Plenz [4] were able to measure neuronal avalanches in cortical cultures. To their surprise, the slice activity that initially looked synchronized, had a finer spatiotemporal scale, where local field potential (LFP) firings seemed to be spreading through the electrodes just like the avalanches did in the BTW model. Not only that, but the number of LFP events in each of these cascades, separated by periods of no activity, was PL-distributed, just as predicted by the SOC theory: with τ = 1.5 the critical exponent related to the number of spikes s in a cascade, and τ t = 2 the one related to its duration T. More striking was that the branching ratio within these avalanches was around 1, leading to the hypothesis that cortical activity could be described by a critical branching process [5]. This is an example of a universality class. Each class is uniquely identified by the set of values for the critical exponents (τ , τ t , and others that we will see further ahead); and it defines all the macroscopic properties of the system [6,7]. This means that any two systems that pertain to the same universality class (i.e. have the same values for their critical exponents) behave exactly like one another. This is the reason why its crucial to precisely determine the set of exponents that governs brain criticality. Once determined, we can be sure about what is the one theoretical model for the brain, allowing us to make better predictions and descriptions of the healthy and unhealthy brain. An avalanche of theoretical and experimental works followed. In the theory side, part of the quest was finding a mechanism that generated PL-distributed avalanches. To cite a few, theories involved mainly synaptic , but now the different symbols differentiate between experimental techniques, regardless of animal or brain region. Data points connected by lines: experiments that yielded many exponents along the interval given by the line connecting the maximum and minimum value of the exponents (the intermediary point, in these data, is the mean value of the exponents). Colored points are experimental avalanches that obey the crackling noise scaling law, whereas filled gray symbols do not. Empty gray symbols represent works that did not test the scaling of avalanches. Notice that the same brain region of the same animal yields different exponents, either weakening the brain criticality hypothesis or suggesting that some mechanism hides the true critical exponents (see text for discussion). The data in the figure was taken from: Beggs and Plenz [4], Gireesh and Plenz [22], Plenz and Chialvo [23], Friedman et al [24], Palva et al [25], Shew et al [26], Bellay et al [27], Kohn and Smith [28] (given in [29]), Hahn et al [31], Ponce-Alvarez et al [32], Yaghoubi et al [33], Fekete et al [34], Fontenele et al [29], Senzai et al [35] (given in [29]), Varley et al [36], Carvalho et al [37], Fosque et al [38], Mariani et al [39]. More information can be found in tables 1-4. Brain regions: DL -Dorslateral, S -Somatosensory, V -Visual, Hipp -Hippocampus. plasticity [8][9][10], rare-region effects due to spatial heterogeneity [11,12], excitation/inhibition (E/I) synaptic balance [13,14], or simply synaptic noise [15]. In parallel, both theory and experiments also attempted to understand the consequences of brain criticality, such as optimizing information transmission [4], processing [16], and capacity [17], memory [18,19], and sensitivity to external stimuli [20,21].
However, experimental studies are rarely finding the same exponents reported by Beggs & Plenz, and many models yielded each a different set of exponents too (see figure 1 and the appendix tables 1-4 for a summary of 45 experimental findings involving neuronal avalanches, reported in 30 different papers).
Thus, the purpose of this work is threefold: in this first section, we review the neuronal avalanches literature, point out its open problems and discuss what they mean; following, we briefly review the theory of SOC, pointing to how it stands in interpreting the avalanche data. Finally, we bring attention to an alternative framework that could be used to probe for criticality in the brain. The framework is not new: it is made by the thermostatistics of phase transitions (section 2) and the phenomenological scaling theory (section 3)-ideas that make the fundaments of a strong definition of universality in a critical point [7]. We introduce these ideas keeping the inherent interdisciplinarity of brain criticality in mind, trying to convey the information for audiences of non-specialists in statistical mechanics. Throughout the text, we provide examples of its application to brain science and discuss how it could help close the gap between theory and experiments in corroborating the brain criticality hypothesis.
A quick disclaimer: recording from living animals and tissues is definitely a difficult task. It depends on expensive technology and requires dedication, fierce people, time and other resources. So, clearly all the neuronal avalanche experiments are already great endeavors that were put forward by many authors, and they must be acknowledged by their merit. This work intends to introduce tools and suggest ideas that could be useful to build up from the avalanche experiments and evolve together as a scientific community, fostering the interaction between theorists and experimentalists. Thus, we are not interested in countering or supporting any particular experiment or theory; instead, we critically review the experimental findings to make an argument about future directions of the field.
Let us illustrate the dependence of avalanche exponents on experimental technique, animal and brain region by looking at specific experiments from figure 1. First, we fix the experiment to be Ca imaging. This technique applied to the whole brain of larvae zebrafish gives τ ≈ 2.03, τ t ≈ 2.81, and 1/(σν ⊥ z) = 1.75, resulting in an Ising-like universality class [32]; applying the same technique to rat hippocampus [33], we get τ ≈ 1.6, τ t ≈ 2.4, and 1/(σν ⊥ z) 2.2 under healthy conditions, and τ ≈ 2.25, τ t ≈ 3.4, and 1/(σν ⊥ z) 2 for folatereared cultures (a condition that may generate high excitation and mimic seizures [48]). These are another two unique universality classes apart from the one of zebrafish, and are still theoretically unknown. So far, these differences can be justified, as we discussed earlier, since evolution might drive different animals towards particular universality classes. However, take the rat hippocampus recorded with LFP and filtered by spike sorting [38], and we get τ ≈ 1.65, τ t ≈ 1.98, and 1/(σν ⊥ z) = 1.48, pointing to a different universality class than the one obtained in [33] for the same brain region of the same animal species.
So, here is the first paradox: the healthy culture in [33] and one dataset from [38] are both from the same brain region (dissociated hippocampal culture of rats); yet, they have very distinct avalanche exponents. This tells us that either the experiments are mutually incompatible; or that the system being recorded from does not present a critical point, since it breaks universality. However, each of these experiments individually obey the scaling relation in equation (4), which supports the criticality hypothesis for both data sets. Considering that obeying the crackling scaling relation is not an experiment-specific artifact, and that this equality is truly representative of critical dynamics, this favors the argument that Ca imaging and spike sorting are incompatible techniques. If that is the case, then the exponents are only apparent and experiment-dependent, instead of being representative of the true underlying critical dynamics. Instead, it could also be the case that obeying the scaling relation in equation (4) is not a sufficient condition for criticality (we only know that it is a necessary condition). This would explain the absence of matching exponents, but would weaken the brain criticality hypothesis.
On the other hand, nine experiments for different animals and conditions yield similar avalanche exponents. They are the ex vivo turtle V1 [26], the in vivo rat V1 under urethane [29,37], the V1 of freely-behaving mice [29], the V1 of sufentanil-anesthetized monkey [29], the in vivo rat barrel cortex under tiletamine [39], the in vitro rat somatosensory cortex/hippocampus [24], the in vitro rat hippocampus [38], and the in vitro mouse S3 [38]. The avalanches in [26] are made of LFP events (having good correlation with spike events), and the avalanches in [39] are extracted both by spike sorting and LFP events, independently, with slightly different exponents. All the other seven works computed the avalanches using the spike sorting technique. However, the exponents of all these experiments fall along a wide region in the τ t versus τ plot (figure 1), agreeing with a previously reported exponent correlation [30]. This is the second paradox: the spread of exponents in this plot happens even for the same brain region of animals of the same species. This either means that exponents are only apparent [37], or that the system is away from the critical point due to external stimulus causing selfsustained spontaneous activity [38]. Or, again, it could mean that the crackling noise relation is not a sufficient condition for criticality, fostering the search for an alternative explanation for the avalanche distributions and scaling.
A general look at the data revised here shows that these nine experiments are the only ones in strict mutual agreement. This is because only a few experiments calculate τ t (usually due to poor PL in the duration distribution); and even fewer of them calculate the scaling exponent, 1/(σν ⊥ z). Relaxing our criteria to only matching the τ exponent, we see that many experiments seem to match. Conversely, the data in figure 1 and in tables 1-4 also show that the same τ may come with a wide range of different τ t ; not to mention the scaling exponent. Thus, it is not sufficient to only match the τ exponent to claim equivalence in between experiments, and between experiments and models.
All these facts are magnified by the lack of a standard definition for experimental avalanches. Clearly, the many different ways that are used to extract neuronal avalanches from the data are a direct consequence of the particularities of the multiple recording techniques. Here is a list of different procedures: (a) Extracting a binary time series from experimental recordings of brain activity can be made in different ways; at times, the data of each electrode or image region are subjected to thresholds; these are called LFP avalanches, since each LFP event is not necessarily made by only neuronal spikes [49]. (b) When the initial data comes from M/EEG or fMRI, then channels/BOLD signals are filtered to extract a point process [50,51]. (c) In addition, some experiments filter LFP signals to reconstruct the underlying neuron spikes, yielding the spike sorting avalanches. (d) Other experiments further threshold the binary event time series to be able to obtain periods of time where there is no activity, allowing for the separation of avalanches. This is known to hide the correct scaling of the system if avalanche sizes are not correctly defined according to the threshold [52].
(e) Finally, a few imaging experiments distinguish different avalanches by imposing a spatial constraint, allowing one to separate two simultaneous avalanches happening in two distinct regions of the cortex [32]. (f) Sometimes a criterion to group data before calculating the avalanche sizes and duration is used. For instance, data may be grouped by its coefficient of variation (CV) [29,37], by its maximum variance [38], or by the principal frequency of the LFP signal [22,31]. (g) The distribution of instantaneous clusters of active regions/neurons is sometimes confused with the avalanche distributions, as in [53,54] -see section. (h) None of these definitions clearly corresponds to the theoretical definition [55]: an avalanche is all the activity between two consecutive absorbing states (and an absorbing state is not necessarily a state of silence [56]). More precisely, an absorbing state is defined by a configuration from which the system cannot escape, unless externally stimulated [7].
To make matters worse, recording from only a few dozens to hundreds of neurons from a system made by thousands to billions of elements, may raise the issue of subsampling [57,58], a phenomenon that can either hide the true avalanche exponents [37,59], or turn PLs into lognormals [15,57,59,60]. A covariance was recently discovered between the level of randomness in brain activity (measured by the coefficient of variation, CV) and the avalanche exponents [29]; models of branching processes can only reproduce this covariance when subjected to subsampling and adjusted within a 3% range off of the critical point towards the supercritical state (where larger avalanches are slightly predominant) [37]. This is in direct opposition to other findings that could only be reproduced by a branching model under subsampling in a slightly subcritical state with external driving [59].
In summary, the study of avalanche scaling alone is, at best, confusing. This motivates the investigation of how the many experimental techniques relate to one another, such as the work done in [61], where they show that the positioning of electrodes may generate correlations and PLs. In parallel, it also brings about the need for a systematic and standardized way of computing avalanches. But more importantly, it begs the introduction of other methods to independently verify the criticality hypothesis in the brain. Luckily, we have methods that make good candidates, inherited from the Condensed Matter/Statistical Physics community. These will be discussed in details in sections 2 and 3. Other approaches involve looking at the underlying bifurcations of the system [62], which ultimately might correspond to nonequilibrium phase transitions [63,64].

But we are measuring avalanches to probe for SOC...Or aren't we?
To begin answering this question, we first must understand what SOC is and what it requires. Then, we have to address whether these features could be found in the brain, so that we understand what an avalanche actually stands for.
SOC is a mechanism that converts slow external inputs into a critical point; and it does so by redistributing the accumulated energy throughout the system in the form of avalanches, and dissipating energy through the system's boundaries [55]. Let us quickly go back to the original sandpile model to illustrate this statement: we have a square lattice, where each site accumulates grains (energy) coming randomly from a slow external source; every time a site reaches a given threshold, external input stops and the site initiates a chain reaction by redistributing its grains (energy) to its neighbors (either the nearest ones, or randomly chosen from the lattice if one is considering the mean-field version, known as Manna model). Importantly, grains cannot disappear during the redistribution process in the bulk of the lattice; they must only leave the system through its boundaries. For illustration, you can imagine a pile of sand on top of a square table, where a machine drops grains of sand on top of the pile very slowly; and when the slope of the pile crosses a threshold, some avalanche happens and redistribute these grains. Some grains closest to the edge of the table will fall outside of the system (dissipate), but the grains in the bulk of the pile have to be entirely redistributed across neighboring sites.
To understand this mechanism, Dickman et al [55] realized that the system needs two variables: one accounts for the number of sites crossing the threshold at each moment in time (known as the density of active sites, ρ); the other is the total number of grains in the system, ζ. Now, cease external input and close the lattice with periodic boundary conditions, stopping grains from dissipating through the edges: what you get is a system that possesses a continuous absorbing phase transition as the total number of grains ζ is varied. You can tune this new sandpile to an absorbing inactive phase (when ζ < ζ c ) where avalanches only propagate locally; and an active phase (when ζ > ζ c ) where activity never stops and grains keep getting redistributed forever between sites.
This tells us that the autotune (self-organization) towards ζ c is only possible with three ingredients: (a) a system with an absorbing phase transition, otherwise avalanches would never cease; (b) boundary dissipation and bulk conservation, otherwise avalanches would either never end (no boundary dissipation), or have a characteristic size smaller than the system size (no bulk conservation); and (c) a separation of external Table 1. Spike sorting avalanche exponents from LFP recordings. Avalanche exponents agree with scaling. † Agreement within 0.2 error calculated from figure. * * * Mutual agreement between different experiments. 1 Avalanche distributions collapse despite being lognormal. 2 No collapse of avalanche distributions despite being PL. 3,4,5,6 Spikes were grouped by CV level. 3 τ = 1.54, τ t = 1.73 when CV = 1.46. 4 τ = 1.52, τ t = 1.7 when CV = 1.4. 5 τ = 1.66, τ t = 1.9 when CV = 1.1 (done by [29]). 6 τ = 1.77, τ t = 1.99 when CV = 1.1 (done by [29]). 7 No CV grouping. 8,9,10 Only considered spikes in bins of high Var ρ /supercritical avalanches. 11 Urethane mimicking healthy activity; changing inhibition impaired avalanche distributions. 12,13 Data grouped by frequency PCA; PL avalanches observed in synchronized activity. 14 PL exponents grew after suture of one eye. 15 Neither measuring only uncorrelated avalanches nor stimulating the whiskers significantly changed the exponents. 1,2,3,4,5,7,11,12,13,14,15 In vivo. 6,8,9,10 In vitro. All recordings involved temporal binning. Values in square brackets indicate minimum and maximum measured exponents. Brain regions: S -Somatosensory, V -Visual, Hipp -Hippocampus. "Isoflu." corresponds to isoflurane, and "ureth." to urethane.  1 Awake, τ = 1.5, τ t = 2.0. 2 Whole-brain recordings in larvae; avalanches are spatially connex clusters, allowing independent detection of simultaneous cascades. 4 Mimicks epilepsy. 5 Awake, rest, dark. 6 Initially ketamine + Diazepam. 7 Initially by ketamine + Diazepam. 8 Initially medetomidine hydrochloride; recordings from area 17 and 18. 10 Excess dye causing cortical disinhibition. 9,10 Recording area not specified. 1,2,3,4 Ca imaging. 5,6,7,8,9,10 Voltage imaging. 1,2,5,6,7,8,9,10 In vivo. 1,3,4 In vitro. All recordings involved temporal binning (except for zebrafish). Brain regions: S -Somatosensory, V -Visual, Hipp -Hippocampus. input and internal relaxation dynamics via avalanches (i.e. when an avalanche starts, external input must be temporarily halted-the infinite separation of time scales). This works because when there is no activity (i.e. no exceeding grains to cause an avalanche), then ζ < ζ c , but the slow injection of grains into the system generates a small flow dζ/dt > 0. During the avalanche, ζ > ζ c , but no grains are added; they can only leave the system through its boundaries, causing a small dζ/dt < 0. This homeostatic adaptation of ζ coupled to the activity ρ is what pushes the system towards ζ c [55]. Now, let us look at whether these ingredients can be found in neuronal systems. The first ingredient seems to be straightforward. The spikes (instants and/or frequencies) are the fundamental units of brain activity [65]. Our bodies, then, have to keep a sort of chemical imbalance in the brain between the exterior and interior of neurons, maintaining their membranes hyperpolarized and susceptible to emit spikes. Spike emission can happen spontaneously due to chemical fluctuations, or due to external stimuli coming from other systems in the body or from the sensory inputs. If these stimuli cease, and the biological mechanism that keeps the chemical imbalance stops (i.e. we die), and we wait long enough for the all the chemical potential energy to dissipate, the average firing frequency of neurons will gradually fade and neurons will eventually stop emitting spikes. The only way to reactivate these spikes is by externally generating a chemical potential and/or injecting electrical currents into the neurons' membranes. Thus, this state without activity is definitely absorbing. Table 3. LFP thresholding avalanche exponents. Avalanche exponents agree with scaling. * Suggested values (data not clear/not PL). * * * Agrees with experiments marked similarly in table 1. 1 PL observed in LFPs, but inconsistent in spike sorting. a Spike sorting exponent. b LFP exponent. 3 No PL in brain tissue of epileptic humans (temporal lobe). 4 LFP peaks apparently correlate with spikes (no CV grouping), τ = 2.03, τ t = 2.23, 1/σν ⊥ z = 1.21. 5 Measured during nested θand β/γ-oscillations. 6 Monkey recorded awake; rat recorded in vitro. 7 Resting state, τ = 1.66. 8 Scaling is kept under slow driving. 9 Neither measuring only uncorrelated avalanches nor stimulating the whiskers significantly changed the exponents. 1,4,5,6,7,9 In vivo. 2,3,5,6,8 In vitro. All recordings involved temporal binning. Values in square brackets indicate minimum and maximum measured exponents. Brain regions: DL -Dorsolateral, PM -Premotor, S -Somatosensory, V -Visual, Hipp -Hippocampus. "isoflu." corresponds to isoflurane, and "ureth." to urethane.  Table 4. Avalanches in whole-brain imaging. Avalanche exponents do not agree with scaling. All recordings were thresholded to obtain a point process. 3 Changes in PL avalanches correlated with changes in DFA exponents across tasks: r resting, a auditory, v visual. 4 The bulk-conservation-boundary-dissipation requirement is more difficult to spot. Bulk conservation can only be guaranteed if a neuron receives a sum of inputs from its presynaptic neighbors that is equal to all the electric potential that it lost while producing its own spike [66]. Although conservation can be improved by having many neighbors (a common characteristic of neuronal networks that enhances synaptic reliability), many other factors contribute to locally dissipate neuronal activity. Three examples are the lack of myelin sheath in dendrites, the presence of inhibitory synapses, and the loss of neurotransmitters. In contrast, there is no clear boundary in the brain networks forming a boundary and a bulk, similarly to what is usually seen in crystals and magnets, where criticality is well understood. Thus, it is not obvious to determine the network locations in which activity would be allowed to dissipate (the network boundaries) such that the only characteristic scale present in the system is given by its size [69]. This is mainly due to the complex topology of the brain network. Perhaps the rich-club structure of the brain [46] is responsible for defining bottlenecks where activity could be dissipated. This scheme would potentially result in avalanches not being larger than the size of the largest rich-club cluster. These theoretical ideas need deep scrutiny and careful consideration in models of neuronal networks.

References
Finally, the requirement for infinite separation of time scales, however, is most probably not present in the brain: spontaneous electrical cortical activity seems to happen continuously in a globally asynchronous and locally irregular fashion (or AI) due to E/I synaptic balance [67,68]. In fact, many works try to compensate for this by imposing arbitrary thresholds in their recordings (either in the LFP signals or in the spike sorted data). Other works even group the data by its statistical features before calculating avalanches, as we saw in the last section.
Relaxing the need for bulk conservation and separation of time scales, and adding homeostatic mechanisms to the network, some systems tend to hover around the critical point generating the self-organized quasi-critical state (or SOqC) [66,69]. The critical point would only be reachable when dissipation is balanced by the driving injected currents [69]. As a consequence, the SOqC system does not present generic scale-invariance (i.e. irrespective of parameters). These systems are not strictly critical, but may display PL avalanches limited by some characteristic scale [66]. We know a few mechanisms that are responsible for yielding a SOqC dynamics in neuronal networks: depressing synapses [9, 10, 14], adaptive firing [70,71], and adaptive thresholds Order parameter and its associated susceptibility. Two parameters control the phase of the system. The critical point only happens when c = h c = 0, and it is characterized by χ → ∞ according to equation (13) as → c ; in SOC literature, the state with h = h c and ρ = 0 ( > c ) is called subcritical, whereas ρ > 0 ( < c ) is supercritical [55]. Here, for h < h c , there is a first order (discontinuous) phase transition. No phase transition happens with h > h c because ρ does not go to zero. In an Ising-like magnet, and h stand for the temperature and external magnetic field, respectively. In neuronal networks, the usual choice for is the synaptic coupling intensity, and h is the mean external current applied to the network [14] (often chosen to be a Poisson process appended on the membrane voltage of each neuron, and ρ is known as the network firing rate [79]). For each pair ( , h), ρ and χ are calculated by equations (6) and (7). In equilibrium, χ can be calculated via equation (8), and in nonequilibrium systems, k B T may be ignored and the variance of ρ and the susceptibility are two distinct quantities, each scaling with its own critical exponent [7]. [14,72] are some examples. These quantities are directly coupled to the activity of neurons, playing a similar role to the homeostatic ζ in the sandpile. The characteristic recovery times of these mechanisms naturally generate periods of high and low activity, which can be thresholded to yield PL avalanches [72]. Not only that, but the SOqC state has also been shown to display AI activity in a synaptically balanced state [14,72].
So, on the one hand, true SOC might not be the answer to how the brain processes information (in spite of other evidence, such as the 1/f noise [73,74]). However, if spikes are units of activity, interpreting the brain as some sort of self-organized branching process where intermittent activity emerges in the form of (quasi-)critical avalanches can be compelling. In fact, changes in neuronal firing patterns can be detected by injecting excitatory or inhibitory synaptic blockers in the system [19,21]. All of this suggests that a critical phase transition could be observed in cortical experiments. We need to understand what can be used as a control parameter, and what can be measured to determine the nature of the critical brain dynamics. Thus, the two following sections are dedicated to briefly introduce this theory to an interdisciplinary audience, relating it to the brain whenever possible.

The thermodynamic origin of criticality
Let us start where the concept of criticality is well defined. In macroscopic thermodynamic equilibrium, a critical point is a special type of phase transition where two coexisting phases become a single one, yielding long-range correlations and, hence, fluctuations that are able to span the whole extent of the system [6]. The two most known examples of this phenomenon are the liquid-vapor and the ferromagnetic transitions. In the former, liquid and vapor, each of which having its own particular density, become a single critical fluid for a particular configuration of pressure and temperature (i.e. the critical point) where both densities become equal-this is described by the paradigmatic van der Waals fluid state equations. In the latter, ferromagnetism forms from the local alignment of spins as the temperature goes down to the so-called Curie (also known as critical) point and external magnetic field is held asymptotically close to zero-this situation is paradigmatically described by the Ising model.
The critical point fluctuations are so strong that they change the observable structure of the system. For instance, they give rise to what is known as critical opalescence, since the density fluctuations end up blocking the light that crosses the fluid. A similar phenomenon occurs in the ferromagnetic phase transition, although the fluctuations are now in the magnetization of the magnet, where some regions (or domains) in the material align with the vanishing magnetic field, while other regions are entirely flipped. These regions of fluctuating density or magnetization are reshaped over time, and eventually individual domains are able to extend all over the system (the closest to the critical point, the more often this happens due to the diverging correlation length). However, both the average liquid-vapor density difference and the average magnetization remain arbitrarily close to zero.
Theoretically, these two systems and their phase transitions are described by their free energies-a function of the macroscopic variables (e.g. temperature, pressure, magnetic field, and so on) from which all the thermodynamic (macroscopic) quantities may be derived (e.g. the magnetization, the heat capacity, magnetic susceptibility, the density, the entropy, and many others). So, here comes the catch: is it possible to identify the variables with which we would be able to determine the macroscopic behavior of the brain and, hence, its phase transitions?
Unfortunately, the answer is 'No!' (at least until the date this paper was written; although we wished it could have been 'Yes!'). Surely, some attempts to define a sort of fundamental free energy have been made [75], but they rely on this abstract entity called information, whereas here we are concerned with actual physical variables (such as electric and magnetic fields, temperature, density, etc), which are the quantities that effectively execute and shape all the brain functions, development and evolution [76]. This free energy function is often times impossible to exactly compute even in equilibrium thermodynamics, but this does not stop us of understanding such systems. In fact, we study them relying on observables that are defined as derivatives of the free energy. More formally, let G(T, h) be the free energy as a function of the temperature T and the external field h, then we define For example, the magnetization ρ is the first derivative of the free energy with respect to the magnetic field h, and the magnetic susceptibility χ is the derivative of the magnetization with respect to h. Similarly, the density of the fluid ρ is proportional to the derivative of the free energy with respect to pressure h, and the compressibility is the derivative of the density with respect to h [6]. Magnetization and magnetic field, volume and pressure, temperature and entropy, and so on, are called conjugate variables, because one is directly coupled to the other [77]. The susceptibility (or compressibility) is of particular interest because it measures how ρ responds to variations in its conjugated external field h.
In an equilibrium phase transition, ρ is often called the order parameter because it informs us of the underlying organization of the system components. For example, the magnetization per unit spin is the order parameter for the ferromagnetic transition because when it is different than zero, we know that the spins that make up the magnet must be ordered (i.e. aligned); on the other hand, when the magnetization equals zero, the spins are completely disordered (usually by a temperature effect). Similarly, the density (first order derivative) of a fluid informs us about the underlying state of the system, since the density of the liquid state is greater than the density of the vapor state, yielding the order parameter as the difference between the liquid and gas densities. All these concepts and ideas have been applied in physics for decades, and they are part of the foundations of statistical mechanics, yielding theoretical and computational techniques to rigorously study phase transitions, such as the density matrix, the master equation [78], and the Monte Carlo method [77]. Now, we are ready to define a 'critical point' more formally: it is a particular value of temperature and external field, (T c , h c ), where the second order derivative of the free energy (i.e. the susceptibility) diverges [80] (see figure 2). This is interpreted as a direct consequence of a diverging correlation length [6].
To understand why the diverging correlations define this phase transition, let us write the order parameter and the susceptibility making use of the equilibrium statistical mechanics tools [77], where N is the number of elements in the system, T is the temperature, X ≡ i x i is the macroscopic quantity that defines the underlying order of the system (such as the magnetization of a magnet, whereas x i would be the local magnetic moment of each atom or molecule in the material), and k B is the Boltzmann's constant. Equations (7) and (8) are equivalent to equations (5) and (6) if the system is in thermodynamic equilibrium [81], and the averages are taken over the ensemble of all possible states of X. Continuing with the magnet (Ising) model for the sake of the argument, ρ is the specific magnetization and χ is the magnetic susceptibility [82]. We can open up the definition in equation (8), together with the definition of X, to get [45] where the function C ij is called the two-point correlation function [77], and it has a positive value when the local variables x i and x j are fluctuating together in the same direction. When the system has translation symmetry (i.e. the reference frame can be put on top of any element in the system), and the material is densely packed in a space of dimensionality d, the last sum can be taken in the continuous limit [45], where r ≡ | r i − r j | is the distance between two elements. Equation (9) is the root of why a critical point is so interesting, since it links the observable properties of the system with its constituents' microscopic fluctuations. Thus, a diverging susceptibility χ comes together with a diverging two-point correlation [6]. This means that the closer the system is to its critical point, the more sensitive ρ is to fluctuations in h. Let us formalize this statement a bit better. Assume that the critical point happens in a particular temperature T c with a particular external field H c being applied, then we define the reduced temperature as ≡ (T − T c )/T c , and the reduced field as h ≡ (H − H c )/H c , such that c = h c = 0 is the critical point. Thus, the correlation function can be shown to obey [6] with ξ ⊥ ≡ ξ ⊥ ( , h) being the spatial correlation length [83], and where η and ν ⊥ are two positive constants called critical exponents, and the vertical bar means that ξ ⊥ must be evaluated at h = h c = 0. In fact, equation (11) is expected due to the (empirical) divergence of χ. Note that when → 0, ξ ⊥ → ∞, leaving a power-law decay in the spatial pair correlation function, C(r) ∼ r −d+2−η . The smaller the value of d − 2 + η, the more strongly correlated the system is in the critical point, making it more and more sensitive to local perturbations.
What is more striking is that the long reach of the correlation function (expressed by the PL decay) allows global changes of state in the system from the propagation of local perturbations through merely local interactions-and this is the explanation for the formation of magnetic domains and critical opalescence [6]. Just to illustrate it, this is analogous to a person on the other side of the world having a finite chance of receiving an intact spoken message from you, even if all you can do is saying it to your immediate neighbor; and all your neighbor can do is saying it to their own neighbors, and so on. Thus, criticality is a collective phenomenon emerging in a network of interacting particles. Now that the critical-point divergence of χ has been linked to a PL-diverging pair correlation, it is possible to show that the other observables of the system will also follow PLs. These PLs define the other critical exponents. The most usual ones are those given by the Greek letters β, γ and δ h , which describe the behavior of the order parameter and the susceptibility (both at h = 0), and the field behavior of the order parameter at = 0, respectively, Two things are worth noticing: first, equations (11)- (14) give us only the behavior of these observables very close to the critical point, when and/or h are close to zero (and the other is strictly zero). These laws are not guaranteed to hold away from the critical point as the general behavior of these quantities [6]. And secondly, the critical exponents are dependent on one another, and these dependencies are called scaling relations (of which equation (4) is an example) and can be derived by applying different statistical mechanical and mathematical arguments [6,7,45].
The second striking fact about criticality is that each set of values for the exponents (say, β = 1/2, γ = 1, δ h = 3, and ν ⊥ = 1/2) define a particular universality class. This set, for instance, corresponds to the socalled mean-field Ising class [45], describing the ferromagnetic phase transition. More importantly, any two systems that pertain to the same universality class display exactly the same macroscopic behavior [given by equations (11)- (14)], and hence exactly the same phase transition. Thus, given two systems in the same universality class, we can always choose to study the simplest one, because we know that the other one will obey the same thermodynamic functions close to the critical point [45]. Universality classes are determined by physical properties like the dimensionality of the system, the range of the interactions, and the symmetries of the system (for instance, if energy is conserved or not) [45]. This is the reason why two copies of the same system (e.g. the same region of the brain of two individuals of the same species) must pertain to the same class (within an acceptable statistical error). There is no reason to believe that the brains of healthy animals of the same species display different symmetries, dimensionality or energy conservation.
Therefore, if the brain had a critical point and it were possible to precisely determine the set of exponents that governs it, then we could safely postulate that the simplest model matching these critical exponents is the one and only brain model. This is a strong and deep statement, but it is a consequence of the scaling theory of universality [84]. For example, if the set of critical exponents of the brain was the one given in the previous paragraph, then the mean-field Ising model would be the one model of the brain. In turn, we would be sure that all the macroscopic brain properties (close to the critical point) are exactly the same as those of a magnet! And this alone would save us a lot of time in pointless arguments about which model is the best brain model; not to mention the amount of saved simulation time, since we would not need to care about any differential equation for neurons or synapses-or any other cell or brain process for that matter-in the effort of modeling the brain.

Phase transitions in the brain
So far, we discussed criticality based on the phenomenology of the Ising model-a paradigmatic magnet in thermodynamic equilibrium. Nevertheless, the brain is not a magnet (although it can interact magnetically), and the brain is not in thermodynamic equilibrium (homeostatic mechanisms are necessary to keep it-and us-from thermalizing with the environment-or dying). So, what is the use of the theory described in the last section?
To begin with, all those concepts can be generalized into a theory for nonequilibrium phase transitions, which frequently happen in all sorts of stochastic and deterministic systems [7,85]. More importantly, the concepts of phase transitions, criticality, and universality can be, and are, extended for systems that are out of equilibrium [7,84,85]. This is true for systems which have their dynamics obeying a fundamental Hermitian Hamiltonian (i.e. which obey reversibility and have steady states given by the usual Gibbs-Boltzmann distribution). These are the systems that will give rise to the classical equilibrium thermodynamics. But the universality and divergences of the critical point are also true for systems that either: (i) have no underlying Hermitian Hamiltonian defined by transition rates, for which the detailed balance is not satisfied (i.e. local time-reversal symmetry is broken, such that the steady state is not guaranteed to exist); or (ii) that have no relation whatsoever with equilibrium models, and are defined in terms of purely statistical processes, such as Markov chains, or generic systems of interacting particles like the contact process [84]. Surely, the time component of this last class of systems, its interactions, and its spatial extent add other symmetries that are relevant to their behavior, yielding extensions to equilibrium universality classes, but also generating completely new ones. A great summary of the nonequilibrium critical exponents and universality classes is given in [7,84].
The nonequilibrium universality classes of particular interest to brain sciences come from systems generally called branching processes [5,37]. In them, the global state of the system fluctuates in time through spreading activity patterns where a single site (or node) can branch out its state (either deterministically or probabilistically), exciting its neighbors in a network; and so on. These systems present one or many absorbing states, and are sometimes related to percolation universality classes. These classes are of particular interest because neurons (and therefore, neuronal networks) are often excitable systems, possessing a threshold above which an action potential (or simply a voltage spike) is generated; and below which there is no activity (i.e. an absorbing state). Each spike can electrically excite neighboring cells through synapses and so on, yielding the typical picture of activity branching. In addition, the time and/or frequency of these spikes are believed to carry information contents throughout the brain [65], motivating models that describe spike timings and their spread, instead of complex voltage equations. Alternatively, the important feature of neurons' activity could be their firing rates. In this case, each neuron could be regarded as a two-state cell: either the neuron is active (firing) or not; this dynamic would render the brain as a sort of spin glass, or a Hopfield model [86][87][88], instead of a branching process. In both of these cases, the parameter can be taken as the average intensity of synaptic coupling (or something that correlates with it, such as the concentration of synaptic blockers), and h is the average intensity of any current that is applied to the system, measured relative to the firing threshold (either from external electrodes, or coming from random fluctuations in the culture of neurons, or even coming in from other brain regions).
Despite these properties of neurons, and the self-organizing nature of neuronal networks via adaptive synapses, the brain is not widely accepted to be critical. This is due to, mainly, two reasons: (i) different techniques yield apparently incompatible critical exponents; and (ii) the exponents suggested by experiments are not always directly compatible with known universality classes, as we saw in section 1. Of course, the second point is not necessarily a problem: if our brain models do not take into account all the symmetries and properties that are present in the empirical brain, then it is only expected that models and experiments will not agree; and the solution to this problem would be to identify further symmetries and conservation laws that are present in the brain which could had been ignored by the models. Concerning point one, three explanations are possible: (a) perhaps there is no single universality class for the brain, making different brains have slightly different exponents (both within and across species); (b) the methods used to measure the critical exponents are not reliable (i.e. power-law neuronal avalanches are a bad estimate of the underlying criticality-the main point of this critical perspective review); or (c) a single universality class is enough, but either the brain is deviated from the true critical point via external fields [38], or the number of sampled neurons hide the true exponents (and universality class) of the system, yielding apparent (and misleading) critical exponents [37]; in fact, any combination of these three factors is also possible. So, there is no well-established universality class for brain criticality.
Here, we are not interested in determining whether the brain is indeed critical. Instead, we want to bring attention to how phase transitions could be quantified in the brain (or neuronal networks for that matter) to better synergize with theory. Thus, we can systematically approach the discrepancies between experiments and models and, hopefully, advance the understanding of the brain in the context of criticality. Any phase transition in nonequilibrium systems can-and should-be quantified via equations (12)- (14); and if you are feeling more adventurous, you can also characterize the correlation of the system by estimating the correlation function and length in equations (10) and (11). Notice that these quantities are way more fundamental than characterizing avalanche distributions. In what follows, we will discuss how these critical exponents can be measured in the brain.
However, you might say that avalanches are important to determine whether the brain is a SOC system (instead of just having a critical point). We addressed this issue in section 1.2 in more details. Here, it suffices to say that PL-distributed avalanches appear on top of the critical point of a wide range of systems, such as the branching processes in the directed percolation universality class [43] or the switching of states in Ising magnets [44]. Systems like the sandpile are usually associated with a phase transition in the Manna universality class [3,7,55], since it possesses an infinite amount of absorbing states. Moreover, the SOqC mechanisms identified in theoretical models tell us that the PL avalanches in self-organizing systems without bulk conservation are obtained with the tuning of a parameter-similarly to what happens in a phase transition. Hence, describing the phase transition is a more general and less controversial way to probe for criticality, although it requires a significant amount of extra work when compared to measuring avalanches.

Quantities of interest
Given the out-of-equilibrium nature of the brain, and to properly characterize its phase transitions, we first need to make a few remarks about how to approach equations (7) and (8), and then we can define the variable of interest (X) and the order parameter ρ. Finally, we introduce an alternative method that is more complete than simply probing for critical exponents; it consists of calculating the scaling functions and performing finite-size scaling [7,77,89]. So, defining X is easy: spike trains are already binarized representations of the timing of neuronal action potentials. Then, each neuron i must have a time series x i (t), such that x i (t) = 1 if a spike occurred at time t. The time series can be sampled at a given time precision [90], say at 1 ms. Therefore, X(t) = i x i (t) is the total number of spikes at a given time bin t, and ρ(t) = X(t)/N is the instantaneous firing rate of the network. So far, this is a standard technique used in many experimental settings, not only when probing for a critical point. Moreover, ρ can be sampled for multiple combinations of and h. For instance, may be controlled via CNQX or PTX-like drugs to systematically change the average synaptic balance of the neurons [19,21], and h may be controlled by the amount of injected current (either coming from a natural source in a behavioral experiment [26,91], or from an implanted electrode, or even via light excitation of cells prepared for calcium imaging).
The central feature defining criticality is the divergence of the correlation lengths (both temporal and spatial). We can probe for it via the equal-time pair-correlation and the autocorrelation functions [7], where we introduced the subscripts ⊥ and for space and time quantities, respectively. Neurons i and j are at a distance r from one another. The autocorrelation can also be calculated by C (t ) = ρ(t)ρ(t + t ) . The correlation lengths, respectively ξ ⊥ and ξ , are given by the values of r and t in which C ⊥ (r) and C (t ) cross the value ρ 2 . The firing rate ρ(t) is also called the density of active sites, or activity density, and it can be used to compute many different critical exponents. The steady state is reached after waiting a time t 0 ξ (see figure 3). Then,  (17) and (18) are defined for nonequilibrium systems. Right: scaling of the initial transient activity starting with ρ(0) ≈ 1, yielding equation (19) near the critical point ( c = 0); the grey lines are different trials of the system, all for the same 0; the blue is the ensemble average for each time step. Each time step average must taken ignoring trials that already went down to the absorbing state ρ(t) = 0; after t ≈ 400, the system reached the steady state. one should sample ρ(t) for a time interval Δt ξ to be able to calculate the averages [7] ρ = 1 Δt where the upper bar is used to represent time average. Using these quantities, equations (7) and (8) can be promptly applied for many combinations of parameters, ( , h), in order to define the order parameter, ρ, and its variance per unit neuron, Var ρ , A few things must be noted here: (i) out of equilibrium, the fluctuation-dissipation theorem is not generally valid [7], such that, now, χ = Var ρ in general, contrary to what is expected in equilibrium, thus, the susceptibility, χ = ∂ρ/∂h, and the variance each define a different critical exponent; (ii) one is certainly interested in calculating the averages in equations (17) and (18) at c = h c = 0, but it takes a very long time to reach the steady state as the critical point is approached, since ξ → ∞ as → 0 and h → 0 (this phenomenon is known as critical slowing down [77]); (iii) higher moments of ρ may be calculated, yielding cumulants that help us identify the critical point [7, see section 4.1.9.2] and [89]; (iv) error estimates for the variance can be computed via bootstrap or jackknife techniques [77]; (v) it is not trivial to measure these quantities in a computer simulation of a system undergoing an absorbing state phase transition, since the absorbing state with ρ = 0 will eventually be reached for any finite system [7,85], so similar issues might appear in experiments; thus, some simulation techniques were developed to optimize the calculations [92].

Scaling properties for brain systems
Inspired by the experimental work in [93], which found a directed percolation (branching process) phase transition in turbulent states of nematic liquid crystals, we describe here some laws that could be used to look for criticality in the brain. A complete list can be found in [7, chapter 4] and [43], and we selected a few that we judged to be the easiest to calculate. The firing rate ρ is the order parameter, is parameter that correlates with average synaptic intensity, and h is any external current to injected in the system relative to its firing threshold. The critical exponents are given by: • The decay of firing rate after full excitation [ρ(0) ≈ 1 or homogeneous ρ(0) > 0; the average for each t is taken over surviving trials, i.e. with ρ(t) > 0]: • The initial increase of firing rate after single-site stimulus [0 < ρ(0) → 0; average over all trials for each t]: • The steady state of the order parameter: • Correlations: • Steady-state avalanche properties (size s, duration T): Remember, and h are, respectively, the average synaptic coupling (or something correlated with it) and the average external current intensity.
A few remarks: (i) all the exponents in equations (19)- (30) are positive real constants, and they are dependent on each other through the so-called scaling laws; many of which are collected in [43]. (ii) The time dependency of equations (19) and (20) hold for the ensemble average; this means that the experiment must be run several times under the same configuration to obtain an ensemble of ρ(t), and then we calculate the average ρ(t) over the ensemble for each t (see figure 3, and an experimental example in [93]). (iii) Depending on the fraction of active sites in the beginning of the experiment, a crossover of equations (19) and (20) may be observed, where ρ(t) initially grows and then decays. (iv) These laws are expressed in terms of the absolute value of because some are only valid in the active phase (also known as supercritical), where the activity is capable of spreading through the network (i.e. ρ > 0 provided that → 0)-and the active phase may happen either with > 0 or < 0, depending on the system (figure 2 has the active phase for < 0). (v) The size of an avalanche, s, should be measured by two consecutive visits to the absorbing state [7,56] (i.e. ideally, avalanches from different seeds should be separated in space and time-a procedure that is not always possible experimentally).
Temporal correlations hold a more controversial relation to the critical point in the SOC-related literature, and consequently in brain criticality. This is because the original work of SOC was published as an explanation for the 1/f noise [1]-this means that the power spectrum of the ρ(t) should decay inversely proportional to the frequency. However, later it was found that the original sandpile presented 1/f 2 noise instead [94]. The power spectrum S(f ) is related to the autocorrelation function by a Fourier transform (a result known as the Wiener-Khinchin theorem [95]); this means that S(f ) ∼ f −b yields C (t ) ∼ t −β/ν [equation (27)] with b + β/ν = 1, valid for b = 1. When b = 1, the PL decay of C (t ) breaks down and gives place to a slow logarithmic decay, yielding extremely long temporal correlations [2]. White noise is obtained for b = 0, and Brownian noise, b = 2. In summary, the important thing is that having a PL power spectrum is enough to obtain long-range autocorrelation that is typical of a critical point, obeying equation (27). Thus, requiring b = 1 is not necessary for criticality, although evidence suggests this is the type of noise present in the brain [73,74,96].
Finally, there is a quantity that is particularly interesting due to its simplicity: the distribution of active clusters (not avalanches), P cluster (ρ). In practice, it is the histogram of ρ(t). This quantity has been employed in some brain-criticality-related works [53,54,97] (see table 5), possibly mistaken by the avalanche size distribution. The confusion with avalanche distributions happens because P cluster (ρ) shows a PL scaling on the critical point of some phase transitions related to percolation [98, see equations for n s in chapter 2], defining the Fisher exponent [98,100]. There is theoretical evidence that hovering around a critical point with PL P cluster (ρ) generates a PL distribution of active cluster in the homeostatic system too [72]. However, the PL of this distribution does not seem to be a general feature of all non-equilibrium phase transitions [98, section 2.5].

Scaling functions applied to the brain
In equilibrium, the PLs that define the critical exponents are theoretically derived from the assumption that the free energy is a generalized homogeneous function sufficiently close to the critical point. Thus, the derivatives of the free energy must also fulfill this same requirement [7]. A generalized homogeneous function is any function Table 5. Active cluster distribution in whole-brain imaging. Distribution of active cluster P cluster (ρ) calculated for whole-brain experiments. [53]: activity clusters of spatially connex cascades. † Exponents estimated from figures. All recordings performed in humans. These distributions are sometimes confused with avalanche size distributions, but in fact they define the Fisher exponent from percolation theory [98,100], see also section 3.2.

References P cluster (ρ) Source # of units Condition
for all positive real values of λ. This means that f is rescaled by a constant factor λ whenever its arguments are collectively rescaled by corresponding powers of λ. We follow the same premise in non-equilibrium systems, but now relative to a generating Langevin equation instead [7]. This causes all quantities in the left-hand side of equations (19)- (30) to be written to satisfy equation (31) sufficiently close to the phase transition. To do that, we must remember that all these quantities are functions of the interaction parameter , the external currents h, time t, lateral system size L, and so on, and then multiply each quantity by its corresponding scaling function. For instance, defining the critical exponent [99] σ = βδ h , the steady state of the order parameter can be written as [89] where the a's are specific metric factors that are not universal (i.e. different models will have different a's) and must not depend on L, or h; andR andṼ are two universal scaling functions that are shared by all systems in the same universality class. Thus, determining the scale invariance of the system by determining these universal functions is a robust way to claim that a phase transition is critical. It also helps determining the universality class of the system, and complements the computation of the critical exponents [7,89]. The exponent of λ for each argument of the scaling functions can be determined from dimensional analysis, but a vast list is given in [7, appendix B, and throughout chapter 4].
In practice, the method of finding the scaling functions is called curve collapse, and consists in choosing an appropriate value for λ to re-scale the order parameter and the other quantities. In doing so, all systems pertaining to the same universality class will fall on top of the same curve with re-scaled arguments. For example, assuming the normalizationR(1, 0) = 1,R(0, 1) = 1, andṼ(0, 1) = 1, then a , a h , and a v are given by the angular coefficients of PL fits to equations (21), (22) and (24). Finally, choosing λ to satisfy a h hλ σ = 1, the functions R andṼ become dependent on a single variable u = a (a h h) −1/σ , and can be isolated from equations (32) and (33) Since and h are control parameters, ρ and Var ρ are measured, and the a's and exponents are obtained by fitting, we can plot the right-hand side of the last expressions versus the new variable u to obtain the shape of R andṼ. The collapse is only expected asymptotically as → 0 and h → 0. All the methods described in this section can be applied to any power-law quantity close to the critical point.
Other scaling forms that may be of interest are related to the finite-size of the system and to time. For that, we just need to add arguments to the scaling functions with the correct exponents in λ (usually determined by dimensional analysis; a list of the most usual forms for λ is in [7, appendix B]). For example, a more complete form ofR containing time t is [7,89]: Choosing λ = (a t t) 1/ν and having as the control parameter, we see that α = β/ν . Then, we define the variable u = t| | ν for the x-axis and plot in the y-axis the functioñ  (17) and (18)] as a function of the synaptic coupling for different system sizes L. The scaling forms from equations (37) and (38) are shown, yielding the exponents β/ν * ⊥ and γ = 0. A zero exponent is frequently obtained for a quantity that has a discontinuous jump on the critical point = 0. We write ν * ⊥ with * because in this particular example, the phase transition is of the mean-field directed percolation universality class; this means that the scaling law ρ ∼ L β/ν * ⊥ holds with corrections [89]. Right: for a system of fixed size L = 10 5 , the average ρ does not depend on the subsampling fraction f = n/L, where n is the number of recorded neurons; the variance depends on f homogeneously. This shows that calculating steady state properties are not strongly influenced by subsampling, making them good candidates to replace avalanche distributions. Supercritical corresponds to < 0 and subcritical, > 0.
The curves should collapse for systems in the same universality class on the universal scaling functionR t (u). This was done experimentally in [93] to determine the universality class of a phase transition in a turbulent nematic liquid crystal.
The finite-size scaling is accomplished when adding a dependence on system size L. It is a very common technique in simulations, since the system size can be easily varied [12]. It is sometimes applied to experiments too, when it is possible to directly control the spatial dimensions of the experiment. In addition, varying L can be mimicked by box-size scaling. This was done in [4] to probe for the PL decay of avalanches as the number of electrodes changed, and also in [53], where the authors determined the correlation length in the resting state of healthy humans fMRI. This technique is inherited from fractal analysis and it consists in calculating ρ, the avalanches, or any other quantity, confined to a box of lateral size L contained in the full system; then, the box is increased and the quantities are calculated in the new box, and so on, resulting in ρ being a function of the box size L (instead of the system size). For a finite system, the scaling forms of ρ and Var ρ are Choosing λ = (a L L) 1/ν ⊥ and h = 0, we get (omitting the a's) These last expressions are useful for two things: (i) they help us find the apparent critical point in finite systems, since when = c = 0, the functionsR FSS andṼ FSS become constants yielding ρ(L) ∼ L −β/ν ⊥ and Var ρ (L) ∼ L γ /ν ⊥ ; (ii) they can be used in the same way as the other scaling functions, collapsing the data for different or h on top of a single curveR FSS (u) ∼ ρ( , L)L β/ν ⊥ , with u = | |L 1/ν ⊥ (the same for the variance). An example is given in figure 4. Nevertheless, dealing with finite-size systems is tricky, since the dimensionality of the system may bring corrections to the scaling laws from the last section and to the critical exponents too. This is particularly known in the case of the mean-field directed percolation phase transition [89], where the exponent ν ⊥ gains a new value. However, the ensuing discussion is outside of the scope of this work, since most of the times it is enough to determine the exponents ration β/ν ⊥ and γ /ν ⊥ by using equations (37) and (38).
Performing the curve collapse for two distinct brain experiments would allow us to determine if the two brains pertain to the same universality class without bothering with the actual class per se. The class itself could be determined by the collapse of the experiments with a particular model, and different models could be tried. For the sake of the example, say you extract pieces of the same brain region (e.g. the same layers of the primary visual cortex) from two different animals (e.g. a mouse and a rat). Provided you have many cortical  [14,72] pertaining to the mean-field directed percolation universality class. Top row: full sampled system (all neurons are recorded from); the distributions extend for larger sizes and duration as the simulated system grows; each panel can be collapsed into one universal scaling function plotting P(s)s τ versus sN D for a particular choice of D (same for P(T )) [3,43]. Size and duration obey the crackling noise scaling rule (right panel). Bottom row: same system as above, although now only a fraction fN of neurons is recorded from the network; notice that the distribution deviates more and more from a PL as f decreases. The crackling noise is not obeyed anymore. Importantly, the system is adjusted to be critical ( = h = 0), so the subsampling is just hiding the true nature of the network. Solid and dashed lines are the theoretical predictions.
slices for each animal, after a good amount of trial and error, you have many preparations for the same animal species, each one containing different concentrations of synaptic blockers-this allows you to control for . Then, you measure the firing rate ρ(t) without any current injection (h = 0). Finally, you control for h in small steps: inject a little bit of current and hold until the system achieves the steady state, measuring the new ρ(t). Inject a little more current and hold again, measuring again ρ(t), and so on. This is probably a very difficult experiment (not to mention whether it is really doable). However, assuming it is possible, then you will have a set of the steady state ρ and Var ρ as functions of and h. Therefore, you will be able to directly apply the scaling function method described here to check whether the data from both animals collapse in a single curve for each quantity, sayR(u, 1) andṼ(u, 1).

Avalanche scaling
For completeness, the scaling theory developed in the last subsection can also be applied to avalanche distributions [3,43]. The more usual way to do it is by introducing scaling functions to equations (1) and (2), where s c and T c are the avalanche characteristic size and duration, respectively, obeying Notice that the σ defined here is different from the σ = βδ h . Their finite-size scaling forms, derived similarly to equation (37), are s c ∼ L D and T c ∼ L z to leading order, where L is the lateral system size, D is a characteristic spatial dimensionality, and z = ν /ν ⊥ is the dynamical exponent [101]. Determining the numeric value of s c and T c is not trivial, but sometimes they can be estimated from the complementary cumulative distributions for s and T [15]. Equations (39) and (40) can be used to view the scaling functionsP(s/s c ) andP(T/T c ). You just need to plot P(s)s τ versus s/s c (the same for T)-see the upper insets in figure 5. The crackling noise scaling relation is given by equation (4). It puts together many critical exponents related to avalanches and can be derived in general, for any absorbing phase transition, in at least two different ways [43,44]. Here is a simplified version: we start by introducing two quantities that are known to define the spreading critical exponents [43]: the survival probability P s (t) ∼ t −δ (expressing the chance that a configuration is still not in the absorbing state at time t); and the average number of active sites over all trials N(t) ∼ t Θ . Since N(t) ∼ P s (t)N s (t), where N s is the number of sites that are still active at time t; we can write N s ∼ t Θ+δ . The size of an avalanche that dies at time T is Now, we can calculate the PL dependence of the distribution of avalanche duration P(T) from P(s), since s depends on T, Notice that (a) T c ∼ ξ ∼ | | −ν since in criticality, the characteristic time is ξ ; (b) s c ∼ | | −1/σ ; and (c) Putting these three facts together with z = ν /ν ⊥ , we get 1 + Θ + δ = 1/(σν ⊥ z), arriving in equation (4). Again, the exponent z is sometimes defined differently, see [101].
This scaling law is derived by assuming that a critical point exists where the avalanches' sizes and duration obey scale-invariance. Thus, the crackling noise is a necessary condition for the critical point. However, this derivation does not guarantee the opposite: we do not know whether obeying the crackling noise scaling relation implies in having a critical point. In another words, it is not clear if the crackling noise scaling relation is also a sufficient condition for criticality. More investigation in this direction is needed.

Criticality beyond avalanches
Neuroscience is a rapidly evolving field. Everyday we see an exciting breakthrough, ranging from genetic manipulations for optical imaging, to whole-brain recordings of electromagnetic fields-all of that to help us make sense of the organ that is responsible for making sense-the brain. As if the brain were not complex enough, recording neuronal activity in any scale is itself an intricate job, and can certainly be compared to a great work of art. Just imagine the precision needed to implant electrodes in the correct spot, and the odds of facing an artifact coming from unknown sources in the system being recorded from. All these tasks require following a strict protocol, and countless hours of fine-tuning.
In parallel, we have theoretical models of neuronal networks. Sometimes they are built exclusively to solve a particular problem (the field of artificial intelligence is an example of this endeavor). Nevertheless, some models are built to render insight regarding the mechanism responsible for a given phenomenon. For example, the PLs in the stimulus-response curve in psychophysics (sometimes known as the Weber-Fechner law) can be explained by a non-equilibrium mean-field directed percolation model on its critical point [20]; not only that, but this model also predicts that when the system is on the critical point, the range of sensitivity to external stimuli must be maximized-a fact that was experimentally corroborated only a few years later [21].
A third use of models, however, is way less explored by the brain criticality community. A personalized model, built using subject-specific data, can be used to make predictions and draw conclusions about a specific person or neurological disorder. In fact, there is an extensive review of the relations between some criticality metrics and clinical parameters [102]. For example, the absorbing state model developed in [53] was improved [97] and applied to measure the recovery of stroke patients over a year [103]. The authors found that as the stroke-related lesions recovered, also did the criticality markers of patients, passing from a subcritical to a critical state; and the improvement of criticality depended on the rewiring of white-matter tracts. Interestingly, however, this type of data-driven modeling yields a subject-specific critical point (with most probably similar critical exponents), since each person has their own particular connectome [97]. Then, the criticality (or lack thereof) for a given neurological disorder could be estimated by quantifying in which state (sub, super, or critical) does the model provide the best fit of each subject's features. This technique could lead to potentially useful markers for the diagnosis and recovery of patients.
In the same vein, a simpler model with a percolation phase transition has been employed in temporal lobe epilepsy to identify dynamical properties of the brain [104]. The connectome of each patient and control subject has been measured using fractional anisotropy of water diffusion. Then, the model was simulated for each individual network, and the percolation time between every two nodes was measured-this was regarded as a measure of the communication between brain regions. The system did not have a critical point, only a first order phase transition. Nevertheless, running the simulations on top of the transition point enhanced the differences in the communication between brain regions in patients. These differences were correlated with cognitive markers (such as memory and motor coordination), and vanished after statistically controlling for the hippocampus volume-a structural marker known to be significantly decreased in this disease. Again, the phase transition coming out of a data-driven model is playing an important role in distinguishing the disease from healthy individuals.
Generalizations of a critical point are made through the theory of bifurcations [62]. Some bifurcations directly correspond to a critical point [63], presenting the scaling laws that we discussed in section 3. Markers such as the critical slowing down-known to happen close to critical points due to diverging autocorrelation length-are also used to forecast tipping points for dynamical systems [105,106]. In particular, it has been found in the brain of epilepsy patients [107], although there is still controversy about whether this is a general phenomenon related to the disease [108]. This is intriguing, because there is an ongoing debate whether epilepsy (or an unhealthy condition in general) should be related to the critical state or not, instead of a healthy condition. This lead to the proposition that healthy systems should be subcritical [59]. Conversely, in vitro neuronal avalanche findings of folate-reared cultures indicate a switch between universality classes when compared to healthy cultures [33]. The folate-reared cultures are expected to be experimental models of epilepsy [48]. Thus, the finding in [33] is in direct conflict with both the slightly subcritical brain hypothesis and the critical brain hypothesis. This is because in the former, criticality would be related to seizures (but a healthy brain would not be critical); in the latter, a healthy brain would be critical, and an unhealthy condition would deviate from criticality.
Avalanche distributions appear in many experiments, but they are difficult to measure and suffer from the lack of standardization. Future work should systematically address the relation between different experimental techniques, specifically employing models that are able to generate avalanches. Not only that, but we also need to establish a protocol or a pipeline to detect and sample avalanches. For instance, some works group avalanches by a particular feature of the noise that generated them. The grouping features for spikes and avalanches include the CV of the underlying noise [29,37], the maximum variance of the noise [38], the principal component of the LFP frequencies [22,31], or even the spatial extent of avalanches [32]. These grouping features could serve as implicit control parameters of the system, although they might introduce bias in the estimation of statistical distributions for avalanche size and duration, and for the computation of the order parameter steady state.
Relying on the avalanche distributions to claim that a system is critical may incur in confusion. From critical systems without clear PL avalanches to different exponents being measured for the same system under diverse experimental setups; or the definition of an avalanche, and the procedure to detect them; or even the bias caused by having a very small sample of a very large system-all of which bring controversial conclusions about the nature of the system under scrutiny. To help circumvent that, many aspects of the general theory of criticality were described and discussed in the context of the brain throughout this paper. The main one being the scaling functions and scaling relations between critical exponents. Although certainly difficult to measure, they bring a clearer evidence of criticality. Importantly, we know that subsampling does not disrupt the measurement of the steady state firing rates in mean-field-like networks; at most, subsampling may only systematically shift the firing rate fluctuations in these networks, as we showed in figure 4. To be able to measure them, we need to record the instantaneous firing rate of the system as a function of two independent control parameters; and many trials for each parameter configuration must be realized. One of the control parameters is the : it is related to the average interaction between neurons or brain regions (e.g. E/I balance, connection strength, synaptic conductance, density of connections [109]). The other is simply the external input relative to the threshold. Other candidates for control parameters are the features that some authors used to group the data before calculating avalanches, such as the CV or the variance of the firing rate, or the main frequency of the oscillations, etc. In cultures, the external input is controlled by current injection. In in vivo experiments, the external stimulus can take many forms; from blowing onto the whiskers of the animal [110], to visual patch stimulus, to any kind of sensory input that makes sense in the particular experimental context. Then, to apply the theory developed here, one must perform the appropriate average of the firing rate, either over the ensemble of trials, or over time.
Another type of average that can be considered is over spatial disorder [111]. This scenario can happen when the network varies from sample to sample, keeping the same statistical properties. For example, the synaptic weights may vary between two samples, although they obey the same weight distribution; or the local in-and out-degree of nodes may vary between samples, provided that the network distribution of these quantities are kept. Experimentally, this could happen when performing experiments over multiple cultures of the same brain region, each one extracted from an individual animal of the same species. Averaging the firing rate ρ(t) over the recordings of each culture sometimes can create the so-called rare-region effects. These are cumulative exponentially rare regions in the network in which the activity propagates; the averaging procedure, then, sums up these effects and generate a PL in ρ. Differently from the critical point, where the PL exists only at that particular c and h c , the PLs in ρ generated by the network disorder persist over an extended region in the parameter space. This region of the parameter space is known as a Griffiths phase [11,12,111]. This regime can be systematically tested with the same tools that we presented here, and would be an alternative explanation for PLs in the brain. Some authors have applied thermodynamic concepts to analyze their experimental data (see, for example [31,[112][113][114]). Roughly speaking, this method assumes an Ising model with arbitrary parameters (external fields and pairwise coupling), and then determines the distribution of these parameters that maximize the model entropy constrained by experimentally observed quantities. Some examples of quantities that were used to constrain the data are: the average firing rate and pairwise correlation [112], or the probability of transition between two firing rate states [114]. Since the entropy of the equilibrium Ising model is a function of an arbitrary 'temperature' T, it can be used as a control parameter, yielding a set of fitted external field and coupling distributions for each value of T. This allows the estimation of the model's specific heat c(T), and its peak marks the critical point similarly to the peak in the susceptibility (this is because c(t) is also a second order derivative of the free energy [6]). The peak in the specific heat is strongly dependent on the underlying cortical state, be the cortical state given by the LFP frequency [31] or by the CV of the spike rate [114]. The firing rate CV can be used as a control parameter, yielding an estimate for the CV in which a critical point is predicted [114]. This prediction matched the critical CV value yielded by verifying the validity of the avalanche crackling noise scaling relation [29,37]. Another interesting consequence of this method is that the critical exponents of the fitted Ising model can be estimated [113]. However, the fitted model exponents might not correspond to the critical exponents of the underlying experimental system, simply because the Ising model might fail to account for all the symmetries (or lack thereof) of the neuronal network. Nevertheless, it is probably a powerful classifier for the data, letting us know whether the experimental system is performing a sub-, super-or critical dynamics.
Spatiotemporal correlations also play an important role in determining the system's critical point. In fact, the PL divergence of the correlation lengths both in space ξ ⊥ and in time ξ are the defining feature of a critical point. By using the box-size scaling technique, it was found that the spatial correlation length extracted from fMRI data was consistent with a diverging correlation length due to a critical point in a percolation model [53]. Not only that, but the fluctuations' amplitude persisted in the fMRI BOLD signal through many spatial scales, a feature that could only be observed in the critical point of the model. More importantly, the model could only generate a good estimate of the human resting state networks when adjusted on top of the critical phase transition.
Temporal correlations have also been used to probe for criticality [25,115,116] and to relate neuronal avalanche distributions to behavior [25]. That is accomplished either by the direct calculation of the autocorrelation, or power spectrum, or even by applying a technique called detrended fluctuation analysis [117,118], developed to unveil the structure of non-stationary time series. As we saw in section 3, the exponent b of the power spectrum is related to the exponent of the steady state average of ρ(t) via the decay of the autocorrelation, b + β/ν = 1. This is another scaling law that could be employed in data analysis, since it is only expected to hold on the critical point and it does not require the imprecise calculation of avalanches.
Furthermore, phase transitions and bifurcations are present in every brain scale, from the diffusion of molecules through the membrane of cells, to the macroscopic whole-brain activity. The endeavor of looking for an explanation of brain PLs continues as more evidence is gathered around the theme of brain criticality. Other phenomena, such as synchronization and brain rhythms, also play important roles in behavior, and their relation with brain criticality or neuronal avalanches is not entirely understood. Bringing statistical mechanics tools and insights is not only an exercise of thought, but necessary to produce a unifying theory of the brain electrophysiology, development, and function. Not only that, but these studies may serve to develop technological tools to better diagnose diseases and evaluate the patients' recovery of brain injuries. This paper is, thus, an attempt to bring attention to these methods and help paving the way towards the understanding of how, ultimately, all of these facets come together to generate animal behavior.