Integrated information in the thermodynamic limit

The capacity to integrate information is a prominent feature of biological and cognitive systems. Integrated Information Theory (IIT) provides a mathematical approach to quantify the level of integration in a system, yet its computational cost generally precludes its applications beyond relatively small models. In consequence, it is not yet well understood how integration scales up with the size of a system or with different temporal scales of activity, nor how a system maintains its integration as its interacts with its environment. Here, we show for the first time how measures of information integration scale when systems become very large. Using kinetic Ising models and mean-field approximations from statistical mechanics, we show that information integration diverges in the thermodynamic limit at certain critical points. Moreover, by comparing different divergent tendencies of blocks of a system at these critical points, we delimit the boundary between an integrated unit and its environment. Finally, we present a model that adaptively maintains its integration despite changes in its environment by generating a critical surface where its integrity is preserved. We argue that the exploration of integrated information for these limit cases helps in addressing a variety of poorly understood questions about the organization of biological, neural, and cognitive systems.


I. INTRODUCTION
Cognition emerges from the distributed activity of many neural, bodily, and environmental processes.The problem of large-scale integration of neural processes is crucial for understanding how unified cognitive and behavioural states arise from the coordination of these distributed sources of activity.Evidence [1,2] suggests this integration process is non-decomposable: we cannot understand it in terms of modular components or timescales of activity in a neural system nor can we decouple neural activity from the external environment [3].The different components and scales of the cognitive process are deeply intertwined.Yet, the functional components of the process are still able to maintain their differentiated characteristics in order to generate complex adaptive patterns of behaviour.
How can such an integrated, complex organization emerge and be maintained?One of the most attractive theories is that neural activity is coordinated into a coherent yet flexible 'dynamic core' [4,5], which balances opposing tendencies of integration and segregation.The interplay of these opposing tendencies generates information (understood as described by information theory, not in a semantic or intensional sense) that is highly diversified among functional parts of the nervous system, and at the same time unified into a coherent whole, thus displaying highly complex patterns of activity.
Integrated information is defined as the information possessed by a system which is above and beyond the information that is available from the sum of its parts.Information integration was first conceived of as linked to consciousness [5,6] but it can also be manifested without awareness [7] and has been used more generally to describe biological autonomy [8].Although the topic of information integration has received interest from different communities in recent years, we are still lacking a full understanding of the principles that underlie this fundamental process: how integrative forces are deployed temporally or spatially, how they cope with the surrounding environment, or how they scale with the size of the system.
Different approaches have proposed ways to formalize this idea; one of the most popular has been developed as a measure connected to consciousness under the name of integrated information theory (IIT, [6]).In its latest versions, IIT is based on interventionist notions of causality to characterize the causal influences between the components of a system [6,8].That is, instead of assessing whether a system is unified into a coherent whole by analysing its behaviour in regular conditions, IIT proposes that the forces integrating the behaviour of the system are better captured by observing its behaviour under perturbations.
IIT postulates that any subset of elements of the system is a mechanism [9] integrating information if its intrinsic cause-effect power (i.e., its ability to determine past and future states) is irreducible.Irreducibility is measured in terms of integrated information ϕ, which when larger than 0 indicates that the subset of elements at its current state constrains the past and future states of the system in a way that cannot be decomposed in two or more independent cause-effect sets of relations.That is, ϕ captures the level of irreducibility of the system, understood in the sense that even the least disrupting bipartition of the system into two disconnected halves (this is called the minimum information partition, MIP) would imply a loss of information in the causal power of the system.Aside from computing integrated information at the level of mechanisms, IIT postulates a composite measure Φ, which is computed from the set of all mechanisms (each one defined by a value of ϕ) computed in the original system and the system under bidiriectional partitions.A system with Φ > 0 is described as forming an irreducible unitary whole.Since many subsets of the system may present Φ > 0, the boundaries of the system are defined around the subset with larger Φ.A detailed description of IIT measures is provided in Appendix A.
Nevertheless, current formulations of IIT present some limitations for studying brain organization.We propose that, in order to extend current uses of IIT to capture some important aspects of neural organization, we should re-examine some of the main assumptions behind its conception: • Scalability.A system can present different levels of integration at different spatial and temporal scales [10,11] and, in general, it is not well understood how integration behaves at different scales.However, analyses of the properties of brain-inspired statistical mechanical models have unveiled how many processes in neural systems take the form of phase transitions occurring in the thermodynamic limit, showing properties that diverge as the size of the system scales up.Here we apply models from statistical mechanics to describe integration in terms of the tendencies of the system near the thermodynamic limit.
• Temporal deployment The latest formulations of IIT [6] attempt to capture the dynamical nature of neural systems by focusing on the dynamics of causal processes, not taking the stationarity or ergodicity of the system as initial assumptions.Nevertheless, IIT is only measured at a single scale of temporal activity, since it analyses integration in the causal power of a mechanism from one time step to the next.We propose a modification of ϕ to study integration along different temporal spans, showing that systems at critical points must be evaluated for very long timescales.
• Non-decomposability.As we mentioned, empirical evidence points to the non-decomposability of cognitive processes.In its current formulation, IIT considers elements outside the system under analysis as independent sources of noise.Here, we propose instead that the level of integration of a system must be evaluated in the context of the other systems it is coupled to (therefore not assuming that elements in the environment are just sources of statistical noise).This modification allows us to correctly determine the boundary between a system and its environment in the thermodynamic limit.
Some of the assumptions and modifications pointed out here are explained later in the text, and a detailed account and comparison between IIT and our measure of integrated information can be found in Appendix B. Part of the reasons why some of the aspects above have not yet been addressed is that, due to its computational complexity, the application of current IIT measures is limited to very small systems and short timescales.In general, IIT has been tested in small toy models (e.g., [6,12], although some alternative formulations try to circumvent this problem, see [13,14]).In contrast, our approach, apart from the modifications proposed above, introduces some simplifications and approximations in order to measure integrated information as a system scales to very large sizes.Specifically, we introduce a simple kinetic Ising model of infinite size and quasi-homogeneous connectivity, which presents an exact mean field solution that we use to simplify the calculation of integrated information ϕ of the mechanisms of a system.We proceed as follows.First, we introduce the kinetic Ising model and a mean field approximation for solving it.Then, we introduce a measure of integrated information and how it can be computed for Ising models of infinite size.Finally, we present the results of our method in three scenarios of increasing complexity for depicting how integrated information can be used to characterize an integrated system interacting with an environment: • In the first scenario, we illustrate the measure in a simple homogeneous model.In the thermodynamic limit, we can describe integrated information as the susceptibility of the system to changes in the direction of the minimum information partition (MIP).Consequently, integrated information diverges when the system is near a critical point.
• The second scenario depicts a system coupled to an external environment, showing the system and the system-environment compound both show integrated information diverging near a shared critical point.Nevertheless, depending on the coupling strength, the system and system-environment mechanisms present different speeds of divergence.This allows us to delimit the dominant dynamical unit where integration takes place.
• Finally, we tune the parameters of a system with internal self-regulation in order to present high integration when interacting with a variety of environments.The system's internal inhibitory interactions generate a critical surface in the direction of the MIP which describe the viable region in which its integration is maintained.
The results presented here represent a first attempt at using integrated information theory to delimit the boundaries of a family of infinite size systems that can be formally solved.The interest of the study is twofold.First, it allows us to check some of the assumptions of IIT and propose some modifications to maintain its consistency in the thermodynamic limit, and to propose a way to adapt IIT measures for very large systems.Second, although the results presented are obtained from relatively simple cases, they offer an opportunity to speculate about how the causal integrative forces of a system (both its internal cohesion and the coupling with its environment) might scale up when a system approaches the thermodynamic limit.This provides an opportunity to address unanswered questions about integrated organization of biological and cognitive systems.

II. MODEL
We start by describing a general model defining causal temporal interactions between variables.Looking for generality, we use the least structured statistical model (i.e., a maximum caliber model [15]) defining causal correlations between pairs of units from one time step to the next.We study a kinetic Ising model where N binary variables (Ising spins) s i evolve in discrete time, with synchronous parallel dynamics (Fig 1 .A).Given the configuration of spins at the previous step, s(t − 1) = {s 1 (t − 1), . . ., s N (t − 1)}, the spins s i (t) are independent random variables drawn from the distribution: where The parameters H i and J ij represent the local fields at each spin and the couplings between pairs of spins, and β is the inverse temperature of the model.Without loss of generality, we assume β = 1.

A. Mean field kinetic Ising model
We focus on the particular case of a system of infinite size where H i = 0.The system is divided into different regions (from 1 to 3 depending on the example), and the coupling values J ij are positive and homogeneous for each intra-or inter-region connections J ij = 1 NR J SR , where R and S are regions of the system with sizes N R , N S and i ∈ S, j ∈ R.
For a system of infinite size (and all regions with also infinite size), a mean field approximation allows to calculate the field of all units i belonging to the region S as: where m R (t − 1) is the mean field of region R(t − 1).Now we can exactly define the update of the mean field variables using Eq 1 as:

B. Integrated Information ϕ
We use a simplified version of the integrated effect information described by IIT [6], implementing some modifications to measure the scaling of integrated information in the thermodynamic limit.In IIT, both causes and effects of a state are taken into account.For simplicity, we consider only the effects of a particular state.Also, although IIT is defined only for the immediate effects after one update of the state of the system, we define integrated information ϕ(τ ) for an arbitrary number of updates of the system.See Appendix B for a list of the differences between IIT and the measure employed here.
Given an initial state s(τ 0 ), we define a 'mechanism' M (following IIT's nomenclature) as a subset of units {s i (τ 0 )} i∈M .The integrated information of mechanism M, ϕ M , is defined as the distance between the behaviour of the original system to a system in which a partition (from the set of possible bipartitions) is applied over the units in M. When a partition is applied, the input coming from the partitioned connections of the system is replaced by a random unconstrained noise (binary white noise in the case of an Ising model).
Once the partition is applied, the probability of the state s(τ 0 + τ ) is computed after τ updates, injecting noise at the partitioned elements during each update.Then, integrated information is defined as the distance D between the conditional probability distributions at t + τ : where D(p 1 , p 2 ) refers to the Wasserstein distance (also known as earth mover's distance) used by IIT to quantify the statistical distance between probability distributions.Here cut specifies the partition applied over the elements of mechanism design the blocks of a bipartition of the mechanism at the current state {s i (t)} i∈M , and S f 1 , S f 2 refer to the blocks of a bipartition (not necessarily the same) of the updated state of the units {s Specifically, IIT computes integrated information as the value of ϕ cut under the minimum information partition (MIP), which is the partition of mechanism with the least difference to the original partition (i.e., ϕ MIP M (τ ) = min cut ϕ cut M (τ )).We use ϕ M (τ ) to denote the minimum information partition integrated information ϕ MIP M (τ ).Note that some important modifications have been made.The most important one is that IIT considers the element outside of the mechanism as unconstrained sources of noise.As we show in Figure B2, this can radically change the results of integrated information theory, provoking spurious divergences at points other than the critical point.To preserve the consistency of our results, we let elements outside the mechanism operate normally (see Appendix B B3 for details).

C. Integrated information in the mean field model
We now show how integrated information can be computed for the mean field approximation of the Ising model.Thanks to the mean field approximation we can simplify the calculation of the probability distributions of trajectories p(s(τ 0 + τ )|s(τ 0 )), p cut (s(τ 0 + τ )|s(τ 0 )) to a Markovian distribution dependent on the mean field at the previous step.
In general, p(s(τ 0 + τ )|s(τ 0 )) can be computed recursively applying the equation: In the kinetic Ising model of inifine size, the mean fields of the system's regions are deterministic, and instead of computing all possible paths of the system we can just determine the evolution of the mean field using Equation 4.Moreover, knowing the mean field of each region we can calculate the value of the effective fields h(τ 0 + τ ) received by each unit using Equation 3. Also, given the mean field value at a specific point, the posterior probability distribution of each unit is independent.Thus, using the value of h(τ 0 + τ ) computed evolving from s(τ 0 ) we can just take: In this context, the calculation of the Wasserstein distance D is drastically simplified, and we can compute ϕ as the sum of distances between independent binary variables, which is equivalent to computing the difference of their mean values: Once we can calculate ϕ, we still have the problem of finding the MIP of the system.Luckily, since the connectivity of the system is homogeneous for all nodes in the same region, finding the MIP is equivalent to finding the partition that cuts the lowest number of connections.For infinite size systems where inter-region connections are not zero, the MIP will be one of the possible partitions that isolate just one node of the system.Also, the partition that isolates a single unit in time t always has a smallest value of ϕ than the partition isolating a node at time t + 1, since partitioning the posterior distribution corresponds to a larger difference between m R (τ 0 + τ ) and m cut R (τ 0 + τ ).Thus, finding the MIP corresponds to finding which region R of the system least affects future states when one node of the region is isolated in the partition at time t (e.g., Fig 1 .B).
Finally, we define a function F R (m(τ 0 ), τ, {J S,R }) that recursively applies the update rule in Eq 4 for τ steps starting from an initial value with a mean field value m(τ 0 ), such that m R (τ 0 + τ ) = F R (m(τ 0 ), τ, J).In our mean field approximation, applying the MIP to the quasihomogeneous system described here is equivalent to just removing one connection [16] between one or more pairs of regions {S, R} cut , whereas the connections between the rest of regions {S, R} uncut remain intact.Therefore the update rule applied by function F to the partitioned system is Assuming that the number of units per region is equal to N R = r R N and r R = 1, we get a simplified expression for the partitioned and unpartitioned terms: where m 0 = m(τ 0 ) and x = 1 N in the partitioned case and x = 0 otherwise.Now, computing the unpartitioned and partitioned cases case is equivalent to calculating F R cut (m 0 , τ, 0) and F R cut (m 0 , τ, 1 N ) respectively.Given this, assuming N → ∞ we calculate the final form of ϕ as a sum of the derivatives of function F R cut (m 0 , τ, x): where Note that this defines integrated information in similar terms as the magnetic susceptibility typically used in Ising model to identify critical points, although in this case the mean field of the system is differentiated along the parametrical direction of the MIP.

A. Integrated information in a homogeneous kinetic Ising model
As an example, we compute numerically the value of ϕ MN (τ ) for a homogeneous kinetic Ising model containing just one region (as in Fig 1 .A).The system only has one parameter J describing all connections in the system.
For different values of J, we compute ϕ for the system starting from a state in the stationary solution.For doing so, we need to know how to compute F cut (m 0 , τ, x), that is, how to compute the mean field of units at a particular time.
First, we numerically compute F cut (m 0 , τ, x) and ϕ MN for different values of J for the largest mechanism M N of size N , and different values of τ and m(τ 0 ) equal to the value at the stationary solution of the system.We estimate the values of the derivative as F ′ cut (m 0 , τ, 0) = (F cut (m 0 , τ, dx) − F cut (m 0 , τ, 0))/dx, using a value dx = 10 −10 .As we observe in Fig 2 .B, the value of ϕ MN (τ ) appears to diverge as τ grows [17].
Similarly, we numerically compute ϕ MN (τ → ∞) by using the mean field of the model iterating the equation m(t) = tanh(Jm(t − 1)) until the difference in the update is smaller than 10 −15 .In Fig 2 .C we observe how ϕ MN (τ → ∞) shows an apparent divergence around J = 1.Also, we compute the value of ϕ MM (τ → ∞) for different mechanisms of size M as a fraction of N .As shown in Fig 2 .D, the resulting value of integrated information still diverges but is smaller than the value of ϕ MN (τ ) of the whole system, indicating that the system is irreducible.We can go beyond numerical computations and calculate the analytic value of ϕ MN (τ → ∞) near the point of divergence by approximating the values of F cut (m 0 , τ → ∞, 0) around J = 1 as the value of m that solves m = tanh(Jm).Note that, more generally, we can compute F cut (m 0 , τ → ∞, x) just by substituting J ← J(1 − x).
The system has a trivial solution at m = 0. Also, for J > 1 the solution at m = 0 becomes unstable and a pair of solutions in a pitchfork bifurcation (Fig 2 .A).Although there is no analytic solution of the problem, we can compute the value of m near J = 1 by approximating the hyperbolic tangent by the first two terms of its Taylor series, finding that in the limit J → 1 + we approximate: Thus, we can confirm that the value of integrated information ϕ MN (τ → ∞) diverges when J → 1 + .This has interesting implications.If the a system must maintain a growing level of integration as its size increases, it needs to be poised near a critical point that shows a divergence of the values of ϕ.

B. Integrated information for measuring agent-environment asymmetries
We apply the proposed measure of integrated information to the problem of determining the boundaries of an agent interacting with an environment.One of the central aspects of agency is the existence of agentenvironment asymmetries [18], in which the part of the system corresponding to the agent is able (to an extent) to define the terms in which it relates to the surrounding milieu.We test our measure in two simple cases of systems presenting asymmetries in their interaction.
We model a minimal case of agent-environment bidirectional interaction with two regions, where only the region corresponding to the 'agent' has the capacity to self-regulate through recurrent connections (Fig 3 .A).In this case, we have two regions A and E, only A presenting self-connections.The mean field of the system is updated as: For simplicity, we study the case where agentenvironment connections are symmetric J AE = J EA = J c , and J AA = J r .We numerically compute that the system has an similar solution than the previous case, presenting a pitchfork bifurcation at a critical point (Fig 3 .B,D).Moreover, we compute the value of ϕ M (τ → ∞) for different mechanisms.For the case of the mechanism covering the whole system M = AE, we look for the MIP of the system by isolating single units of the mechanism at s(t) (Fig 1 .B).If we isolate a unit from region A, two connections are cut (one with value J r and one with value J c ).Otherwise, if we isolate a unit from region E, only one connection with value J c is cut.Thus, this second partition is always the MIP of the system (M IP AE ).For M = A, the only candidate for the MIP is isolating one node from A, therefore cutting one connection with value J r (M IP A ). Finally, for mechanism E there are no connections within the mechanism and we can directly conclude that ϕ E = 0. Now, the question is: can we consider A as an individual system or should we consider instead the coupled system AE as an integrated unit?Assuming r A = r E = 0.5, we define the values of integrated information as: In Fig 3 .C,E we estimate the value of ϕ A , ϕ AE for τ → ∞ an initial value m 0 corresponding to the stationary solution of the system, and values of J c = 0.8 (left) and J c = 1.2 (right).We observe that in all cases the values of ϕ A , ϕ AE diverge next to the critical point.Nevertheless, in the first case when agent-environment connections are weaker ϕ A > ϕ AE next to the critical point.In contrast, for stronger couplings between agent and environment ϕ A < ϕ AE in the vicinity of the critical point.
We validate this results by solving Eq 12 near criticality.We do this by transforming it into a system of one equation m A = tanh( 1 2 (J AA m A + J AE tanh(J EA m A ))) and finding its Taylor series near m A = 0. We obtain that near the critical point: and F E (m 0 , τ → ∞, x) are easily calculated by adding a (1 − x) factor to the partitioned connections.Thus, we find that the location of the critical point which is the one satisfying J AA +J AE J EA = 2 (Fig 3 .F).From here, we get: Near the critical point at (J AA + J AE J EA ) → 2 + , the values of integrated information are approximated by the expressions: by defining K A = J AA K and K AE = J AE J EA K we describe with these variables the level of integrated information of the agent and the whole agent-environment system near the critical point.In Fig 3 .G we observe that there is a transition from the agent being the system with highest integration to the agent-environment.This illustrates that, near a critical point, the value of integrated information scales up indefinitely in an agentenvironment system.In the case of symmetric interaction only for some cases the agent can be identified as the predominant integrated unit in the system, while in others the agent-environment system is the predominant unit.

C. Adaptive integrated information facing environmental diversity
We have just used integrated information for delimiting an agent interacting with a static environment.The environment was 'passive' in the sense that it showed no self-interaction.This is not a common scenario, since typically environments change and display their own dynamics.A key aspect of agency is the ability of an agent to sometimes modulate the coupling with its environment to preserve its individuality [18], generating an interactional asymmetry between agent and environment.Thus, a basic feature of living and cognitive sysetms is to display adaptive mechanisms regulating its coupling to the environment to maintain their level of functional integration for a range of external environments.
In order to characterize a scenario that is more realistic in this sense, we model an agent with two internal regions A and B, interacting with an environment E with recurrent connections (Fig 4 .A).A and B present feedback loops that we fit in order to maintain integration for a range of environmental parametric configurations.The evolution of the system is described by: m(t + 1) = tanh(Jm(t)) (16) where m and J describe in vector and matrix notation the mean fields and couplings of the three regions A, B and E. We assume that the environment is defined by two parameters defining the agent environment couplings J AE = J BE = J EA = J EB = J c and environmental selfcouplings J EE = 1.Values of J AA , J AB , J BA , J BB will be tuned maximize integration.We also assume r S = r M = r E = 1/3.In particular, the system will be tuned to maximize the integrated information of the agent AB, ϕ AB while facing 5 different environments defined by values of J c from the set {0.8, 0.9, 1.0, 1.1, 1.2}.We calculate ϕ for different parameters as in previous cases, testing the possible candidates for the MIP (in the case of ϕ AB , the MIP candidates are isolating one node either from A or B) and the one minimizing integrated information is chosen.
In order to find the parameter values that maximize ϕ AB for the set of environments, we first run a microbial genetic algorithm [19] and then (using the parameters of the agent with larger fit) a Nelder-Mead algorithm [20] to adjust the results.For both algorithms, the fitness function is defined as the value of ϕ AB (τ ), with some exceptions.For reducing the computational cost, the value of τ will be 10 4 for the genetic algorithm and 10 5 for the Nelder-Mead algorithm.In order to avoid the case where A and B are independent integrated units, fitness will be set to zero in the case that ϕ A or ϕ B are larger than ϕ AB .As well, fitness is set to zero in the case where ϕ AB does not converge to a stationary value.
After running the genetic and Nelder-Mead algorithms, we obtain an agent with parameters J AA = 0.09973671, J AB = −0.85774749,J BA = −0.8995672and J BB = 0.14326043.This agent presents negative weights connecting A and B and positive self-coupling values.Thus, each region will inhibit the behaviour of the other while reinforcing itself, therefore regulating its activity to maintain high integrated information for the presented environments.
After tuning the parameters of the system, we evaluate its behaviour for different environments.The value of ϕ AB diverges but not only for a specific environment due to fine tuning of its self-couplings as in the previous case.Instead, the divergence is maintained for an approximate range of J c of [−1.21, 1.21].Moreover, this divergence is also maintained if we modify the value of J EE , displaying a surface in which the value of ϕ(τ ) diverges (Fig 4 .D).This means that the points of divergence from previous examples are transformed here into a critical surface that maintains integration of the system for a wide range of environmental parameters.That is, the agent is able to self-regulate to some extent to maintain its integration, and thus its viability as an agent.

IV. DISCUSSION
We have proposed a simplified measure of IIT measure ϕ which, together with mean field approximations in a kinetic Ising model, allows us to capture for the first time integrated information in very large systems, up to the thermodynamic limit.Using this method we are able to compute ϕ for infinite size mean field kinetic Ising models with quasi-homogeneous infinite-range connectivity.
Our models, although highly idealized, allow us to speculate about some of the properties of integrated neural organization.First, we observe that, despite the infinite size of the models, the amount of integrated information is bounded for most of its parameter space.Only near critical points does the level of total integrated information diverge, suggesting that integrated entities need to organize themselves close to critical points in their parameter space to maintain their level of integration as their size grows.This suggests that it may be of greater interest to describe brain organization in terms of diverging tendencies of IIT in different modules rather than in therms of the specific values of ϕ in finite systems.Furthermore, we have shown how integrated information can be used to define the boundaries between a system and its environment by comparing the divergent tendencies of their joint and individual integration.For doing so, some of the assumptions of current formulations of IIT had to be modified.Our tests show that integrated information cannot, in principle, be measured in a brain independently of its environment (bodily and extra-bodily), nor by assuming that the environment is an independent source of noise.Moreover, our results show that near critical points in some cases both the system and system-environment integrated information diverges.Nevertheless, we have shown how to characterize the dominant dynamical unit by comparing the difference in the diverging tendencies between the two configurations.
Our results connect the emergence of boundaries of integration with phenomena related to criticality.Systems near critical points are maximally sensitive to changes in some directions of their parameter space (generally measured as the susceptibility of the system to changes in this parametrical direction).Here, we capture integrated information measures by applying different partitions to the system which are interpreted as changes in particular directions of the parameter space.Thus, the level of integrated information corresponds to the susceptibility of the system for the minimum information partition, i.e., the partition with the less significant effect on the system's causal powers.In the framework of IIT, systems highly sensitive to their minimum information partition are interpreted as maximally irreducible units.
This could allow further simplifications in order to measure integrated information in complex models or even empirical setups.By testing the behaviour of a system when perturbations in its components are introduced (i.e., noise injected in partitioned connections), the integrated information of a mechanism can be described as the minimal susceptibility the set of perturbations from different partitions.The connection between information integration and critical susceptibility allows us to speculate about the link between integration and properties that have been postulated as pervasive of living beings such as self-organized criticality [21].
By interpreting integrated information in terms of susceptibilities in the parametrical direction of partitions of the system, we can think of integration as the sensitivity of a system to the decoupling of the modules composing it.In our last example, we show how internal regulation results in the capacity for maintaining this susceptibility for a range of different situations.We hypothesize that this can be achieved by similar dynamics as those of systems showing self-organized criticality, which are attracted to critical points of maximum susceptibility.This could be achieved in systems capable of self-organizing near points where they maintain maximal sensitivity to the integrity of their internal organization while they interact with changing environments (e.g., maintaining internal invariances near critical surfaces [22]).

V. CONCLUSION
The core ideas that IIT intends to capture apply to a variety of poorly understood questions in biological and cognitive systems.By introducing some modifications to take into account different temporal spans and influences from the environment, and studying the behaviour of integration measures in the thermodynamic limit, we have shown the existence of critical points that maximise a system's integration, for instance, an organism or a cognitive agent.The fact that our case studies remain general and abstract (we do not specify any detail about the neural, sensorimotor, and environmental processes involved) suggests that robust individuation and susceptibility towards loss of integration are inherent consequences of maximising a tendency towards integration, and so they are likely to be observable trends in all systems that are able to do so.
A limiting assumption in our approach is the homogeneity of the elements within a each region.Biological systems cannot be assumed to present such a degree of homogeneity and the variability in their components and interactions has to be accounted for.Our framework, however, can take into account higher levels of heterogeneity by introducing a larger number of regions.In the case of three regions we observe that tuning the parameters of the system results in the extensions of critical points of diverging integration into regions of the parameter space.We expect (but have not yet verified) that increasing the number of interacting regions will still result in critical regions of divergent integration.In brain network models, it has been found that structural heterogeneity can generate extended critical-like regions [23], thus we may also expect this phenomenon to be reinforced in the presence of higher heterogeneity in our models.Our results are also limited to models with stationary solutions where we can evaluate the stable solution when the temporal span tends to infinity.This is not a limitation of the method, though.The results of more realistic systems presenting cyclic or chaotic dynamics could be harder to interpret, although they are in principle tractable within the framework presented here and could be explored in further work.
The models presented here allow a shift of focus toward the integrative tendencies of systems as they grow or evolve.This opens up the applicability of IIT to a range of questions about changes over developmental and evolutionary time.Even in the simple cases we have considered, the existence of critical points that maximise integration may be important for understanding apparent jumps in complexity, including the transitions at the origin of life [24] or cognitive developmental transitions [25].
Focusing on the divergent tendencies of integration measures, we are able to capture the asymmetry of agentenvironment interactions.Thinking interactions with the environment in this terms is fruitful for grounding notions such as the individuality or the autonomy of a system.Often, these concepts have been formalized in terms of self-determination and independence from an environment [26,27].By contrast, our examples show how both integration of a system and integration between system an environment can diverge together, while the level of individuality of the system can be quantified by the relative divergence speed of both terms.This is a robust finding obtained under the minimal assumptions and thus, we suggest, a general trend in large complex systems.The key data of interest as systems scale up are not so much the absolute values of integrated information, but the relative divergent tendencies of system integration and system-environment integration.
In addition, by exploring different kinds of agentenvironment configurations, we observe that agents assumed to maximise integration are likely to do so robustly for a range of environmental situations due to the existence of critical surfaces.The existence of these surfaces that guarantee maximal integration is coherent with postulates at the theoretical foundations of adaptive systems research, such as the existence of 'regions of viability that guarantee the integrity of an agent [28,29].While such conditions of viability have often been imposed by the designer or assumed to be given by evolutionary or material constraints, our approach allows to think of them as critical regions emerging at the level of the integrative forces of the system.This illustrates how viability regions could scale up from material or pre-given constraints to regions defined by increasing complexity of the integrated activity of a system.

B2. Purview
In IIT, integrated information of a mechanism ϕ MIP M is evaluated not only for a particular mechanism M, but also for a purview P. If the mechanism defines which units of {s i (t)} i∈M we take into account, the purview defines which units of the future state {s i (t + τ )} i∈P we take into account.Given these subset of present and future states, partitions are computed over the join space of {s i (t)} i∈M and {s i (t + τ )} i∈P , and the purview P with maximum integrated information for its MIP is selected.Here for simplicity, we apply the partition over {s i (t)} i∈M and {s i (t + τ )} i∈M , making the mechanism and purview coincide, and the distance for computing integrated information is measured for the distance of all elements of the system, not only the elements contained in the purview.
Allowing more choices of purview could make a big difference in certain systems, although in the quasihomogeneous systems tested in the paper the differences are small.

B3. Elements outside of a mechanism
More importantly, there are significant differences from the IIT framework in the way we treat the elements that are outside of the evaluated mechanism M. In IIT, elements outside the mechanism are assumed to be unconstrained (i.e., as random as possible).We decided to modify this assumption because it can have dramatic effects when measuring the behaviour of large systems.Specifically, assuming unconstrained elements outside the mechanism create an artifact that provokes a shift in the critical point of the system (this will be detailed in future work).
Let's exemplify an example using an homogeneous Ising model with local fields H i = 0 and couplings J ij = J.As we shown, compute the value of ϕ for the whole system using continuous noise injection at partitioned connection yields a divergence around the critical point at J = 1.Now, we will show what is the behaviour of its internal mechanisms assuming different behaviours of the units outside of the mechanism.
First, we compute values of mechanism covering a fraction of the system M/N (since the system is homogeneous, any fraction we choose has the same behaviour) assuming that the elements outside of the mechanism M keep operating normally (Figure B2.A).In this case, we observe that the divergence of ϕ M is maintained, although the value of ϕ M decreases with the mechanism size.
In contrast, if we accept IIT assumption and take the elements of the mechanism as independent sources of noise, the behaviour of ϕ M changes radically.In this case, the divergence is maintained but takes place at a different value of the parameter J (Figure B2.B).This happens because independent sources of noise have a zero mean field value, and thus the phase transition of the system takes place at larger values of J that compensate the units that now are contributing with a zero mean field.Thus, we think that considering the elements outside of the mechanism as independent sources of noise can be misleading about the operation of mechanisms that are embedded in large systems.
A less loaded assumption could be maintaining the state of the units outside of the mechanism with the static values that they had at time t, that is, maintaining their mean field constant.We can see at Figure B2.C that this behaviour is also not satisfactory, since for mechanism sizes smaller than N the value of ϕ M decreases very rapidly, and it is exactly zero at the critical point.We can understand this thinking that the effect of constant fields is equal to adding a value of H i equal to the input from frozen units, therefore breaking the symmetry of the system and precluding a phase transition.

B4. Mean field approximation of partitioned systems
We simplify the calculation of the probabilities p({s i (t + τ )} i∈M |{s i (t)} i∈M ) and p cut {s i (t + τ )} i∈M |{s i (t)} i∈M ) by using a mean field approximation described by Equations 3 and 4.
In the case of partitioned systems for computing integrated information, cutting connections injects uniform noise on the input node.In the mean field approximation, this would be equivalent to inject a zero mean field signal, which is equivalent to setting to zero the affected connection weights when computing h i (t).

B5. Integrated conceptual information
Finally, once ϕ is computed, IIT proposes a second level of calculations for computing integrated conceptual information Φ where new bidirectional partitions are applied to the system.In our case, given the homogeneity of the system, we do not compute conceptual information since all the mechanisms composing each set have similar behaviour.Thus, for simplicity we do not apply a second level of partitions.

Fig 1 .
B depicts an example of a partition.

FIG. 1 .
FIG. 1. Kinetic Ising model.A: Description of the infinite size kinetic Ising model.B: Description of the partition schema used to define perturbations.Partitioned connections (black arrows) are injected with random noise.

FIG. 2 .
FIG. 2. Homogeneous kinetic Ising model.A: Magnetization of the infinite size kinetic Ising model.B: Value of ϕM N (τ ) for different temporal spans.C: Value of ϕM N (τ → ∞) for an infinite temporal span.D: Value of ϕM M (τ → ∞) for different mechanisms of size M and an infinite temporal span.

FIG. 3 .
FIG. 3. Asymmetric interaction in a kinetic Ising model.A: Basic agent connected to an environment.B, C, D, E: Values of the mean fields (only positive values are shown) of the stable solution (top) and ϕ(τ → ∞) (bottom) for the agent and environment nodes of the model at stability for Jc = 0.8 (left) and Jc = 1.2 (right) and different values of Jr. F: location of the critical point in the parameter space for different combinations of Jr, Jc.G: Constants multiplying ϕA(τ → ∞) and ϕAE(τ → ∞) near the critical point, showing which is the most irreducible unit of the system.
For the values of J c used during training, we find that the mean values of regions A and B, m A and m B display a similar transition than the previous examples (Fig 4.B shows the case of J c = 1, although other cases are similar).Moreover, we can observe that there is a divergence of the values of ϕ AB for a range of values of J c (Fig 4.C).For larger values of J c the transition disappears and the values of ϕ AB do not diverge.The example presented here displays an important qualitative change in comparison with the previous one.

FIG. 4 .
FIG. 4. Adaptive integration in a kinetic Ising model.A: Adaptive sensorimotor system connected to an environment.B: Values of the mean fields of the stable solution for a Jc = 1.C: Values of ϕAB(τ ) for different values of Jc.F: The blue area represents the surface in Jc and JEE where ϕ(τ → ∞) diverges.

FIG. B1 .
FIG. B1.Temporal ranges of integration.(A) Values of ϕ(τ ) using continuous injection of noise for different values of J. (B) Values of ϕ(τ ) using an initial injection of noise for different values of J. (C) Values of ϕcum =

FIG. B2 .
FIG.B2.Effects of the environment in integrated information.Values of ϕM(τ → ∞) of a mechanism M of size M for different values of J, assuming that elements outside of the mechanism operate (A) normally, (B) as independent sources of noise, and (C) as static input fields.