Optimal Control Costs of Brain State Transitions in Linear Stochastic Systems

The brain is a system that performs numerous functions by controlling its states. Quantifying the cost of this control is essential as it reveals how the brain can be controlled based on the minimization of the control cost, and which brain regions are most important to the optimal control of transitions. Despite its great potential, the current control paradigm in neuroscience uses a deterministic framework and is therefore unable to consider stochasticity, severely limiting its application to neural data. Here, to resolve this limitation, we propose a novel framework for the evaluation of control costs based on a linear stochastic model. Following our previous work, we quantified the optimal control cost as the minimal Kullback-Leibler divergence between the uncontrolled and controlled processes. In the linear model, we established an analytical expression for minimal cost and showed that we can decompose it into the cost for controlling the mean and covariance of brain activity. To evaluate the utility of our novel framework, we examined the significant brain regions in the optimal control of transitions from the resting state to seven cognitive task states in human whole-brain imaging data of either sex. We found that, in realizing the different transitions, the lower visual areas commonly played a significant role in controlling the means, while the posterior cingulate cortex commonly played a significant role in controlling the covariances. SIGNIFICANCE STATEMENT The brain performs many cognitive functions by controlling its states. Quantifying the cost of this control is essential as it reveals how the brain can be optimally controlled in terms of the cost, and which brain regions are most important to the optimal control of transitions. Here, we built a novel framework to quantify control cost that takes account of stochasticity of neural activity, which is ignored in previous studies. We established the analytical expression of the stochastic control cost, which enables us to compute the cost in high-dimensional neural data. We identified the significant brain regions for the optimal control in cognitive tasks in human whole-brain imaging data.


Introduction
The brain is a highly complex dynamical network that flexibly transitions to various states to execute a myriad of functions (McKenna et al., 1994;Breakspear, 2017;Kringelbach and Deco, 2020). In this regard, the brain can be considered a system that modulates its internal states to desired states, in accordance with the function the individual needs to perform (Botvinick and Cohen, 2014;Gu et al., 2015). Among the many transitions that bring the system into the various states it requires, some state transitions are more difficult to control than others, depending on the dynamical properties of the neuronal systems. In other words, controlling transitions to some states incurs greater "costs" than controlling transitions to others. Providing a theoretical framework for quantifying transition costs, or control costs, is important for evaluating the difficulty of the shifts between various brain states, and possibly in explaining cognitive loads (Braun et al., 2021;Kawakita et al., 2022), sleep-awake differences (Deco et al., 2019), habituation of cognitive tasks (Szymula et al., 2020), and psychiatric disorders (Braun et al., 2021) with a quantifiable measure. Therefore, the development of such a framework for quantifying control cost in the brain is a vital topic in neuroscience.
A rigorous and promising framework to assess control cost was provided by an approach using control theory, which was first introduced in neuroscience in a pioneering work by Gu et al. (2015). Control theory provides theoretical tools for investigating the dynamical properties of complex systems (Liu et al., 2011;Liu and Barabási, 2016), and its application to neuroscience is opening new doors to mechanistically explaining neural behaviors from brain structures (Gu et al., 2017;Kim et al., 2018;Stiso et al., 2019;Szymula et al., 2020;Cornblath et al., 2020;Braun et al., 2021). However, despite being a strong approach, this framework does not take account of an important property of neural activity: it neglects noise or stochasticity in neural systems. Since neural noises are known to be ubiquitous in the brain and to play critical roles in information processing (Rieke, 1999;Faisal et al., 2008), overlooking the stochasticity of neural systems may result in an inaccurate estimation of the control costs.
In this study, we propose a novel framework to quantify control costs in linear stochastic neural systems. This framework takes advantage of both the linear control theoretic framework (Gu et al., 2015;Kim et al., 2018;Szymula et al., 2020) and the control cost proposed in our previous work . That is, we modeled brain dynamics using linear stochastic differential equations, and defined the control cost as the Kullback-Leibler (KL) divergence between the uncontrolled and controlled processes ( Fig. 1) as we did in Kawakita et al. (2022). Thanks to the linearity, we can obtain an analytical expression of the control cost in the stochastic system. Furthermore, as we include a control input term in our model, we can identify those brain regions playing a significant role in the control of state transitions.
In addition to determining the analytical expression, we also showed that we can decompose this expression into the cost for controlling the mean (referred to as mean control cost) and that for controlling the covariance (covariance control cost). We proved that the mean control cost corresponds to the control cost in the previous deterministic setting. The covariance control cost, on the other hand, is the cost of controlling the covariation among the system, which has not been quantified in previous studies in neuroscience.
After formulating the theoretical framework, we then posed the following two questions. First, how important is it to take account of covariance in estimating control cost? Second, what brain areas are significant in controlling brain state transitions? To address these questions, we applied our new method to real neural data. We used whole-brain fMRI BOLD data of 352 healthy adults of either sex, recorded as part of the Human Connectome Project (HCP) (Van Essen et al., 2013). As for the first question, we found that the influence of the covariance control cost was indeed not negligible. And as for the second, we discovered that the lower visual areas and the posterior cingulate cortex (PCC) play important roles in controlling state transitions, but in different ways: the PCC acts in controlling the covariance and the lower visual areas act in controlling the mean in addition to the covariance.

Materials and methods
Theoretical background Control cost in deterministic systems Before formulating the control costs in stochastic systems, we start with a conventional deterministic framework and a method for quantifying control cost under this deterministic setting. The dynamics of the brain is modeled with an n-dimensional state space model, where a brain state x [ R n at time t [ [0,T] (T . 0) consists of n scalar values that represent the magnitudes of brain activity. Then for each t [ [0,T], x(t) is assumed as a point in an n-dimensional Euclidean space. A state transition of the brain is described as a trajectory on which a point travels from one state to another. An uncontrolled transition is described using linear dynamics, such as the following: _ xðtÞ ¼ AxðtÞ: (1) Here, x: [0,T] ! R n is a vector that represents the magnitudes of activity of all nodes. A [ R n Â n is a connectivity matrix whose elements represent connectivity weights for each pair of nodes.
We consider situations where a brain state switches from an initial state xð0Þ ¼ x 0 2 R n to a target state xðTÞ ¼ x T 2 R n that is different from the state when following its uncontrolled dynamics (Eq. 1). To realize such a transition, we assume that a control input u(t) is given. We can incorporate the control input into the dynamics as follows: where u : ½0; T ! R p is a control input and B 2 R nÂm is an input matrix that determines the nodes assigned with control inputs (n; m 2 N). We assume that this system is controllable (i.e., 8x 0 ; x T there exists u(t) that enables the transition from x 0 to x T ). Here, we limit ourselves to cases where B = I n (the n Â n identity matrix), which implies that the input is given independently to all nodes. In other words, we consider the system with an input v : ½0; T ! R n , _ xðtÞ ¼ AxðtÞ 1 vðtÞ: The control cost (also called control energy) J cont is defined as the total amount of control input required to steer the system from x 0 to the target x T and is expressed as follows: under The minimum of this integral represents the input minimally needed to realize the state transition that satisfies the marginal conditions (Eq. 5). It is written as follows: Figure 1. Schematic of our framework for quantification of control costs in the brain in linear stochastic systems. We model a brain state to follow a certain probability distribution p 0 at time t = 0 (left, blue ellipse). In uncontrolled dynamics, the brain state stays in the same distribution (right, blue trajectory and blue ellipse). However, in a state transition, the brain dynamics changes so that it reaches a target distribution p T at time t = T (gold ellipse). We call this altered trajectory the controlled process (gold trajectory). To evaluate how close the controlled process is to the uncontrolled one, we use the KL divergence between the two processes as the cost function, marginalized with the initial and the target distributions (blue square). The distributions of the processes are defined on a path space, the space composed of R n valued continuous functions defined on [0, T]. This type of KL optimization problem on a path space is referred to as the Schrödinger's bridge problem.
and referred to as the minimal control cost. The problem of minimizing the control cost is called the optimal control problem in control theory. Hereafter, we call the optimal control cost simply the "control cost" because we are only concerned with the minimal value in this study. We also refer to this cost as the deterministic cost, since this cost is defined on the deterministic model (Eq. 3). This minimum is known to exist, and the optimal value J Ã cont is readily solved (see Kamiya et al., 2022). This metric has been used as the control cost in previous studies in neuroscience (Gu et al., 2017;Kim et al., 2018;Stiso et al., 2019;Cornblath et al., 2020;Deng and Gu, 2020;Braun et al., 2021).

Formulation of a stochastic linear model
To take account of noise and fluctuation in brain activity (Rieke, 1999), we make an extension for three characteristics in the control model of the brain: dynamics, transitions, and the control cost (Fig. 2). In this section, we explain these characteristics in detail. First, we model the brain dynamics through a stochastic process instead of conventional deterministic processes as described in Equations 1 and 3. We then model the dynamics without control, or the uncontrolled process described as an R n valued Ornstein Uhlenbeck process where each element of x(t) corresponds to the magnitudes of activities in a brain region. Here, w(t) is a standard n-dimensional normal Brownian motion, and A 2 R nÂn ; C 2 R nÂn (n 2 N). We assume C t C to be nonsingular ( t C denotes the matrix transpose of C), which is equivalent to rank(C) = n. Second, to model the control of transitions, we consider these as from a probability distribution to another probability distribution, instead of as the point-to-point transitions that have been considered in previous studies (Gu et al., 2015;Szymula et al., 2020). In the present study, we consider the control of the system between Gaussian distributions from time t = 0 to time t = T. We assume that the distribution of x at t = 0 follows a Gaussian distribution with mean m 0 2 R n and covariance R 0 [ PSD (n) (set of n-by-n positive semi-definite matrices), denoted by xð0Þ ; N ðm 0 ; R 0 Þ (we call this the initial distribution). When the system follows Equation 7, the distribution of x(T) is uniquely determined as follows: the process (Eq. 7) stays in the same probability distribution in Vt ! 0 (steady-state distribution); thus, m 0 ¼ m T ¼ 0 and R 0 ¼ S T . The existence of the steady-state distribution is guaranteed if the real parts of all the eigenvalues of A are negative. To describe the control of state transitions, we consider altering the dynamics and the probability distribution at time t = T by giving a certain input. We consider steering the system from the initial distribution N ðm 0 ; R 0 Þ to a given distribution N ðm T ; R T Þ ðm T 2 R n ; R T 2 PSDðnÞÞ at t = T, which is different from the final distribution N ðm T ; S T Þ in the uncontrolled process. We call N ðm T ; R T Þ a target distribution, and a process that reaches N ðm T ; R T Þ at t = T a controlled process. Third, for the control cost, we adopt the KL divergence between the uncontrolled and controlled processes (  Comparison of deterministic and stochastic state transitions in a brain from the perspective of control theory. The Previous framework column describes the deterministic model of state transitions. The uncontrolled process (blue arrow) is expressed as a linear differential equation _ xðtÞ ¼ AxðtÞ (A is a connectivity matrix of the brain network). A brain state transition is modeled as a transition from a point to another in an n-dimensional state space. Along with a control input v, the brain shifts its state to x T , another point in the state space, at time T (dynamics drawn in yellow). The cost function is often set as the time integral of the squared input. The Proposed framework column explains the stochastic model of brain state transitions. Here, the uncontrolled process (blue arrow) and a controlled process (gold arrow) are given as stochastic processes. A brain state transition is viewed as a shift from an n-dimensional probability distribution to another. As a control cost, the KL divergence between the two processes is examined.
is a metric that measures the closeness between two probability distributions. Thus, the control cost, defined by the KL divergence, measures the closeness between the uncontrolled and controlled processes. This KL divergence is not the one between the two probability distributions of brain states at a particular time point, such as the KL divergence between the initial distribution N ðm 0 ; R 0 Þ and the target distribution N ðm T ; R T Þ, but rather the divergence between the probability distributions of the entire paths in the uncontrolled and controlled process from time 0 to T.
In this study, we consider the optimal control problem based on the KL divergence. There are many possible controlled processes that take the system to a given target distribution. Among them, we consider the optimal controlled process, whose control cost is minimized. This process is the closest controlled process to the uncontrolled process in the KL sense. To express this mathematically, let us denote the probability distribution induced by the uncontrolled process by P and that by the controlled process (P ), which are defined on an abstract space composed of a set of R n valued continuous functions on [0, T] (denoted by S ¼ Cð½0; T; R n Þ). The KL divergence between these paths is written as follows: where dP dP is a Radon-Nikodym derivative (we consider when P andP are absolutely continuous to each other). Then the probability distribution of the optimal controlled process we seek is the following Q: with the boundary conditions xð0Þ;N ðm 0 ; R 0 Þ; xðTÞ;N ðm T ; R T Þ; if the minimum exists. We call the minimum the stochastic control cost, Again, the proposed framework is compared with the conventional deterministic framework in Figure 2.
The optimization problem of minimizing the KL divergence between two stochastic processes (Eq. 11) has been referred to as Schrödinger's bridge problem (Léonard and Modal, 2014;Chen et al., 2016a). This problem was originally proposed by Erwin Schrödinger for finding the most probable path that moving particles take (Schrodinger, 1931). The Schrödinger bridge problem was subsequently found to be equivalent to an optimal control problem, and has been studied in the field of control theory (Dai Pra, 1991;Léonard and Modal, 2014;Chen et al., 2016b). To our knowledge, our recent study  is the first work in neuroscience to use the KL minimization problem in examining control cost in brain dynamics.
Equivalence between KL cost and quadratic cost Next, we observe that KL minimization boils down to an optimal control problem where the cost function is the expectation of the quadratic input with respect to the controlled process (Dai Pra, 1991;Chen et al., 2016a).
It is known from previous studies (Dai Pra, 1991;Chen et al., 2016b) that, to consider the minimal KL divergence D KLP ; P À Á under the boundary conditions (Eq. 12), one has only to search for dynamics that can be described as the next form: where v : R n Â ½0; T ! R n is a control input. Moreover, the KL divergence and the next quadratic cost become equal: where EP ½Á represents the expectation on a probability lawP and Therefore, to obtain the minimal KL divergence, we need to search for the input v that minimizes the expectation of the quadratic form given in the right-hand side of Equation 15. In other words, the KL minimization problem boils down to an optimal control problem whose cost function is the right-hand side of Equation 15. As defined in the previous section, we define the minimum of Equation 15 as the stochastic control cost.
The discussion in this section is stated in more rigorous form in Kamiya et al. (2022).

Experimental design and statistical analysis
For our data processing techniques, see Figure 3 for schematic explanation.

On fMRI data
The 3T fMRI data of 990 subjects were obtained from the Washington University-Minnesota Consortium HCP (Van Essen et al., 2013). To remove the effects of outliers, we picked data of 352 subjects of either sex according to criteria suggested by Ito et al. (2020), (Fig. 3a).

Data preprocessing
Minimally preprocessed fMRI data were used for the resting state and seven cognitive task states (emotion, gambling, language, motor, relational, social, and working memory). Denoising was performed by estimating nuisance regressors and subtracting them from the signal at every vertex (Satterthwaite et al., 2013). For this, 36 nuisance regressors and spike regressors were used following a previous study (Satterthwaite et al., 2013), consisting of (1-6) six motion parameters, (7) a white matter time series, (8) a CSF time series, (9) a global signal time series, (10-18) temporal derivatives of (1-9), and (19-36) quadratic terms for (1-18). The spike regressors were computed with 1.5 mm movement as a spike identification threshold. After regressing these nuisance time courses, a bandpass filter (0.01-0.69 Hz) was applied, whose upper filter bound corresponds to the Nyquist frequency of the time series. A parcellation process proposed by Schaefer et al. (2018) was used to divide the cortex into 100 brain regions, which reduced the complexity of the following analysis ( Fig. 3b).

Estimation of parameters characterizing brain states
We next estimated parameters that characterize resting and task brain states. To apply the framework explained in Formulation of a stochastic linear model, we estimated the next parameters: the mean and the covariance matrix of the initial distribution (m 0 and R 0 ), those of the target distribution (m T and R T ), and the matrices in the uncontrolled dynamics (the drift matrix A 2 R nÂn and the product of the diffusion matrix S C :¼ C t C). As for the diffusion matrix, we estimated S C ¼ C t C 2 R nÂn rather than C itself, as S C is sufficient for cost computation. The detailed methods to estimate the parameters of the initial and the target distributions are covered in Distribution of the resting state and task states, and those to estimate A and S C in Estimation of parameters of the resting state dynamics. For statistical robustness, a bootstrapping method was adopted, where the estimation described below was done using data of 100 randomly chosen subjects of 352 overall subjects, and repeated 100 times independently. The data of the 100 subjects were concatenated to obtain a single time series that is long enough for statistically reliable estimation.
Before estimating the parameters using the concatenated data across different subjects, we need to normalize the time-series data. This normalization is necessary for the following reason. Basically, the absolute values of fMRI BOLD signals are meaningless by themselves because of the bias caused by the water content or the basic blood flow in the brain tissues (Poldrack et al., 2011). Because of this, one cannot directly compare values of BOLD signals in different subjects. Thus, we cannot concatenate data of different subjects using the preprocessed fMRI BOLD signals as they are. To concatenate BOLD signals across subjects, we need some kind of scaling, or normalization, of the data.
We performed the scaling based on the assumption that the sum of the time-series variances in the whole ROIs are constant when a subject is not engaged in a cognitive task. First, for each subject, we computed the trace values of the empirical covariance matrices of the resting state and the task-free moments in each task. Then, we divided the whole time-series data of the rest and each task by the square root of the trace value of the rest and by that of each task's task-free moment, respectively. This division makes the magnitude of the sample covariance matrix (i.e., the trace of the sample covariance matrix) of the rest and the task-free moments to be one in all subjects and tasks. After this normalization, we concatenated the time-series data of the whole resting state and the task-free moments of each task state of all 100 subjects. Using the concatenated time-series data, we estimated the parameters as follows.
Distribution of the resting state and task states. To estimate the resting state distribution, the empirical mean and covariance of the whole time series were used to represent the probability distribution. The empirical mean of the resting state was almost a 100-dimensional zero vector in each subject because of a zero-mean adjustment in preprocessing. To extract the mean of the activity during a task, we did not use the empirical mean and covariance as we did in estimating the resting state distribution. This is because, unlike the resting state data, task time-series data are composed of taskperforming and task-free moments. And one needs to know to what extent the activity in the task-performing moments is different from that in the task-free moments. We assumed that, in the task-free moments, the average magnitude of activity is the same as the average of the resting state (i.e., ;0). Accordingly, to obtain the time-series mean of the activity during the task state, the time-series mean during the task-free moments was subtracted from that in the task-performing moments. The covariance matrix was calculated as the empirical covariance matrix during the task-performing moments. For schematic explanation, see Figure 3c.
Estimation of parameters of the resting state dynamics. In this section, we explain how we estimated the matrices A and S C ¼ C t C that determine the uncontrolled dynamics (Eq. 7). As we mentioned in Results, on application to fMRI data, we regarded the uncontrolled dynamics as the resting state dynamics. Thus, we need to fit the resting state time series to the dynamics equation as follows: When sampled at some interval of Dt . 0, this process is equivalent to the next VAR(1) process as follows: where a = e ADt (matrix exponential) and « t1Dt is a zero-mean Gaussian random variable that represents noise. We first estimated the drift term (denoted byâ) using the least absolute shrinkage and selection operator regression. A transformationÂ ¼ logâ=Dt (18) followed to obtain the estimated drift coefficient A in the dynamics. As for S C ¼ C t C, one can directly infer this from the covariance of « t1Dt (E½« t1Dt t « t1Dt ¼: S «t1Dt ) in Equation 17 using the following relationship: where vec is the vectorization operator and vec -1 its inverse, the Kronecker product, and log the matrix logarithm. To obtain the estimation of S C (denoted byŜ C ), we substituted above the estimatedâ and the empirical time series covarianceŜ «t1Dt . For schematic explanation, see Figure 3d.

Calculation of entropy of an input map
The entropy S of an input map I ¼ t ðI 1 ; Á Á Á ; I 100 Þ 2 R 1 100 quantifies how dispersed a vector is and takes the maximum if each element I l ðl ¼ 1; Á Á Á ; 100Þ takes the same value. S is calculated by the following: where I sum ¼ X 100 l¼1 I l . Note that each element I l ðl ¼ 1; Á Á Á ; 100Þ is positive.

Code accessibility
The codes for computing the stochastic control cost and reproducing the figures on this paper are available at https://github.com/oizumi-lab/SB_ toolbox.

Theoretical results
In this section, we show theoretical results obtained from our framework proposed in the previous section. This framework based on the KL minimization was first introduced in neuroscience in our previous study . While the previous study considered a finite-state space, discrete-time stochastic process, we considered a linear continuous-time system in this study. As we have explained in the previous section, there have been many theoretical studies of the linear stochastic system. In addition to what has previously been acknowledged, we newly obtained the following analytical results: 1. We derived the analytical solution of the control cost. We found that the analytical solution of the stochastic cost can be disintegrated into two portions: the cost of driving the mean (mean control cost) and that for the covariance (covariance control cost). 2. The mean control cost turns out to correspond to the conventional deterministic control cost in specific occasions. This clarifies the correspondence between the deterministic and the stochastic costs. The covariance control cost has not been quantified in previous applications in neuroscience. 3. By investigating the control input assigned to each node, we can compute the amount of total input given at one node. This node-level input can also be decomposed into two parts: the input necessary for controlling the mean and that necessary for controlling the covariance.

Analytical solution of the stochastic control cost
We start by showing the analytical solution of the optimal value of Equation 15, which is described as follows: In the equation, G : ½0; T ! R nÂn ; P : ½0; T ! R nÂn ; W : ½0; T Â ½0; T ! R nÂn , and m d :¼ m T À WðT; 0Þm 0 . For the definitions and detailed derivations, see Kamiya et al. (2022). P(t), G(t), and W(t,s) depend on A, C, R 0 , and R T . This gives the analytical expression of the stochastic control cost.

Decomposition of the stochastic cost
We next demonstrate that Equation 21 can be decomposed into two parts: the cost needed to steer the mean (called mean control cost) and the cost needed to steer the covariance (called covariance control cost). More specifically, when we decompose this as follows: where we show that we can consider J Ã m as the cost needed to steer the mean (mean control cost), and J Ã R as the cost needed to steer the covariance (covariance control cost). This decomposition is shown schematically in Figure 4.
We found that the mean control cost J Ã m depends only on the marginal means (m 0 and m T ), while the covariance control cost J Ã R depends only on the marginal covariances (R 0 and R T ). To explain these dependencies, we start by examining J Ã R . We can see that Equation 24 depends only on the marginal covariances and is independent of the marginal means, since P(t) relies only on R 0 and R T (in addition to A and C). We can then interpret these terms to represent the cost of navigating the covariance from R 0 to R T . On the other hand, J Ã m (Eq. 23) seems to depend on both the marginal means and marginal covariances, since G and P are dependent on R 0 and R T . Surprisingly, however, we found that Equation 23 actually does not depend on the marginal covariances, but rather only on the means. Indeed, J Ã m is further rewritten as follows: where U is a state transition matrix defined as Uðt; sÞ ¼ e AðtÀsÞ , and M(T) is a Gramian matrix in the linear stochastic system, For proof of this transformation (Eq. 25), see Kamiya et al. (2022). Thus, Equation 23 can be seen as the cost for controlling the mean from m 0 to m T .
Furthermore, the formulation given in Equation 25 indicates a correspondence between the stochastic control cost and the deterministic control cost . We can easily see that, if the input matrix B is identical to the diffusion matrix C, the mean control cost (Eq. 25) is equal to the deterministic control cost. This correspondence between the mean control cost and the deterministic control cost indicates that the stochastic control cost, defined as the KL divergence, can be considered an extension of the conventional deterministic control cost.
Another merit of the expression of the stochastic cost (Eq. 25) using the Gramian matrix (Eq. 25) is that one can obtain the directions in which controlling the mean is easy or difficult. In the deterministic control system, it is classically known that the eigenvectors of the control Gramian matrix (see Kamiya et al., 2022) can be interpreted as the direction in which the control is easy/difficult (Yan et al., 2015;Brunton and Kutz, 2022). The same logic can be applied to the mean control cost that shares the similar formulation to the Gramian in the deterministic system: the eigenvectors of Equation 26 specify the directions in which the system is good at in controlling its mean.
The covariance control cost, on the other hand, is the cost of shifting the covariance of the probability distribution. This cost has been ignored in previous neuroscience studies (Gu et al., 2015;Kim et al., 2018;Stiso et al., 2019;Szymula et al., 2020;Braun et al., 2021;Singleton et al., 2022). We will cover the neuroscientific implication of the covariance control cost, which can be rephrased as the cost for controlling functional connectivity, in the Discussion. As the formulation Equation 24 is somewhat complicated, an intuitive interpretation of the covariance control cost is not available at this moment.
The significance of this decomposition is that it enables us to separately quantify the influence of taking into account the covariance of brain activities in the control cost (Fig. 4). As we saw, Figure 3. Data acquisition techniques. a, The fMRI BOLD data in HCP underwent minimal preprocessing and were then divided into 100 ROIs. b, This gives us 100-dimensional time-series data in our hands. c, The data at rest and under each task state were approximated as a Gaussian distribution in the 100-dimensional space whose individual coordinates represent the magnitude of activity of an ROI. d, To estimate the matrices A and C that determine the resting state dynamics, we first employed a regression using the sparse VAR(1) process (left panel). The coefficient b^and E½« t t « were regressed so that the dynamics has m rest ≒ 0 and P rest as the mean and covariance. This VAR(1) process was then transformed into the corresponding continuous time dynamics using Eq. (18)  the mean control cost depends only on the marginal means and is shown to correspond to the deterministic control cost under a certain setting. On the other hand, the covariance control cost depends only on the marginal covariances. Thus, the effect of taking account of the covariance of brain activities is reflected only in the covariance control cost.
Input given at each node Last, we evaluate the contribution of each node to control transitions by the total amount of inputs given at each node in the brain network.
The amount of inputs provided at each node can be calculated as follows. Again, the optimal dynamics is described as follows: 100Þ. The expectation of the total input given to the k'th node (denoted by IðkÞ; k ¼ 1; Á Á Á ; 100) is then given as follows: which is readily transformed into the following: IðkÞ ¼ S C ð T 0 P RðtÞ 1 mðtÞ t mðtÞ À Á P À 2mðtÞ t mðtÞP À 1 mðtÞ t mðtÞÞdtS C kk : The subscript (·) ij denotes the (i, j)-th entry of a matrix, and m : ½0; T ! R n is a function for arranging the transient mean. For the derivation, see Kamiya et al. (2022), and m : ½0; T ! R is a certain function defined by Kamiya et al. (2022). This value can be decomposed in a similar way to the stochastic control cost (Eq. 22) as follows: where I m ðkÞ ¼ S C ð T 0 ðPðtÞmðtÞ À mðtÞÞ t ðPðtÞmðtÞ À mðtÞÞdtS C ! kk ; (31) As m(t) is the function for shifting the mean of the brain dynamics, the first term I m ðkÞ can be interpreted as the cost needed to control the mean, which we call the mean input at the k'th ROI. The latter I R ðkÞ is the cost of controlling the covariance, which we refer to the covariance input at the k'th ROI.
Results on application to fMRI data So far, we have formulated a method to quantify the stochastic control cost and contribution of each brain region on state transitions. With this method, we then address the following two problems: • How important is it to take account of the covariance of brain state probability distributions? • What brain areas are significant in optimally controlling the brain state transitions?
To address the first question (i.e., to assess the influence of covariance), one can compare the covariance control cost J Ã R to the mean control cost J Ã m . We have seen that the mean control cost in the stochastic model corresponds to the optimal control cost in the deterministic model (Decomposition of the stochastic cost), where we consider the control from one point to another point. So imagine, for instance, a case where the mean control cost is far greater than the covariance control cost and hence accounts for the most part of the overall stochastic cost. We might then need to consider only the point-to-point control and might not have to investigate distribution-to-distribution control as we did in the stochastic model. To evaluate the necessity of incorporating probability distributions in our model, we first need to examine to what extent the covariance control cost accounts for the stochastic cost.
To address the second question regarding important brain regions in controlling brain state transitions, we computed control input v for each region as defined in Input given at each node. By decomposing the control input of each region into the mean input (I m ðkÞ) and covariance input (I R ðkÞ), we can identify regions that are important for shifting the mean and covariance of brain activity. This decomposition enables us to identify not only the brain regions that need to be activated but also those regions that are crucial to the reconfiguration of covariation of whole-brain activity.
To address these questions, we used whole-brain fMRI data recorded as part of the HCP (Van Essen et al., 2013). These data consist of recordings of 990 healthy adults. We used a subset of data composed of 352 subjects based on a criteria proposed by Ito et al. (2020). For each subject, the dataset contains BOLD signals, including a pair of scans at rest and scans under seven different cognitive tasks: emotion, gambling, motor, language, relational, social, and working memory. After a standard preprocessing procedure including denoising with nuisance regressors, Figure 4. Schematic illustration of the decomposition of stochastic control cost into the mean and covariance control. The stochastic control cost from one state to another is described as a sum of the cost to control the mean and covariance from the initial state to the target state. The radius of the circle represents the magnitude of the mean, while the thickness of an edge represents the covariance value between two nodes. the whole voxels were partitioned into 100 ROIs (Fig. 3a,b) following the method proposed by Schaefer et al. (2018).
After preprocessing and parcellation, we applied our framework to the fMRI data. Here, we assumed that the uncontrolled process is the resting state dynamics of the brain and the controlled dynamics is the transition from the probability distribution of the resting state to that of a task state. Under this assumption, we computed the control cost from the resting state probability distribution to a task distribution. To perform this computation, we need to infer two classes of probability distributions: (1) the probability distributions that characterize the resting state dynamics, and (2) the task state probability distribution. For (1), we used a regression using the sparse vector autoregressive (VAR) model (Fig. 3d) on the resting state BOLD data. We assumed the resting state dynamics to be the steady-state dynamics and computed the steady-state probability distribution (N ðm 0 ; R 0 Þ) and the transition probability distribution (characterized by the matrices A and C). For (2), the mean (m T ) and covariance (R T ) are inferred using the sample mean and covariance in the BOLD signals after normalization (Fig. 3c).
As for the target time constant, we used various T values ranging from 0.1 to 6.0 s, and observed that results did not qualitatively change . Thus, below, we show the results with T = 1.0 s as representative.
To estimate the probability distributions and the stochastic costs, a bootstrapping method was adopted for statistical robustness. The estimation was conducted with 100 randomly chosen subjects of 352 subjects and was repeated 100 times independently (for further details, see Materials and Methods).
Mean and covariance control cost in the stochastic model Using the estimated probability distributions and the matrices A and C, we computed the stochastic control cost from rest to the seven tasks. For each task, we separately computed the mean control cost (J Ã m ), the covariance control cost (J Ã R ), and their ratio (J Ã R =J Ã m ). First, to assess the contribution of the covariance control cost compared with the mean control cost, we show the ratio in Table 1. We found that covariance control costs are as large as, or slightly larger than, the mean control costs. The ratios range from 0.4 to 2.8 depending on the task, with an average value of ;1.5. This result indicates that the contribution of covariance control cost to the overall control cost is as important as that of mean control cost. We also found that these ratios of covariance control cost to mean control cost take various values depending on the specific task. For example, the language task needs a larger covariance control costs than the mean, while the opposite is the case for working memory (WM).
To identify the influence of the covariance control cost, we then examined whether incorporating the covariance control cost changes the ordering of the seven tasks by the control costs. We found that incorporating covariance control cost changes the ordering of the seven tasks by the control costs. We show the mean, the covariance, and the stochastic control cost (= mean control cost 1 covariance control cost) of the seven tasks in ascending order (Fig. 5). The left panel (blue bars) shows the mean control cost values, the middle panel (orange bars) the covariance control cost, and the right panel (yellow bars) the total stochastic control costs. We can see that the order changes when we consider covariance control cost in addition to mean control cost. For example, the mean control cost for the gambling task is larger than that of the motor task; however, when we take account of covariance control cost, this relationship reverses. Thus, we see that the covariance control cost changes the result when estimating the magnitude relationships of control costs.

Control inputs at brain regions
To identify which areas are important in optimally controlling brain state transitions, we next computed the amount of inputs given to each brain region (see Input given at each node). We computed the mean input I m ðkÞ (Eq. 31) and covariance input I R ðkÞ (Eq. 32) at each ROI, obtaining a pair of 100 dimensional vectors (ðI m ð1Þ; Á Á Á ; I m ð100ÞÞ; ðI R ð1Þ; Á Á Á I R ð100ÞÞ) and their sum (ðI ð1Þ; Á Á Á ; Ið100ÞÞ). We can think of these vectors as brain maps of the amount of input required to alter the mean and covariance, and the total inputs. We call these maps the mean input map, covariance input map, and total input map. We computed the inputs and obtained these three maps for each of the seven tasks.
We found that the mean and covariance input maps showed quantitatively different patterns in each task. The mean inputs are typically located in a small number of regions, while the covariance input maps are more widely distributed throughout the brain . We quantitatively evaluated the difference in distributions of the input maps by computing the entropy of the input maps over 100 ROIs for each task. The entropy S of a vector quantifies how dispersed the elements of the vector are; if the inputs are assigned to widely distributed regions, the entropy takes a large value. For a detailed description of how the entropy was calculated, see Materials and Methods. As shown in Figure 6, the entropy of the covariance input maps is larger than that of the mean input maps in all seven tasks. This result suggests that the covariance input maps are more dispersed over the whole brain regions.
We then examined the relationships between the two types of control inputs and the changes in BOLD signal magnitudes when transitioning to a task. Intuitively, we may expect that larger inputs are required for brain regions that are activated or deactivated in the task. To study this relationship, we computed the correlation coefficient of the brain activity change and the mean/covariance input maps. We subtracted BOLD signals of preparation periods from those of task periods, and then computed the t values of the differences. We defined the absolute value of the t values as the brain activity change. We found that the mean input maps have high correlation coefficients in all the tasks, whereas the covariance input maps have lower values (Fig.  7). This result indicates that the mean input maps are highly correlated to the changes in BOLD activation level, while the covariance input maps are less related. In each of the 100 bootstrap trials, the covariance control cost is divided by the mean control cost, and the average and the SDs of the 100 trials are computed.
To identify the brain regions that are commonly important for control of the seven cognitive tasks, we then computed the average of the mean, the covariance, and the total input maps for all seven tasks, as shown in Figure 8a. To calculate the contribution of each task in a fair manner, for each task, we normalized the mean (I m ðkÞ), the covariance (I R ðkÞ), and the total input map IðkÞ; k ¼ 1; Á Á Á ; 100 by dividing by the sum of the total input to the whole ROIs We can see that a large amount of the mean inputs is in the visual areas (Fig. 8a, "mean"). In contrast, the covariance inputs are slightly more distributed over the whole brain. Specifically, a large portion of inputs is concentrated in the visual area, orbitofrontal area, and the PCC (Fig. 8a, "covariance").
To validate these results on the brain regions that are important in regulating brain state transitions, we counted how many times each ROI appears in the top 10 ROIs with the largest control inputs. Although we identified commonly important regions in controlling transitions by computing the average control inputs of all tasks, the possibility exists that the control input to a certain area is extremely large for one task only and not for the others. To rule out such a possibility and to find those regions that are commonly significant for the seven tasks, we selected the top 10 significant brain regions for each task and counted how many times each region is ranked in the top 10. We performed this analysis for the mean, the covariance, and the total inputs. This additional analysis further supported our findings that the PCC and lower visual areas are generally important in control transition to task states. The results of the additional analysis are shown in Figure 9. The color of an ROI signifies the number of tasks for which the ROI is ranked in the top 10 (thus; the numbers take integer values from 0 to 7). To facilitate visualization, we only colored ROIs that appeared in the top 10 more than 3 times. For mean input, the lower visual areas are ranked in the top 10 in the majority of the tasks. For the covariance input, the PCC is ranked in the top 10 most frequently, followed by the lower visual areas. For the total input, the lower visual areas and the PCC were ranked most important. Taking these findings together, we conclude that the PCC and the lower visual areas are the most significant regions for    Figure 8. Mean, covariance, and total input maps averaged over the seven tasks. a, Average of the mean, covariance, and total inputs for each of 100 ROIs over seven tasks in the HCP dataset. The numerical values indicate percentages compared with the sum of the total input to the whole ROIs. b, The ROIs that account for 30% of the whole mean, covariance, and total inputs shown in a. The numerical values indicate the percentages compared with the sum of the mean, covariance, and total input to the whole ROIs, respectively.
transitioning to task states. While the lower visual areas contribute to both the mean and covariance shift, the PCC contributes to shifting covariance only.

Discussion
In this study, we propose a novel framework based on a linear stochastic dynamics to quantify control costs between brain states. We define the minimal KL divergence between the uncontrolled and the optimally controlled paths (Eq. 10) as the stochastic control cost. There are three theoretical contributions in this study. (1) We give a solution for the stochastic control cost (Eq. 21).
(2) We show that the cost can be decomposed into two parts: the mean control and covariance control costs (Eq. 22; Fig. 4). (3) The proposed framework allows us to measure the inputs given in each ROI, thanks to the explicit modeling of the input term (Eqs. 28 and 29). The input in each ROI can be further decomposed into inputs necessary for regulating the mean brain activity and the whole-brain functional connectivity (Eq. 30). We applied our new method to fMRI BOLD data in the HCP dataset. We first found that the influence of incorporating covariance in control costs is not negligible. We further discovered that for regulating the mean, the lower visual areas turned out to be the most significant, whereas for the covariance, the PCC and the lower visual were the most significant.

Stochasticity of control cost
One contribution of the present framework is incorporating stochasticity in quantifying the control cost, which has not been adequately addressed in previous studies in neuroscience (Gu et al., 2015;Kim et al., 2018;Szymula et al., 2020;Braun et al., 2021;Singleton et al., 2022). Our framework incorporates stochasticity in two ways, namely, (1) whether the dynamics of the model is stochastic and (2) whether the cost takes stochasticity into consideration.
Previous studies have not dealt with either of these or considered only the first point. Previous studies that used control theory (Gu et al., 2015(Gu et al., , 2017Kim et al., 2018;Stiso et al., 2019;Cornblath et al., 2020;Szymula et al., 2020;Braun et al., 2021) applied the linear state-space model with deterministic dynamics and deterministic cost (Eq. 6), while another study applied a stochastic model for brain dynamics (Deng and Gu, 2020) but calculated costs based on the deterministic framework.
To our knowledge, our present and recent studies  are the first to take account of stochasticity, not only with regard to brain dynamics but to cost. As we saw in Figure 5, incorporating stochasticity in quantifying the control cost significantly changes the result. Thus, it will be desirable to assess the influence of stochasticity in the brain on control costs as we did in the present work.
Advantages of the decomposition into mean and covariance control cost One remarkable characteristic of the newly proposed cost is that it allows decomposition into mean and covariance control cost (Eq. 22). A major theoretical advantage of this decomposition is that, as seen in Decomposition of the stochastic cost, it enables us to separately examine the influence when taking covariance into account. Here, we discuss two significant aspects of this decomposition in neuroscience.
First, thanks to the decomposition, the stochastic cost can quantitatively compare the significance of contributions of two separate phenomena in a unified manner, namely, the change in magnitude and covariation of brain activities. Conventionally, changes in the magnitude of brain activities have been assessed through the estimated coefficients (often denoted as b ) in the GLM (Friston et al., 1995;Ashby, 2019). The change in covariation has been evaluated through the subtraction of correlation matrices (Cole et al., 2014(Cole et al., , 2016 or through network theoretic measures (Braun et al., 2015;Finc et al., 2020;Hahn et al., 2020). The dynamical change in covariation (often referred to as functional connectivity in the field of neuroimaging) is called functional reconfiguration (Kitzbichler et al., 2011;Krienen et al., 2014;Davison et al., 2015;Stitt et al., 2017). In this way, classically, the change in magnitudes and the covariation of activities have been examined separately in different contexts. In contrast, our use of decomposition has allowed us to quantitatively compare control costs for the mean and covariance from a unified perspective.
Second, the proposed framework enables us to investigate regions playing a significant role in the control of brain state transitions, as we have seen in Input given at each node. This topic is covered in the next section.

Significant brain regions in the control of state transitions
The present framework allows us to identify brain areas that contribute to controlling the mean and covariance separately. In Control inputs at brain regions, we saw the general tendencies of the mean and covariance input maps. Compared with the mean inputs, the covariance inputs are (1) more widespread (Fig. 6) and (2) less related to the regions of altered activities (Fig. 7). The first observation might reflect the fact that the alteration of covariance (termed functional reconfiguration) (Kitzbichler et al., 2011;Krienen et al., 2014;Davison et al., 2015;Stitt et al., 2017) occurs all over the brain; thus, broad input to control covariance may be necessary. The second observation means that we cannot estimate important regions for controlling covariance using only the magnitudes of activity. Examining the magnitudes of activity only may lead to the failure to identify significant regions controlling brain state transitions.
On examination of significant areas for control, an intriguing observation is that the PCC commonly contributes to controlling the covariance in many of the tasks examined (Figs. 8a,b, 9, middle panels). Among previous findings related to this result, the PCC is reported to be connected with many brain areas structurally (Hagmann et al., 2008) and functionally (Leech et al., 2012). Although the PCC's specific functions are not yet fully understood, previous studies have revealed that it is associated with many cognitive processes, including cognitive control (Leech and Sharp, 2014), as shown using fMRI (Lin et al., 2016), and through recordings of single-cell firing rates (Hayden et al., Figure 9. Number of times an ROI was ranked in the top 10 brain regions that contributed most from rest to the seven tasks. Left to right, each panel shows the areas that contribute to the mean, covariance, and total control, respectively. 2010). From these studies, it is now considered that the PCC exerts influence on various regions and might thereby play a role in altering the functional connectivity of the brain (Leech et al., 2011;Leech and Sharp, 2014;Lin et al., 2016). Our finding that the PCC is significant in controlling covariance supports this view. Although this speculation needs further evidence, our study showed the importance of the PCC in changing functional connectivity from a control theoretic perspective.
Compared with the PCC and its contribution to controlling covariance, we found that the lower visual areas are included in the significant regions controlling both the mean and covariance (Figs. 8a,9,left panels). This might be because of the nature of the tasks recorded in the HCP dataset, where subjects were presented visual stimuli as part of the tasks.
Validity and merits of the linear continuous model for fMRI data analysis In this study, we adopted the linear continuous-state model to describe brain dynamics in the fMRI data. It has been pointed out that a linear model may oversimplify the brain dynamics, which is complex and known to behave in nonlinear manners (Freeman, 1979;Stephan et al., 2008;Rabinovich and Muezzinoglu, 2010). Although we acknowledge this limitation, a recent study has shown that the neural data, including the same fMRI data as we used in this study (HCP), can be better fitted with linear models than nonlinear ones, possibly because of temporal and spatial averaging effects (Nozari et al., 2020). Accordingly, this previous study (Nozari et al., 2020) provides validity for our use of a linear model for the dataset.
To account for the nonlinearity of brain dynamics, another approach to modeling brain dynamics in the fMRI data is to use the discrete-space probabilistic model Lynn et al., 2021;Kawakita et al., 2022). In our recent study , we used the same KL divergence as control cost in brain state transition as in the present study. The difference from our study lies in that in Kawakita et al. (2022) we used a different probabilistic model, where brain states are discrete.
Although the discrete model can incorporate nonlinearity, the linear model is more feasible than the discrete model in computing control costs in high-dimensional brain dynamics. In the discrete model, the computational cost of the control cost can easily explode, as the control cost is analytically intractable. We have to use iterative algorithms, such as Sinkhorn's algorithms, to compute control costs (Cuturi, 2013;Kawakita et al., 2022); and to lessen the computational burden of the algorithms, we have to coarse-grain and limit the number of brain states. In contrast, control cost in the linear continuous model is analytically tractable, enabling us to compute control costs in high-dimensional dynamics.

Future directions
There are two possible extensions of our work: one is incorporating more general control inputs, and the other is the estimation of actual control costs from neural data.
For the first point, in this study, we considered limited situations of control where we implemented a model in which independent inputs are assigned to all nodes, as discussed in Control cost in deterministic systems. In previous studies that have used the deterministic control theoretical framework (Gu et al., 2015;Kim et al., 2018;Cornblath et al., 2020;Deng and Gu, 2020), the model was grounded on more general cases wherein the system is described with an input term Bu, as in Equation 2. We may be able to consider these general cases of the input matrix B in our stochastic framework with some additional techniques for computation (for details, see Kamiya et al., 2022).
For the second point, we only considered the optimal control cost where brain state transitions are controlled in an optimal manner, with minimization of stochastic control cost. However, in real neural systems, state transitions are not controlled in an optimal manner. An intriguing future direction will be to compare the optimal and actual dynamics using neural data during tasks. Estimating the control cost from real time-series data requires the estimation of time-varying control inputs. This is generally a difficult estimation problem that requires more sophisticated techniques (Milde et al., 2011). Quantifying the actual control cost in real neural data will provide another new insight into cognitive processes, which we plan to pursue in future research.