Dynamic causal modeling of hippocampal activity measured via mesoscopic voltage-sensitive dye imaging

The aim of this paper is to present a dynamic causal modeling (DCM) framework for hippocampal activity measured via voltage-sensitive dye imaging (VSDI). We propose a DCM model of the hippocampus that summarizes interactions between the hilus, CA3 and CA1 regions. The activity of each region is governed via a neuronal mass model with two inhibitory and one/two excitatory neuronal populations, which can be linked to measurement VSDI by scaling neuronal activity. To optimize the model structure for the hippocampus, we propose two Bayesian schemes: Bayesian hyperparameter optimization to estimate the unknown electrophysiological properties necessary for constructing a mesoscopic hippocampus model; and Bayesian model reduction to determine the parameterization of neural properties, and to test and include potential connections (morphologically inferred without direct evidence yet) in the model by evaluating group-level model evidence. The proposed method was applied to model spatiotemporal patterns of accumulative responses to consecutive stimuli in separate groups of wild-type mice and epileptic aristaless-related homeobox gene (Arx) conditional knock-out mutant mice (Arx-/+;Dlx5/6CRE-IRES-GFP) in order to identify group differences in the effective connectivity within the hippocampus. The causal role of each group-differing connectivity in generating mutant-like responses was further tested. The group-level analysis identified altered intra- and inter-regional effective connectivity, some of which are crucial for explaining mutant-like responses. The modelling results for the hippocampal activity suggest the plausibility of the proposed mesoscopic hippocampus model and the usefulness of utilizing the Bayesian framework for model construction in the mesoscale modeling of neural interactions using DCM.


Introduction
Dynamic causal modeling (DCM) (Friston et al., 2003(Friston et al., , 2007 has been used to model effective connectivity (directional, asymmetrical influences of one node over another) among multiple neural populations using various levels of neuroimaging data, such as electroencephalograms, functional MRIs, and local field potentials (Friston et al., 2019;Kiebel et al., 2008;Moran et al., 2013). Most of these studies model macroscopic inter-regional neural circuitry, which are directly or indirectly associated with local field potentials of neural populations. However, studies that model mesoscopic neural interactions reflected in the membrane potentials by using DCM are rarely found despite the wide usage of voltage-sensitive dye imaging (VSDI) or calcium imaging (indirectly) in measuring membrane potentials. Only a few DCM studies for calcium imaging have recently been introduced (e.g., Jung et al., 2019;Rosch et al., 2018) but not for VSDI. In terms of modelling a neural circuitry using DCM in the mesoscopic level, VSDI is highly advantageous since it measures membrane potentials of distributed neural cells or neural populations with high temporal (1-2 ms) and spatial (20-50 μm) resolution (Chemla and Chavane, 2010b).
The essence of DCM, particularly regarding mesoscopic modelling of a neural circuitry, lies in the ability to construct a biologically plausible but simple neural state model. To make a biologically plausible model, researchers often refer to previous literature for model architecture and/ or neural properties. For example, for the hippocampus, one may refer to public databases of the hippocampus to employ electrophysiological properties such as resting membrane potential, action potential threshold and time decay constant of neural populations. Such electrophysiological properties work as reference values (constants) within a DCM model, based on which model parameters (e.g., connectivity parameters) are estimated during the model inversion process. However, some reference values of neural properties that are necessary in the model equation are not yet available, and some depend on the experimental condition. Thus, reference values, particularly for neural properties without previous references, should be determined before the application of DCM to simplify model inversion. So far, adequate methods to determine not-yetavailable reference values for a model for DCM have not been researched sufficiently. In this study, we propose a method to determine unknown (unavailable) reference values using a Bayesian optimization technique, which is widely used in the fields of machine learning (for review, see Shahriari et al. (2016)). Since unknown reference values should be optimized before parameter estimation in DCM, we will refer to the unknown reference values as hyperparameters in this study.
Another requirement of computational modeling using DCM is to make a model structure as simple as possible, but not too simple (optimizing a model structure within the constraint of biological plausibility). This is especially critical for mesoscopic modeling with multiple parameters for neural properties, which inevitably induce high complexity. This optimization of a model structure should be conducted in the model construction stage, by removing redundant or question-irrelevant parameters from the model. For example, we need to determine whether or not to assign modulation parameters for reference values (either directly obtained from previous literature or derived from hyperparameter optimization) to account for inter-individual (or inter-group) variability at the cost of increased model complexity in DCM. We also need to simplify the model architecture based on experimental data. For example, when modeling the hippocampus, we may determine whether to include or ignore some connections among neural populations which are presumed to exist but without direct evidence (Wheeler et al., 2015). All these issues are not trivial, but they need to be resolved in order to successfully apply mesoscopic DCM to real experimental data analysis.
The first goal of the current study is to propose practical schemes for implementing mesoscopic modeling of neural interactions using a DCM framework. First, we introduce Bayesian optimization of hyperparameters in high-dimensional space to select unknown reference values for DCM. Second, as a step for simplifying model structure, we adopt a model selection scheme to determine (i) whether to use certain reference values as constants or as variables to be estimated in DCM and (ii) whether or not to include specific connections (edges) in the model, by comparing group-level model evidence. Third, we propose a method to evaluate the causal role of statistically significant parameters in generating target responses. The computational modeling using DCM often ends with the group-level test for model parameters. However, the statistical difference of a parameter between groups does not necessarily indicate the causal role of the parameter (e.g., effective connectivity) in inducing signal differences between groups. For this, we propose a method to evaluate each parameter with respect to causality.
Upon establishing technical schemes for optimizing model structure, the second goal of the current paper is to propose a DCM model for hippocampal responses to stimuli measured using VSDI. The hippocampus is a complex center for learning, emotion, memory (Roy et al., 2017), and spatial navigation (Eichenbaum and Cohen, 2014). The hippocampus is also a pathologically important organ in various brain diseases such as Alzheimer's disease (Braak and Braak, 1991;Masurkar, 2018) and temporal lobe epilepsy (Ang et al., 2006;Krook-Magnuson et al., 2013;Laurent et al., 2015;Sanjay et al., 2015). Because of the critical role of the hippocampus, the functional circuitry of the normal and abnormal hippocampus has been a compelling target for computational modeling for diverse purposes particularly at micro-scales (e.g., Cutsuridis et al., 2018). Most computational modeling studies of the hippocampus have focused on intra-regional connectivity at a specific region of the hippocampus (e.g., Bezaire et al., 2016;Sanjay et al., 2015), but not on inter-regional connectivity among the hippocampus regions. In the current study, we demonstrate a computational modelling of both intraand inter-regional interactions within the hippocampus that is responding to an external stimulus, using a DCM framework.
So far, many DCM studies have been conducted to explain interactions among cortical regions or between cortical and subcortical regions, based on a convolution-based framework (Bastos et al., 2012;Jansen and Rit, 1995;Moran et al., 2013). However, for the hippocampus model, since the architecture of neural compositions differ from that of cortical columns, we did not reuse conventional neural state models designed for cortical columns. Instead, we propose a DCM model for the hippocampus, with neurobiological structure and properties that are specific to subregions of the hippocampus as found in previous experimental studies. In particular, the proposed model for the hippocampus consists of one (or two) excitatory and two inhibitory neural populations for a column since the hippocampus consists of relatively homogeneous excitatory but diverse inhibitory neurons (Bezaire et al., 2016;Kullmann, 2011;Sanjay et al., 2015;Wendling et al., 2002). This composition differs from that of the Jansen-Rit model (two excitatory and "one inhibitory") (Jansen and Rit, 1995;Moran et al., 2013) and canonical microcircuit models (three excitatory and "one inhibitory") (Bastos et al., 2012;Moran et al., 2013) designed for the cortical column.
Using the proposed hippocampus model, we analyzed the effective connectivity in two different groups of mice, available from an open VSDI dataset . The dataset contains stimulation-locked VSDI of hippocampal brain slices of wild-type and epileptic aristaless-related homeobox gene (Arx) conditional knock-out mutant mice (Arx À/þ ;Dlx5/6 CREÀIRES-GFP ) (Marsh et al., 2009). According to Bourgeois et al. (2014), not only the signal amplitude but also the spatiotemporal activity evoked by consecutive 100-ms interval stimuli differs between that of the wild-type mice and the that of the epileptic mutant mice. Specifically, for the first stimulus, hyperpolarization of the cornu ammonis 1 (CA1) region was observed in both wild-type and mutant mice. However, for subsequent stimuli, hyperpolarization of the CA1 region was reduced in wild-type mice while it was maintained in mutant mice (Fig. 1). Furthermore, accumulative signals (stronger responses for consecutive stimuli) particularly in the hilus region were only observed in wild-type mice (Fig. 1).
To explain the differing pattern of accumulative responses for consecutive stimuli across regions and groups (regardless of scale factors), we focused on effective connectivity among neural populations both within and between regions of the hippocampus, rather than local neural properties of the hippocampus. However, not much is known about the inter-regional effective connectivity in response to an external stimulus, for both healthy and altered hippocampus. This may partly be due to the fact that the effective connectivity among neural populations cannot be directly assessed by simply measuring the activity of each neural population, but calls for computational modeling, like DCM. Regarding the altered responses to external stimuli in the conditional knock-out mutant mice (Arx À/þ ;Dlx5/6 CREÀIRES-GFP ), a previous study reported regional alterations in gene expression, e.g., reduced Arx expression in the DG and increased Arx expression in the CA1 of the hippocampus in young mutant mice and only slightly reduced Arx expression in the CA3 in adult mutant mice (Marsh et al., 2016). However, how these regional alterations of Arx expression (and possibly neural composition) by mutation in the childhood affect the intra-and inter-regional connectivity of the adult mice has not been explored. By using computational modelling with appropriate effective connectivity, we expected to find a potential connectivity-based mechanism that produces different patterns of accumulative waveforms in response to consecutive stimuli, and to identify the inter-regional effective connectivity responsible for altered responses in the hippocampus of the mutant mice using DCM for VSDI.
The current paper is composed of four parts: 1) introduction of data sets and construction of neural state and observation models using VSDI-DCM for the hippocampus; 2) Bayesian hyperparameter optimization methods for DCM; 3) group-level comparison of wild-type and mutant mice; and 4) perturbation analysis to evaluate the causal role of each connection in eliciting the abnormal behaviors in the mutant mice.

VSDI dataset
The present study utilized an open VSDI dataset of the hippocampal brain slices of wild-type mice and epileptic Arx conditional knock-out mice (Arx -/þ ;Dlx5/6 CIG mouse) after electrical stimulation at the temporoammonic (TA) pathway .
The open dataset includes VSDI signals of 44 segments (segmented by 0.1 mm) along the stratum radiatum ( Fig. 1A and B) of the hippocampal brain slices. Bourgeois et al. (2014) reported that knock-out mice exhibited frequent spontaneous seizures and behavioral deficits, showing symptoms similar to humans with mutations in the ARX gene. Details of experiment are available at Bourgeois et al. (2014).
The VSDI signals in the dataset were recorded under four repetitive instantaneous (0.1 ms) electric stimuli inserted near the TA pathway with 100 ms intervals. For the first stimulus, hyperpolarization after the stimulation was observed in the CA1 region, but this inhibition was reduced for the following consecutive stimuli. This experiment is different from other experiments (e.g., Ang et al., 2006), in which the CA3-CA1 connection was cut to block direct effects from CA3 to CA1 and thus to eliminate effect of stimuli from the perforant path. In contrast, the unblocked stimuli in the current dataset may affect all the DG, CA3 and CA1 regions. Therefore, the early coactivation at multiple regions in response to the stimuli could have originated not only from the connection between the CA3 and CA1 regions but also from direct electrical conductivity.
To simplify the computational modeling of hippocampal responses to external stimuli, we extracted VSDI signals from three regions of interest (ROIs) in the label map of Bourgeois et al. (2014), which correspond to the centers of the hilus, CA3 and CA1 regions. Among ten and eight slices (each slice refers to an individual in this study) provided for wild-type and mutant mice, respectively, we only utilized the five best slices for each group (chosen by visual screening) for the subsequent analysis, and excluded slices with very weak and noisy responses. We normalized the extracted signals for each group using the maximal sum of entire VSDI signals of the slices, and applied a median filter with a 10 ms window, as used in Bourgeois et al. (2014).

Basics of the convolution-based neural state dynamic model
The VSDI-DCM framework for the hippocampus consists of two parts of modeling; a neural state dynamic model for the hippocampus and an observation model for VSDI. For the neural state dynamic model in the hippocampus, we used a convolution-based model (Jansen and Rit, 1995;Moran et al., 2013) composed of two consecutive transform functions S (V) and h(t).
The sigmoidal (activation) function S (V j ) of a neural population j describes a transformation of the average membrane potential V j to the average firing rate of action potentials, denoted by, with parameters of a maximal firing rate f j , a post-synaptic potential (PSP) reference V 0;j and a slope R j of the sigmoid function. V 0;j is the PSP that achieves a 50% firing rate of a neural population j (Jansen and Rit, 1995). The synaptic kernel h i (t) is a transformation function from an average firing rate coming to a neuronal population i to an average PSP at the neural population i (which can either be inhibitory or excitatory) with a time constant of synaptic kernel τ i and a maximal PSP H i .
Thus, the average membrane potential (V i ) of a neural population i, evoked by an average membrane potential (V j ) of a neural population j, can be described as follows.
If we extend Eq.
(3) to multiple neural populations coming to a neural population i with different connectivity A ij , the convolution-based neural state model can be written by using the following ordinary differential equations: (Eq. 4)  .
Here, V i and I i indicate the membrane potential and cross membrane current of a neural population, respectively. The effective connectivity from a neural population j to a neural population i is denoted by A ij . The external input for the neural populations u i (t) is multiplied by input modulation parameter C i . Of note, since the current experiment does not have experimental conditions as within-subject factors, no modulation input (corresponding to B matrix in the conventional DCM for event related potential (ERP)) was assigned to approximate the VSDI time series in the current study.

Neural state dynamic model for the hippocampal activity
The hippocampus has principal cell bodies located in a layer with dendritic trees and axonal projections in the lamina and is composed of populations of excitatory (principal cells) and inhibitory neurons (interneurons) (Booker and Vida, 2018). The principal cells of the hippocampus, among diverse neural cell types, are pyramidal cells of the CA1 and CA3 regions and granule cells of the dentate gyrus (DG) (Hamilton et al., 2017), and are mostly excitatory. Although the numbers of excitatory neurons are larger than those of inhibitory neurons, excitatory neurons are relatively homogenous (Booker and Vida, 2018). Meanwhile. Inhibitory neurons exhibit more diverse characteristics (e.g., different firing patterns, calcium-binding proteins, and axonal targets) (Freund and Buzs aki, 1996;McBain and Fisahn, 2001) and play critical roles in controlling the hippocampus during learning and memory or in inducing pathological states (e.g., epilepsy, Alzheimer's disease) (Allen and Monyer, 2015;Krook-Magnuson et al., 2013). Therefore, we constructed a neural state model for the hippocampus that consists of one (or two) excitatory and two inhibitory neural populations for a column, based on a convolution-based framework (Bastos et al., 2012;Jansen and Rit, 1995;Moran et al., 2013).
To construct a model for VSDI of the hippocampal slices of the wildtype and mutant mice, we first chose three representative regions of the hippocampus (the hilus, the CA3 and CA1 regions), and modelled the interactions in those regions with ten neural populations. The ten populations are 1) the granule, mossy, DG Basket and HIPROM in the hilus; 2) CA3c Pyramidal, CA3 Basket and CA3 O-LM in the CA3 region; and 3) CA1 Pyramidal, CA1 Basket, and CA1 recurrent O-LM cells (Fig. 2). They were chosen according to the locations of their dendrites (obtained from a morphology matrix in the Hippocampome open database (Wheeler et al., 2015) For the hilus region, two excitatory neural populations (granule and mossy cells) and two inhibitory neural populations (DG Basket and HIPROM cells) were assigned. Both the CA3 and CA1 regions were again composed of one excitatory neural population (pyramidal cell) and two inhibitory neural populations (Basket and O-LM cells).
The electrophysiological properties such as maximal firing rate (f i ), decay time constant (τ i ), thresholds for action potential (V th,i ) and resting membrane potentials (V rest,i ) were employed from the open database (Table 1). However, reference values for the PSP V 0;j and slope R j of the sigmoid function in Eq. (1) are not available in the database. Thus, reference values for V 0;j and R j for each neural population need to be estimated.
Previous studies (Jansen and Rit, 1995;Moran et al., 2013) have assigned the PSP V 0 (initiating at a 50% firing-rate) and slope R j as fixed External electric shocks were presumed to affect all regions of the stratum radiatum and dentate gyrus. Thus, the external electric stimulus u is configured to affect E11, E12, I12, E21, I21, E31, and I31, directly. Considering the distance from the stimulus, we did not set direct input to two neural populations in the stratum oriens (I22 and I32).
constants regardless of neural populations. However, we assigned the reference PSP V 0;j variable for each neural population j in order to allow for diverse neural behaviors. To reduce the number of model parameters, we assumed that the PSP reference (V 0,j ) at a neural population j can roughly be approximated from a voltage shift from its action potential threshold (V th,j ), as shown below.
We also assumed a homogeneous slope across all neural populations (i.e., all R j are assumed to have the same value R). Previously, a steep sloped sigmoid function was used, such as R ¼ 0.67 mV À1 in Moran et al. (2013). However, by parameterizing R, we expected to find a lower R value (a slower slope) to extend the range of monotonic increases in the sigmoid function for each individual. Therefore, the parameters R and V offset are to be estimated to explain a wider response range and diversity of each neural population in the sigmoid activation function of each individual.
Although neural populations in the stratum oriens and dentate are presumed to be involved in the processing of the external stimulus, the VSDI signals for these regions are unavailable in the public dataset. Therefore, we employed such neural populations (i.e., O-LM neural populations) as latent (hidden) in the model. Those latent neural populations constitute the dynamics of neural states according to Eq. (5) but their direct contributions to the observed VSDI signals were set to zero in the observation model.
The Hippocampome open database provides not only known (identified in previous literatures) but also potential connections (morphologically inferred without direct evidences yet) which were presumed to exist based on Peter's rule . According to Peter's rule, axon-dendrite colocations are considered as proxies of potential connections . Thus, in the current hippocampus model, we constructed a full model composed of both types of connections, comprising a total of 41 connections (17 known and 24 potential connections) (Fig. 2). Then, we evaluated and compared reduced models nested in the full model using Bayesian model reduction to test the need for potential connections in the proposed model. The details will be explained in a later section.
To account for potential spatiotemporal delays to diverse neurons in a population during the propagation of electric shock as an external input, we used a Gaussian-convolved waveform u(t) at four onset times, with a narrow Gaussian kernel size of τ ¼ 0.2. The input was assigned to all neural populations within the radiatum and hilus, which are close to the location of the electric stimuli (see Fig. 2).

Observation model for VSDI signal
A linear relation between a VSDI signal and membrane potential has been reported in previous studies (Berger et al., 2007;Chemla and Chavane, 2010a). Accordingly, we employed a linear observation model for the VSDI data. The VSDI signal v r (t) at a region r composed of n neural populations is described as follows.
(Eq. 7) X n i¼1 ρ ri ¼ 1; for all populations in the region r; (Eq. 8) in which α r represents a scaling factor of a signal at a region r to fit the VSDI signal amplitude. V ri and ρ ri indicate a membrane potential at a population i and its contribution ratio to the VSDI signal in region r. In the observation model for VSDI, α r and ρ ri were set as variables to be estimated. The scaling factor α r reflects combinations of the regional neural sensitivity, optical factors and other unspecified factors common across neural populations in region r. Among cell-type dependent factors, the contribution (ρ ri ) of each neural population to a VSDI signal in region r is assumed to be primarily determined by the neural cell density of each population. In the hippocampus, the density of excitatory neurons is known to be much higher than those of inhibitory neurons (Keller et al., 2018). Following this as a principle, we assigned high and low contributions to excitatory and inhibitory neural populations, making the search space for excitatory neural population between 0.5 and 1.0 (i.e., 50%~100%) and inhibitory population between 0.1 and 0.5 in the hyperparameter optimization. These values were sequentially optimized in the hyperparameter optimization, and final results are listed in Table 1. In summary, model parameters in both neural and observation a Abbreviations: resting membrane potential V rest , time constant of the synaptic kernel τ i , maximal firing rate f i , threshold for action potential V th,i , reference values of the maximal PSP for inhibitory and excitatory neural populations H inh/exc , and contribution ratio of the membrane potential at a neural population i to the VSDI signal at region r, ρ ri.
b The first latter of ID represents excitatory I and inhibitory (I). c To obtain optimal neural population dynamics, the threshold for the action potential V th,i of each neural population was used to determine the PSP at the halfmaximal firing rate (V 0,i ) of the activation function. d H inh/exc are the reference values of H i . H i indicates the maximum PSP for the i-th (excitatory or inhibitory) neural population, e Latent neural populations that indirectly affect the observed signals (ρ ri ¼ 0). models include parameters for neural properties, interactions (or synaptic connectivity) among neural populations, and contribution ratios for membrane potentials in the observation model. Reference values for the neural properties of all types of neural cells used in this study are listed in Table 1. Note that polarity signs for all neural properties were determined beforehand, i.e., positive and negative signs were used to represent the effects from excitatory and inhibitory populations, respectively. Thus, parameters for the neural properties are confined to positive values. The positivity of parameters can be achieved by utilizing either exponential transformation or absolute transformation. The effective connectivity A ij was formulated by absolute transformation of a parameter θ, i.e., A ij ¼ abs (θ), which follows the normal distribution N (A intra/inter , 1/32), with A intra/inter as a reference value for intra-or inter-regional connectivity. Using absolute transformation is advantageous in making the effect size zero (or no connection) in the effective connectivity, which is not trivial in the exponential transformation. Other parameters were given by the multiplication of reference values C 0 and exponentiated parameters θ i , e.g., Table 2).
Parameters to be estimated are listed in Table 2. Despite simplification of the neural state model for the hippocampus, there are still too many unknown reference values and parameters to be estimated. Therefore, the next step is to confine reference values within biologically plausible ranges and to reduce model parameters in constructing a simplified realistic model. Fig. 3 summarizes the model construction procedure for VSDI-DCM for the hippocampus conducted in the current study.

Model inversion using DCM
The parameters in both the neural state model and the observation model were inferred from VSDI data using a standard variational Bayesian scheme under Laplace approximation (Friston et al., 2003(Friston et al., , 2007, implemented in the SPM12 toolbox (https://www.fil.ion.ucl .ac.uk/spm/, spm_nlsi_GN.m). The details for DCM inversion steps can be found elsewhere (Friston et al., 2003(Friston et al., , 2007 and briefly in Appendix 1. After estimating DCM for each individual, the model evidence was evaluated at the group-level using parametric empirical Bayesian modeling (PEB) for DCM (Friston et al., 2015(Friston et al., , 2016. The PEB for DCM is designed to account for random (between session and subject) effects on parameters that are estimated during the first (within session or within subject) level of DCM. More specifically, the observation of the i-th subject, y i , in the first level (within-subject), is described by a function Γ(θ i ) of its parameters θ i and estimation error ε i (1) . At the second level (between-subject), individual DCM parameters are modelled with a group design matrix X, between-subject effects β and random effect ε (2) , (Eq. 10) The group-level approximation for parameters or connectivity β can be obtained by using a design matrix of the second level X ¼ [1 1 1 1 … 1] T (N subjects Â 1). We then have a group-level (approximated) model evidence (or free energy) for the observed data in a group, which was used as a model fit index for model comparison and model selection.
We used a free energy (an approximate log model evidence, See Appendix 1 for its definition) metric to evaluate model adequacy. We also used the percent explained variance (EV) as an index for the model fit of the experimental data.
EV ¼ 100 Á P r;t ðy i ðtÞÞ 2 P r;t ðy i ðtÞÞðy i ðtÞÞ 2 þ P r;t ðy r ðtÞ À y model;r ðtÞÞ 2 ; (Eq. 11) in which y r (t) and y model,r (t) respectively represent observation and estimated signal of a region r at time t.

Model construction using a sequential Bayesian hyperparameter optimization
Experimental reference values for neural properties in the model would confine search space within a biologically plausible range. However, some neural properties do not have reference values nor have a priori knowledge for their ranges. For this case, we introduced a method to approximate reference values by using a Bayesian optimization technique, which searches appropriate sampling points of hyperparameters based on a model of Gaussian process.
The Bayesian optimization of hyperparameters, which has been highlighted in the field of machine learning, tunes hyperparameters that determine a model structure (or configuration of model structure). In this scheme, we search a global maximizer of an objective function F(β) (here, free energy function) with a hyperparameter vector β (β 2 M), within a model design space of interest M. Note the hyperparameter vector β, based on which parameters are estimated in DCM, are used as a parameter of the objective function F. To find optimal hyperparameters efficiently with the minimum number of steps (i.e., evaluations of the objective function, a computationally high-cost function), Bayesian optimization is composed of two steps -approximating the objective function (using a surrogate model) and an acquisition function to decide where to sample (evaluate). The surrogate model incorporates prior belief about the objective function and updates the prior with samples the intra-and inter-regional effective connectivity. Since the self-recurrent connectivity of the CA1 region is known to be weaker than that of the CA3 region, we assigned A inter instead of A intra for the self-recurrent effective connectivity of a pyramidal population in the CA1 region.
evaluated from the function to derive a posterior and thus leads to a better approximation for the function. Based on the posterior of the objective function, the acquisition function determines the next sample β for evaluation that is expected to improve the approximation best over the currently accumulated evaluations. In the present study, we adopted a Gaussian process model for the surrogate model (Snoek et al., 2012) and an expected-improvement-per-second-plus function for the acquisition function (Bull, 2011;Gelbart et al., 2014). This approach is highly advantageous for cases with a high computational cost for model evaluation, relatively small dimension of hyperparameters, and unavailable hyperparameter gradients. The optimal (unknown) reference values obtained after Bayesian hyperparameter optimization (and known reference values) are used as constants in the DCM model. As explained in Supporting Information S1, DCM model inversion is also a Bayesian parameter optimization process, using the variational expectation-maximization (EM) (Friston et al., 2007). Variational EM-based DCM estimates not only connectivity model parameters but also internal hyperparameters for controlling error and parameter precisions (denoted as λ, in the right upper panel of Fig. 3). The hyperparameters λ inside DCM differ from the hyperparameters designated to indicate unknown reference values, which are to be estimated before DCM model inversion. To differentiate it from the Bayesian parameter optimization technique used in DCM, we named the Bayesian optimization for exploring unknown reference values as "Bayesian hyperparameter optimization" and utilized the MATLAB function 'bayesopt' (Mathworks, co. USA) with a default acquisition function ('expected-improvement-per-second-plus'). As a cost function, we used negative free energy (approximate model evidence) of VSDI-DCM to be minimized by parameter updating. Considering the high computational load in each DCM inversion using variational EM, this approach is highly Fig. 3. The procedure and variables for VSDI-DCM for the hippocampus. To construct a biologically plausible model, we referred literatures and public database (Hippocampome) to determine reference values used in the model as constants (A). Parameters which are not available from the database were optimized by performing Bayesian hyperparameter optimization (B and C). The fixed (same across individuals) and unfixed (allowing individual variances) usage of the maximal postsynaptic potential H i for each neural population i were determined by using the Bayesian model reduction scheme (D). Group-level Bayesian model comparisons were performed to find the best model of connection combinations (E). For the best model, we performed group comparison analysis to extract group differences in effective connectivity (F). Finally, using perturbation analysis, the causal effect of each effective connectivity in causing abnormal behavior in the mutant mice was evaluated (G). The right upper panel summarizes the reference values, hyperparameters and parameters. In the DCM (using the variational expectation-maximization, EM), the initially chosen reference values and the optimized hyperparameters (unknown reference values searched using Bayesian hyperparameter optimization) were used as constants in the DCM model. The effective connectivity A ij , input modulation scale C i , and maximal post-synaptic potential H i across neural populations, were parameters estimated in the DCM. It should be noted that variational EM-DCM also estimates hyperparameter λ for error and parameter covariances, which differ from hyperparameters of unknown reference values. For details see the main text. efficient in reducing the number of DCM inversion steps by utilizing appropriate sampling of hyperparameters in a model driven approach (i.e., Gaussian process) during the optimization.
In the current model for the hippocampus, 23 reference values are to be determined using Bayesian hyperparameter optimization (see Tables 1 and 2). Since it is impractical to search for an optimal value in a high dimensional space, spanned by all reference values at a time, we propose a sequential optimization method of searching (optimizing) for a subset of the reference values, followed by searching the other subset with the previously estimated sets fixed (Fig. 4). Subsets for the reference values were maximal PSPs of the excitatory and inhibitory neural populations (H exc and H inh ), modulation effects for external input signal (C 0 ), VSDI amplitude scaling factors for a region r (α r , r ¼ 1,2,3), contribution ratios (ρ ri ) for each neural population i (i ¼ 1,2,3 for CA1 and CA3 or 4 for the hilus) at region r and prior means of intra and inter-regional connectivity (A intra and A inter ) (Fig. 4A).
V offset for the reference PSP (in Eq. (6)) and the slope R of the sigmoid function were also estimated using the Bayesian hyperparameter optimization approach. The order of the subsets (as well as categorization for subsets) was determined empirically; we first chose parameters H 0 and C 0 during the first stage since those values changed (generated) signals dramatically. After the sequential optimization, we finally performed Bayesian optimization for all reference values (H exc , H inh , C 0 , α r , ρ ri , A intra , A inter , V offset , R) within the hyper-space search range around the pseudooptimal set of reference values in the sequential optimization.
We applied Bayesian hyperparameter optimization to the group averaged VSDI signal after normalizing the mean amplitude of each slice (in the wild-type mice) to be in the level of the maximal mean amplitude across slices. Scales of VSDI signals differ not only across slices but also across groups. The scales of VSDI signals of the wild-type mice were approximately two or three times larger than those of the mutant mice. Based on the reference values found in the wild-type data, we sequentially estimated regional scaling factors for the mutant group by estimating 1) the global scaling factor a 0 and 2) regional scaling factors specific to each region r (α r , r ¼ 1,2,3) (step 6 in Fig. 4A).
In summary, by introducing a sequential Bayesian hyperparameter optimization scheme to DCM, we found optimal reference values in high dimensional space, which reduces trial and errors in finding those values to construct a VSDI-DCM for the hippocampus.

Model reduction: decision for fixed and unfixed usage of reference values in the model
In the VSDI-DCM, the reference values (adopted from previous literatures or estimated from Bayesian hyperparameter optimization) can be used as either fixed constants or unfixed variables to be estimated in the DCM. For the unfixed usage, the reference value is combined with a modulation parameter θ that controls an effect that has deviated from the reference value, by using either the exponential transformation C i ¼ C 0 exp (θ i ) with θ i~N (0, 1/32) or the absolute transformation A ij ¼ abs (θ ij ) with θ ij~N (A intra/inter , 1/32), as listed in Table 2.
We now propose a method to determine whether to fix a reference value or to relax the effect as a variable using a modulation parameter. For example, we determined neural populations that do not necessarily Fig. 4. A procedure for sequential Bayesian hyperparameter optimization. (A) In the Bayesian hyperparameter optimization, a total of 23 unknown reference values (unavailable from previous literatures) were sequentially optimized. (B) In each step, we searched 120 data points in the hyperparameter space, and compared the highest free energy to that of the previous step. Estimated parameters of the final step were used in the first-level (individual slice) model inversion using DCM, which optimizes the effective connectivity (A ij ) from neural population j to i, the strength of input external stimuli (C i ) and the modulation parameter for maximal PSP (H i ) of the neural population i. (C) The cost function (the negative free energy) in step 1b is shown as an example of Bayesian hyperparameter optimization.
need "free" maximal PSP H i to account for individual variability. For each slice of the wild-type, we estimated a full DCM model that comprises free (with modulation parameters θ) H i at all 10 neural populations, followed by Bayesian model reduction (BMR) (Friston and Penny, 2011). We generated 2 10 combinations of reduced models nested in the full model by shrinking H i (setting both expectation and covariance to zero, e.g., H i ¼ H exc exp (0) for an excitatory population or H i ¼ H inh exp (0) for an inhibitory population). We then applied PEB for all reduced models to estimate group-level (approximated) model evidences (free energy) (Friston et al., 2016). If a population has a free H i in the optimal model with the highest group-level free energy, it indicates that H i may well be estimated to explain individual variance in the signal of neural population i at the cost of increased complexity (Figs. 3D and 4A).

Model reduction: decision for inclusion of potential connections in the model
In the hippocampus, some connections are not observed yet but may potentially exist (i.e., morphologically inferred connections) (Wheeler et al., 2015). To determine which combination of potential connections could be included in the model to explain the hippocampal responses better, we applied group-level BMR.
First, for each slice of wild-type mice, we estimated a full DCM model that comprises all known and potential connections. Using BMR, we generated reduced models nested in the full model by shrinking (thus removing) some of potential connections (setting both expectation and covariance to 0) (Friston and Penny, 2011). To find the best combination of connections to remove, we conducted a greedy search method at the group-level, by extending model reduction using greedy search at the individual-level DCM (Friston and Penny, 2011) to the group-level.
Since the full model of the present study includes 24 potential connections, it is not trivial to evaluate all 2 24 reduced models. Therefore, by shrinking one-by-one the potential connections from the full model, we generated 24 reduced models. We then applied group-average PEB (with a vector of all ones for individuals in the design matrix X in Eq. (10)) for the reduced models to estimate group-level (approximated) model evidences (free energy) (Friston et al., 2016). The group-level free energies of the reduced models are compared with that of the full model and are used to determine whether to include (higher group-level free energy in the reduced model) or exclude (lower free energy in the reduced model) the connection in the subsequent reduction of potential connections.
By doing this, we first selected the best eight (out of 24) connections that increased group-level free energy maximally after removal of those connections. Such connections indicate redundant contributions to the observed signal. Then, we constructed 2 8 reduced models (binary combinations of 8 connections) and evaluated the group-level free energies for the reduced models using PEB. The reduced model with the highest free-energy was selected and the corresponding shrunken connections were removed. These steps were performed iteratively until the reduced models no longer improve free energy. This is an extension of the greedy search in conventional BMR implemented in spm_dcm_bmr_all.m of SPM12, which evaluates free energy at the individual level.
So far, we have described a group-level model selection (reduction) procedure using PEB by reducing (or shrinking) modulation parameters θ about neural properties (e.g., H i ) and by greedy searching of potential connections. We re-estimated DCMs with the reduced model configuration for each slice.

Bayesian group comparison of effective connectivity between the wildtype and mutant mice
For the mutant mice, after optimizing scale factors α for the average signal of the mutant mice, DCM inversion for individual slice was applied to a mutant-specific DCM model. We then evaluated group-level differences in the parameters (e.g., effective connectivity) between wild-type and mutant mice using PEB. For this, we assigned a design matrix X ¼ [X1 X2] in which X1 ¼ [1 1 … 1] T (N subjects of wild-type and mutant Â 1) and X2 ¼ [1 1 … 1 -1 -1 … À1] T (subjects of wild-type Â 1 and mutant Â À1) in Eq. (10). The parameter for the first column indicates the group common effect while the parameter for the second column indicates the group difference effect.

Perturbation analysis to evaluate causal effects of connections to mutant responses
Mutant mice had low VSDI signals and did not show the cumulative response patterns for consecutive stimuli that were observed in wild-type mice. Among many potential parameters that may explain in altered signals in mutant mice, we attributed altered response patterns (nonaccumulated waveforms) of mutant mice to their altered effective connectivity. To focus on between-group differences in the accumulative waveforms (which we attribute to the effective connectivity differences), rather than signal amplitudes, we separated the effects of scale factors α from the connectivity factors by estimating scale factors for each group.
As mentioned before, not all the connections that show group differences may be critical to induce altered responses in mutant mice. To test causality of the effective connectivity for altered responses in the mutant mice, we adopted a perturbation method proposed in Kang et al. (2017). We artificially perturbed the wild-type system by replacing some or all of the wild-type connectivity (showing significant group differences) with those of mutant mice without changing other parameters of the wild-type mice (except for scales), in order to generate mutant like VSDI signals.
First, we constructed a virtual model with parameters from the wildtype mice. Scaling factors of the observation model are derived from the mutant mice (α r , r ¼ 1,2,3). Then, all possible combinations of connections (showing group difference) of wild-type mice were replaced with those of mutant mice. If the replacement generates similar signals to those of the mutant mice, we could argue that the replaced combination of connections is critical in explaining the underlying mechanism of the mutant mice. From ten group-differing connections detected using PEB analysis (the details will be explained in the Result section), we constructed 1024 (2 10 ) combinatory replacement sets of connections between the wild-type and mutant mice. We simulated VSDI signals for each replacement set, and compared the generated VSDI signals with those of the target mutant signals. We then evaluated at each region the ratio of EV for each replacement combination to EV for the wild-type without replacement, as follows. in which y target,r (t), y wt,r (t), and y model,r (t) designate signals of a region r at time t generated with model parameters of the target mutant, wild-type, and replaced model parameters. To focus on the immediate responses for stimuli, we restricted the time range of evaluation from 0 to 500 ms. Table 3 summarizes all the procedures for VSDI-DCM for the hippocampus.

Improved model estimation by Bayesian hyperparameter optimization
Sequential Bayesian hyperparameter optimization increased the model evidence for the average VSDI signal of the wild-type mice (Fig. 5). The EV increased from 91.4% to 98.6% by using Bayesian hyperparameter optimization. A simple simulation is also presented in Supporting information S1 to show the performance improvement by using Bayesian hyperparameter optimization, compared to a one-step DCM that estimates all reference values (hyperparameters) and parameters simultaneously within the conventional DCM. Bayesian hyperparameter optimization for reference values of intra-and inter-regional effective connectivity (A intra/inter ) found optimal values at 0.79 and 0.39, which were used as prior expectations (with prior covariance of 1/32) for the intra-and inter-regional effective connectivity A ij in the neural state model in Eq. (5). (see Table 2).
The average VSDI signal of the mutant group was roughly four times weaker than that of the wild-type group. When we estimated scaling factors for all regions of the mutant mice as well as those of the wild-type mice, the scaling factors of the mutant mice were 0.3-0.4 times lower than those of the wild-type mice; [α 1 , α 2 , α 3 ] wt ¼ [1.30, 0.44, 0.18] and [α 1 , α 2 , α 3 ] mut ¼ [0.50, 0.15, 0.06] (Table 1).
Based on the parameters estimated from group average signals, we performed DCM inversion for individual slices. EVs of individual slices were 84.5-95.1% (mutant), and 93.3-98.3% (wild-type).

Parameterization of the membrane potential-to-firing rate transforming sigmoid function
Using Bayesian hyperparameter optimization, we obtained R ¼ 0.21 mV À1 and V offset ¼ 7.1 mV for the sigmoid function (membrane-potential to firing-rate). To gain some insight about these values, we evaluated the effects of a sigmoid function with regard to two parameters R and V 0 by simulating neural responses with the previous fixed parameter set (R ¼ 0.67 mV À1 ) (Moran et al., 2013) and the parameter set obtained from the Bayesian hyperparameter optimization (R ¼ 0.21 mV À1 , V offset ¼ 7.1 mV). We constructed a simple computational model which consists of pyramidal (E31) and basket (I31) neuronal populations at CA3. We set one connectivity from E31 to I31, A I31,E31 ¼ 0.3. Four electric pulse stimuli with 100 ms interval were applied on the pyramidal neural population with the parameters of the sigmoid function: R ¼ 0.67 mV À1 and V 0 ¼ V th (À50.0 mV), and with the parameters from the Bayesian hyperparameter optimization: R ¼ 0.21 mV À1 and V 0 ¼ V th þ V offset ¼ À42.9 mV. To evaluate the effect of each parameter for different input strengths (controlled by changing modulation parameter θ i with À1.6, À0.8, 0.0, 0.4, and 0.8), we performed simulations of four models: 1) R ¼ 0.21 mV À1 , V 0 ¼ À42.9 mV; 2) R ¼ 0.67 mV À1 , V 0 ¼ À42.9 mV; 3) R ¼ 0.21 mV À1 , V 0 ¼ V th ¼ À50.0 mV; and 4) R ¼ 0.67 mV À1 , V 0 ¼ À50.0 mV.
The sigmoid functions with the estimated slope (R ¼ 0.21 mV À1 ) shows a wider range of monotonic increase in the range of À60 mV to À20 mV compared to those with R ¼ 0.67 mV À1 (Fig. 6B). Increased V 0 parameter (V 0 ¼ À42.9 mV) shifted the monotonic increase range to a higher voltage range. The model with R ¼ 0.21 mV À1 , V 0 ¼ À42.9 mV showed a wider and increased dynamic range of responses at I31 for different strengths of inputs (Fig. 6C).
A further simulation result presented in Supporting information S2 suggests the need for hyperparameter optimization of R and V offset when applied to real data set; this is explained in the following section.  Fig. 7B present examples of predicted signals at a wildtype slice and a mutant slice, respectively. Here, the effective connectivity (A ij ), input strength (C i ), and maximal PSP (H i ) were allowed to be different across individual slices. Free parameterization of H i was implemented by utilizing modulation parameter θ i that controls an effect deviating from the reference value, i.e., H i ¼ H exc exp (θ i ) for excitatory neural populations or H i ¼ H inh exp (θ i ) for inhibitory neural populations, with θ ι~N (0, 1/32) (see Table 2). Both H exc and H inh were obtained via Bayesian hyperparameter optimization. The modulation parameter θ i of H i for each neural population did not significantly differ from zero (95% credibility interval belonging to zero), i.e., H i Â expð0Þ, in these examples of both the wild-type and the mutant mice.

Bayesian model reduction
Next, we conducted group-level model reduction to determine Table 3 Summary of all procedures for VSDI-DCM.

Procedures a Relevant information and results
Construction of a neural state model: Assign (known) reference values available from literatures and the Hippocampome Table 1 Bayesian hyperparameter optimization: Decision of optimal reference values for unknown neural properties in wild-type mice H exc , H inh , ρ ri , C 0 , V offset , R, prior mean of A intra and A inter , and α i using averaged VSDI data of the scaled signals of wild-type mice Two H i were chosen to be free.

Figs. 3D and 8
Selection of the best connection model using a greedy search algorithm for the best group model evidence. Selection of the best model after greedy search for all reduced models, utilizing PEB.
A full model with all the potential connectivity was chosen.

Figs. 3E and 8
Group comparison of effective connectivity using PEB

Figs. 3F and 9
Perturbation analysis for causality of the effective connectivity Figs. 3G and 10 a Index of i and r represent id of neural population and region, respectively.
whether to assign H i for each population as a fixed constant or as a free parameter (variable across individuals). The concept and results of the group-level model reduction with multiple combinations of on-off H i across neural populations are presented in Fig. 8A and B. Group-level model reduction found that the best model among 1024 (2 10 ) models of all possible on-off H i combinations was the reduced model with all fixed H i but one free H i (H E12 ) ( Fig. 8A and Fig. 8B). Therefore, a model with one free parameterized H i (H E12 ) was used in the subsequent analysis of individuals.
To evaluate potential connections (connections presumed without solid references but morphologically inferred), we further conducted model optimization by evaluating the gain (in model evidence) of including potential connections using a greedy search algorithm. Fig. 8C and D display the concept and the result of the greedy search algorithm that explored the best combination of potential connections for the hippocampus model, respectively. Although some reduced models among the 256 final models after the greedy search showed relatively higher group-level free energies than others, the full model showed the highest free energy (Fig. 8D). Therefore, we selected the full model, which includes all potential connections in the model for the subsequent analysis. Although the using greedy search algorithm did not result in reducing model parameters in this particular example, it did provide a rationale for the usage of all possible connections in the hippocampus model. This result also supports the prediction raised by Peter's rule (morphologically predicted connectivity) . Fig. 9 displays PEB results for group common and differences in the model parameters. In the group common connectivity (Fig. 9A and C), the strength of the intra-regional connectivity was generally higher than that of the inter-regional connectivity.

PEB analysis between wild-type and mutant mice
PEB identified ten significantly different effective connectivity between groups in a Bayesian criterion (with a 95% credible interval) ( Fig. 9B and D). The three group-differing inter-regional connections were from HIPROM of the hilus to the pyramidal neural population of the Fig. 6. Effects of four firing-rate transfer sigmoid functions. A simple system, which simulates a pyramidal (E31) and a basket (I31) neuronal population of the CA1 region, is shown in (A). Sigmoid functions of E31 evaluated are shown in (B). The membrane potentials (V m ) of E31 and I31 are shown in (C). The inputs to E31 were weighted by modulation values, C 0 exp(θ i ). For simulation, we used five different modulation parameters θ I of À1.6, À0.8, 0.0, 0.4, and 0.8 to induce different amplitudes of input effects. Four models with different sigmoid functions are R ¼ 0.21 mV À1 and V 0 ¼ À42.9 mV (solid red); R ¼ 0.67 mV À1 and V 0 ¼ À42.9 mV (solid blue); R ¼ 0.21 mV À1 and V 0 ¼ V th ¼ À50.0 mV (dotted red); R ¼ 0.67 mV À1 and V 0 ¼ V th ¼ À50.0 mV (dotted blue). The purple rectangle corresponds to the range between an action potential threshold (À62.8 mV) and a maximal membrane potential (À27.3 mV) at θ i ¼ 0 in (C). Fig. 7. Representative examples of individual DCM analysis of a wild-type (A) and a mutant mouse (B). The predicted VSDI with a model of free parameters [A ij , C i , H i ] in the DCM estimation is presented for three hippocampus regions. Blue, red, yellow lines in the left column represent VSDI signal at the hilus, CA3 and CA1 regions, respectively. Dotted lines and solid lines show observed and predicted VSDI signals. Expectation of parameters are presented in gray bars with a credible interval of 95% in red bars.  CA3 region (I12 → E21); from the pyramidal neural population of the CA3 region to the pyramidal neural population of the CA1 region (E21 → E31); and from the pyramidal neural population of the CA3 region to the recurrent O-LM population (E21 → I32). All inter-regional connections along the pathway from the hilus to the CA1 region were mediated by the pyramidal neural population of the CA3 region (E21). Three groupdiffering intra-regional connections in the CA1 region were associated with the O-LM population (I32): self-recurrent inhibition at the O-LM (I32), reciprocal connections between the O-LM and the pyramidal neural populations (I32 → E31 and E31 → I32). The self-recurrent inhibition at the O-LM (I32) in the CA1 region, self-recurrent excitation at the pyramidal population (E21) in the CA3 region, and inter-regional connection from the pyramidal population (E21) in the CA3 region to the pyramidal population (E31) in the CA1 region showed decreased effective connectivity in the mutant mice as compared to the wild-type mice. Except for these three connections, the other seven connections showed increased connectivity in the mutant mice compared to the wildtype mice.
Except for two excitatory-excitatory connections from E21 (E21 → E21 and E21 → E31), the connectivity showing group difference is associated with inhibitory effects, either with modulation from inhibitory neural populations or projection to inhibitory neural populations. Based on this result, the mutant mice seem to have hyper-inhibition among neural populations in processing the consecutive stimuli, which may result in reduced signals and no accumulative VSDI signals.

Causal effects of effective connectivity: a perturbation analysis
For the causality test, we performed a perturbation analysis for effective connectivity in generating mutant-like signals (Fig. 10). Causal effects of all possible combinations (Fig. 10B) for the ten groupdiffering connections (Fig. 10A) are presented in Fig. 10D. As shown in Fig. 10C, replacement of all ten connections with those of the mutant mice (and any combinations of the ten connections) was not sufficient to reproduce mutant-like responses at the hilus (Fig. 10C, D, and 10E). This implies the necessity for additional connections other than just the group-differing connections to induce abnormal response at the hilus of the mutant mice. On the contrary, mutant-like abnormal responses in the CA3 region were achieved even by replacing a single connectivity from I12 to E21 (combinations of replacement 1 to 256 in Fig. 10D); any combinations of replacement including the I12 to E21 connection increased the ratio of EV/EV 0 (to explain mutant-like signals) in the CA3 region ( Fig. 10D and F). In the CA1 region, however, the effects of replacement combinations were complex (Fig. 10D). The ratio of EV/ EV 0 in the CA1 region increased with the number of replaced connections (Fig. 10G). Except for the signal at the hilus, mutant-like responses could be obtained by replacing only four connections (I12 → E21, I32 → E31, E21 → I32, I32 → I32), not necessarily requiring replacement of Fig. 10. Evaluation of the causal role of each effective connectivity by replacement of wild-type connections with mutant connections. (A-B) Ten effective connectivity that showed group differences in PEB analysis (A) and all possible combinations of those connections to be replaced (B) are presented. (C) Simulated VSDI signals at the hilus, CA3 and CA1 regions are presented in blue, red, and yellow. The dotted lines and black solid lines represent target mutant signals and wild-type signals, respectively. Colored solid lines in the middle and bottom panels in (C) show signals that are generated with a model with four replaced connections (I12→E21, I32→E31, E21→I32, I32→I32) and a model with ten replaced connections. (D-G) The ratio of EV/EV 0 at the three regions for all possible combinations of replacements are displayed in (D). The ratio of EV/EV 0 according to numbers of replaced connections at the hilus, CA3 and CA1 regions are shown in (E-G). The effective connectivity I12→E21 (asterisked in A) significantly increased EV/EV 0 of CA3. ten connections (Fig. 10C).

Bayesian hyperparameter optimization
The success of computational modeling lies in striking a right balance between the simplicity and biological plausibility of the model. In construction of a biologically plausible model, the reference values for neurobiological properties play critical roles not only in confining model solutions within the biological ranges but also in simplifying the model to be estimated. However, not all neurobiological reference values are available. Furthermore, even the existing reference neurobiological properties may not necessarily be true for different animals in different context without adjustment. Thus, we proposed a two-step Bayesian optimization method by performing Bayesian hyperparameter optimization for model reference values, followed by optimization of model parameters (e.g., effective connectivity) in DCM. The Bayesian hyperparameter optimization we propose searches optimal reference values to maximize group-level model evidence as an efficient way of reducing the number of DCM inversions in the optimization. Although the computational cost for EM optimization during DCM inversion may not be high for a small network, the optimization of an unknown reference for DCM requires iterative evaluations with different samples of the reference. When multiple reference values have to be estimated, it is not trivial to conduct optimization in all multi-dimensional spaces (reference values) at a time. Thus, we proposed a sequential optimization approach of searching (optimizing) a subset of reference values, followed by searching the other subset, in order. Then, we confine the search space around optimal values for each parameter over which Bayesian optimization for all the reference values had been conducted. This sequential Bayesian hyperparameter optimization approach makes it practical to utilize Bayesian optimization to a problem in very high-dimensional space. We found improved model fit results by using Bayesian hyperparameter optimization (in both the simulation and experimental results, see also Supporting information S1).
The order of reference values to be estimated in the sequential optimization inevitably bias the optimization unless the reference values are independent. In this study, we performed the sequential Bayesian hyperparameter optimization by using an empirically determined order, which was determined by ordering the amount of free energy changes after changing each hyperparameter by trial and errors. Although we conducted Bayesian hyperparameter optimization with all reference values within a restricted range of previously determined optimal values to reduce such order-dependent effects at the final stage, we cannot rule out the possibility of a better solution that may be found by considering all reference values together from the beginning at high computational cost. Determining the order of reference values for optimization will be a challenging problem and should be addressed in future studies.

Adjusted sigmoid firing-rate transfer function in the neural population dynamics
For the neural state model of ensemble activity, we used a convolution-based model that comprises inter-modal connectivity with a sigmoid transform function that converts the pre-synaptic membrane potential to the firing rate for each population and a synaptic kernel mapping from the firing rate to the post-synaptic membrane potential (Jansen and Rit, 1995;Marreiros et al., 2008;Moran et al., 2013). Since the convolution-based neural state model focuses on the probabilistic information transfer between neural populations, the ensemble activity of neurons is generally described by the mean property of neurons in the population. Thus, the firing-rate transfer sigmoid function (σðvÞ in Eq. (1)) converts the presynaptic mean membrane potential to the mean firing rate of a local population (Liley et al., 2009;Marreiros et al., 2008Marreiros et al., , 2010Spiegler et al., 2010). The mean postsynaptic potential is calculated in terms of convolution of a synaptic kernel with the mean firing rate of a neural population (called "neural mass") (Freeman, 1975;Jack et al., 1975;Jansen and Rit, 1995;Wilson and Cowan, 1972). In practice, it is not easy to find references for the parameters of the firing-rate transfer function for each neural population, such as the PSP at the half-maximal firing rate V 0 , maximal firing rate (f max ), and slope (R).
In the initial Jansen-Rit model (Jansen and Rit, 1995), the authors used 6 mV for V 0 (adopted from Freeman, 1987) and R ¼ 0.56 mV -1 for EEG modeling (based on the local field potential with a distant reference), which may not be applicable to a membrane-potential-based model of the hippocampus (e.g., the membrane potentials of general neural cells around À60~À80 mV may not achieve the level of 6 mV).
The firing-rate transfer function in a neural population can be described as a combination of population dynamics (proportion of firing neurons at the membrane potential) and a firing rate function of the membrane potential of each neuron. Due to the lack of sufficient references for the latter (indeed, derivation of the PSP at the half-maximal firing rate V 0 in a neuron is not trivial in practical experiments, as noted in Jansen and Rit (1995)), we estimated the parameters for the firing-rate transfer function by introducing Bayesian hyperparameter optimization. Since a PSP increases the firing rate of a neuron, the populational transfer of the average membrane potential to the mean firing rate is considered to operate approximately within a range of the average action potential threshold and a certain voltage above that threshold. Thus, we optimized the population reference PSP V 0 (inducing the 50% maximal firing rate) at a higher voltage than the known action potential threshold. We also expected to find a wider range of monotonic increase from the action potential threshold to the upper bound of the membrane potential. As shown in Fig. 6, the estimated parameter R ¼ 0.21 mV À1 of the sigmoid function induces wider ranges of monotonic increase than R ¼ 0.67 mV À1 used in Moran et al. (2013) and allows firing rates to depend more on the strength of input signals rather than binarized responses. The model with the optimized parameter set (R ¼ 0.21 mV À1 and V 0 ¼ À42.9 mV) has a wider dynamic range of the sigmoid function that responds to a monotonic increase for different strengths of inputs as shown in Fig. 6C. To reduce model parameters, we simplified the model by using the same sigmoid function with an offset from the action potential threshold (differing across neural populations) and a slope independent of brain regions. Although this approach works well as shown in Supporting information S2, these simplifications could be improved in further studies.

Model simplification by a group-level model reduction
After implementing Bayesian hyperparameter optimization for reference values, the decision for fixed and unfixed usage of each reference value in DCM depends on the hypothesis or question of interest. This is the question of whether or not to allow individual differences in the maximal PSP H (as an example of a reference value) in explaining individual VSDI data. In this study, we proposed a method to determine whether to assign a modulation parameter for the reference value H to the effect of H as a variable to be estimated (unfixed) or simply use H as a fixed constant in DCM by comparing second-level model evidence (approximated free energy) obtained from PEB analysis (Table 1). This approach can be a useful tool when no clear hypothesis for the effects of inter-individual variability of each parameter is available.
For the neural circuit of the hippocampus, we constructed a full model with all known and potential connections based on a hippocampus database. In the hippocampus, the potential connections are assumed to exist based on Peter's rule (morphologically predicted connectivity) , but the functionality of these connections has not been fully evaluated. To test the need for a potential connection in the processing of stimuli, we performed a Bayesian model reduction of combinations of connections (Friston et al., 2016), in a greedy search manner. The greedy search-based model reduction suggests that the full model is the best model for explaining the observed VSDI signals. This is a good example that Bayesian model comparison may contribute to testing plausible circuitry for which no anatomical evidence is yet available.
In a technical perspective, the novelty of the current procedure (testing the need for potential connections) is the introduction of a greedy search method for group-level model evidence, which extends the greedy search-based model reduction at an individual-level DCM (Friston and Penny, 2011). In particular, evaluation of model evidence for each connection on a group-level using PEB is advantageous when modeling noisy signals as well as noisy individuals (variation from group mean). Since the second-level model regularizes estimation of parameters in the context of both the first and second-level variations, we speculate that group-level evaluation is more robust to estimation errors than evaluation at an individual level. Note that the current method is similar to the hierarchical model comparison method implemented in spm_dcm_peb_bmc.m, except for the order of implementation. The proposed method conducts model reduction (during the greedy search) in the first (individual) level and test the free energy for the group-average of the reduced model using PEB at the second level. On the other hand, spm_dcm_peb_bmc.m constructs a group-averaged full model using PEB and conducts greedy search for reduced models nested in the full model in the second level. The cons and pros of each method remain to be evaluated in future studies.

Computational models for the hippocampus and DCM
As described in the introduction, the hippocampus is a compelling target for the computational modelling, particularly on the microscopic circuit level. Bezaire et al. (2016) performed perturbation analysis using a realistic full-scale model of the rodent CA1, which includes one excitatory neuron type (pyramidal cells) and eight inhibitory (interneuron) types, and suggested that parvalbumin-expressing interneurons, neurogliaform cells, and inter-neuronal diversity are important for generating intrinsic theta rhythm (Bezaire et al., 2016). Sanjay et al. (2015) conducted a computational model study of the temporal lobe epilepsy. By modeling the CA3 region using pyramidal cells, basket cells, and O-LM cells, they investigated how loss of connectivity between the O-LM and pyramidal cells influences the behavior of other types of neurons, how changes in external inputs received by pyramidal neurons affect their activity, and how changes at synapses between the neurons affect the overall network activity in the CA3 subfield (Sanjay et al., 2015). The authors concluded that impaired dendritic inhibition and changed synaptic plasticity can induce epileptic activity. Most of these computational modelling studies have focused on modelling intra-regional connectivity at a specific region of the hippocampus. Furthermore, these complex computational models were not designed to fit individual data and allow for inter-individual variations. By introducing a computational model for the hippocampus on a mesoscopic level using the DCM framework, we were able to explore stimulus-dependent intra-and inter-regional effective connectivity in the hippocampus of a group of wild-type mice and another group of mutant mice. Although the current computational model of the hippocampus contains relatively small numbers of neural populations and model parameters (compared to microscopic models), this approach is advantageous in exploring group-specific neural mechanisms using observed data. The neural state model for the hippocampus we introduce is based on a well-known convolution-based model, which has widely been used for modeling effective connectivity among cortical columns (Bastos et al., 2012;Jansen and Rit, 1995;Moran et al., 2013). In contrast to computational modelling of cortical columns, most previous computational models of the hippocampus have employed a single principal neuron (excitatory neuron) and multiple inhibitory neurons for each column and focused on the functional role of inhibitory neurons (Bezaire et al., 2016;Kullmann, 2011;Sanjay et al., 2015;Wendling et al., 2002). In line with those studies, we include one excitatory (or two for the hilus) and two inhibitory neural populations to construct a region in the hippocampus.
For the hippocampal inter-regional circuit, we included three neural populations of those dendrite that are located in the stratum oriens (and not in the stratum radiatum) as latent neural populations (I22 and I32 in Fig. 2), although the dendrite locations of these hidden neural populations are not in the region of interests for VSDI measurement. This is because the functional role of various interneurons, including basket and O-LM neurons, has been widely investigated (Bezaire et al., 2016;Sanjay et al., 2015). To include these unobserved neural populations in the model, we take advantage of the Bayesian scheme of DCM to approximate latent variables under constraints of the model topology.
It should be noted that the time constant of the synaptic kernel, particularly for inhibitory neural populations, is one of the key parameters in the epileptic neural circuit. In the current model, the two inhibitory neural populations at the hilus and CA3 have different levels of time constants, i.e., faster (~10 ms, I11 and I21) and slower ones (~30 ms, I12 and I22), except at CA1 that has two different levels of membrane potentials and action potential thresholds (Table 1). In the model, the slower inhibitory neurons (I12 and I22) were configured to project to faster inhibitory neurons (I11 and I21) (Fig. 2). To focus on connectivity alterations reflected in the accumulative waveforms and to reduce the number of parameters, we did not estimate the time constants that are relatively well documented in the public database (Table 1). However, the systematic evaluation of interplay between fast and slow inhibitory neural populations should be further researched particularly in the mutant for epileptic mice, exampled in Wendling et al. (2002).
4.5. Altered connectivity in the knock-out mouse (Arx À/þ ;Dlx5/6 CIG ) ARX, which is located on the X-chromosome, is an important transcription factor for the development of the forebrain, pancrease, and testes development (Gecz et al., 2006;Olivetti and Noebels, 2012). In the brain, it is expressed in the subpallial proliferative zones and in developing and adult interneurons. Many ARX mutations have been reported in brain diseases such as X-linked infantile spasms syndrome (ISSX) (Gecz et al., 2006;Olivetti and Noebels, 2012). Marsh et al. (2009) developed a conditional Arx mutant mouse, where Arx is knocked-out in cells expressing Dlx5/6: Arx þ/-;Dlx5/6 CIG (female) and Arx -/y ; Dlx5/6 CIG (male). In both mouse models, developmental epilepsy was observed, resembling that of patients with Arx mutations. An immunohistochemistry study (Marsh et al., 2016) showed that young (postnatal day 14) Arx þ/-;Dlx5/6 CIG mice have different Arx expression in the DG (reduced Arx expression) and the CA1 region (increased Arx expression) of the hippocampus. Additionally, adult mutant mice showed reduced Arx expression in the CA3 region (not statistically signified). The study also revealed significantly increased neuropeptide Y expression and significantly decreased parvalbumin expression in adult mutant mice (Marsh et al., 2016).
In this study, we explored how regional alterations in the gene expression of the mutant mice affect altered interactions among neural populations, reflected in the altered response waveforms. Group comparison showed significant alterations in the ten connections including inter-regional interactions between the hilus and CA3, and between the CA3 and CA1 regions. Intra-regional connectivity alteration mainly occurred within the CA1 and hilus where significantly reduced Arx expression was found (Marsh et al., 2016). Although the Arx gene mutation is expected to change local neural compositions in the early stage of development, the current result implies that it may eventually affect inter-regional connectivity during the brain development.
In previous studies with rodents, abnormal inhibitory connections in the CA1 region, which is the initial region of the TA pathway, have been suggested to be capable of inducing seizures. For example, Ang et al. (2006) suggested that less inhibitions of the CA1 region in epileptic rats could induce epileptic signals in temporal lobe epilepsy. In their experiments, responses of the CA1 region evoked by stimulating the TA pathway were reduced in epileptic rats that were treated with scopolamine methyl nitrate, a temporal lobe epilepsy model. Johnson et al. (2014) also reported that mice with lateral fluid percussion injury, which also increases the risk of seizures, show hyperpolarization in the stratum oriens of the CA1 region when evoked by stimulation. By changing the positions of stimuli, they investigated the effects of neurons and suggested that induced traumatic brain injury reduced CA1 output through abnormal synaptic connections that projected to the cannabinoid-sensitive interneurons. Although the origins of abnormal systems in previous studies may differ from those in the current study, these results are consistent with our results with regard to the alteration with inhibitory connectivity in the CA1 region. The inhibitory neural population I32 in the CA1 region of the mutant mice in the present study receives increased effective connectivity from the intra-regional excitatory neural population E31 and inter-regional effective connectivity neural population E21.
The alteration in the effective connectivity of the mutant mice was expressed in a cascaded manner; e.g., the bidirectional interaction of E31 and I32 in the CA1 was modulated by E21 in the CA3, which was again affected by the inhibitory neural population I12 in the hilus. The details for these cascaded alterations and the causes for such alterations (either from primary alterations due to gene modification or secondary compensatory alterations) in the inter-regional connectivity, in association with intra-regional connectivity, should be further explored.

Statistical differences and their causality
PEB could identify group differences in effective connectivity. However, we showed that the statistical differences do not necessarily indicate a causal role of the significant connection in inducing abnormal responses in the mutant. To evaluate the causality of each connection, we took advantage of computational modeling in generating signals by artificially perturbing specific connections. From the perturbation analysis, we found three characteristics in the connectivity for mutant mice. First, the wild-type signal at the hilus was not altered by changing any possible combinations of the ten connections. This indicates that the signals of the hilus in the mutant mice may possibly be induced by a more complex circuit including additional connections that were not detected at the group comparison analysis. Second, perturbation analysis showed that only a single connection (from I12 to E21) was sufficient to induce mutant-like behavior in the CA3 region. Third, in contrast to the CA3 region, the altered response in the CA1 region may be attributable to the multiple combinations of the connections as the mutant-like behaviors was found according to the increased number of altered connectivity.
These results suggest the importance of perturbation analysis in identifying causal relations of each connectivity or combinations of multiple connections in inducing mutant-like behaviors.
In summary, we proposed a Bayesian hyperparameter optimization scheme for determining hyperparameters used in DCM, and found increased performance in determining a mesoscopic computational model for the hippocampus by using our optimization scheme. We also proposed a VSDI-DCM scheme to investigate hippocampal activities at the neural population and analyzed the VSDI data of rodent hippocampal brain slices of wild-type and epileptic mutant mice. To achieve construction of a biologically feasible model, we extended the conventional DCM model by introducing data-based neural populations and hidden brain region in the circuits. As a result, VSDI-DCM successfully fits VSDI signals in both wild-type mice and epileptic Arx conditional knock-out mutant mice. Group comparison suggested altered effective connectivity mainly with inhibitory neural populations in the mutant mice. The perturbation confirmed the causal relations of the connectivity in inducing mutant-like neural responses. Although the current method was proposed to explain a special case of Arx -/þ ;Dlx5/6 CIG knock-out mice, this model construction and optimization approach could be extended to diverse mesoscopic applications that require biologically sophisticated computational modeling using DCM.