Fully Bayesian estimation of virtual brain parameters with self-tuning Hamiltonian Monte Carlo

Virtual brain models are data-driven patient-specific brain models integrating individual brain imaging data with neural mass modeling in a single computational framework, capable of autonomously generating brain activity and its associated brain imaging signals. Along the example of epilepsy, we develop an efficient and accurate Bayesian methodology estimating the parameters linked to the extent of the epileptogenic zone. State-of-the-art advances in Bayesian inference using Hamiltonian Monte Carlo (HMC) algorithms have remained elusive for large-scale differential-equations based models due to their slow convergence. We propose appropriate priors and a novel reparameterization to facilitate efficient exploration of the posterior distribution in terms of computational time and convergence diagnostics. The methodology is illustrated for in-silico dataset and then, applied to infer the personalized model parameters based on the empirical stereotactic electroencephalography recordings of retrospective patients. This improved methodology may pave the way to render HMC methods sufficiently easy and efficient to use, thus applicable in personalized medicine.


Introduction
The activity in the brain for cognition, perception and consciousness are governed by the fundamental mechanism of neural network oscillations. Consequently, perturbations of network activity play an important role in the pathophysiology of brain disorders such as epilepsy [1][2][3]. When individual structural information from non-invasive brain imaging is merged with mathematical modelling, then generative brain network models constitute personalized in-silico platforms for the exploration of causal mechanisms of brain dysfunction and clinical hypothesis testing [4]. Such data-driven computational whole-brain models are referred to as virtual brain models in which a patient's structural brain imaging data derived from non-invasive magnetic resonance imaging (MRI) constrain the latent space trajectories of the brain network model. Simple activation paradigms lack the functional complexity to explain the richness of observed spatiotemporal behaviours linked to these brain dynamics [5], leaving it essentially to non-linear network processes to explain the origin of the emergent functional and pathological spatiotemporal patterns. The virtual brain models emphasize the network character of the brain and bring dynamical properties to the structural data of individual brains [6]. In the virtual brain models, the dynamics of a network node is given by a neural mass model of lumped neuronal activity and is connected to other nodes via a connectivity matrix representing white matter fibre tracts derived from diffusion tractography of the brain [1,7,8]. This form of connectome-based virtual brain modelling [9,10] exploits the explanatory power of measured network connectivity imposed as a constraint upon network dynamics and has provided important insights into the mechanisms underlying the emergence of the resting-state networks dynamics [7,11,12] of healthy subjects, stroke [13], schizophrenic patients [14] and epilepsy [15]. So far, neural mass models have proven successful in explaining the biophysical and dynamical nature of seizure onsets and offsets [16]. far superior to gradient-free sampling algorithms in terms of the number of independent samples produced per unit computational time [42,43]. This class of sampling algorithms provides efficient convergence and exploration of parameter space even in very high-dimensional spaces that may exhibit strong correlations between samples [36,37,39]. Nevertheless, the efficiency of gradient-based sampling methods such as HMC is highly sensitive to the user-specified algorithm parameters [36,37]. More advanced MCMC sampling algorithms such as No-U-Turn Sampler (NUTS; [37]), a self-tuning variant of HMC [44] solve these issues by adaptively tuning the algorithm parameters. It has been shown that these algorithms efficiently sample from high-dimensional target distributions that allow us to solve complex inverse problems conditioned on massive data sets as the observation [45][46][47]. MCMC has the advantage of being non-parametric and asymptotically exact in the limit of long/infinite runs [24]. Of the other alternatives, Variational Inference (VI; [48,49]) turns the Bayesian inference into an optimization problem, which typically results in much faster computation than MCMC methods [24,47]. However, the classical derivation of VI requires a major model-specific work on defining a variational family appropriate to the probabilistic model, computing the corresponding objective function, computing gradients, and running a gradient-based optimization algorithm [50]. Probabilistic programming languages (PPLs; [44,51]) provide efficient implementation for automatic Bayesian inference on user-defined probabilistic models by featuring the next generation of MCMC sampling such as NUTS [44,52]. In particular, Stan [53] is a high-level statistical modeling tool for Bayesian inference and probabilistic machine learning, which provide the advanced inference algorithms, enriched with extensive and reliable diagnostics. Although PPLs allow for automatic inference, the performance of these algorithms can be sensitive to the form of parameterization [34,39,54]. An appropriate form of reparameterization in the probabilistic models to improve the inference efficiency of system dynamics (governed by a set of nonlinear differential equations) remains a challenging problem. We propose an approach to reparameterize the model configuration space based on the correlations between the parameters induced by the measurement function maps (gain matrix) so that the posterior distributions of the model parameters are explored efficiently. The efficiency is shown in terms of inference diagnostics and computational time. Moreover, we propose sparse priors on the model parameters to consider the small number of EZ's in the model and we also consider priors based on the behaviour of the model equations. We focussed on the self-tuning variant of HMC called NUTS algorithm in this paper and the convergence diagnostics are monitored carefully to ensure the reliability and accuracy of the estimations.
In section 2, we illustrate the probabilistic model and discuss the proposed prior as well as the reparameterization based approach. We show the results obtained from the NUTS on synthetic datasets in section 3. The analyses for 19 patients and the comparison of the results with the clinical diagnostics are shown in the same section. We discuss the utility and extension of the work in section 4.

Materials and methods
The body of the work is based on the virtual brain models and Bayesian inference as schematically illustrated in figure 1. The workflow to build the Bayesian VEP (BVEP) consists of two main steps: constructing the VEP, a personalized virtual brain model of epilepsy spread [1], and then embedding the VEP model in a Bayesian framework to infer the model parameters with NUTS algorithm. We show that the proposed probabilistic framework in BVEP is able to efficiently invert the nonlinear state-space equations to infer the hidden system dynamics. This approach allows us to accurately estimate the spatial map of epileptogenicity in a personalized virtual brain model of epilepsy spread by taking advantage of flexible probabilistic inference in PPLs such as Stan [39]. The brief description of the empirical data collection process and the pre-processing steps are provided in supplementary sections 1.1 and 1.2, respectively. The detailed description of the whole-brain model is provided in supplementary section 1.3 and the description of the gain matrix is provided in supplementary section 1.4.

Whole-brain model of epilepsy spread
Typically, to build a virtual brain model, the brain regions are defined using a parcellation scheme and then placing a set of mathematical equations (here Epileptor model, [16]) to capture the dynamics of the regional brain activity [1,6]. Taking such a data-driven approach to incorporate the subject-specific brain's anatomical information, the network edges are then represented by structural connectivity of the brain, which are obtained from non-invasive imaging data of individual patients [1,8]. The brain regions are defined according to the VEP atlas [55], a cortical and sub-cortical parcellation of the brain developed specifically for the use in domains of epileptology and functional neurosurgery. The atlas considers reasonable region sizes for improved performance of model inversion techniques [55]. The VEP atlas indicating the indices and the corresponding brain regions is shown in tables 1 and 2 of the supplementary. The structural connectivity matrix based on VEP atlas for a randomly selected patient is shown in Figure 1. Schematic illustration of workflow in the BVEP. The approach to build the BVEP comprises two main steps: constructing the VEP model, and then embedding VEP in a PPL tool to infer and validate the model parameters. To build the VEP model, we take the following steps: First, the patient undergoes non-invasive brain imaging (MRI, diffusion tractography, computed tomography (CT)). Based on these images, the brain network anatomy including brain parcellation and the patient's connectome are provided from the reconstruction pipeline. Then, a neural population model is selected for each brain region to define the network model. In VEP, the Epileptor model is defined on each network node that are connected through structural connectivity derived from diffusion tractography. The mapping from source to sensor space is encoded in the gain matrix, where the distance between the sensors and the brain regions are obtained using MRI and tomography. The SEEG recordings obtained from the patient are pre-processed to obtain the data feature, which is the envelope time-series, as the target of fitting. Then, model fitting is performed using NUTS algorithm within a PPL tool. Finally, the posterior distribution of excitabilities are obtained for all the brain regions which help in the identification of EZ/PZ which could be the target of resection.

Figure 2. (A)
Structural connectivity matrix of a patient, whose entries represent the connection strength between the brain regions, is derived from diffusion MRI tractography. Using VEP atlas, the brain of the patient is parcellated into 162 different regions and (B) gain matrix of a patient. Each element represents the inverse-squared distance between the region and the sensor. figure 2(A). In VEP model, the dynamics of brain network nodes are governed by Epileptor equations [16] that are coupled through the structural connectivity matrix. The Epileptor is a taxonomy of dynamical models of seizure evolution and is able to realistically reproduce the dynamics of onset, progression and offset of seizure-like events [16,18]. The 2D reduction of VEP model is an adiabatic expression of the time-scale separation which is obtained by averaging some of the variables in the detailed VEP model. The model equation for the 2D VEP are given as: Depending on the value of excitability parameter η, the 2D Epileptor exhibits different stability regimes. The details regarding linear stability analysis and parameter space exploration of 2D Epileptor are provided in [1,18]. For η < η C , a trajectory in the phase plane is attracted to the single stable fixed-point of the system on the left branch of the cubic x-nullcline. In this regime, the Epileptor is said to be healthy, meaning not triggering epileptic seizure without external input. As the value of η increases, the z-nullcline moves down and a bifurcation occurs at η = η C corresponding to a seizure onset. For η > η C , the system exhibits an unstable fixed-point allowing a seizure to happen (the Epileptor is said to be epileptogenic and corresponding brain region is referred to as EZ). Isolated nodes display a bifurcation at the critical value η C = −2.05 [1,18]. In this study, we use the 2D VEP model for Bayesian inference of spatial map of epileptogenicity to reduce the computational cost associated with the model parameter estimation. The 2D reduction allows for faster inversion while enabling us to predict the envelope of fast discharges during the ictal seizure states [1,18].
Based on the above dynamical properties, the spatial map of epileptogenicity across different brain regions comprises the excitability values of EZ (high value of excitability), PZ (smaller excitability value) and all other regions categorized as healthy zone (HZ). Note however, that an intermediate excitability value does not guarantee that the seizure recruits this area as part of the PZ, because the propagation is also determined by various other factors including connectivity and non-linear brain state dependence.

Probabilistic whole-brain model
The key component in constructing a probabilistic virtual brain model within a Bayesian framework is the generative model. Given a set of observations, the generative model is a probabilistic description of the mechanisms by which observed data are generated through some hidden states and unknown parameters [56,57]. In this study, the generative model therefore has a mathematical formulation guided by the dynamical model that describes the evolution of model's state variables, given parameters, over time. This specification is necessary to construct the likelihood function [58,59]. The full generative model is then completed by specifying prior beliefs about the possible values of the unknown parameters which are the spatial map of epileptogenicity [29] and the other auxiliary parameters of the generative model. The BVEP brain model presented in this study is built upon two main steps. First, the VEP model equation that provides the basic form of the data generative process describing how the epileptic seizures are generated. Second, the linear combination of the source signals which are observed with independent measurement errors at the sensor level. The generative model in the BVEP is formulated based on a system of nonlinear ordinary differential equations of the form (so-called state-space representation): where, N denotes the number of nodes, is a 2N-dimensional vector of system's states evolving overtime, x(t = 0) = x t0 is the initial state vector at time t = 0. Here, the set of unknown parameters θ ∈ R p , where p = 3N + 2 (N parameters corresponding to excitability, 2N parameters corresponding to x t0 , the coupling parameter K and the slow timescale parameter τ 0 ), contains all the unknown parameters of the VEP model to be estimated. In addition, y(t) ∈ R M denotes the measured data subject to the measurement error v(t), where M denotes the number of electrodes. The measurement noise denoted by v(t) ∼ N(0, σ 2 I) is assumed to follow a Gaussian distribution with mean zero and unknown variance σ 2 . Moreover, f(.) is a vector function that describes the dynamical properties of the system and h(.) represents a measurement function which is represented as h(x) = a Gx + b, where G represents the gain matrix of dimension M × N, a ∈ R + and b ∈ R M represent scale and offset parameters, respectively. In this paper, we shall be considering the inference of the model given by the equations combining equations (1) and (2) and given as the following: where x 1,t ∈ R N represents the vector denoting the activity of the fast-variable x 1 at time t. Therefore, the total number of parameters to be estimated is 3N + M + 4. In Bayesian inference, we seek the posterior density P(θ|Y), which is the conditional distribution of model parameters given the observations [27,30]. Bayes's Theorem expresses this posterior density in terms of likelihood and prior as follows: where the denominator P(Y) represents the probability of the data and it is known as evidence or marginal likelihood (in practice amounts to simply a normalization term [24]). We employ the NUTS, a self-tuning variant of HMC algorithm in this paper to sample from posterior density P(θ|Y) of the model parameters.
The performance of HMC is highly sensitive to the step size and the number of steps in leapfrog integrator for updating the position and momentum variables in Hamiltonian dynamic simulation [37]. If the number of steps in the leapfrog integrator is chosen too small, then HMC exhibits an undesirable random walk behaviour similar to Metropolis-Hastings algorithm, and thus algorithm poorly explores the parameter space. If the number of leapfrog steps is chosen too large, the associated Hamiltonian trajectories may loop back to a neighbourhood of the initial state, and the algorithm wastes computation efforts [37,39]. NUTS extends HMC with adaptive tuning of both the step size and the number of steps in leapfrog integration to sample efficiently from posterior distributions [37,39,54].

Mathematical formulation of probabilistic model
We propose a Bayesian estimation of the model parameters involved in the model given by the VEP equations in its 2D formulation using the NUTS algorithm. Let us denote the observed data by . . , T}, and the parameter vector by Θ. We denote the matrix of eigenvectors of G ⊤ G as V. Then, we consider the following independent priors on the regional parameters in the reparameterized space: and the following independent prior densities for the other auxiliary parameters in the model: We consider the non-centered parameterization for all the parameters for efficient sampling from the respective posterior distributions [60].
To incorporate the information that only few regions are epileptogenic, the mean and standard deviation of the prior distribution on η is considered in such a way that apriori all the brain regions are healthy. This choice of prior for the case of partial epilepsy is practically meaningful because with such a prior knowledge, a region is considered to be healthy unless there is a sufficient evidence based on the data contradicting this assumption and hence, it reduces the false positives. Also, the following relation in the prior means of the regional variables should be considered so that the HMC exploration happens near the stable fixed-points of the 2D Epileptor equations and the HMC sampler start sampling in a physically meaningful space of the high-dimensional parameter space. We consider (Vµ z * , Vµ x * ) to be the fixed-points for a healthy region with parameter Vµ η * . In the analyses shown in the paper, we start the likelihood calculation after ten time points which reduces the dependence between the regional variables x init , z init and η and shows better mixing of the HMC chain. The fixing of x init has also been considered in some analyses to reduce computational time as we need to estimate N fewer parameters in that case. This is accurate as there is only one stable fixed-point in the 2D epileptor phase-space for the healthy regions and hence, the stable fixed-point is reached quickly if the initialization is close to the fixed-point in the phase-plane. In the simulation studies, we found that the fixing of x init does not introduce large bias and EZ, PZ and HZ are identified well. Also, in the data analysis, we take the number of timepoints in ictal seizure state in such a way that the prior density placed on τ 0 , the time-scale of the system, is appropriate. For example, we consider the mean of the prior close to 10 with low standard deviation when approximately 150 timepoints are considered and mean of the prior close to 75 with low standard deviation when approximately 500 timepoints are considered. We perform prior predictive checks to fix the mean and standard deviation of τ 0 . The posterior distribution of Θ can be written as: where π(Θ) = π(x init )π(z init )π(a)π(b)π(σ)π(K)π(τ 0 ) denotes the prior density of the model parameters Θ.

Validation on synthetic SEEG data
In order to study the performance of the proposed reparameterization approach using The Virtual Brain (TVB; [61]), we generate a dataset at the source level from the VEP model (see equation (1)) by considering the structural connectivity and the gain matrix of a real patient. While generating the data, we consider the Euler-Maryama integration with dt = 0.1 and τ 0 = 10.0, a = 1, K = 1. The offset parameters are generated from multivariate normal distribution with mean 10 and covariance matrix as I M . The structural connectivity considered during the data generation corresponds to the anatomical information of a real patient with partial epilepsy and is shown in figure 2(A). Then, we consider the gain matrix of the same patient having dimension 161 × 162, and multiply it with this generated source level data (see equation (3) and supplementary equation (1.3)). The sensor level data is generated by adding an independent Gaussian noise with zero mean and standard deviation σ = 3.0 at every time point and for every sensor. We consider two of the regions to be EZ's and four regions to be PZ's while rest of the regions are considered to be HZ's. The data considered for the estimation contains a seizure envelope and the number of timepoints considered are 130 for each sensor. The excitability parameters for the EZ, PZ and HZ are considered to be −1.6, −2.4 and −3.5 respectively. The prior mean and standard deviation of the excitability parameters for all the regions are considered to be −3.5 and 0.1, respectively. The parameters of the prior distribution considered for this analysis and as described by equations (4) and (5)  There is no misclassification of the HZ as EZ or PZ and vice versa. We also show the violin plots for η * in figure 4(B). It can be seen that the reparameterization makes the HMC to explore the first few linear combinations while the rest of the weakly estimable combinations are very close to their prior distributions. This is due to the low-rank of the gain matrix and practically, this is because many of the regions are far from the sensors and are not anatomically connected to any region which is close to the sensors. We also show the posterior samples obtained from an MCMC chain for two regions: an EZ and a PZ in figures 5(A) and (B). In terms of diagnostics, there are no divergences in the generated samples. TheR values (the well-established diagnostics for convergence on Monte Carlo sampling) for the posterior samples corresponding to all the parameters are less than 1.1 indicating reliability of the estimates based on the convergence for the posterior distributions of the parameters and hidden-state variables. The run-time of the algorithm with 1000 warmup and 250 sampling iterations using a computer equipped with 8 GB RAM and 3.60 GHz Intel Core(TM) i7-7700 processor is approximately 12 h. Therefore, by placing appropriate priors combined with the gain matrix based reparameterization, the HMC algorithm can efficiently and accurately discriminate between the epileptogenic and healthy regions. In the right column of figure 5, the posterior samples obtained without using the reparameterization are shown. It can be seen from the posterior samples, that even with large warmup iterations, the mixing between the generated samples is very low as can be seen for an EZ, and for a PZ, many generated samples get stuck at the same value. This results in large autocorrelation between the samples rendering the effective sample size per iteration very low (e.g. it is less than 25 for more than 80% of the excitability parameters). The plots for log-probabilities are also shown in the figures 5(C) and (F), and again it can be seen that the reparameterization based approach improves the mixing in the log-probabilities and shows faster convergence to the typical set.
It can be concluded that the reparameterization based approach solves the issues considered without reparameterization in terms of computational time as well as the diagnostics. Therefore, the proposed novel approach based on reparameterization should be considered for efficient estimation of the parameters of the  VEP model. The consideration of priors also play a critical role in removing some inherent degeneracy in the model and makes the parameters identifiable. Some examples considering these issues are shown in the next subsections.

Robustness with respect to Laplace prior
In figure 6, we show the violin plots of the posterior distribution of η when we consider the prior distribution of η as Laplace with mean −3.5 and scale parameter 0.1. The data considered for this analysis is the same as in section 3.1 and the prior distribution for other parameters is the same as in section 3.1. It can be seen that

Model identifiability using prior
We consider the effect on the posterior distribution of η due to the change in its prior mean. The model is non-identifiable for η for the regions which are HZ and the offset parameter b. This is due to the constant

Bayesian inference of empirical SEEG data
While modelling the empirical SEEG data using the VEP model (equations (1), (3) and supplementary equation (1.3)), it should be noted that the dynamics explained by the model captures the envelope of the hidden-state variables and hence, the envelope of SEEG observations. Therefore, in the pre-processing step, we consider filtering of the data before analysing to remove very high frequency and very low frequency signals. Then, we extract the envelop in the SEEG time-series and consider the logarithm of the time-series of the envelope as the target of estimation. We consider the connectivity and gain matrices for the patients as discussed in supplementary sections 1.1 and 1.4. We illustrate the reparameterization method as applied to a real SEEG dataset obtained from a partial epilepsy patient. We run four parallel HMC chains to obtain the posterior samples in this case. For this particular analysis, we consider the prior distribution of η as Gaussian with mean −3.0 and sd 0.3. The mean estimated source activities along with the violin plots of the posterior distribution of η for all the regions are shown in figure 8. We next select the probable EZ's in this case by choosing only those regions for which P(η ⩾ −2.05) > 0.25 and show the posterior distributions of excitability parameters using violin plots for these selected regions. For this specific dataset, based on our methodology and the obtained HMC samples, there are six brain nodes which have at least 0.25 probability to be epileptogenic. Two of the identified regions (Right Amygdala and Right Thalamus) have more than 0.95 probability to be epileptogenic. There are three EZ's and eight PZ's identified by the clinicians for this patient.  Gyrus and Right Hippocampus anterior have substantial posterior probabilities of η < −2.05, and hence, not to be EZ. The mean estimated phase-planes corresponding to an identified EZ and an identified HZ are shown in figure 9. The practical application of the methodology creates additional issues. We discuss the issue of multi-modality in the posterior distribution of the parameters in supplementary section 1.9. The consideration of sparsity in the number of EZ's using priors is discussed in supplementary section 1.8.

Group analysis
We consider the SEEG data obtained from nineteen epileptic patients who underwent standard clinical evaluation at La Timone hospital in Marseille for the study. The details of the patients considered for the analysis are provided in supplementary table 3. We compare the results obtained from the proposed methodology with the clinical diagnostics. We consider the same prior density for the excitability parameters for all the regions meaning that there is no assumed difference between the excitabilities corresponding to different regions a priori. Also, the prior distribution for excitabilities in all the analyses are considered such that a priori all the regions are considered to be healthy with at least 0.95 probability. We run four parallel HMC chains for each dataset. The convergence diagnostics of the chains are assessed and no divergences are obtained in the considered chains. The within-chain split-R values [62] are also less than 1.5 for all η in most of the chains which is acceptable for such complicated multimodal posterior distributions [63]. To compare the results, we combine the obtained posterior distribution of the excitabilities and consider a region i to be  epileptogenic for a patient if the probability of its excitability is given by P(η i > −2.05) > 0.25. The cutoff −2.05 corresponds to the critical value of an isolated node given 2D Epileptor dynamics. We show the scatter-plot of the number of identified EZ's by the proposed methodology against the number of identified EZ's based on the clinical diagnostics for each of the analysed patient in figure 10(A). It can be seen that generally the numbers of identified EZ's from our methodology are more than those based on the clinicians' diagnostics. This may be due to the consideration of all the regions in our estimation unlike the selection of the regions among only those few regions which are close to the electrodes by the clinicians. These estimates can help the clinicians in identifying some unexplored regions which could be epileptogenic and where the electrodes could not be implanted. In figure 10(B), we plot the proportion of the number of regions identified as EZ by our inference methodology among different groups of regions classified as EZ, PZ and HZ according to the clinical diagnostics for patients belonging to Engel I (n = 10), Engel II (n = 2), Engel III (n = 4) and Engel IV (n = 3). The Engel scores are based on the outcome of the surgery: (a) Engel I: patients who are seizure free, (b) Engel II: patients who show rare disabling seizures, (c) Engel III: patients with minimal improvement and (d) Engel IV: patients with no improvement. Every index in the plot represents a patient and the plotted points represent the proportions. The round dots, the triangular dots and the '+' dots show the proportion for the clinically classified EZ's, PZ's and HZ's, respectively. It can be seen from the figure that in general, the proportion of regions identified as EZ based on the methodology which are clinically identified as HZ is very low showing that the consideration of high probability of the prior density helps in lowering the false positive rates. It can be observed that the proportion of EZ's identified from our methodology is slightly higher among the clinically identified EZ's for Engel-I patients with respect to other groups of patients but general statistical conclusions should not be drawn because of the low sample size of the number of patients in different groups. From figure 10(C), it can be observed that some of the clinically identified PZ's are identified as EZ's using the methodology. The proposed BVEP workflow with the reparameterization method can help the clinicians to examine these regions more closely before surgery.
It can be seen that obtained estimates match some of the clinically identified EZ's in most of the patients which shows that the personalized BVEP model can be considered to model the epilepsy propagation in the brain. Though, as a precaution, we recommend running multiple parallel chains and including some clinical information in the prior to make better use of the proposed methodology in real life therapeutic applications. The methodology may therefore help in identifying the epileptogenic regions for the patients based on the SEEG data which in turn can be considered to be resected during surgery to stop the recurrence of seizures in the patients.
The results obtained from the group analysis show that the proposed methodology can be a highly valuable alternative to the model-free approach employed by the clinicians to identify the EZ's. In the model-free approach, the EZ's are identified as the brain regions which are close to the electrodes showing large activities during the seizure. The model-based approach used in BVEP is different from the model-free approach considered by the clinicians due to the following reasons: (a) the VEP models the activity of all the regions in the brain and hence, even the subcortical EZ's can be identified in contrast to the model-free approach where only the brain regions which are close to the electrodes can be identified as EZ's. (b) The Bayesian estimation technique is able to quantify the probability of being epileptogenic for each of the brain regions. This probabilistic quantification, which enables one to identify different modes of getting the same fit, is missing from the model free approach. Therefore, the regions identified by the clinicians and VEP estimates could be different and moreover, each of the brain regions is assigned a probability of being epileptogenic in the BVEP model. The BVEP estimation is also highly valuable when the surgery fails and VEP discovers more regions with associated probabilities of being EZ, rather than clinicians making binary estimation.

Discussion
This work is an attempt to merge the theoretical understanding of the whole-brain dynamics with the physically measured data using state-of-the-art advances in PPLs and Bayesian inference algorithms. We have provided an efficient probabilistic methodology to infer the posterior distribution of the model parameters involved in the biophysically realistic VEP model of epilepsy spread in the brain. The estimated time-series provides good fit on the extracted envelope from the raw SEEG data which makes the methodology more valuable. The prior distribution constraints the model parameters to be in a biophysically relevant space and the obtained posterior distribution quantifies the uncertainty in the regional excitability parameters. The proposed reparameterization method enables the accurate estimation of the uncertainty by efficiently generating samples from the complicated posterior distribution of the model parameters. The estimated joint distribution of the excitability parameters provide the probabilities of different plausible sets of regions to be identified as epileptogenic together. The obtained estimates also inform about the nodes whose activities could not be observed due to their low connectivity with the regions close to the electrodes. The posterior distribution of the excitability parameters of these regions are similar to their corresponding prior distributions.
We provide a link between the virtual brain models, personalized treatment and systematic Bayesian inversion in PPLs. The accuracy and the reliability of the estimation is carefully investigated by the HMC diagnostics. The virtual brain models combine the anatomical connectivity with mathematical formulation of brain activity. However, the main challenge lies in inferring the system dynamics explained by the model during the activity. In this study, dynamical systems provided useful tools for the inversion of slow-fast system dynamics having incomplete observations in the phase-plane. For instance, η is the bifurcation parameter in the Epileptor model, i.e. small changes made to its value causes a sudden change in the system behaviour. This allows us to classify all the brain regions in three categories based on the values of η and the structural connectivity. The knowledge that the considered system exhibits only one fixed-point in the phase-plane allows us to avoid the mismatch between the observed time-series and the estimated time-series without losing the biophysical relevance of the model. We incorporate this knowledge systematically in the model inversion by constraining the activity using the prior density over the regional parameters. The priors for initial values (x init , z init ) and spatial parameter mask η are considered in such a way that the activity of the fast variable remains close to its stable fixed-point during the seizure for the healthy regions. The initialisation of the time-series of hidden state-space variables close to the steady-state avoids the initial transition in the time-series. This illustrates the important role which the system dynamics can play in guiding the estimation methodology for the physically relevant inversion of high-dimensional complex models.
The multimodality in the posterior distribution necessitates the application of MCMC methods compared to maximum a-posteriori estimation (MAP) and VI as shown in figure 6(C) of the supplementary. The application of other methodologies for identification of EZ/PZ such as Approximate Bayesian Computation (ABC)-related methods are not feasible for application in virtual brain models with large number of parcellations as these methods suffer from the curse of dimensionality which means that the number of simulated samples needed to provide a good estimate of the posterior can be prohibitively expensive. The reduction of the data to low-dimensional summary statistics and the distance tolerance discard some of the information about the parameters in the data which reduces the quality and accuracy of inference. Moreover, for such non-linear models, it is challenging to extract low-dimensional summary statistics which is sufficient statistic for the unknown parameters. Additionally, the ABC methods are sensitive to a threshold value to accept or reject samples. This makes the application of ABC related methods prohibitive in high-dimensional non-linear models. The NUTS algorithm considers the gradient information which avoids random-walk in high-dimensional space and allows it to converge to high-dimensional target distributions much more quickly. Therefore, the estimation using self-tuning and gradient-based NUTS algorithm should be considered when such non-linear large network models are applied to real datasets. However, as the proposed methodology is personalized and each seizure of every patient is analysed separately, group-level information regarding epilepsy is not incorporated in the method. Therefore, hierarchical Bayesian methods for group analysis should be considered in future studies to combine information obtained from several patients across multiple seizures.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.