A Bayesian framework for the analysis of systems biology models of the brain

Systems biology models are used to understand complex biological and physiological systems. Interpretation of these models is an important part of developing this understanding. These models are often fit to experimental data in order to understand how the system has produced various phenomena or behaviour that are seen in the data. In this paper, we have outlined a framework that can be used to perform Bayesian analysis of complex systems biology models. In particular, we have focussed on analysing a systems biology of the brain using both simulated and measured data. By using a combination of sensitivity analysis and approximate Bayesian computation, we have shown that it is possible to obtain distributions of parameters that can better guard against misinterpretation of results, as compared to a maximum likelihood estimate based approach. This is done through analysis of simulated and experimental data. NIRS measurements were simulated using the same simulated systemic input data for the model in a ‘healthy’ and ‘impaired’ state. By analysing both of these datasets, we show that different parameter spaces can be distinguished and compared between different physiological states or conditions. Finally, we analyse experimental data using the new Bayesian framework and the previous maximum likelihood estimate approach, showing that the Bayesian approach provides a more complete understanding of the parameter space.

blood pressure, arterial oxygen saturation and/or partial pressure of CO 2 . 45 The models are driven with input signals, such as the blood pressure and/or oxygen 46 saturation, and simulate brain tissue measurments of oxygenation, blood volume and  Simplified structure of a typical BrainSignals model A typical BrainSignals model can be split into four compartments or submodels. The blood flow submodel represents blood flow from arteries to veins via the capillary bed and the oxygen transport submodel estimates diffusion of dissolved O 2 from the capillary blood to the brain tissue. Delivered oxygen is then utilised by the metabolism submodel. Finally, the measurement submodel translates the internal states of the blood flow and metabolism submodels into observable outputs. Model inputs are shown in red and consist of arterial blood pressure (ABP), arterial oxygen saturation (SaO2), partial pressure of CO 2 (PaCO2) and a parameter specifying relative demand, whilst measurable outputs are shown in blue, including NIRS signals as well as middle cerebral artery velocity (Vmca) and cerebral metabolic rate of oxygen (CMRO 2 ). 51 the models is to fit the model simulations to clinical and experimental data and 52 investigate how model parameters are affected. In the case where data is collected from 53 an injured or sick patient, these changes may illuminate what the underlying 54 causes/mechanisms are behind the illness or injury is. For example, Caldwell et al. used 55 BrainPiglet v2.0 to investigate why two piglets showed different recoveries following 56 hypoxia-ischaemia, finding that the differences could be explained by including cell 57 death within the model [6]. 58 The models are currently fit using a maximum likelihood based method, with a 59 single value obtained for each parameter. Sensitivity analysis performed on the models 60 to determine which parameters are most important in influencing each model output for 61 any particular dataset. These parameters are then optimised using the PSwarm 62 method [9] to minimise a given error metric, such as the Euclidean distance, between 63 the modelled and measured signals. Through this each output has a set of optimised 64 parameter values. Parameter values were limited to the same ranges used in the 65 sensitivity analysis [6]. 66 This approach has a number of drawbacks. The models are mechanistic and, if fitted 67 to single value parameter estimates, will produce the same output for the same input. 68 Physiology and biology, however, is unlikely to operate in such a constrained manner. In 69 effect, the maximum likelihood approach, which produce single value estimates of 70 parameters, can lead to overfitting of the model to the data. Additionally, this set of 71 best-fit parameters for the model may not be representative of the full parameter 72 space [10]. In an attempt to try and compensate for this potential drawback, Caldwell 73 et al. [6] fit the BrainPiglet model multiple times for two different piglets and found 74 that, whilst parameter values can vary within the same data, separate parameter spaces 75 for each piglet did seem to exist based on the brain physiological status of the piglet 76 following a hypoxic-ischaemic insult. 77 One of the key ways in which these models are used to extract information from 78 data is through the use of parameter estimation and fitting. However, this step remains 79 a difficult mathematical and computational problem, potentially originating in the lack 80 of identifiability [11]. In addition, there has been discussion of 'universal sloppiness' 81 within dynamic systems biology models. Gutenkunst et al. [12] proposed that 82 sloppiness, where the parameters of a dynamic model can vary by orders of magnitude 83 without affecting model output, is a universal property of systems biology models. Due 84 to this sloppiness, it may not be possible to make parameter estimations that can be 85 used to make inferences about the system [12,13]. Chis et al. have stated however that 86 sloppiness is not equivalent to a lack of identifiability and that a sloppy model can still 87 be identifiable [14]. Apgar et al. note that experimental design can be used to constrain 88 a sloppy parameter space by choosing a set of complementary experiments [15].

89
The use of a Bayesian methodology, by avoiding point estimates, can allow the full 90 uncertainty of the problem to be captured [10]. In fact, the use of an Approximate 91 Bayesian Computation (ABC) approach, discussed below, is particularly well suited to 92 these kinds of problems [16]. There are many examples of Bayesian methods being used 93 to analyse bioinformatics data and systems biology models [17], including in sequence 94 analysis [18], gene microarray data [19] and in models of genetic oscillators [20] and 95 DNA network dynamics [21]. There are a number of models that take a systems biology 96 approach towards understanding physiology, particularly oxygen transport and blood 97 flow, including the previously mentioned BrainSignals [2,4,6] and BrainPiglet [5,6] 98 models, the Aubert-Costalat model [22], and work by Fantini [23][24][25][26] and Orlowski and 99 Payne [27,28] where Bayesian parameter estimation could also be applied but has yet to 100 be.

101
Bayesian inference utilises Bayes' rule, where p(y) = θ p(θ)p(y|θ) dθ is the marginal probability of y and p(y|θ) is the likelihood. 102 Typically, p(y) is not known and the likelihood will not be known explicitly or may require marginalising over some values of θ. This often leaves the solution analytically 104 intractable. Instead we can try solve for p(θ|y) using a Monte Carlo or Markov Chain

106
Where a likelihood function can be defined there are a number of these methods that 107 can be used to infer a posterior distribution, p(θ|y). The simplest is the Gibbs 108 Sampler [29], which in its most basic form is a special case of the Metropolis-Hastings 109 algorithm [30]. If a likelihood expression is unobtainable, as is the case with the 110 BrainSignals models, a likelihood-free approach using ABC is required instead. There 111 are a number of different methods available with the simplest being the ABC rejection 112 algorithm (ABC REJ) approach.

113
The aim of this paper is to introduce the new BayesCMD modelling platform that 114 can be used in systems biology models of physiology such as the BrainSignals models, 115 but that can be replicated beyond these. For this work, we have chosen to use ABC 116 REJ as whilst it is less efficient than the other methods mentioned here, the simplicity 117 with which it can be implemented is a significant factor. The models and modelling 118 environment used are already complex and so this initial work focuses on the use of the 119 simplest method as proof of utility. We will demonstrate the effectiveness of this 120 approach by using it to analyse two simulated datasets chosen to represent healthy and 121 impaired brain states, before then using it on experimental data from a healthy subject 122 undergoing a hypoxia challenge. We will show that the Bayesian approach allows us to 123 extract more information from our data than the previous maximum likelihood 124 approach, with a more complete picture of the parameter space being obtained.

131
In this work we have chosen to use the refactored BrainSignals model [7], with a minor 132 modification to include the haemoglobin difference (∆HbO 2 − ∆HHb = ∆HbD) as a 133 model output alongside the normal outputs of oxyhaemoglobin (∆HbO 2 ), 134 deoxyhaemoglobin (∆HHb), total haemoglobin (∆HbO 2 + ∆HHb = ∆HbT), tissue 135 oxygenation index (TOI), and cytochrome-c-oxidase (∆CCO). Both ∆HbD and ∆HbT 136 are included in the experimental dataset due to them being good indicators of brain 137 oxygenation changes and brain blood volume changes respectively, with both being 138 easily measured using broadband NIRS. All NIRS outputs, except TOI, are measured as 139 changes relative to an initial value and therefore both data and model outputs are Three datasets were used to test the new Bayesian model analysis process. Firstly, 143 'healthy' data was simulated using the BrainSignals model with the default parameter 144 settings, as per [2,4]. Next, the same inputs were used but with the model modified to 145 represent an 'impaired' brain. To do this, a single parameter was changed to reflect a 146 potential pathology or injury, to generate an 'impaired' simulated dataset. Finally, we 147 used experimental data from a healthy adult undergoing a hypoxia challenge.

149
Partial pressure of CO 2 (PaCO 2 ) and arterial blood pressure (ABP) were kept at their 150 baseline values of 40 mmHg and 100 mmHg respectively, whilst arterial oxygen 151 saturation (SaO 2 ) was varied to simulate hypoxia through a decrease in arterial oxygen 152 saturation from 97% to 65%. Initially, all model parameters were kept at their default 153 values in order to simulate a healthy brain's response to this challenge. Figure 3 shows 154 the arterial saturation data and the model response across all considered model outputs. 155 After simulating the healthy brain response and determining its posterior parameter 156 distribution, the model was altered to include a pathological or impaired brain state.

157
This was done by changing a single parameter to be outside of the healthy parameter 158 space. r t, which affects the shape of the muscular tension relationship, was found to 159 be sensitive in both the sensitivity analysis process (see Section and the Bayesian 160 analysis. This is clearly seen in its comparatively narrow marginal posterior for the 161 healthy data. Stiffening of blood vessels in the brain has also been noted as a 162 potentially important factor in a number of different pathologies, including 163 Alzheimers [31], and in autoregulation, as seen in 4 .

164
The muscular tension relationship is defined as where T m is the muscular tension within the vessel wall and has a bell-shaped 166 dependence on the vessel radius, taking value T max at some optimum radius r m . r t and 167 n m are parameters determining the shape of the curve. Figure 4a illustrates the effect 168 of changing r t on the shape of the curve and shows that decreasing r t leads to increased 169 muscular tension for the same vessel radius due to a widening of the bell-shaped curve. 170 This can be seen to represent a stiffening of vessels within the brain.

171
Changing r t has a significant effect on the brain's ability to autoregulate within the 172 model as seen in figures 4b, 4c and 4d. Figure 4b shows that higher blood pressure 173 causes a decrease in cerebral blood flow (CBF) for lower r t values, as opposed to an 174 increase at the normal value of r t = 0.018 cm. Figure 4c shows that CBF is lower and The input variable of arterial oxygen saturation is shown in blue and is the same for both simulations, whilst the outputs of TOI, ∆HbO 2 , ∆HHb and ∆CCO clearly differ between the two brain states. decreases quicker for lower r t values as PaCO 2 is decreased and figure 4d shows that 176 across all considered oxygen saturations, lower r t gives a lower CBF.

177
Figures 3f)-j) shows the model response across all considered model outputs for this 178 impaired brain state. The response of the model outputs to the same change in arterial 179 saturation is much smaller than in the healthy simulation, with the TOI having a lower 180 baseline value of around 45% as compared to around 75%.

181
Experimental Data 182 Experimental data will inherently contain more uncertainty for parameter fitting than 183 data generated by the model itself. This makes it important to test the Bayesian 184 analysis process on experimental data as well as that simulated from the model. The 185 data used was originally collected by Tisdall et al. [32] and is shown in figure 5. Healthy 186 adult humans had their arterial oxygen saturation reduced from baseline to 80%, whilst 187 minimising changes in end tidal carbon dioxide tension (EtCO 2 ).

188
The dataset contains three model inputs: arterial oxygen saturation, end tidal CO 2 189 and arterial blood pressure, with EtCO 2 converted to partial pressure of CO 2 . Blood 190 pressure data was filtered using a low pass 5th order Butterworth filter, with a cut off of 191 0.05 Hz, to remove noise.

192
In terms of model outputs, only NIRS signals were used: ∆HbD, ∆HbT, ∆CCO and 193 TOI. All data was resampled to 1 Hz.

Sensitivity Analysis
195 When fitting a model as complex as BrainSignals, it is important to reduce the number 196 of parameters that are required to be fit. We expect that not all parameters will have a 197 significant impact on the model output for given set of input data. Instead, we can 198 attempt to reduce the number of considered parameters through sensitivity analysis. 199 We used the Morris method [33,34], which is known to work well with a large number of 200 parameters. The method requires the time series to be reduced to a single number and 201 identifies the parameters that have produce the most variance in this summary value.

202
Previously, we have used the Euclidean distance over the whole time series as our 203 summary value but this has a number of significant drawbacks.

204
If the summary measure is the distance across the whole time series, we're failing to 205 capture specific changes that we know to be physiologically important. In the case of 206 our hypoxia simulation, for example, we want to select parameters that are important in 207 controlling the overall change from baseline. Taking the Euclidean distance over the 208 time series as a whole however does not prioritise this behaviour. Figure 6a shows data 209 generated from the same toy model function where a, b are both model parameters and is random Gaussian noise.

211
Assume that we are trying to identify the parameter that is most important for 212 determining how sinusoidal our model output is because we want to fit our model to 213 this data. We decide to undertake sensitivity analysis, using a distance measure of some 214 kind as our summary statistic that we can then pass to the sensitivity analysis method 215 and the parameter found to be most sensitive will be the one we fit.

216
To generate our data x was varied from 0 to 2π, producing data y 0 , y 1 and y 2 for the 217 parameter sets Θ 0 : a = 0, b = 0, Θ 1 : a = 1, b = 0 and Θ 2 : a = 0, b = 0.707 respectively, 218 with output y 0 and parameters x 0 being our baseline. This is seen in figure 6a. It is 219 clear from the figure that the two outputs y 1 and y 2 show very different behaviour, with 220 y 1 being the behaviour we want to optimise for.  Figure 4a shows the effect of different r t values on the shape of the muscular tension curve for a range of vessel radii. It can be seen that reducing r t widens the curve, leading to increased muscular tension for the same vessel radius. Figures 4b, 4c and 4d show the effect of both increasing and decreasing model inputs on cerebral blood flow for different values of r t . Cerebral blood flow (CBF) is given as a proportion of the normal CBF (40 ml 100g −1 min −1 ). Changing r t has a significant effect on the brain's ability to autoregulate within the model. Figure 4b shows that higher blood pressures causes a decrease in cerebral blood flow for lower r t , as opposed to an increase at the normal value of r t = 0.018 cm. Figure 4c shows that for lower r t values, CBF decreases quicker as PaCO 2 is decreased. Figure 4d shows that across all considered oxygen saturations, lower r t gives a lower CBF.   Figure 6a shows data generated from the same test function y i = a sin(x) + b + , where a, b are both model parameters and is random Gaussian noise. x was varied from 0 to 2π, producing data y 0 , y 1 and y 2 for the parameter sets Θ 0 : a = 0, b = 0, Θ 1 : a = 1, b = 0 and Θ 2 : a = 0, b = 0.707 respectively. Despite both y 1 and y 2 being qualitatively very different they are very similar when summarised using only the Euclidean distance, with y 1 having a Euclidean distance ε euc,1 = 10.01 and y 2 having a Euclidean distance ε euc,2 = 10.03. If we instead look at the scaled baseline-to-peak (SBTP) distance we find that y 1 has a SBTP distance ε SBTP,1 = 49.98 and y 2 has a SBTP distance ε SBTP,2 = 0.09. Figure 6b illustrates how the scaled baseline-to-peak distance is defined. The baseline-to-peak distance is the absolute distance from the baseline to max ({|y max |, |y min |}), This is then divided by the range of the data to get the distance as a proportion of the total change seen within the data. In this example, baseline-to-peak distance is 4.81 and the range is 6.63, giving a SBTP distance of 0.726.
Despite both y 1 and y 2 being qualitatively very different they are very similar when 222 summarised using only the Euclidean distance, with y 1 having a Euclidean distance 223 ε euc,1 = 10.01 and y 2 having a Euclidean distance ε euc,2 = 10.03.

224
Instead we can define a new summary measure, which we will call the "scaled 225 baseline-to-peak" (SBTP) distance. We know that we want to find the parameter that 226 determines how sinusoidal our model is. One way to emphasise this behaviour is to find 227 the distance from our baseline to the maximum or minimum (whichever has the largest 228 absolute value) of our data, as illustrated in Figure 6b. We then scale this by the range 229 of our data to normalise it and avoid issues comparing data of different magnitudes.

230
This gives us We then find the Euclidean distance between SBT P (y 1 ) and SBT P (y 2 ) If we use this Euclidean distance between scaled baseline-to-peak (SBTP) distances, 233 ε SBT P , we find that y 1 has a distance ε SBTP,1 = 49.98 and y 2 has a distance 234 ε SBTP,2 = 0.09. Here we see that the difference in behaviour is clearly highlighted. 235 We scale our baseline-to-peak distance because a number of model outputs 236 significantly vary over different scales. For example, cerebral oxygenation can be 237 measured through TOI which is a percentage and, as seen in figure 3 can vary over 238 10-20%. Cytochrome-c-oxidase however, varies over a much smaller range, with a change 239 of less than 1 µm being typical. Failing to account for these different scales will lead to 240 parameters that affect larger magnitude outputs being identified as more sensitive than 241 those that affect smaller magnitude outputs, even if the relative change is significant.

242
For example, if changing a parameter θ 1 causes the CCO change seen in figure 3e) to 243 double to a minimum of −2 µm, whilst a change in a parameter θ 2 causes TOI to 244 decrease to 55%, without scaling the model seems more sensitive to θ 2 because the 245 magnitude of the change is much more, even though the relative change is smaller. If we 246 consider this change proportional to the range of our data however, we account for its 247 relative size.

248
It should also be noted that this choice of metric is specific to the behaviour being 249 optimised for. For example, in the case of a signal that is non-oscillatory, a different 250 summary method would be required based around the behaviour to be replicated within 251 that particular signal. 252 We used the Morris elementary effect method [33] variant devised by Saltelli et 253 al. [35]. This provides us with two notable statistics: the mean of the absolute values of 254 the changes, µ * , and their standard deviation, σ. The larger the value of µ * , the more 255 influential parameter is on the output, whilst the larger the standard deviation, the 256 more non-linear the influence of the parameter is. The top ten most sensitive 257 parameters were chosen to fit the model. The parameter range considered for sensitivity 258 is the default value ±50%. Sensitivities are calculated for each output as well as across 259 all outputs jointly. This joint sensitivity is calculated by summing the SBTP value for 260 each output and then determining variability in this total. After selecting the most important parameters, the model was fit using the rejection 263 algorithm [36]. This is defined, as per [37], as: 264 1. Sample a candidate parameter vector θ * from the proposal distribution p(θ). 265 2. Simulate a dataset y rep from the model described by a conditional probability 266 distribution p(y|θ * ). 267 3. Compare the simulated dataset, y rep , to the experimental dataset, y, using a 268 distance function, d, and tolerance, . If d(y, y rep ) ≤ , accept θ * . The tolerance 269 ≥ 0 is the desired level of agreement between y and y rep .

270
The output of the ABC algorithm used will be a sample from the distribution 271 p(θ|d(y, y rep ) ≤ ). If is sufficiently small, then p(θ|d(y, y rep ) ≤ ) will be a good 272 approximation for the posterior p(θ|y).

273
The choice of d(·, ·) is important, just as with the sensitivity analysis. Previously the Euclidean distance has been used to fit the model but, as in the case of the sensitivity analysis, this fails to account for outputs that vary over different magnitudes. Instead, we have chosen to include a number of other distance metrics including the root-mean-square error (RMSE) and the normalised root-mean-square error (NRMSE). These are defined as where x 1 and x 2 are the two time series being compared, running over t = 1 to t = T , 274 with T being the total number of time points.

275
By dividing the RMSE by the range of the data, the errors for time series that vary 276 over different magnitudes are comparable. Without doing this, parameters that mainly 277 affect outputs that vary over larger magnitudes are preferentially optimised.

278
Normalisation prevents overfitting of one output at the expense of others, providing a 279 more reliable joint posterior distribution after fitting.

280
After an initial exploratory fitting of the different datasets, it was found that setting 281 an absolute tolerance value was not a suitable selection criteria. This was due to 282 massively differing distance values between datasets, with all parameter combinations in 283 the simulated healthy dataset producing NRMSE values smaller than almost all 284 parameter combinations on the impaired dataset.

285
In general, the number of accepted particles that gives an adequate approximation of 286 the posterior distribution is problem dependent; dispersed posterior distributions will 287 ultimately require more particles. Poor estimation of the posterior can in most cases 288 result in a wide posterior predictive distribution which appears to give a poor quality fit 289 because outlier posterior samples cause biases. To address this issue in a pragmatic way, 290 a fixed acceptance rate of 0.01% was set. This meant the 0.01% parameter combinations 291 with the lowest d(y, y rep ) were used as the posterior. The posterior was visualised 292 through kernel density estimation on a pairplot using the Seaborn plotting package [38]. 293 The posterior predictive density is then generated by sampling directly from the 294 posterior 25 times and the model simulated for each sample. The results are aggregated 295 and plotted, with the median and 95% credible interval marked on the plot.

296
The model was run in batches of 10,000,000 and the parameter combinations within 297 the acceptance rate were used as a posterior. This batch size was chosen as a 298 compromise between sufficient sampling of the parameter space and the computational 299 time required to run the batch. The quality of the fit obtained from this posterior 300 determined if the model had been run a sufficient number of times to sample the 301 posterior adequately. If the posterior predictive distribution failed to capture the 302 behaviour seen in the "true" data, then the process was repeated until a more adequate 303 fit was obtained. HbO 2 , HHb and TOI outputs. Figure 7 shows the sensitivity analysis results across all 309 four outputs individually and for the outputs considered jointly. The results are plotted 310 as bar charts, with sensitivity, as per the µ * value, on the x-axis. Table 2  Sensitivity analysis was undertaken on the experimental dataset to determine the 323 parameters to be fit. Table 3 shows the selected parameters, their respective µ * values 324 and their definitions and default values. Figure 8 shows the results across all outputs.

325
When considering all outputs jointly, the effect of n m and r m is significantly larger 326 than all other parameters, but when looking at the individual outputs it's clear that the 327 other parameters are still important, but the magnitude of the impact n m and r m have 328 on the overall variability is drastically larger.   values between approximately 10 and 1000 which is significantly smaller than the range 331 of the µ * values for TOI in the simulated data.

332
Similarly, the most sensitive parameters for HbD fall within a very small range with 333 no one parameter obviously determining the majority of the output's behaviour. In Whilst a full exploration of the parameters within the BrainSignals model is outside the 340 scope of this paper, we advise the reader to look at the original publications [4,7], and 341 provide a brief overview of some of those identified as important here.

342
A number of the parameters identified as being important for the above datasets, 343 such as R auto and mu max, are dimensionless parameters. They are often model 344 specific parameters that cannot be directly measured and instead need to be considered 345 in the context of their meaning within the model. For example, an increase in R auto 346 would mean that the autoregulatory response would become more sensitive to changes 347 in oxygen concentration. In contrast, other parameters such as Xtot, which is four times the concentration of haemoglobin, are more easily measured in an experimental or 349 clinical setting.

350
Some of the parameters identified as important are linked closely to the shape of the 351 autoregulatory response of the model and its sensitivity to changes in model inputs.

352
These are R auto and mu max in the simulated dataset, and v on, v un, R autc and 353 v cn in the experimental dataset. As we are driving the model with a changing input, 354 the identification of these parameters as important seems physiologically sensible. It  More detailed information on the exact nature of these parameters and how they 361 function within the BrainSignals model can be found in [7].  This healthy posterior was then used to define an impaired brain, as mentioned 375 above. r t was set to 0.013 mmHg and the model driven with the same inputs as the 376 healthy simulation. This "impaired" dataset was then fit using the same approach as 377 above, using the sensitivity analysis results. The model was run 30,000,000 times, with 378 the increased run number required in order to sufficiently estimate the posterior. With 379 an acceptance rate of 0.01%, a posterior was produced based on 3000 particles having  When approaching the experimental data, the criteria for a good fit were different to 392 those in the simulated dataset. With the simulated dataset, any parameters not chosen 393 for fitting would have the same value during the fitting process as during the generation 394 of the simulated dataset. In the experimental data however, it is almost certain that the 395 default values of any parameters not chosen for fitting would not have the exact same  Figure 9a shows the posteriors for healthy and impaired data based on an acceptance rate of 0.01%. Posterior are shown over the full prior range as defined in table 4 in S1 Table and table 5 in S2 Table. Figures 9b and 9c show the predicted time series data from the healthy and impaired posteriors respectively. Each posterior was sampled 25 times and the resulting runs aggregated, with the median and 95% credible intervals plotted in dark blue and light blue respectively. Figures 9d and  9e show a zoomed in view of each output in order to show the credible interval of the posterior predictive distribution. value as their biological, real-world analogue. As a result, instead of looking for a 397 perfect fit, we instead look for qualitative behaviours to be reproduced, such as the 398 periodic increase and decrease in output values due to the repeated hypoxia challenges. 399 The fitting process required 20,000,000 runs before a satisfactory fit was obtained,   Figure 10a shows the posterior distribution for the experimental data set, based on an acceptance rate of 0.01%. The posterior median is shown in black and the OpenOpt predicted value is shown in red. Posterior are shown over the full prior range as defined in table 6 in S3 Table Figure 10b shows the predicted time series for all output based on the posterior shown in figure 10a. The posterior was sampled 25 times with the resulting time series aggregated, with the median and 95% credible intervals plotted in dark and light blue respectively. Overall behaviour is reflected in the predicted trace, with 3 distinct periods of hypoxia visible as periodic behaviour within all signals. The fit obtained using OpenOpt is shown in red.
these parameters, with a median value that would be considered 'healthy'.

447
There are a number of other methods that can also be used to explore and define the 448 parameter space. The previously used maximum likelihood based method, for example, 449 can provide estimates and confidence limits of parameter values, but under the 450 assumption that the maximum likelihood estimator is normally distributed around the 451 maximum. It may also be possible to use a profile likelihood [39], but whilst this will 452 provide information about the distribution of the parameter space without assuming 453 normality, it is computationally expensive and does not take into account prior 454 information about the parameters. 455 It is acknowledged that the Bayesian approach is not without its own limitations. 456 Historically, non-trivial problems were not solvable analytically due to the high 457 dimensional integrals required. However, with the relatively recent availability of more 458 computational power, a number of algorithms and approaches are now available that 459 allow these problems to be approximated. This has seen increased uptake of Bayesian 460 approaches within the fields of systems biology and genetics, where the inherently 461 complex models and noisy data that these fields involve are particularly well suited to 462 being analysed through the Bayesian approach. As long as a statistical model can be 463 used to relate the relevant quantities, Bayesian inference can be used to give full 464 probabilistic information on all unobserved model variables.

465
One of the main drawbacks to this method is that the number of model runs 466 required to have sufficient samples in the posterior may be prohibitively high, especially 467 where the tolerance is low or the prior distribution is very different from the posterior 468 distribution.

469
This requirement for a large number of simulations for a reliable posterior is seen in 470 all of the datasets used here. For the simulated 'healthy' data, the model was run 471 10,000,000 times in order to achieve the obtained fit. In contrast, to fit the 'impaired' 472 simulated data the model was run 30,000,000 times and for the same acceptance rate 473 the accepted particles had generally higher ε NRMSE values. Finally, the experimental 474 data was only able to obtain a good posterior after being run 20,000,000 times and all 475 ε NRMSE values were significantly above those seen in the simulated datasets. This is  It should be noted that all of the obtained posterior distributions produce what are 481 considered good fits, with those obtained for the simulated datasets far more accurate 482 than we would ever expect to achieve when fitting experimental data. When looking at 483 the experimental data in particular, despite the ε NRMSE values being much higher than 484 in the simulated data, the obtained fit captures all important behaviour and 485 phenomena, with three clear hypoxia events visible in the inferred data trace.

486
More efficient methods of ABC, may alleviate the problem of requiring so many 487 model runs to obtain good posteriors. An approach based on MCMC is more efficient 488 than ABC REJ but the chain may become stuck in regions of low probability for long 489 periods of time [40]. In order to deal with this problem and also the disadvantages of 490 the rejection algorithm, an approach based on sequential Monte Carlo (ABC SMC) [37] 491 was first proposed by Sisson et al. [41], as well as Beaumont et al. [42] and Cappé et 492 al. [43]. In this approach, a number of sampled parameter values, known as particles, Developing the BayesCMD framework to use an ABC SMC approach is a key focus for 499 future work.

501
We have outlined how this new Bayesian framework for model analysis can be used with 502 models of brain haemodynamics to extract information from physiological data. A more 503 comprehensive picture of the parameter space is obtained, allowing physiological 504 conclusions to be based on a broader picture. This is most clearly seen in the 505 experimental data, where point estimates suggested that the values for a number of 506 parameters had changed significantly during fitting, whilst the Bayesian method showed 507 that the parameters were defined by a broad, roughly uniform distribution. We have 508 also shown, through the use of data simulated from the BrainSignals model in healthy 509 and impaired states, how the Bayesian approach allows us to better distinguish different 510 parameter spaces. Finally, whilst we have focussed on using the BrainSignals model 511 here, any model that can be written in a format compatible with BCMD can use this 512 method to estimate model parameters.

513
A major interest within our research group is to use these models and approaches to 514 understand and investigate further our novel measures of brain tissue physiology and 515 metabolism and how they are linked to brain injury [44,45]. In particular, we are 516 interested in neonatal hypoxic ischaemic injury. The Bayesian approach provides a 517 better representation of the parameter space and can inform a better distinction 518 between different brain states, such as between a mild and severe injury. The method 519 will also be adapted to use more efficient methods of parameter estimation, such as 520 ABC SMC, reducing the number of model runs required to obtain a given tolerance.