Computer Methods and Programs in Biomedicine Bayesian parameter estimation in the oral minimal model of glucose dynamics from non-fasting conditions using a new function of glucose appearance

Background and objective: The oral minimal model (OMM) of glucose dynamics is a prominent method for assessing postprandial glucose metabolism. The model yields estimates of insulin sensitivity and the meal-related appearance of glucose from insulin and glucose data after an oral glucose challenge. Despite its success, the OMM approach has several weaknesses that this paper addresses. Methods: A novel procedure introducing three methodological adaptations to the OMM approach is pro- posed. These are: (1) the use of a fully Bayesian and eﬃcient method for parameter estimation, (2) the model identiﬁcation from non-fasting conditions using a generalised model formulation and (3) the intro- duction of a novel function to represent the meal-related glucose appearance based on two superimposed components utilising a modiﬁed structure of the log-normal distribution. The proposed modelling proce- dure is applied to glucose and insulin data from subjects with normal glucose tolerance consuming three consecutive meals in intervals of four hours. Results: It is shown that the glucose effectiveness parameter of the OMM is, contrary to previous results, structurally globally identiﬁable. In comparison to results from existing studies that use the conventional identiﬁcation procedure, the proposed approach yields an equivalent level of model ﬁt and a similar pre- cision of insulin sensitivity estimates. Furthermore, the new procedure shows no deterioration of model ﬁt when data from non-fasting conditions are used. In comparison to the conventional, piecewise linear function of glucose appearance, the novel log-normally based function provides an improved model ﬁt in the ﬁrst 30 min of the response and thus a more realistic estimation of glucose appearance during this period. The identiﬁcation procedure is implemented in freely accesible MATLAB and Python software packages. Conclusions: We propose an improved and freely available method for the identiﬁcation of the OMM which could become the future standardard for the oral minimal modelling method of glucose dynamics.


Introduction
The quantitative assessment of the body's response to food intake in people with normal and impaired glucose tolerance is crucial in the development of therapeutic strategies for dia-(GA) in the peripheral bloodstream [9] . The OMM has been extensively validated against data from experiments using traced glucose [10] as well as traced glucose and clamp measurements [11] . In both studies, it was concluded that the OMM is a reliable tool to estimate both insulin sensitivity and glucose appearance during OGTTs and MTTs.
The OMM was proposed in 2002 by Dalla Man et al. [9] and is based on an earlier model often referred to as minimal model, which was developed for the description of glucose excursions during an intravenous glucose tolerance test (IVGTT) [12] . Both models use measured insulin concentrations as a known input and describe the simultaneously measured glucose levels as output.
Despite the OMM's success in assessing postprandial glucose metabolism, the conventional modelling approach using the OMM has several weaknesses that this paper will address. The aim of this paper is, therefore, to critically discuss the oral minimal modelling approach and propose several improvements to the conventional modelling procedure.
The main adaptation is the use of a fully Bayesian and thus probabilistic method for parameter estimation, called variational Bayesian (VB) analysis [13] , instead of the frequentist approaches used conventionally [ 2-11 , 14-16 ]. This change is motivated by the fact that fully Bayesian methods have been shown to produce superior results in the similar IVGTT minimal model [17][18][19] . Furthermore, our research group has shown the VB method's aptitude in several applications to low dimensional models [20][21][22][23][24] including models of glucose dynamics [20][21][22][23] . Additionally to the use of the VB method, this paper will demonstrate that the model parameter governing insulin-independent glucose utilisation is, contrary to previous results [9] , structurally globally identifiable. This demands appropriate adaptations to the parameter estimation procedure that are greatly facilitated by the Bayesian parameter estimation approach.
An additional weakness of the conventional oral minimal modelling approach is that it has so far only been applied to the description of a single, isolated meal response from a fasted state. This paper develops a framework for identifying the OMM from consecutive meal responses, allowing the examination of diurnal changes in insulin sensitivity and meal-related GA with the OMM. Furthermore, the paper proposes an alternative functional description of GA that has several advantages over the conventional piecewise linear formulation. To establish the validity of the proposed adaptations to the modelling procedure, a comparison to published studies using the conventional approach for identifying the OMM has been carried out.
Further drawbacks of the prevalent oral minimal modelling approach are that the frequently used parameter estimation software package SAAM II [25] is only commercially available and that some parts of the identification procedure are not described in sufficient detail or inconsistent between uses. These issues will be resolved in this paper by proposing a consistent and well-described procedure, allowing the validation of this method by independent research groups. This is greatly facilitated by the fact that the VB method used in this paper is implemented in MATLAB and Python and specific, easy to use and free software packages for the identification of the OMM are published together with this article ( https: //github.com/manueich/VBA-OMM ).

Data description
The dataset used in this work was collected by Ahmed et al. [26] in 1976 and Nuttall et al. [27] in 1985. It contains plasma glucose and insulin profiles from 26 young subjects with normal glucose tolerance (NGT) collected over 12 hours in a single day. A Table 1 Details on the subject populations consuming different meal types with standard (STAND), high carbohydrate (HCHO) and high protein (HPROT) mixtures of macronutrient content. The meal composition is given in percentage of calories contained in the respective macronutrient content. Data are given as mean ± standard error and were taken from [27] . total of three identical meals each providing 33 % of the total estimated daily calorie requirement were consumed four hours apart. Blood samples were collected at the same time in each subject after meal consumption at 0, 2, 5, 10, 20, 30, 40, 50, 60 min, then every 15 min up to 120 min and then every 30 min up to 240 min. One additional fasting sample was collected before the consumption of breakfast, i.e. at -15 min. The coefficient of variation (CV) of the plasma glucose and insulin assays are given as 1.5 and 13.4 %, respectively [26] .
Data from three different meal types of standard (STAND), high carbohydrate (HCHO) and high protein (HPROT) composition are used. The meals were consumed by three different subject cohorts assembled from the overall population (see Table 1 ), leading to 33 glucose and insulin profiles and 99 responses in total. The absolute amount of macronutrients provided to each individual subject was scaled according to body weight. Additionally, female subjects received 12.5 % fewer calories per body weight to account for the sex differences in average body composition and therefore lean mass.

Model formulation
The original formulation of the OMM by Dalla Man et al. [9] is as follows The glucose concentration and its basal (pre-test) level are represented by G and G b in mmol/L, respectively. The glucose effectiveness parameter p 1 in min −1 controls the insulin-independent glucose utilisation and V in L/kg is the distribution volume of glucose relative to body weight. The input function Ra represents the meal-related, posthepatic appearance of new glucose in mmol/kg/min. The state X in min −1 represents the insulin action in a remote (from plasma) compartment with p 2 in min −1 governing its decay dynamics and p 3 in min −2 per mU/L governing the amplitude of insulin action. The insulin sensitivity is given as the ratio p 3 / p 2 and has the unit min −1 per mU/L. The insulin concentration I and its basal (pre-test) level I b in mU/L are considered to be known, error-free inputs.
The first two terms in Eq. (1) describe the insulin-dependent and insulin-independent glucose utilisation, respectively and represent the actions of the liver and peripheral tissues on glucose metabolism. The state X , described by Eq. (2) , does not represent the concentration of a certain substance in a compartment, but rather the action or effect of insulin, highly influenced by the concentration of insulin in the plasma I . In the postprandial state, when glucose and insulin concentrations are typically above their respective baseline levels, X is positive which leads to glucose leaving the blood via the first two terms in Eq. (1) . The only source of glucose inflow to the blood is then the meal-related glucose absorption Ra from the gut. In the fasted state, at the beginning and the final part of the modelled response, glucose and insulin levels can lie below their baseline values. This can induce a blood glucose increase via the first two terms in Eq. (1) , as G ( t ) < G b and the state X becoming negative due to I ( t ) < I b , describing the mechanism of glucose output by the liver and peripheral tissues during fasting. For the case of G ( t ) = G b and I ( t ) = I b , the model is at its steady-state, meaning that the state X is zero or decays to zeros via the parameter p 2 .
This paper proposes the following generalised formulation of the OMM: In comparison to the original OMM formulation in Eqs. (1)-(2) , the insulin sensitivity S I in min −1 per mU/L is directly included, as it facilitates the definition of its parameter probability density function (PDF) in a Bayesian context and the physiological interpretability of the model in general. Further adaptations to the model are made, namely: the basal level of glucose G b is different from its initial condition G 0 , the initial condition X 0 is allowed to be non-zero, and the term Rap representing a persisting GA is included. This allows the incorporation of effects from previous meals to identify the model from non-fasting conditions.
Identical to the conventional OMM formulation [9] , the measurement error is considered to be additive and normally distributed with zero mean and a known standard deviation, calculated from the glucose assay CV. For comparability with the literature results, a CV of 2 %, instead of the 1.5 % given for the dataset utilised in this paper, is used. The details of handling the measurement uncertainty within the VB approach are provided in Section 2 of the supplementary material.
To define the time profile of the rate of glucose appearance Ra following the consumption of a single meal, work [9] proposed a parametric, piecewise linear description: The adjustable heights k 0 -k 7 at fixed breakpoints times t 0t 7 are inferred from experimental glucose and insulin data in the context of the parameter estimation problem in the OMM and define the overall shape of the GA time profile. An example profile of GA using the function Ra PL can be seen in Fig. 1 . There is no consensus about the optimal breakpoint intervals in the literature, however, it is most common to choose the first seven breakpoints at 0, 10, 30, 60, 90, 120 and 180 min and have the last breakpoint coincide with the last measurement point of the response. Furthermore, the height at t = 0, i.e. k 0 is fixed to zero [ 7 , 9 , 14 ], as no glucose is appearing at the time of meal consumption. In contrast, works [ 10 , 11 ] also fix the height between 0 and 5 min to zero and set the breakpoints at slightly different time points.
In this work, we choose the most common approach, i.e. fixing k 0 to zero and setting the breakpoints at 0, 10, 30, 60, 90, 120, 180 and 240 min. Furthermore, the effects of digestion and absorption cannot be assumed complete after 240 min. The GA function is thus described by a monoexponential decay with rate α, fixed to 0.017 min −1 after 240 min as suggested in work [9] for response durations shorter than 6 h.
Besides the inconsistency in the choice of breakpoints, another complication arises from the fact that the original OMM formulation [9] stipulates to constrain the area under the curve (AUC) of Ra based on the amount of glucose contained in the meal. The fixed AUC of Ra is calculated as the product of the amount of glucose in the meal per kg of body weight and the fraction f of ingested glucose that enters the peripheral circulation fixed to 0.9 [10] . This additional information means that one of the height parameters k 1 -k 7 needs to be replaced, which leads to two issues. Firstly, the details of this replacement procedure, particularly which of the height parameters k 1 -k 7 to replace is not mentioned in any publication. Secondly, it can lead to the height of the selected breakpoint to become negative and thus non-physiological, even if all other heights are constrained to be positive. The details of fixing the AUC of Ra through the replacement of a height parameter proposed in this paper are provided in Section 1 of the supplementary material.
Given the difficulties and disadvantages with a piecewise linear formulation of the GA function in Eq. (5) , we propose an alternative, parametric description of Ra following a single meal. It is composed of two superimposed components: where the individual components are defined by a function that utilises a modified structure of the log-normal distribution with an AUC of one: Parameters T 1 , T 2 , W 1 and W 2 govern the peak times and general widths of the respective components and therefore the overall shape of the GA time profile ( Fig. 1 ). The function Ra LN has a fixed total AUC of A and the contributions of the AUCs of individual components to the overall AUC are governed by the parameter R H which is restricted to the range (0,1). This ensures that the function Ra LN remains non-negative at all times. The fixed total AUC is calculated identically to the case of Ra PL , by multiplying the amount of glucose contained in the meal with the fractional absorption f . The use of two superimposed, log-normally shaped components in (6) with differing peak times to describe the overall shape of GA is justified by the fact that experimentally measured GA profiles often display two distinct phases characterised by a secondary hump after the initial peak [28][29][30] .
In comparison to the conventional piecewise linear function Ra PL defined by Eq. (5) , the new log-normally-based function Ra LN defined by Eqs. (6)-(7) has only five parameters ( T 1 , T 2 , W 1 , W 2 , R H ) that have to be inferred during parameter estimation, instead of six ( k 1 , k 2 , k 3 , k 4 , k 5 , k 7 ). Furthermore, the proposed input function Ra LN has the following advantages over the conventional function Ra PL : • Removed necessity to choose fixed values for the time of the breakpoints t 0 -t 7 and which height parameter to replace to fix the overall AUC of Ra PL . • Guaranteed positive and therefore physiologically justified GA rates. As explained before, it is possible for the function Ra PL in Eq. (5) to become negative as a result of fixing the overall AUC. • Increased flexibility in the first 30 min of the response because the parameter T 1 governing the peak time of the first component can be adapted.
• Removed necessity to assume a fixed decay rate α after the end of the response duration because the GA behaviour beyond the modelled duration, i.e. Rap , is completely determined by the adaptable parameters ( T 1 , T 2 , W 1 , W 2 , R H ) of Ra LN and therefore informed by the data. • Straightforward adaptation to a dataset with different meal response durations or irregular meal consumption times. As the last breakpoint of the function Ra PL in Eq. (5) is dependent on the last measurement point of the described meal response, a dataset with varying durations between meals would require the adaption of the formulation of Ra PL for every response, which is highly impractical.

Structural identifiability analysis
The previously mentioned publication [9] presented a structural identifiability analysis on the conventional OMM formulation in Eqs. (1)-(2) and input function Ra PL in Eq. (5) using the Taylor series approach [31] . Here, the conclusion was made that parameters p 2 and p 3 as well as the height parameters k 1 -k 7 are structurally globally identifiable if parameter p 1 and the distribution volume V are known. This result has been used by all studies utilising the OMM, e.g. [2][3][4][5][6][7] .
In this paper, these results are partially disproved by combining the Taylor series approach with symbolic computation. In particular, it is demonstrated that in addition to parameters p 2 , p 3 and k 1 -k 7 , parameter p 1 is also structurally globally identifiable if the distribution volume V is known. This can be shown by using one additional Taylor series coefficient, demonstrating that work [9] truncated the Taylor series expansion too early, as already hypothesised by Saccomani et al. [32] . These novel structural identifiability results extend to the generalised formulation of the OMM in Eqs. (3)-(4) with input function Ra PL in Eq. (5) , i.e. that p 1 , p 2 , S I and k 1 -k 7 are structurally globally identifiable. The details of the structural identifiability analysis are provided in Section 3.1 of the supplementary material.
To examine the identifiability of the OMM in Eqs. (3)-(4) in combination with the log-normally based GA function Ra LN in Eqs. (6)-(7) the Taylor series approach is no longer tractable with symbolic computation. We thus resort to a different approach that can establish the local structural identifiability or non-identifiability of parameters, called the observability rank criterion (ORC) method [33] . The method is implemented as a freely available MATLAB toolbox, known as STRIKE-GOLDDv2.2 [ 33 , 34 ]. Applying the ORC method to the OMM in Eqs. (3)-(4) in combination with the function Ra LN in Eqs. (6)- (7) , it can be shown that all parameters are structurally locally identifiable. The details of the computation are provided in Section 3.2 of the supplementary material. Given the symmetric nature of Ra LN in Eq. (6) , it can be deduced that there are at least two structurally locally identifiable parameter combinations, given the fact that the two input components of Ra LN can be switched for the same overall shape, which would be indicated by T 1 > T 2 .

Definition of prior distributions
The unknown parameters to be estimated in the new formulation of the OMM in Eqs. (3)-(4) are the system parameters p 1 , p 2 , and S I as well as the parameters related to the piecewise linear GA function Ra PL , i.e. k 1 , k 2 , k 3 , k 4 , k 5 , k 7 , and the parameters related to the log-normal GA function Ra LN , i.e. T 1 , T 2 , W 1 , W 2 and R H . Prior distributions for all parameters are selected as log-normal PDFs since the parameters are only physiologically plausible when positive. The exception to that is the parameter R H which is restricted to the range (0,1) by passing a normally distributed auxiliary parameter through a logistic mapping. All prior and posterior distributions are characterised by their median and coefficient of variation (CV). The prior distributions are set based on population densities determined by a study from 2004 [ 10 ]. Here, additional data from traced glucose was utilised to estimate all unknown parameters of the OMM in a NGT subject population. The details of all prior distributions are provided in Section 4 of the supplementary material. Two comments are in order. Firstly, the prior distribution of the previously fixed glucose effectiveness parameter p 1 is chosen to have a CV of 25 %. In a preliminary analysis, this level of prior uncertainty was found to provide a suitable trade-off between considering the observed population variability of p 1 in work [10] and an acceptable estimation precision of the remaining parameters, especially the insulin sensitivity S I . Secondly, the treatment of the unknown parameter p 2 governing the dynamics of state X , is inconsistent between studies. While some studies choose to fix it altogether [ 16 , 35 ], a selective "Gaussian Bayesian prior" with a CV of 20 % was considered in other studies [ 7 , 10 ]. We instead model parameter p 2 with a log-normal distribution and an increased prior CV of 40 % to better reflect its observed population variability in work [10] .

Parameter estimation procedure
The parameter estimation is carried out using a variational Bayesian (VB) approach [ 13,36 ]. It provides an approximation to the posterior distributions of unknown model parameters through the maximisation of a lower bound on the model evidence, i.e. the marginal likelihood of observing the data under the given model. This probabilistic treatment of unknown parameters allows the estimation of parameter uncertainty and incorporation of existing knowledge into the estimation procedure, as demonstrated in the previous section.
The procedure for identifying the OMM from consecutive meal responses is schematically depicted in Fig. 2  first meal to counteract measurement errors. A recalculation of basal levels before every meal is unfeasible because it cannot be assumed that basal levels are reached before the next meal is consumed. In contrast, the initial conditions G 0 and X 0 are reset for every meal, where G 0 is calculated as the average of the 0, 2 and 5 min samples. A similar approach to X 0 is not possible because this state is not directly observed but inferred by the model. Instead, X 0 is set to 0 before breakfast, assuming no active insulin due to the fasting state of the subjects [9] . For the subsequent meals (lunch and dinner) this assumption cannot be justified, so X 0 is set to the last inferred value from the previous meal, i.e. X (240). The persisting absoption Rap is calculated as Rap ( t ) = Ra ( t + 240), where the function Ra is either Ra PL or Ra LN inferred from the previous meal.
One set of unknown parameters is estimated from every single meal response in the dataset using the VB approach. To provide the insulin concentration profile as known input, it is linearly interpolated on the integration time step of 0.1 minutes. Software packages implementing the described identification procedure in MATLAB and Python are freely accessible online ( https: //github.com/manueich/VBA-OMM ). They have been designed for ease of use while requiring minimal knowledge of the underlying mathematical techniques. A detailed documentation and examples can be found online.

Validation of the parameter estimation procedure
The validity of the estimation results and thus the proposed parameter estimation procedure is examined by comparison to literature results. This comparison can identify possible differences to the results from the conventional identification procedure when applied to similar datasets. Included in this comparison are results from studies identifying the OMM under similar conditions, i.e. populations only consisting of NGT subjects and utilising responses from OGTTs or MTTs. Compared are the inference results of parameters p 2 and S I as well as the weighted residuals between model output and data calculated with where R i is the value of the weighted residual calculated from the glucose measurement point y i and the corresponding glucose level G i inferred by the model. The value 0.02 represents the glucose assay CV of 2 %. An additional model fit criteria is calculated as the root mean squared error (RMSE) between glucose data and model output.
To assess the difference in glucose appearance profiles between the two functions Ra PL and Ra LN , the following quantity is calculated:

Parameter estimation
The posterior estimates of the system parameters p 1 , p 2 , and S I for both types of GA functions are displayed in Fig. 3 . The population distribution of the glucose effectiveness parameter p 1 in (a) follows the prior distribution and the estimated medians display a low correlation (r = 0.17) between input functions. Together with the fact that the posterior CVs of p 1 in (b) only marginally decreased from the prior CV of 25 % implies practical identifiability issues in this parameter. For the case of the parameter p 2 , the population distribution in (a) has shifted with respect to the prior PDF, indicating that the individual posterior densities are informed by In contrast, the insulin sensitivity estimates are highly correlated (r = 0.93) and statistically equivalent (p = 0.66) between both types of GA functions. Furthermore, S I estimates show a high posterior precision which indicates excellent convergence from a prior CV of 100 %.
The posterior results of the input function parameters are provided in Section 5 of the supplementary material. The parameters related to the two input functions show excellent convergence to a more narrow posterior distribution, as indicated by an overall median of 11.3 % from a prior CV of 50 % for the parameters of Ra PL and an overall median of 10 % from a prior CV of 30 % for the parameters of Ra LN . Here, two observations can be made. Firstly, for the case of Ra PL , lower values of parameters generally lead to higher posterior CVs and secondly, for the case of Ra LN , the two width parameters W 1 and W 2 show higher posterior CVs, in comparison to the remaining parameters. The overall posterior CVs nevertheless confirm the structural identifiability results regarding the parameters of Ra LN . In this context, there are no cases where the respective components of the log-normal input are switched as indicated by T 1 < T 2 in all responses.

Comparison to literature
A literature search found six studies matching the inclusion criteria of having a NGT subject population and using data from either MTTs [ 7 , 9 , 10 , 14 ] or OGTTs [ 15 , 16 ]. A comparison between the literature results and the corresponding outcomes of the procedure Table 2 Comparison between the results of this work and the literature. If provided, values are given as population mean ± standard error. In the case of an empty cell, the respective results were not reported by the authors of the study.  [14] MTT 21 (420) 7.9 ± 0.6 5 --Saad et al. [7] MTT 13 (360) 6.3 ± 1.1 ---Geragotou et al. [15] OGTT 10 (210) 20.3 ± 2.8 ---Theodorakis et al. [16] OGTT 10 (120) 17.7 ± 1. presented in this paper using the piecewise linear function Ra PL is provided in Table 2 .
The number of sampling points and intervals between the dataset used in this paper and the literature studies are similar, making the overall estimation results comparable, as the estimation precision of the parameters can be affected by the number and timing of the sampling points. As with the dataset used in this work, it is common to sample at irregular intervals with a tighter sampling grid at the beginning of the test. Additionally, OGTTs allow for shorter sampling durations compared to MTTs.
The posterior CV of the insulin sensitivity parameter S I shows that the insulin sensitivity can be estimated with a precision comparable to the literature results. When comparing absolute insulin sensitivity values between different studies, it has to be noted that potential biases in the insulin measurement techniques, which are known to be up to twofold [37] , can affect the results. The estimated S I values are similar in studies [ 7 , 9 , 10 , 14 ], which could stem from the fact that these studies were conducted within a single institution, using consistent equipment and assays. When the OMM is utilised by independent groups as in this work and studies [ 15 , 16 ], differences in population means are observed, most likely due to differences in insulin measurement methods.
Only one study provides posterior results of the parameter p 2 . Here, the standard error of point estimates and posterior CVs are lower. This is most likely the consequence of the more narrow prior distribution of 20 % chosen in [9] , whereas our work utilised a prior CV of 40 %. The bias in mean population point estimates could be a result of potential biases introduced by the insulin measurement techniques.

Model fit
To assess the model fit, a plot of the time profile of averaged weighted residuals as calculated by Eq. (8) is provided in Fig. 4 . Similar to the literature, our results show very few outliers outside the -1/ + 1 range demonstrating a comparable ability to fit the data. The increased standard deviation, i.e. variability in the results of this work can be explained by the fact that, unlike the literature studies, this work used data from three different meal compositions leading to higher variability in the recorded responses.
An interesting trend can be observed in all results using the piecewise linear GA function, including the results of this work: there is a clear bias towards negative weighted residuals in the first 20 -30 min of the response. Considering Eq. (9) , defining the weighted residuals, this means that the model overestimates the glucose excursion in comparison to the data. This trend is not present in the results of the new log-normal GA function (red squares in Fig. 4 ), demonstrating the improved ability of the new input function to describe the data in this particular part of the response. In contrast, Ra LN yields slightly more biased residuals with larger variability towards the end of the response. This can be explained by the higher flexibility of Ra PL in the last 90 min of the response. For the remaining part of the response, both input functions produce a similar model fit. Overall, the RMSE values of the models using the two input functions are equivalent, i.e. Ra PL

Diurnal variability
To examine the ability of the model identification procedure to account for overlapping effects between meals, the diurnal changes in insulin sensitivity and RMSE values are analysed and displayed in Fig. 5 . The insulin sensitivity estimates decrease over the course of the day, whereas the RMSE values are unaffected by the time of meal consumption.

Estimation of glucose appearance
A comparison between the inferred time profiles of GA using the piecewise linear function Ra PL and log-normally based function Ra LN is provided in Fig. 6 , where it is demonstrated that the first peak in GA profiles occurs slightly earlier in all meal types when Ra LN is used. This can be considered more realistic because unlike Ra PL , the proposed function is more flexible in this part of the response and fits the data better, as described in the previous section. For the remaining response time, the inferred profiles in the meals of standard (STAND) and high carbohydrate (HCHO) composition display a similar pattern and low δ Ra values indicating the relative difference of GA functions. This is not the case in the meal with high protein content (HPROT). Here, the GA profile of the piecewise-linear function indicates three distinct absorption phases, whereas the new function is by definition only capable of producing two phases. Despite this lack of flexibility of Ra LN , a worse model fit in the HPROT meal cannot be detected, as demonstrated by the statistical equivalence of RMSE values, i.e. Ra PL 0.23 ± 0.1 vs. Ra LN 0.24 ± 0.1 mmol/L, p = 0.59 according to the Wilcoxon ranksum test.

Discussion
In this work, we critically discussed the oral minimal modelling approach and introduced three novel methodological adaptations to the conventional method: the use of the VB method for parameter estimation, the model identification from non-fasting conditions using a generalised model formulation and a novel function to represent meal-related glucose appearance. Additionally, we demonstrated that the glucose effectiveness parameter p 1 is structurally globally identifiable, disproving earlier results [9] . To incorporate this new structural identifiability result into the parameter estimation procedure, the proposed Bayesian approach is particularly useful as it allows the definition of a suitable prior distribution over p 1 , balancing the observed population variabil- ity from work [ 10 ] with the estimation precision of the remaining parameters. The posterior results of p 1 showed a high uncertainty compared to its prior and low correlation between medians when estimated with two different GA functions. This demonstrates that the estimates of p 1 are unreliable which prohibits the interpretation of the glucose effectiveness regarding the subject and/or meal characteristics. Furthermore, it exposes a general weakness of the OMM irrespective of the parameter estimation approach. We nevertheless argue that fixing p 1 to its population median, as was done in all previous studies using the OMM, is not suitable, because it would ignore the large population variability observed in work [10] and lead to an underestimation of the uncertainty in the remaining parameters especially the insulin sensitivity.
The remaining parameters, especially the insulin sensitivity parameter S I , could be estimated with acceptable precision comparable to that of literature results, as shown in Table 2 . This estimation precision could be achieved despite the introduction of additional uncertainties through the prior distributions over p 1 and p 2 , therefore demonstrating the aptitude of the VB model inversion approach. Similarly, the time course of averaged weighted residuals displays a similar pattern in comparison to the literature results ( Fig. 4 ), demonstrating the ability of the inversion approach to achieve equally good model fitting results.
A factor limiting the cogency of the current study is the use of a dataset that only contains subjects with normal glucose tolerance, as the OMM can be particularly useful for the study of metabolic defects in subjects with impaired glucose tolerance and type 2 diabetes. However, the easily accessible nature of the developed software packages allow other research groups, where data from different subject populations might be available, to apply the OMM and validate the proposed identification method further.
An important aspect of the oral minimal modelling approach is the fact that that the distribution volume V and the fractional glucose absorption f are fixed to population results, despite the fact that these parameters have a considerable population variability, as observed in the previously mentioned work by Dalla Man et al. (2004) [10] . Here, it was already noted that this inflexibility can impair the OMM's ability to assess insulin sensitivity on an individual level. In fact, it can be shown that the insulin sensitivity parameter S I is highly affected by the chosen values for V and f with a proportional relationship to an increase in f and an inverse proportional relationship to an increase in V (see section 5.2 in the supplementary material). This issue could be partially solved using the proposed log-normally based input function Ra PL , as the parmeter A governing its AUC in expression (6), fixed based on f and the amount of glucose in the meal, could be treated as an adaptable parameter, whose prior distribution is informed by the population distribution over f from work [10] . This would eliminate the need to assume a fixed value for f , while retaining the structural local identifiability of all parameters according to the ORC method and highlights an additional benefit of the new input function Ra PL . In this work we however refrained from this procedure as it would be a substantial deviation from the conventional identification approach and therefore disallowing a comparison with results from the literature.
In terms of the computational cost, the VB method is highly efficient. Using MATLAB 2019b running on a PC with Microsoft Windows 10 Enterprise, Intel Core i7 CPU with 3.4 GHz and 32 GB of RAM, the identification of the OMM was taking an average of 4.5 s per response. Similar results were obtained with the Python version of the software package. This is significantly less than the time of several hours reported when identifying the IVGTT minimal model using a Markov Chain Monte Carlo approach [19] and makes the proposed approach feasible in a practical setting like a physiological or clinical laboratory.
The analysis of diurnal changes in RMSE values shown in Fig. 5 (b) shows no decrease in the model's ability to fit the data from meals consumed under non-fasting conditions, which implies the validity of the associated generalisation of the OMM formulation in Eqs. (3)-(4) . Furthermore, the decrease in insulin sensitivity over the course of the day shown in Fig. 5 (a) confirms earlier results by Saad et al. [7] . Here, the conventional OMM in Eqs. (1)-(2) was applied to MTTs conducted at different times on three consecutive days, where the meals were consumed 6 h apart so that fasting conditions could be assumed independent of mealtime. In our work, we achieved very similar results regarding the diurnal changes in insulin sensitivity using data from a far less elaborate experimental procedure. This further demonstrates the utility of the generalised model in Eqs. (3)-(4) to account for overlapping effects between meals and encourages the application of our procedure to similar datasets e.g. [ 27 , 38-40 ].
The newly proposed GA function, consisting of two superimposed log-normally shaped components, has several advantages in comparison to the conventional piecewise linear representation of GA. These are mainly related to the fact that it is a fully differentiable function that does not require the specification of fixed values. The results using the new GA function show that it is possible to fit the data with the same accuracy and that the relevant insulin sensitivity parameter estimates are highly correlated and have similar, albeit slightly decreased precision. A particularly useful feature of the log-normal input function is its flexibility in the time of the initial peak. This gives the model the ability to better fit the data in the first 20-30 min of the response, as demonstrated by the time profile of averaged weighted residuals in Fig. 4 . This improved model fit makes the estimated GA rates more realistic during the initial part of the response and explains the high deviations to the piecewise linear function, demonstrated in Fig. 6 . Another indication of a more realistic estimation of GA using the log-normally based GA function is found in the previously mentioned study [10] . Using traced glucose to obtain a reference profile of GA, it was found that the piecewise-linear function overestimates GA in the first 30 min. Since Ra LN infers lower GA rates in comparison to Ra PL , the proposed log-normal input can thus be considered to provide more realistic results during the initial part of the response.
The potential use of the new GA function extends beyond the OMM to other models describing the postprandial glucose metabolism. In particular, the function could be used to describe ambulatory glucose profiles in models that only require glucose data for identification [ 23 , 41 ], where it is common that meals are consumed at irregular intervals.
The vast majority of studies using the OMM, e.g. [ 2-11 , 14 ], have been published by or with the contribution from the research group that introduced the OMM in 2002. In comparison, its application by independent research groups, e.g. [ 15 , 16 ], is rare. Possible causes for this sparsity could be the difficulties in replicating the modelling approach due to inconsistencies and a lack of details in the literature, and the fact that the prevalent software tool is only commercially available. This couldpotentially lead to a shortfall of objective and independent validation of the oral minimal modelling approach. To alleviate these issues, this paper described all relevant details of the model identification procedure which, together with the freely accessible software packages that we offer, makes it possible for other research groups to apply and validate the OMM independently. This work could thus set the future standard for the identification of the oral minimal model of glucose dynamics. In particular, the published work facilitates the comparison of the OMM results against other means of determining insulin sensitivity, e.g. [42] , and glucose absorption patterns, e.g. [43] , from meal test glucose and insulin data.