A hybrid algorithm for parameter estimation of the short-ghrelin dynamics

Food intake, bodyweight and appetite are controlled by a “web of hormones.” Recently, from this aforementioned web of hormones, several hormones have been revealed and investigated from a medical standpoint with different degrees of success: a key one is ghrelin. Accordingly, ghrelin is an orexigenic (i.e., appetite stimulant) hormone; in fact, the only one of its kind, a peripheral hormone that can influence, centrally, one‟s propensity to start a meal. On this work, we shall present a problem in parameter estimation using evolutionary algorithms in conjunction with local search, what we have called herein hybrid algorithms (global search + local search); additionally, we apply artificial neural networks (feedforward neural network) for supporting the numerical simulations (what we have named “fake data”). Moreover, we present a mathematical model for ghrelin partially published elsewhere by the same authors; in addition, we have confronted the model mathematically with in vivo data via parameter estimation (well-known as validation) and got promising results for the novel mathematical formulation for ghrelin dynamics. Thus, our aim is showing that our algorithms can be imperative for fitting the current and future versions of the model. Notwithstanding the parameter estimation was unable to model precisely the experimental data, most likely due to physiological details still unclear in the medical literature, it generated an optimized curve relatively close to the experimental data, leaving promising results for future investigations.


I INTRODUTION
Food intake, bodyweight and appetite are controlled by a "web of hormones" [1]. Those hormones may play similar/equal role, or quite diverse ones, but in the overall their actions emerge towards a common physiological role to make possible the precise control of bodyweight (energy homeostasis) that can be seen in most living systems (e.g., humans). Accordingly, the workings of this web of hormones culminate in food intake/appetite control (i.e., energy input). How the human body accomplishes bodyweight and food intake control precisely is still an ongoing research (the details of the physiological process), notwithstanding it has been done substantial progress in the last decades by unveiling key hormones such as ghrelin, leptin, and insulin; three key players on food intake and energy homeostasis control.
Ghrelin is an orexigenic (i.e., appetite stimulant) hormone [3][4][5]; in fact, the only one of its kind, a peripheral hormone that can influence centrally one"s propensity to start a meal [3]; its effect seems to be mediated by a group of neurons in the brain (arcuate nucleus) [6]. This group of neurons is the same aimed by leptin and insulin, mediated for releasing neuropeptide y. Ghrelin was found to be influenced by or/and influence key hormones, e.g. leptin and insulin, which may operate in different time-scales. Additionally, ghrelin is a short-time scale hormoneit operates in hours, it is said to increase about 1-2 hours before each meal, and to fall off about 1 hours after a meal has been terminated [3] -. The fall-off of ghrelin concentrations in blood-stream as a consequence of meals is a function of some factors such as macronutrients present in the foodstuff (e.g., carbohydrates) [7]. Our main aim is showing that our algorithms, based on hybrid algorithms instead of plain search, can be imperative for fitting the current and future versions of the model (future versions of the model might get much more complicate due to hormone/factor interconnectivity [1]) to experimental data, and we get it by the numerical results ( Fig. 2) 1 . As we shall see, our optimization problem, eq. 8, is a residual error function that should be minimized, employed for measuring how well or badly is our model performing when compared to experimental data, called parameter estimation/system identification [2]. Our hypothesis is that our model, see for more [1,4] discussions, can be used to explain experimental data, collected by independent group, in vivo data, from human. And we succeed at a first stage, as can be seen in the simulation section.
On this work, we shall present a problem in parameter estimation using evolutionary algorithms alongside local search, what we have called hybrid algorithms (global search + local search, Fig. 3); furthermore, we apply artificial neural networks for supporting the numerical simulations. The work is based on Pires (2017) [1] and Pires at al (2017) [4].

I.1 ORGANIZATION OF THE WORK
In the next section, Mathematical Formulation, we present the model per se, the mathematical formulation in detail. In the following section, Parameter Estimation, we take the model presented previously and confront it with experimental data (in vivo vs. in silico). In the section following, Discussions, we add something more to the model, such as future works. Finally, we close the paper with a brief section, Final Remarks. We provide the reader with a concise set of references following the Final Remarks.

II MATHEMATICAL FORMULATION
On this section, we shall present and discuss on a ghrelin model/dynamics based on insights already discussed somewhere else by the same authors [1][2][3][4], with gastric emptying rate feedback (the gastric emptying rate was added to the model in an at-tempt to respond to some literatures (e.g., [16]) pinning down this imperative physio-logical detail); then, we shift to parameter estimation (which is an independent section).

I.1 EQUATIONS
The first equation is for the stomach, eq.1. The equation is a classical "input-output" (deterministic) differential equation, which can be encountered in most books in ordi-nary differential equations. Accordingly, eq.1 aims at modeling the dynamics of in-put-output that occurs in the stomach as a consequence of foodstuff intake (based on mass conservation). The equation is composed of an input term, i.e. foodstuff from mouth to stomach, and a second term from stomach to small intestine (called chyme). Notwithstanding the small intestine is composed of threeanatomical segments (duodenum, jejunum, and ileum), we shall not model them in detail; details may be needed in future models omitted herein since several literatures found that the control of ghrelin production seems to be physically located between the duodenum and jejunum (the middle and last section of the small intestine). (1) We have, as a direct consequence of eq.1, the following equations: i) eq.2 is for food intake (F(t)); ii) eq.3 is for modeling gastric emptying feedback (f1(D(t)S(t))). Eq.2 is a simple degree/step function, it takes a constant value within an interval of time (i.e., lunch time, represented by τ in the equation), and zero otherwise (outside the interval of time t + τ). Of course, this function is not completely physiologically plausible because one eats in a more complex manner; this constant value can be seen as an average for the whole interval. Eq.3 is a function for controlling the output of the stomach (pylorus) based on excitation of the small intestine (represented by the letter D). (2) In the upcoming tables (Table 1 and Table 2), we have organized the states variables and parameters of the model.  Source: Authors, (2018).
One important (modeling) detail about eq.3 is that it does not model the true physiological phenomenon (see Fig. 1), the feedback from the small intestine into gastric emptying rate. Consequently, eq.3 it models a surrogate variable herein called small intestine (D). The rationale behind this modeling simplification is that the hormones that truly participate in the dynamics of controlling the gastric emptying feedback are all produced directly/indirectly proportionally to the amount of nutrient load in the small intestine, as so, being the amount of food in the small intestine as we are methodologically accepting a pretty good approximation.
Next equations are: i) small intestine, eq.4; and ii) ghrelin dynamics, eq. 5. As a direct consequence of eq.6, we have eq.7. Eq.7 was created to account for the shift in dynamics between day and night behavior of ghrelin (around t= 25h =2 am [3]), it was observed in the literature [3]; however, a plausible physiological explanation has not been yet published to the best of our knowledge, and eq.7 is the most naïve way to handle the issue mathematically. (4) As a direct consequence of eq.5, we have: In the upcoming figure, Fig. 1, we have a scheme of the gastric emptying feed-back: as the small intestine is loaded with (processed) foodstuff, the more hormones are produced; and the more hormones are produced, the higher is the suppression effect; one exception omitted is ghrelin, which induces gastric emptying, in the scheme we should have for ghrelin an arrow-headed vector instead of bar-headed vector (which means suppression).  Fitness/loss function. The fitness or loss function is a mathematical relation used to measure how well/badly an algorithm/model is performing on its task [8]; it is minimized for finding the best parameters of a model. Thus, for having a fitness function we need to have a way of measuring what we want; i.e. a measure of distance from "bad to good." Once we have this measure, we just need to minimize/maximize this relationship; in biology, that may become problematic since not always the optimal is what we can find in real physiological systems, and such an approach is merely a mathematical trick. The optimal parameters found by an algorithm may not be necessarily physiologically plausible, or the one actually tuned by evolution.
The commonest function is called Residual Sum of Squares (RSS), also known as the Sum of Squared Residuals/ Sum of Squares Residuals (SSR) [8]. In one of its variations, it is known as Mean Squared Error (MSE). Hence, the residual sum of squares reads on its simplest shape: (8) Where our experimental dataset is given by: [ ]; "N" is the number of experimental data pairs (input-output) we have. The data is a matrix of two columns, one for the input and the corresponding output, and N lines, in which lines correspond to one realization of the physical system under investigation. ( ) can be "anything," from a neural network with weights to be adjusted to a polynomial series with coefficients to be adjusted or a dynamical system with parameters to be adjusted in order to achieve minimal error, the latter called parameter estimation or system identification. Furthermore, are the inputs, the same set applied to generate ; are the model adjustable values (e.g., weights for neural networks, coefficient for regressions and parameters for dynamical system). Some [8] normalizes eq.8 with the standard deviation, however we do not have this measure directly, thus we have chosen to use the "raw" RSS.
Generating data. One issue that our data presents is that it is not abundant: the more data we have the better usually is the parameter estimation process. A second issue is that we just have data for ghrelin in bloodstream, no data for the gastrointestinal tract dynamics (the second issue was handled by fixing some parameters in order to avoid several possible solutions, see table 2, which would have been pointless letting it free). Thus, we need to generate/sample more data (unfortunately, we cannot redo the experiment, and we challenge medical doctors and biologist reading this paper to sample the data we need to enforce the model).
Two pathways can be followed for exploiting the data we have, both tested herein, and giving to a certain extent the same result for our problem: 1) simulate the dynamical system and compare the distance between the experimental sample time and the simulated time, take the closet one; 2) Use another methodology for creating "fake" data, so we have data for any time we need, for the numerical routine. We preferred the latter for allowing more flexibility regarding future changes in the model. We applied a feedforward neural networks to learn the data we have, and replacing them as input for the parameter estimation process.

III.2 METHODOLOGICAL CONSIDERATIONS
We provide an exhaustive parameter estimation, testing several possible combinations of what we have called herein "hybrid" algorithms (global search + local search, cf., Fig. 3); the motivation is gathering the well-known strengths of each technique in one practice, i.e. global search is well-known to explore the search space, whereas local search to converge fast to the closest optimum. After the exhaustive investigation on the model from a parameter estimation perspective, we can conclude that any improvement is no longer an optimization issue, but rather future versions of the model for accounting for missing details; the model presented herein can be called the "minimum/reduced model" keeping in mind a more wellelaborated version is discussed on [1].

III.3 SIMULATIONS
For solving the dynamical system presented previously numerically (eq.1-7), we preferred our own coding, a Runge-Kutta 4. As we shall see, we have tested several optimization routines; all the optimization methods were borrowed from Matlab, built-in functions, inserted within the scripts employed for accomplishing the simulations reported herein. For the local search, just one method was tested, we have done just one simulation, whereas for the hybrid methods, composed of one stochastic search (Fig. 3), we have done ten simulations and reported the average values for the parameters and the interval of confidence for the optimal values using the Gaussian distribution (α=5%). Some of the codes will be shared on: https://www.dropbox.com/sh/8v3knol6u1zwp69/AABW w6TeXW3VzEYLlAHYtrV-a?dl=0 It was tested several routines, see [1] for comparing, however, we just report the imperative simulations, the least promising results are left for curious readers to get in touch. One interesting case to report is regarding particle swarm. Unfortunately, due to unknown reasons it did not work well for the problem herein, even after parameter changes; therefore, we shall not report the hybrid algorithms designed. Because our problem is physiological, it is senseless allowing any solution: bounds were set at the parameters. Two ways can be done for bounds: i) using an algorithm based on bounds; ii) creating an external trick. We have tested both; details on the external bounds can be found on [2]. The advantage of external bounds is that we can use unconstrained optimization (generally faster than constrained optimization), whereas for internal bounds we can avoid problems that can happen when we arrive close to the bounds. Just one local search is reported, but we tested several of them: Nelder-Mead Method. The problem with local search is the need to set initial conditions: it was set by hand-picking, which can be cumbersome if we must change the model, or any detail, for future versions of the model. As so, the hybrid methods overcome this detail: but they are surely slower and give more variability on the final solutions.
In the upcoming table, Table 4, we have the algorithms and their outputs. The results are quite similar, with small differences between the methods. For the hybrid methods, due to the first step being stochastic search, we ran it out ten times, and took the average value; for the fitness function, we report an interval of confidence based on a Gaussian distribution (α=5%).

V DISCUSSIONS
The application of soft computing (e.g., genetic algorithm) is not uncommon in biological systems, as we have done herein. Therefore, we show once more numerically the possibility to apply soft computing successfully and hybrid algorithms on "parameter estimation', as an alternative route for classical methods/local search.

V.1 THE NUMERICAL SIMULATIONS
On Table 4, we have the numerical methods applied and their outputs. The methods were divided into: local search and global search. We reported just the interesting results, further results are left as curiosity for interested reader to contact us. The global search was done in the hybrid style: one local search + a global search, see Fig. 3. Table 4: List of optimization methods and their outputs (see Table 6). In the upcoming table, Table 5, we have initial conditions used on the local search. Source: Authors, (2018).

V.2 EFFECTIVENESS OF THE MODELING
Ghrelin has two key dynamics: i) day-night dynamics; ii) long-short term dynamics. The former has to do with the shift of physiological behavior observed around 2 a.m [3], whereas the latter has to do with the double-dynamics observed in ghrelin [12], one in meal-like timescale and the other in terms of bodyweight timescale. In our model, we just concerned about the day-night dynamics; insights for the long-short term dynamics is presented by [2,5]. As we can see from eq.7, we solved the problem by adding a two-state variable, one ghrelin production rate for day, when meals drive the ghrelin production, whereas at night it is internal physiological processes, modeled by a second value for ghrelin. As we can see from Table 4, the value estimated for night is slightly smaller. This can be explained based on the fact that at night our hormonal machinery is reduced (metabolism), which most likely affects also the production of the hunger hormone; see that ghrelin is a pleiotropic hormone, which means that its functions go beyond just food intake control. The half-life of ghrelin in humans is about 240 min [12], which gives us an elimination rate equals about "0,003 min-1"or "0,17 h-1"; we found 1.9958, which is relatively close, the value 0.007 was applied by [13] in mice studies.

V.3 IMPROVING THE MODEL
Remodeling the gastric feedback. The model for gastric feedback we have applied herein is a simplification for the physiology behind. It is true that as the small intestine starts to be excited by nutrient load present in foodstuff, the hormones that control gastric feedback shall be either suppressed or produced, thus we have a positive correlation. However, the true physiology behind, if we want to have a model closer to reality, must be taken into account. It means that we must model the hormones individually, or at least some of them. Ghrelin also was found to control gastric emptying, in an opposite direction, when ghrelin is high, it seems to induce gastric emptying, rather than suppressing, as most of them do.
The "night mode". The "night model", as we see it here and may be found in the literature with other names, is a sequence of hormone changes due night/sleeping period. One example is the dawn phenomenon (cf., [17]). In our model, we have modeled the effect of that on ghrelin production as a "two-state" function (day or night); in fact, we are assuming that the "night mode" is responsible for the decline in ghrelin concentration at night, where there is no meal to explain the fall-off. More interesting modeling would be to take into account "hormones", being it "fake" (a mathematical trick) or real (based on physiological observations); thus, the concentration of this hormone must grow throughout the day and become significant high at night. Some of this fall-off seems to be explained by leptin, that can join independently or as net force on this fall-off of ghrelin during the night.
The production rate. As we can see in Fig. 2, the model fits well on the two meals after breakfast, but fails between breakfast and lunch. It happens because the ghrelin production rate, a constant β, is the same for the whole day. We may need to consider a time-dependent "constant" for ghrelin production; or even consider having different suppression factors for ghrelin as a function of food content, which is not an unknown physiological fact that ghrelin responds differently to different macronutrients.
Optimization methods. We have used several optimization models. We honestly believe that from an optimization perspective, we did more than enough, accordingly any future problem is regarding model improvement. We did not test all the combinations on the "hybrid style", since we have a considerable amount of local and global search techniques to consider. Others were not considered, e.g. evolutionary strategies, except for scientific curiosity, we see no strong reason to elongate further this issue; it seems unlikely that other methods may improve the fitting, if we need it somehow.
The loss/fitness function. One problem with our parameter estimation is that we do not have data for the stomach and small intestine (gastrointestinal tract), from the same experiment with ghrelin. It means that we either must fix the parameters from the gastrointestinal tract, to avoid multiple possible solutions, or we need data for the gastrointestinal tract 2 . To the best of our knowledge, at the current time, there is no paper that would fulfil the last possibility, thus avoiding multiple solutions for the parameter estimation procedure. A possible fitness function would be, which is just an extension of eq.8: (9) Where: ( ) is the stomach dynamics, and experimental data, like the one for ghrelin, measured in time; ( ) is the dynamics for the small intestine (or any other chosen compartment, e.g. jejunum), and the time dependent experimental data.

V.3 FINAL REMARKS
On this short paper, we have presented a mathematical model for ghrelin dynamics, a hormone now well-known to play a central role on appetite/food intake control. We have confronted the model mathematically with in vivo data via parameter estimation. Notwithstanding the parameter estimation was unable to model precisely the experimental data, most likely due to physiological details still unclear in the literature, it generated an optimized curve relatively close to the experimental data, leaving promising results for future investigations. As a key future work is obtaining in vivo data for the gastrointestinal tract, by this manner estimating also the parameters relative to the gastrointestinal tract and comparing them with values already found using other methods.

IV ACKNOWLEDGMENTS
This work is a republication of "A hybrid algorithm for parameter estimation of the ghrelin dynamics based on in vivo data" (Congresso Brasileiro de Inteligência Computacional 2017): http://cbic2017.org/papers/cbic-paper-9.pdf