A PARTICLE SWARM OPTIMIZATION BASED NONLINEAR AUTOREGRESSIVE MODEL FOR PREDICTING PLASMA GLUCOSE LEVEL FOLLOWING IVGTT

An important aspect of real time glucose control in diabetic subjects is forecasting the future glucose profile. From a modeling perspective, the complexity of interactions between plasma glucose and its controlling hormone insulin, demands a nonlinear modeling approach for representing the dynamics of glucose metabolism. In this paper, a technique based on the hybrid approach of nonlinear autoregressive modeling and particle swarm optimization (PSO) has been applied for modeling glucose metabolism and predicting the glucose profile from past glucose-insulin time series data. The identification of nonlinear autoregressive model involves selection of proper terms from a family of regressors, so as to minimize the deviation between the model and actual output. However, since the existing selection approaches do not involve the reduction of overall error, but it's minimization, the final model obtained is not always globally optimal. In this regard, the identification process has been framed as an optimization problem and solved using PSO. The proposed model has been analyzed on the clinical data of sixteen subjects. It has been demonstrated how the combination of an evolutionary computing technique and nonlinear autoregressive modeling is more efficient in predicting the evolution of glucose profile as compared to the widely used minimal model.


Introduction
Diabetes mellitus (TIDM) is a metabolic disorder characterized by glucose levels beyond the normal range of 80-120 mg/dL.The enhanced level, commonly referred to as hyperglycemia results from the lack or complete absence of the internally secreted controlling hormone; insulin.With diabetes reaching epidemic proportions, recent times have witnessed an increased attention in the development of methodologies for diabetes management by external infusion of insulin.The conventional approach for blood glucose control, involve insulin infusion at specific intervals in a day, generally followed by meal.However, the time varying and nonlinear nature of the physiological parameters makes this open loop approach highly unreliable.Recent advances in the field of sensor technology and continuous glucose monitoring (CGM) has opened new horizons for real time glucose control.State of the art CGM systems allow continuous glucose measurement for several days in a minimally invasive environment [1].A significant application of CGM system is the generation of alerts for hyper/hypoglycemic conditions ahead of time based on the current and past glucose profile integrated with a time series model.Based on the alert, a suitable metabolic control is achieved as compared to applying insulin therapy following hyperglycemia.This motivates the development of models that would be able to online predict the future trends in glucose concentration.The likelihood of predicting future glucose concentration using past values was first postulated by Bremer and Gough in [2].Using a linear autoregressive model fitted on a published data, it is concluded that the glucose dynamics are not random and the concentration can be forecasted at least for the near future (10 min ahead of time).Since [2], a large number of techniques based on autoregressive (AR) modeling have been proposed for predicting future glucose trends.In [3], a first order linear extrapolation of previous two/three glucose samples is adopted for generating alerts.Other notable works in this field includes development of an adaptive AR model of first order [4], robust AR model with inter-patient variability of order 30 [5] and a subject independent universal model [6].Some of the notable works that has used both glucose and insulin data includes a second order autoregressive with exogenous input (ARX) model [7] for a dataset obtained from 41 patients in ICU and a recursive ARX model identified for diabetic human and canine [8].The ambulatory data of two diabetic subjects is utilized in [9] for the development of AR, ARX and autoregressive with moving average and exogenous input (ARMAX) models.However, all the existing autoregressive modeling approaches on glucose concentration have assumed linear interactions between glucose and insulin.The presence of nonlinearity in the dynamics of glucose-insulin interactions have been well documented in the various nonlinear physiological models of glucose metabolism [10].In view of this, in the Abstract present work, a nonlinear autoregressive with exogenous (NARX) input model has been adopted to predict the future glucose levels from the past glucose and insulin levels.Because of its ability in describing both the stochastic and deterministic components of a nonlinear system, the NARX modeling approach has been successfully applied for representing the dynamics of several nonlinear systems [11].Obtaining an optimal structure from a family of models is essential in NARX modeling.Many algorithms proposed in literature involves structure selection using different performance metrics [12][13][14][15].The forward regression orthogonal algorithm (FROE) algorithm [12], which involves a parallel approach for structure selection with parameter estimation though quite suitable for short term prediction often results in redundant and spurious models.The resulting models are characterized by poor long term prediction and simulation performanace.A structure selection algorithm referred to as simulation error minimization with pruning (SEMP), which employs simulation error minimization and automatic model reduction has been proposed in [13].However, instead of minimizing the overall error, both FROE and SEMP seeks to minimize the iterative reduction in the error beteween the model and actual output, hence the final model obtained is not always globally optimal.Genetic algorithm has been successfully applied in NARX modeling for minimal structure detection [14] and simultaneous model order and parameter estimation [15].Since both the algorithms are based on the minimization of one step ahead prediction, the final model obtained yields unsatisfactory results for long term prediction.In comparison to the modeling based on one step ahead prediction, the model structure selection based on simulation error besides performing better for long term prediction and smaller sampling time is also found to be more robust to the excitation characteristics of the identification dataset.In view of this, in [16], the identification of NARX model is carried out by direct minimization of simulation error with the selection of terms based on genetic algorithm.The technique is applied to model the dynamics of glucose interactions following a physiological test.The genetic algorithm based techniques though capable of generating models with superior simulation performance, quite often require large execution time.Considering the need of larger prediction horizon in the present problem of forecasting glucose levels and reduced computational effort in identifying the model, a particle swarm optimization (PSO) based approach for model identification is proposed by direct minimization of simulation error.The model identification process has been formulated as an optimization problem, based on the selection of terms from a large possible set with the aim of approximating the given experimental data.Results demonstrate that the proposed approach is able to predict the dynamics of glucose utilization, about three hours ahead in time following a clinical test i.e.Intravenous Glucose Tolerance Test (IVGTT).In a recent work on the application of PSO for hypoglycemia detection, a hybrid approach based on PSO and fuzzy reasoning model has been used for hypoglycemia detection using physiological parameters derived from the ECG [17].However, since the glucose metabolism dynamics were not represented by an explicit model, the prediction of the glucose profile over a time horizon is not possible.
The organization of the paper is as follows: section 2 describes the NARX modeling technique, section 3 deals with the formulation of NARX model identification as an optimization problem and its solution using PSO, the identification results of the model for glucose metabolism are dealt in section 4 and finally section 5 provides conclusions.

NARX Modeling
The IVGTT is a simple experimental procedure to determine insulin sensitivity i.e. the ability of insulin in accelerating the process utilization by muscles and adipose tissues.It is carried out by injecting a bolus of glucose intravenously, followed by repeated measurement of the plasma glucose and insulin concentrations over a period of time (generally three hours).Several works has been reported on the modeling of physiological interactions following an IVGTT by a set of identifiable parameters, the most notable among them being the Bergman minimal model [18].
In this work, the dynamics of glucose metabolism which are known to be highly nonlinear, stochastical and subject dependent has been modeled in a recursive way by representing the present glucose level as a function of past glucose and insulin levels.The output function, i.e the plasma glucose level at any sampling instant k has been represented in its general form by the following NARX function [19] ( ) ( ( -1),..., ( -), ( -1),..., ( -)) ( ) where the output y(.) and input u(.) refers to the plasma glucose and insulin levels, f(.) is a nonlinear function of certain degree l, n and n are maximum lags in the input y u and output respectively and is a white noise sequence.Equation ( 1) can be reformulated in a linear regression function form as where and are monomials(regressors) made of linear and nonlinear combination of delayed input and output terms.
The most challenging problem in NARX modeling is the optimum structure selection.For a given structure, the number of monomials that can be included in the model is quite large.For example with n =n =4 and y u l=2, the number of possible terms are 45.In reality the actual system in Eq. ( 2) may involve only 10-15 terms.Direct estimation involving all the terms will make the parameter estimation task ill conditioned, while an over simplified model with very few significant terms will not be able to capture the actual dynamics of the system.A proper structure selection process involves a trade-off between achieving accuracy and identifying a simplified model.Most of the techniques developed for the estimation of NARX model have concentrated on the minimization of one step ahead prediction error i.e. deviation between the model and actual output one sample instant ahead of time.However, a model based on one step ahead prediction gives erroneous output for cases where simulation and long term prediction is involved.A technique based on the minimization of simulation error is developed in [12].However, since the technique does not incorporate minimization of the overall error but only it's relative reduction, the final model obtained is not optimal.In this regard, a PSO based approach has been developed for the identification of a NARX model from glucose insulin time series data (IVGTT).The model aims at long term prediction of the glucose levels for a given initial condition and insulin secretion profile.The PSO based model structure selection is described in the next section.

PSO Based NARX Modeling
Particle swarm optimization is a population based evolutionary technique inspired by the social behavior of bird flocking and fish schooling [20].It utilizes a population of particles that move within the feasible parameter space with given velocities.In recent years, PSO has emerged as a popular technique for solving complex optimization problems because of it's ability in not getting effected by the size and nonlinearity of the problem.It can converge to global optimal solution for cases where the conventional gradient based techniques fail to converge.The application of PSO for the present problem, involves determining an optimal model structure for representing the dynamics of glucoseinsulin interactions based on the minimization of deviation between the model and experimental output.The different steps in the application of PSO for NARX modeling for a predefined input-output lags and nonlinearity degree are summarized as follows.
Encoding: The initial population is generated by encoding a set of possible models into binary strings.Each bit of a binary string for a given population represents the presence or absence of a particular term of the regressor set.For example with n =n =3 and l=2, the y u regressor set size is 28 and the possible terms are: Then a candidate model is represented by the string 110100001…0 .The number of 1's in the string may be constrained to the maximum number of terms desired in the model.Utilizing the inputoutput time series data, the parameters of the model (2) corresponding to the randomly generated binary strings are estimated from least squares technique with orthogonalization by Gram Schmidt process [21].
Fitness Evaluation: The effectiveness of the possible models in the population set in representing the glucose-insulin dynamics is determined in terms of mean square simulation error (MSSE).The MSSE, which reflects the error between the simulation output of the model and the experimental data is represented as where is the actual output of the system at N measurements and is the candidate model's simulated output.As mentioned earlier, for the present problem, the plasma glucose level following the intravenous infusion of insulin represents the output and the plasma insulin level at different instants of time is the model input.Each particle representing a possible model is represented by replacing the 1's in the encoded binary string by the corresponding estimated parameter.
In PSO, each particle can be considered as a candidate solution to an optimization problem, which is represented by a vector in a n dimensional space.Where, n is the number of parameters, which for the present case is the number of regressors.Iteratively, the particles moves toward new points in the search space with a certain velocity with the aim of reducing the objective function.The initial velocities are randomly selected from a normal population In the present case, the search involves finding the appropriate parameter vector and structure for the NARX model that minimizes the objective function (4).In standard PSO, after every iteration, the position and velocity of each particle is updated as

r p t x t c r g t x t x t v t x t
(5) where i represents the particle, c and c are positive The first component in (5) refers to the tendency of the particle to move along the same direction in the search space in the next iteration.The second component represents the attraction of the particle towards the best position ever recorded by the particular particle: The third component in the velocity updation equation is the attraction towards the best position recorded by all the particles: The positions and are commonly referred to as local and global best respectively.Unlike the standard PSO, where r and r are considered as one- restricts sampling within a parallelepiped region between the two best positions p and p [22].Also, bounds are i g imposed on the particle position to restrict the model parameter space within the feasible region.A significant drawback of the standard PSO given by ( 5) is the absence of any mechanism to check the uncontrolled increase in the magnitude of velocity.This leads to divergence of the particles, commonly referred as 'explosion' of the swarm [23].A simple approach adopted in the present work addresses the above problem by imposing bounds for velocity at desired levels.This restricts the parameters from making large alterations from their current position.
Although, the concept of maximum velocity improved the performance of the standard PSO (5), it is not able to concentrate the particles in the vicinity of the better solutions.In the later iterations, the particles tend to oscillate around their best positions.This is attributed to the inability in controlling velocities.This led to the introduction of a new parameter w, called inertia weight, resulting in the following PSO variant [24] 1 1 2 2 ( 1)

t wv t c r p t x t c r g t x t x t v t x t
The remaining parameters remain same as that of the standard PSO (5).The inertia weight is selected so as to reduce the effect of with increase in the iterations.The common choice is to reduce w in a linear fashion, so as to eliminate oscillatory behavior in later stages.However, considering the nonlinearity involved in the present problem, w is decreased in a manner so as to complement the nonlinearity of the system to be optimized.The updating schema of the weight (w) is given by: In population-based search optimization methods, high diversity in the particles during the initial iterations results in larger spanning of the search space.Whereas, in the later iterations it is preferable to allow fine-tuning of the possible solutions to efficiently locate the global optima.In this context, time varying acceleration coefficients are to properly control the global and local exploration of the swarm.This has been achieved by suitably adapting the acceleration coefficients c and c .Similar to (7), the following updating scheme has been adopted for the acceleration coefficients: In each iteration, the model with the global best solution is compared with that of the previous best.If a very insignificant improvement is seen for some successive iterations then the algorithm is stopped, otherwise all the operations described above are carried out till a model is obtained with a pre specified performance criterion.

NARX Modeling from IVGTT Data
The PSO based NARX modeling approach discussed in the previous section is tested in it's effectiveness in predicting the plasma glucose levels over a time span (generally three hours).The prediction of the future glucose levels is achieved by deriving a model from the plasma glucose -insulin time series data following an intravenous glucose tolerance test (IVGTT).For arriving at the NARX model, the plasma glucose level is considered as the output, i.e. the present plasma glucose level is considered as a nonlinear function of past plasma glucose and insulin levels.The solution of the optimization problem consists of finding the model parameters that minimize the performance index (4).In accordance with the minimal model representation, where the plasma glucose and insulin concentration are represented in terms of their deviation from the basal value, the IVGTT time series data has been processed correspondingly.The mean of the last three measurements at strady state has been considered as the basal value.The peak value of the glucose profile has been considered as the initial condition to account the delay between the intravenous delivery of glucose and it's distribution in the plasma space.During the IVGTT, to capture the high frequency dynamics in the initial phase of glucose bolus distribution, the measurements are taken at a sampling time interval of one minute.As steady state is reached during the later stage with the glucose concentration approaching the steady state, samples are drawn at larger time intervals.
The structure identification process is initialized by considering a model with possible terms obtained by a randomly generated binary string.Based on the effectiveness of the model in exhibiting the dynamics of glucose as determined by the MSSE (4), the model parameters are iteratively updated as per (6).The parameter updation process is carried out till a predefined MSSE is achieved or if the updation process for some successive iterations did not lead to a significant change in the MSSE.A reduction in the deviation among the local best of different particles is observed with increase in the number of iterations for the dataset of all the subjects.Considering all the terms irrespective of the contribution of the individual regressors to the system dynamics resulted in a complex with high MSSE.The iterative variation of the MSSE for the global best particle for two subjects is depicted in Figure 1.The maximum number of iterations and particles are set to 500 and 100 respectively.However, for many subjects, the convergence criterion is found to be satisfied at lesser of generations with reduced population.After the convergence of the algorithm, simulation performance of the identified model is tested on the identification dataset.The deviation between the simulated glucose profile obtained from the model and clinical test output, commonly referred to as residuals, is shown in figure 4 for different subjects.The maximum deviation is observed in subject # 6.The near zero mean for the residuals is

Regressors (b)
reflected in almost all the subjects.For all the subjects, because of the transient dynamics involved initially for the distribution of injected glucose, a relative high deviation is observed in the initial phase.

Coefficient Value
Besides providing a lower MSSE on the validation dataset, a proper model should also satisfy the correlation test on the residuals.The correlation analysis is required to validate the performance of the model against variation of the dataset.For proper capture of the system dynamics by the identified model, the residuals should have low autocorrelation for different lags [26].A high correlation suggests the incorporation of dataset dependent noise components in the model.Such a model is suppose to perform poorly on a dataset with different noise property.Figure 5 reports the correlation as a function of different lags for the residuals of the models identified for subject # 5 and subject # 6. Subjects with near exponential decay were identified by models with no/negligible contribution by the nonlinear terms as compared to subjects with nonexponential glucose decay arising because of the nonuniform distribution of injected glucose and/or glucose release by the liver under hypoglycemic conditions.This reveals the variability in the extent of the presence of nonlinear interactions in glucose metabolism among different subjects.For some subjects (like subject # 6), the inclusion of further terms of higher order (in excess of two) lead to a substantial reduction in MSSE (results not reported here), but for generalizing the results among different subjects the set is kept fixed for all subjects.Subjects with high insulin sensitivity are characterized by the relative high magnitude of the coefficients corresponding to the insulin terms.

Lags (a)
The simulated model output for the subjects with lowest (subject # 11) and highest insulin sensitivity(subject # 13) is shown in figure 6.The enhanced ability of insulin in utilizing glucose is apparent in subject # 13 from the high undershoot (hypoglycemic condition) prevailing at around 50 min.For the identified models, the coefficients of the regressors are depicted in figure 7.As mentioned earlier because of high insulin sensitivity, the coefficients of the insulin regressors are relatively quite higher in subject # 13.The performance of the models identified using the PSO based proposed approach for different subjects is compared in terms of MSSE with the most widely used physiological model i.e. the Bergman minimal model along with the NARX model identified by SEMP [27] and another evolutionary optimization technique i.e. genetic algorithm (GA) [16] in Table 1.The superior performance of the NARX modeling approach over minimal model is due to the inclusion of quadratic nonlinear terms, which are not accounted in the minimal model.The inability of the SEMP based approach in identifying an optimal model structure that captures the dynamics of glucoseinsulin interactions is also exhibited from the table.

Conclusions
In the present work, a nonlinear autoregressive (NARX) modeling approach has been adopted to predict the glucose concentration following a clinical test (IVGTT).The generation of future glucose concentration way ahead of time allows administration of proper t h e r a p e u t i c m e a s u r e s , b e f o r e t h e o n s e t o f hypo/hyperglycemic co ndition.Considering the limitations of existing approaches of not identifying an optimal structure, an evolutionary optimization approach based on PSO has been adopted for model identification.relating the model robustness to noisy data.The intrasubject variability in the insulin action on the utilization of glucose is reflected in terms of the model coefficients.The results also revealed the need of a subject dependent variable structure for modeling.To further validate that the proposed approach provides a better representation of the glucose-insulin interactions, future work in this direction would involve model identification and validation on data for a wider population of subjects with greater variability under different physiological conditions.
each iteration, the coefficient of the regressors changes with contribution from the local and global best particle.The glucose profile as obtained from the identified model is compared with the clinical test data in Fig.2for two prototypical subjects: one (subject # 5) having a uniform near exponential decay of glucose after the first peak and another (subject # 6) having multiple peaks of glucose.The corresponding parameters of the identified model are depicted in fig.3.

Fig. 4 :
Fig. 4: Deviation in the simulated plasma glucose profile from the clinical test result for the models identified for different subjects.
The identification process aims at minimizing the deviation between the simulated model response and the actual test output.Results obtained with the models identified form the dataset of sixteen subjects demonstrate the effectiveness of the proposed approach in replicating the dynamics of glucose metabolism.The PSO based NARX model was found to fit the actual clinical test data more accurately as compared to the widely accepted Bergman minimal model and SEMP based NARX model.The improved performance is attributed to the inclusion of extra nonlinear and autoregressive terms.Besides lesser deviation, the developed models are found to satisfy the statistical tests Coefficient Value Coefficient Value

Table 1 :
Comparision of MSSE on validation data