Short-term prediction for dynamic blood glucose trends based on ARIMA-LSSVM-GRU model

Continuous glucose monitoring (CGM) is an effective strategy to dynamically monitor the patient’s blood glucose (BG) levels. The current existing prediction models, such as ARIMA, LSSVM, GRU, LSTM are commonly used for the prediction of BG changes in the process of health monitoring and evaluation. However, these single models cannot obtain the optimal prediction results for the BG series, which possess both linear and nonlinear characteristics. Given the above limitation, a hybrid ARIMA-LSSVM-GRU model based on ARIMA, LSSVM, and GRU technologies is proposed to predict the forthcoming BG trends. This model utilizes an ARIMA model to capture the linear features and predict the BG series. Then, the least-squares support vector machine is used to predict the error series that is generated by ARIMA. Finally, the GRU model is used to combine the prediction results of ARIMA and LSSVM to get the more accurate upcoming BG trends. To test the accuracy of this hybrid model, the BG series is prepared, trained, and tested through real clinical BG monitoring data set. The ARIMA-LSSVM-GRU model is been compared with other prediction models (ARIMA, LSSVM, GRU, LSTM) by the evaluation criteria RMSE, MAPE, TIC. The experimental results show that this model can effectively improve the accuracy of the short-term BG prediction.


Introduction
Continuous glucose monitoring (CGM) provides an effective strategy to continuously monitor the blood glucose levels of diabetics, which provides more information support for prediction models to forecast the future BG trends and effectively prevents the occurrence of abnormal events [1][2][3]. To predict the dynamic BG series will timely facilitate the preventive measures against health risks induced by its abnormal trends. However, due to the BG series usually have both linear and nonlinear characteristics, the prediction models cannot effectively and precisely predict the future BG results [4]. Therefore, it has become one of the most challenging applications to construct the BG series forecasting model and accurately predict future time-varying trends. Jun Yang et al. establishes an ARIMA model with adaptive orders to continuously predict the dynamic BG concentrations and it outperforms the adaptive univariate model and ARIMA model [5]. JaouherBen et al. propose adopted several artificial neural network approaches for the prediction of Type 1 Diabetes [6]. Some adaptive artificial neural networks have been proposed and been proved to be accurate, adaptive, and very encouraging for clinical implementation. Kezhi  2 hotspot in recent years and the application is conducted to do early hypoglycaemic alarms by the CGM data trends [8]. Ta et al. proposed accurate prediction of continuous blood glucose-based SVR and differential evolution methods to achieve high prediction accuracy [9][10]. However, the above models all assume that the time series is a linear or nonlinear characteristic, but in practical application, there may be both of these characteristics, which affect the performance of the BG prediction models. To solve these problems, this research proposes a new strategy for the combination of linear and nonlinear model prediction results. Firstly, the ARIMA model is used to extract the linear features of the time series, then the LSSVM model is used to predict the error series between the predicted value and the actual value of the ARIMA model. Finally, the prediction results of the former two models are used as the input of the GRU model to predict the time series, and dropout technology is applied to optimize the GRU model for further accurate BG prediction.

ARIMA model
The autoregressive integrated moving average process model (ARIMA) is a linear model proposed by Box and Jenkins for time series analysis and prediction. In time series analysis, three parameters need to be set, namely autoregressive order (p), difference order (d), and moving average order (q). The general form ARIMA p d q is as the following equations (1) to (3).
Where, B is the backward operator.
(1 )  (2) and (3) are polynomial of autoregressive correlation coefficient and polynomial of moving average coefficient respectively. This model is utilized to extract the linear features of the BG series and establish the foundation for BG series prediction.

LSSVM regression model
The LSSVM has played an important role in many fields including data regression, pattern recognition, time series prediction, etc. Figure 1 illustrates the structure diagram of LSSVM.
 is the input vector, and i y R  is the corresponding output value, the input data can be mapped to a higher dimensional space from the original feature space d R by adopting a nonlinear function ( ) x x is the kernel function, satisfying Mercer's conditions. The kernel function plays an important role in constructing high-performance LSSVM, with the ability to reduce the computational complexity in high-dimensional space. Then the LSSVM model can be constructed as follows:

Gated Recurrent Unit (GRU) model
Compared with Long Short-Term Memory (LSTM) model, GRU is a kind of Recurrent Neural Network (RNN) and it has the same data mining ability with the improved gate control structure. It combines the input gate and forgetting gate into the gate t z , and replaces the output gate of LSTM with the gate t r . Among them, t z determines the integration of historical information of new input information, t r determines the proportion of the previous state information into the model. The training parameters are reduced in the GRU structure due to its reduced door numbers. Therefore, the training speed is greatly improved. The mathematical expression of the GRU is described as follows: Where  stands for the product formula of elements. z W and r W are the weight matrix of the gate t z and t r . s W is the weight matrix of the output state. t x is the input data at time t. t s  and t s are candidate states and output states at time t, respectively. s b , z b and r b are the constants. sig  and tanh  are respectively sigmoid and tanh activation functions, which are used to activate gates. The expressions of sigmoid and tanh functions are illustrated as formulas (9) and (10).
Since the BG series are prepared already, they are utilized to train the GRU model. In the training process, the parameter matrices r W , z W , s W , r U , z U , s U are modified in each iteration, which follows the backpropagation through the time procedure of typical RNN. After the learning process, a GRU is trained to be constructed and competent for time series forecasting tasks.

Dropout technology
The increasing number of layers, neurons, and parameters in the deep neural network will improve its performance. But too large parameters or the less training data set can easily lead to the problem of overfitting. At present, the method to prevent overfitting is to add the inactivation regularization weight 4 constraint technology to the training data set. In the training process, dropout technology randomly discards the hidden neurons in the network according to a certain probability, so that these neurons do not participate in forward and backpropagation in the training process, but their weights are still retained, as shown in figure 2. It shows the results of a standard neural network processed by dropout technology, in which the solid circle represents the neurons randomly discarded by dropout technology. The optimal dropout probability in this research is set to 0.3.

ARIMA-LSSVM-GRU model
This research presents a hybrid ARIMA-LSSVM-GRU prediction model that is based on ARIMA, LSSVM, and GRU technologies. The hybrid model consists of three parts. Figure 3 shows the prediction process of the combined prediction algorithm and the specific components of this model are described as follows.

Results and discussion
The configuration of our experimental environment is as follows. We take Ubuntu 16.04 as the operating system to establish the BG forecasting experiments. Python 3.6 is taken as the development language and the testing framework is TensorFlow2.2. The data in this study are acquired by the CGM system (RGMS-Ⅲ, MeiQi) in a municipal hospital (Jinan, Shandong province of China). CGM device measures the blood glucose data about every 3 minutes to obtain the continuous BG data during three days for training and testing of our proposed prediction model.

Data preprocessing
In neural network training, the dimensional difference between data plays a key role in whether the network training can converge and the prediction accuracy. Therefore, it is necessary to pre-process the input time series data before the prediction modeling. In this paper, formula (11) is used to normalize the sample data to the interval [0,1] for the needs of modeling and prediction.
Where, t is time series.   x t is the original BG series.   normal x t is the normalized BG series.

 
 are the minimum and maximum values of the training data set respectively. The output BG series after training and modeling are denormalized to obtain the predicted value.

Evaluation criteria
To evaluate the prediction performance of the ARIMA-LSSVM-GRU model, this research uses the root means square error (RMSE), average absolute percentage error (MAPE), as well as Theil inequality coefficient (TIC) as evaluation indicators to measure the prediction accuracy. The expressions of evaluation criteria are as follows. x is the actual value of the ith sample.  i x is the forecasted value. n is the number of samples in the BG data series. Theil inequality coefficient is between 0 and 1. The closer the TIC value is to 0, the higher the fitting degree between the predicted value and the actual value.

Experiment and result analysis
This paper constructs and compares a batch of short-term BG prediction models. It is illustrated in figure  4. The prediction results of a patient's blood glucose test data set, in which the step length in the prediction process is set to 30-min and 60-min respectively. From the comparison of the prediction results with different forecasting steps, it is easily seen from the forecasted results that the ARIMA-LSSVM-GRU model is better than the other traditional models (GRU, LSTM, ARIMA, LSSVM). The dynamic BG prediction results of different models with different prediction steps were evaluated from the prediction results. It shows the ARIMA-LSSVM-GRU model has an advanced performance than the other models. The prediction evaluation has illustrated as follows in detail.

Conclusions
This paper proposes a hybrid ARIMA-LSSVM-GRU model for BG series prediction. Firstly, the combination of ARIMA and LSSVM models is jointly used to extract the linear features of the BG series and the nonlinear features of the error series separately. After that, the GRU model with dropout optimization is utilized to combine the linear and nonlinear prediction results for the accurate prediction of the future BG trends. This study makes an empirical analysis of the BG series of different experimental samples. The experimental results show that our proposed model improves about 24.0% and 21.2% by RMSE and MAPE respectively when compares with other models (GRU, LSTM, ARIMA, LSSVM) in 60-min predication. At the same time, TIC evaluation also shows that the prediction result in this region is increased by 21.1% than the other methods. The prediction accuracy is also largely improved in 30-min than it in 60-min. In conclusion, the ARIMA-LSSVM-GRU prediction model can effectively improve prediction accuracy. Therefore, it is a better short-term BG prediction method for a certain practical application. In the future, our proposed hybrid model should be further optimized to make it more suitable for time-varying BG prediction.