Implementation of autoregressive integrated moving average model to predict total electron content from GPS satellite receiver in Bandung

Total electron content (TEC) or free-electron content that is in the ionosphere layer can disrupt telecommunication and navigation systems. This study discusses forecasting the daily average value of total electron content (TEC) during March 2019. The method used in this forecast is the autoregressive integrated moving average (ARIMA) model with the Box-Jenkins approach. The research results show that the best ARIMA model is ARIMA (1,1,1)with an MSE 2.453. The results showed that the daily average predicted TEC value increased every day and the maximum TEC value occurred in the 86th period (86thday) which was 18.4991 TECU (1 TECU =1016electrons / m2).


Introduction
The ionosphere layer is part of the upper layer of the atmosphere where many free electrons affect the propagation of radio waves [1]. The GPS satellites which are useful for navigation by ionosphere researchers can be used as tools for understanding the ionosphere layer. GPS satellite signals will experience refraction and change in intensity as they pass through the ionosphere before being accepted by GPS signal receivers on earth. The changes that occur in the GPS signal as it passes through the ionosphere contain information on the state of the ionosphere. This information is the number or content of electrons in the ionosphere. The electron content in the ionosphere is called the total electron content (TEC) [2].
The propagation of radio waves through the ionosphere will experience a delay time due to interference with free electrons in the ionosphere. With this delay in the ionosphere, the measurement of the distance between the satellite and the GPS receiver based on the measurement of propagation time will experience an error, so that GPS positioning using the distance resection method will experience measurement errors. For high-precision positioning, the ionosphere TEC needs to be estimated so that it can be used for the correction of GPS satellite distance measurement errors [3].
The TEC value can increase, one of which is due to the influence of the seasonal pattern of the sun, wherein the equinox months (March-April, September-October) the sun is close to the Indonesian equator. If the TEC value is higher, it will affect the propagation of radio waves so that in everyday life the telecommunications and navigation systems will be disrupted [2]. In this study, the TEC daily average data was used from January to February 2019. Because the TEC value is daily time series data and has an up and down trend pattern, the data analyzed using time series analysis [4]. Time series 2 data can be used as the basis for making provisions to predict events that will occur in the future era. The autoregressive integrated moving average (ARIMA) method is a very popular model and is often used in time series data modeling. So that in this study using the ARIMA method assumption with the Box-Jenkins approach to predict the daily average TEC value in March 2019.

Methodology
Modeling and forecasting using the ARIMA method consist of several stages, included model identification, parameter estimation, diagnostic test, best model selection, and forecasting [5].

Model Identification
Model identification is carried out by the following steps:  Plot the data to examine whether the data have seasonal patterns or not.
 Check the stationarity of the data (stationary in the mean and stationary in the variance). If it is nonstationary in the mean, perform the differencing, and if it is nonstationary in the variance, perform a transformation.  Plot ACF and PACF of the data, to see whether the data has been stationary after transformation and differencing. The plot also can be used to decide the possible order of ARIMA models.

Parameter Estimation
The next step is to estimate the possible models and test their significances. The test is carried out to check whether the model is suitable for use or not. The process of estimating and testing the significance of the parameters can be performed using the Minitab 16 software.

Best Model Selection
There are several residual or error-based methods can be used to select the best model, one of which is the mean square error (MSE) [6]. MSE is formulated as ∶ number of observations ∶ observational data at time t ∶ forecast value at time t

Model Identification.
In this stage, two tests were carried out, the stationary test for variance and the stationary test for means using past data on the TEC value to identify it, at this stage a suitable ARIMA model will be sought and considered in accordance with the data obtained by the researcher.

Variance Stationary Test.
The box-cox test is carried out to find out whether a data has fluctuations that tend to be constant or not, in a box-cox test data that tends to be constant or stationary will have a rounded value ( ) = 1.  Box-cox test results on the total electron value data content (TEC) can be seen in Figure 1. It is known that the data is not stationary to the variance because the rounded value is ( ) ≠ 1, so it needs to be transformed according to the lambda value ( ). The data resulting from the box-cox transformation with the formula √ causes the rounded value (λ) = 1, as shown in Figure 2, so it can be concluded that the data is stationary to variance.

Means Stationary Test.
Identification of stationary means data includes identification of actual data plots, trend data plots, continued with autocorrelation function (ACF) pair plots and partial autocorrelation function (PACF) using TEC value data. The actual and trend data plots are shown in Figure 3.  It is known that based on Figure 3 shows that the data is not stationary because it still shows an uptrend. Therefore, a differentiation process is needed to assist in stationary data. Differencing is the process of calculating the change or difference in actual data on the nth data which is reduced by the previous data [7]. The graph of the first difference trend plot can be seen in Figure 4. Based on the data, the first difference is stationary. The next step is to recreate the ACF and PACF plots using the data in the differencing results.  Based on the results of model identification through ACF and PACF plots, it can be seen that the possibility of the emergence of a temporary model, ARIMA (2,1,1), (2,1,0) and (0,1,1) model. According to the possibility of the emergence of other temporary ARIMA model is (1,1,1) and(1,1,0) model [8]. So the researchers decided to use the ARIMA (2,1,1), (2,1,0), (1,1,1), (1,1,0), (0,1,1) models.

Determining Models Parameter Estimate
The next step is to find the best parameter estimate, the following is testing the significance of the parameters. The resume of test results is presented in the following Table 1. The decision is H 0 rejected for ARIMA(0,1,1) and ARIMA(1,1,0) because the p-value < = 0.05.

Diagnostic Test
In the model diagnosis, the residual normality test and independence test will be carried out. The normal distribution test for the residues used the Kolmogorov-Smirnov test and the independence test (white noise) using the Ljung-Box test. The resume of test results is presented in the following table.

Best Model Selection
Based on the results of the parameter assessment and diagnostics, the ARIMA (2,1,1) and ARIMA (1,1,1) models are suitable for forecasting the TEC value because the parameters are significant and other assumptions are met. So to find the best model, ARIMA (2,1,1) and ARIMA (1,1,1) were selected using the mean square error (MSE) value.

Forecasting
The best ARIMA model is the ARIMA (1,1,1) model so it can be used to determine the average value of the total electron content in March 2019. Using Minitab 16 software, the forecast results are shown in Table 4.