Modeling Autoregressive Integrated Moving Average (ARIMA) and Forecasting of PT Unilever Indonesia Tbk Share Prices During the COVID-19 Pandemic Period

ARIMA method is one method that can be used in predicting the movement of company shares. This study aims to obtain a time series model with the ARIMA method and predict stock price data of PT Unilever Indonesia Tbk from January 2020 to June 2020. The best model that fits the data based on the MSE value is the ARIMA(1,1,1) model. The ARIMA model (1,1,1) shows a match between real data and the predicted value. This model is then used for forecasting the next 14 days. Data on UNVR stock price from January 2020 to June 2020 are below 8000, this seems to correlate with the current conditions, namely the Covid-19 pandemic. Forecasting for the next 14 days (two weeks) from July 1, 2020 to July 14, 2020, the forecast values have a trend decrease, the trend of PT. Unilever Indonesia Tbk has been going down since January 2020. This seems to have occurred as an implication of the Covid-19 pandemic from January 2020 to the present.


Introduction
Time series analysis has become a topic and study that has attracted the attention of many researchers in statistics, economics, finance, population and other fields of science. The application of time series analysis is mainly used for forecasting in various fields. Time series data is data that is recorded during a certain period. Usually in the form of daily, weekly, monthly, six-monthly and annual data. The pattern can be in the form of repetition of the past or does not have a pattern. Time series data that has a repetition pattern is called seasonal time series, for example is data on a company's stock movements. In the case of non-seasonal time series, the Box Jenkins method is modelled by determining several criteria which are then known as the ARMA and ARIMA models. These criteria include the Autocorrelation (ACF) and Partial Autocorrelation (PACF) functions. Similar to the case of seasonal time series, Box Jenkins models using the same criteria. Forecasting is a conjecture or 2 estimate about the occurrence of a situation in the future by using certain methods. Forecasting is done by using the best information so that the desired goals can be achieved.
Forecasting methods commonly used are Autoregressive Integrated Moving Average (ARIMA). The Integrated Moving Average Autoregressive Model (ARIMA) is a model that uses dependent variables or data in the past and completely ignores the independent variables. This method has several advantages, namely not requiring stationary data patterns and can be used on data that form seasonal patterns [1]. Therefore ARIMA is a statistic that is suitable to be used in predicting a number of variables quickly, simply and accurately because it only requires variable data to be predicted. Therefore in this study the ARIMA model for time series data analysis and forecasting using PT Unilever Indonesia Tbk (UNVR) stock price data from January 2, 2020 to June 30, 2020.

Data Time series
The time series is a collection of observations , each observation that is collected at time t. The time series model in observational data is a specification of the joint distribution (or maybe only the mean and covariance) of the random variable array The most important part of time series analysis is the selection of possible models that fit the data. Time series data itself is data collected from time to time for an individual [2].

Stationarity
Stationary means there is no drastic change in the data. Data fluctuations are around a constant average value, independent of the time and variance of these fluctuations. The assumption of stationarity in time series data analysis is fundamental and must be checked before analyzing data. Several methods are available to check stationarity of time series data, based on data plots or through the test Augmented Dickey-Fuller (ADF). The ADF test process is as follows, for example y 1 , y 2 , . . . , y n becomes a time series, assuming that {y t } follows the AR(p) model with the given μ: y t − μ = ϕ 1 (y t−1 − μ)+ . . . +ϕ p (y tp − μ) + ε t (1) where: ε t is the white noise average 0 and variance σ 2 , and ε t~ N(0, σ 2 ). Testing non-stationary data in equation (1)

Model Autoregressive (AR)
Model autoregressive (AR) is a model that illustrates that present value is influenced by past value. The AR model with order p is denoted by AR (p). The general form of the AR (p) model is: Order in the model AR is often used in time series analysis is p=1 or p=2 [3].

Model Moving Average (MA)
Model Moving Average has the form as follows: From equation (4), it can be seen that is the weighted average error of q periods backwards. The number of errors q used in this equation indicates the level of the moving average model.

Model ARIMA
The Integrated Moving Average Autoregressive Model (ARIMA) was developed by George E.P. Box and Gwilym M. Jenkins (1976), so ARIMA is also called the Box-Jenkins time series method. The ARIMA model is divided into 3 elements, namely the Autoregressive (AR) and Moving Average (MA) and Integrated (I) models. The ARIMA model is a model that completely ignores independent variables in making forecasting. ARIMA uses past and present values of the dependent variable to produce accurate short-term forecasting. ARIMA is very good accuracy for short-term forecasting, while for long-term forecasting is not good. Usually it will tend to be flat (horizontal / constant) for a fairly long period. The general form of the ARIMA model can be stated in the following equation [4]: The ARIMA model assumes that the input data must be stationary. Stationary means that there is no growth or decrease in the data. The data must be roughly horizontal along the time axis. In other words, data fluctuations are around an average value and a variance that is constant with time. If the input data is not stationary, adjustments are needed to produce stationary data. One of the commonly used methods is the differencing method. This method is done by reducing the value of data in a period with the data value of the previous period. One method that can be used to estimate model parameters is Least Squares (Conditional Least Squares) [5]. The least squares method is done by minimizing the number of error squares. The steps for applying the ARIMA method in a row are model identification, model parameter estimation, model evaluation.

Test for Stability of the model
The Eigen value of matrix F satisfies The first thing to do at this stage is whether the time series data is stationary or non-stationary and that the AR and MA aspects of the ARIMA model only relate to stationary time series [6]. The stationarity of a time series can be seen from the ACF plot, where the autocorrelation coefficient decreases rapidly to zero, usually after the 2nd or 3rd lag. If the data is not stationary then a distinction can be made, the order of differentiation until the series becomes stationary can be used to determine the value of d on ARIMA (p, d, q).

Parameter estimation model
Parameter estimation by trying various ARIMA models at each level of each order which is likely to be an ARIMA parameter.

Model Evaluation
The second step is to estimate the autoregressive parameters and moving average parameters based on the order obtained at the identification stage. A good estimation model can be seen from the significance of the estimated parameters, and the smallest mean square error (MSE) value [7].

Forecasting
The last stage is forecasting, which is to make predictions or estimates of data based on the selected ARIMA model.

Results and Discussion
In doing time series modeling, the first step that must be taken is the data stationarity test. Stationary testing can be done in three ways, namely viewing the time series plot, using the Autocorrelation Function (ACF) graph, and the unit root test (unit root test).  Figure 1, it can be seen that the four data are not stationary in the average or variety because they have a tendency patte rn. Furthermore, looking at the ACF graphs from the four images also shows the instability because it has a pattern of decreasing exponentially close to zero. Then finally the stationary testing with the unit root test.  Table 1, it can be seen in the statistical p-value that Tau (τ) all types of testing for each variable are greater than the significant level used, which is α = 0.05, so that starting with H0 is not stationary.
Because the data is not stationary, differencing must be performed on the data, then re-testing of stationary testing using time series plots, ACF charts and unit root tests. Based on Figure 2, it can be seen that the data shows that they are stationary in mean and variance because they do not have a tendency pattern, then the stationary testing is done by unit root test.  Table 2, it can be seen in the statistical p-value Tau (τ) that all types of tests for each variable are smaller than the significant level used, which is α = 0.05, so reject H0 which means the data has been stationary. After stationary data with the first differencing (d = 1), it can then determine the ARIMA model by looking at the partial autocorrelation (PACF) value and the autocorrelation (ACF) value using the first differencing data. Based on Table 3, to find out the order on AR (p) and MA (q) can be seen from the T value in each table with a limit of ± 1.96. The T value on PACF indicates the order p. Because there are 3 values in the initial lags that exceed the ± 1.96 limit, the order p is 1. While the T value in ACF indicates the order q, where there is 1 value in the initial lags that exceeds the limit of ± 1.96, meaning that the order q is 1.
After getting each order on AR and MA, then you can try various ARIMA models on each order that are likely to be the parameters of the ARIMA model. Next do a significance test to choose which model is suitable to use. The model is said to be significant, if the p-value parameter is smaller than α = 0.05. The possibilities that can be used to become the parameters of the ARIMA model and the results of the significance tests on each of the possible ARIMA models are as follows.   Table 4, the significant ARIMA model (p-value <0.05) is ARIMA (1,1,1). And after obtaining some significant ARIMA models, the thing to do next is to evaluate the model to get the best model. The best model is by looking at the smallest MSE value. Based on Table 4 it can be seen that the one with the smallest MSE value among all possible models is ARIMA (1,1,1) with MSE value = 52694. Figure 3 shows the residual plot for UNVR share price data from January 2, 2020 to June 30, 2020. There is an outline of a straight line that can be observed from the normality plot. This indicates that the error is close to normal with some outliers. Therefore, the normal assumptions are met, besides that it can be seen from the histogram plot that the graph looks symmetrical and converges in the middle and shows a slight spread, thus supporting the normality assumption. Therefore ARIMA