Predicting Crude Oil Prices During a Pandemic: A Comparison of Arima and Garch Models

Received July 17, 2020 Revised from August 25, 2020 Accepted September 22, 2020 Available online March 15, 2021 The unprecedented global turn of events primarily due to the spread of highly contagious corona pandemic has led to a substantial fall in crude oil prices. A forecast for crude oil prices is important as oil is required for all major economic activity, particularly production and transportation. This study aims to apply two commonly used methods of Autoregressive Integrated Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroskedasticity (GARCH) to predict the WTI crude oil prices for the period February 10, 2020, to April 27, 2020. Such a comparative analysis of these methods in unprecedented times is missing in the existing literature. ARIMA suggests ARIMA (4,1,4) model while GARCH (1,1) as the best among their own respective family of models. And between ARIMA and GARCH ARIMA model is recommended for forecasting as it has a lower root mean squared error (RMSE) and mean absolute error (MAE). The study recommends using a mean based ARIMA approach for predicting future values in extreme situations. JEL classification: G10, M41 DOI: 10.14254/1800-5845/2021.17-1.15


INTRODUCTION
Oil, which has been also called 'black gold' as it is most precious commodity impacts the world economy, is going through unprecedented times. The demand for crude oil has fallen drastically like never before because of the suspension of economic activity worldwide because of the spread of the contagious coronavirus. The world population of around three billion has been forced to remain at home to prevent the spread of this highly contagious virus. But this has taken a toll on the economic activities. This has lowered industrial production, minimized transportation, disrupted supply chains, and contracted international trade. With the identification of novel coronavirus on 31 December 2019, the first death was reported in China on January 11, 2020. By February 9, the death toll in China itself surpassed the death toll for the previous 2002-2003 SARS epidemic. By March 11, WHO declares it as a pandemic. By Aril 10 the global death toll surpassed 100000. The most dominant strategy to prevent the spread to coronavirus was a lockdown as at the end of July also, the world awaits the vaccine for this disease. But this lockdown has seriously strained the world economy. Crude oil, which is sometimes considered as the contemporary engine of growth is hit the hardest.
Few other things have also worsened the case of oil prices. Two major oil producers Saudi Arabia and Russia could not reach to an agreement on slashing oil prices as a response to this crisis. Saudi Arabia reacted with flooding the international market with oil. Second, the futures market of oil is now in 'super-contango,' encouraging storing oil with the anticipation of increased prices in the future. But this has strained both the storage infrastructure. And third, the nature of crude oil production is such that it cannot be stopped as a response to reduced demand. The restart closed oil wells are prone to higher losses than selling the oil at minimally low prices.
Crude oil forecasts are important for making any basic macroeconomic decision. It is also an important financial implication in managing options, risks, and portfolios. There are a large number of models to predict crude oil prices and their volatility. Autoregressive Integrated Moving Average (ARI-MA) is a popular mean based precision technique and Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models are important volatility based prediction techniques. Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) are commonly used to compare competing models and rank them accordingly for better out of the sample forecast (Sadorsky, 2006;Marzo and Zagaglia 2010;Mensah, 2015).
Though there have been attempts in the past to forecast crude oil prices. But this study attempts to address the problem of predicting crude oil prices in a period of extraordinary situation. There is a sharp drop in the demand for oil due to the spread of corona. Oil prices have not only fallen to historically low levels. On April 20, 2020, the price was WTI in the futures market was negative $37.63 per barrel. This negative oil price happened for the first time in history. Towards this, the current study proceeds to compare the forecast results of ARIMA and GARCH in such an extreme period. The choice of these two methods is rationalized as ARIMA is based on mean and GARCH is based on variance. Abledu and Kobin used ARIMA (1,1,0) to predict the oil prices in Ghana. ARIMA has been used by studies to predict future values. Akpanta and Okorie (2014) used ARIMA modeling to predict the crude oil price of Nigeria. The ACF and PACF plots suggested ARIMA (6,1,7). But as the author removed some redundant parameters the study proposed ARIMA (2,1,2) on the basis of parsimony. Later, a comparison is done between the two models using Information criterion statistics and t-test and no significant difference was found. In a related study, Adebiyiet al. (2014) built a model for predicting stock prices. The study found that ARIMA provided robust short-term Forecasting. ARIMA (1, 0, 1) is relatively the best model for forecasting Zenith bank stock index and ARIMA (2, 1, 0) as the best model for the Nokia stock index. The study further opined that the ARIMA model can compete favorably with other models of forecasting. In another study, Mensha (2015) divided the study period into their sub-periods and found that an ARIMA (1,1,1) model was the best for forecasting Brent crude prices. However, this model may not work properly when there is high volatility and hence recommended using GARCH family model for estimating during periods of high volatility. ARIMA has been used for predicting and comparing domestic credit growth of two countries, China and Vietnam (Dinh, 2020) Further, the existing body of literature has extensively used GARCH family models to model the volatility of stock returns, crude oil prices, gold prices, etc. The researchers forecasting volatility in crude oil prices found the GARCH models as the best methods compared to other forecasting models, since the GARCH models have more forecasting ability (Tripathy and Rahman 2013;Shabani et al. 2016;Ayele et al. 2017). Moreover, the performance of the asymmetric GARCH model seems to be influential among the GARCH family models in modeling the volatility of different commodities. The asymmetric GARCH models, such as EGARCH was found to be superior to the symmetric models, such as GARCH (1,1) model in predicting the volatility of prices since the asymmetric models capture the leverage effects (Agnolucci, 2009;Chin, 2009;Salisu and Fasanya, 2012;Wang and Wu, 2012;Lama et al. 2015;Charles and Darne, 2017). Moreover, the other asymmetric GARCH models in the family in addition to EGARCH, such as TGARCH, PARCH, and AGARCH is found more suitable in modeling volatility (Gokbulut and Pekkaya, 2014).

LITERATURE REVIEW
A similar study by Arachchi (2018) that compared the symmetric and asymmetric GARCH models in applying to exchange rate volatility found the later to be effective in most of the circumstances. Further, a study by Faruk and Muhammad (2019) that attempt to explain the best suitable model in forecasting price volatility found the basic GARCH model with GED parameters to be a parsimonious model among GARCH family, due to capturing persistence of volatility. Moreover, the performance of GARCH family models was found to be better in the short-run (Herrera et al. 2014). Similarly, the asymmetric GARCH models, such as EGARCH and TGARCH found to be best suited models in forecasting persistence of volatility in stock markets, because these models capture the asymmetric relationship between the risk and return of stocks (Tamilselvan and Vali, 2016;Abdalla and Winker, 2012;Rahman and Malik, 2019). The GJR-GARCH model, an asymmetric model in capturing stock market volatility was found to be effective, since this model captures leverage, volatility clustering, leptokurtosis effects in the time series analysis (Onwukwe et al. 2011;Khan et al. 2019). In addition, to these studies, Ekong and Onye (2017) found the symmetric and asymmetric GARCH models to be effective in capturing the stock market volatility.
Yazzi et al. (2011) compared the two models of ARIMA and GARCH while attempting to predict the WTI crude oil prices for the period January 2, 1986, to September 30, 2009. The study found GARCH (1,1) to be better than ARIMA (1,2,1) as it is able to better capture volatility through the non-constant of conditional variance. Ahmed and Shabri (2015) compared ARIMA and GARCH while predicting for WTI using the daily data for the period of January 1, 1986 to September 30, 2006. The study found the ARI-MA model to be better than GARCH as its RMSE was lower. Aamir and Shabri (2015) upon comparing ARIMA and GARCH for predicting monthly crude oil prices of Pakistan for the period Feb 1986 to Mar 2015, found both the RMSE and MAE of the GARCH model to be lower than ARIMA., making it better.
Comparisons between ARIMA and GARCH have been done for other variables also. Chen et al. (2011) compare the two models for short-time traffic flow prediction. The study found the ARIMA model to be sufficient to forecast traffic flow. Bhardwaj et al. (2014) while predicting the daily price of gram found that GARCH is a better model than ARIMA for estimating the daily price of Gram. Miswan et al. (2014) found that the ARIMA model is better than the GARCH model in predicting Malaysian market properties and shares. Nyoni and Nathaniel, (2018) compared the two model while predicting inflation and Nigeria and found ARIMA to be a better model than GARCH. A combination of these two models has been used for power demand forecasting (Neshat et al. 2018).

DATA AND METHODOLOGY
The data of WTI is taken for forecasting the daily crude oil prices. The data is for the period January 3, 1986, to April 27, 2020. The out sample forecasting period is February 10, 2020, to April 27, 2020. First, the Augmented Dickey-Fuller (ADF) test is done to check for stationary of the variables. Traditionally the plot for ACF and PACF is done to determine the order of the ARMA model. But EViews 10 automatically does the iterations provides the best model using Akaike Information Criteria (AIC). AIC indicates how well the estimated model fits the data when compared to other models. The AIC is estimated by: where M denotes the maximum likelihood function value and (p+q) are the total parameters to be estimated. After the best model is specified based on the highest AIC value with the highest R-Squared value. The study plans to use and compare two popular models of forecasting namely, ARIMA and GARCH. The forecasted values from ARIMA, and GARCH are compared with the actual values using Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) ARIMA is basically a combination of two processes namely 'Autoregressivee (AR) and Moving average (MA). It is used when wither the variables determinants of the variable to be forecast are unknown or the data on these casual variables are not readily available. In Autoregressive (AR) models the dependent variable depends on its own previous values. An AR model can be specified as: ( 1) where is the response variable at time t; is the constant mean of the process; , …… are the response variables at lags t-1, t-2……t-p respectively and Y's are the independent variables; are the coefficients to be estimated; and is the error term at time t.
Theoretically if the Partial autocorrelation function (PACF) stops abruptly at some point, the model is of AR(p) type. The number of spikes before the abrupt stop is equal to the order of the AR model. In Moving average (MA) models the dependent variable Yt depends on previous values of the errors rather than on the variable itself. MA model can be specified as: ( 2) where is the response variable at time t; is the constant mean of the process; , …… are the response variables at lags t-1, t-2……t-p respectively and Y's are the independent variables; are the coefficients to be estimated; and is the error term at time t.
If the autocorrelation function (ACF) abruptly stops at some point say, after q spikes, then the appropriate model is an MA(q) type. The number of spikes before the abrupt stop is referred to as q . If neither function falls off abruptly, but both decline toward zero in some fashion, the appropriate model is an ARMA (p,q)type. For a univariate series the model is specified as: ( 3) here is a white noise process is a purely random series of numbers having zero mean and is normally and independently distributed with no correlation between the independent variable and time.
The ARMA model requires stationary series with constant mean, constant variance and constant autocovaraince. The ARMA model is further extended to Autoregressive Integrated Moving Average (ARIMA) model when the series is differenced to remove non-stationarity. A series is stationary when basically two consecutive values in the series depend only on the time interval between them and not on time itself. When differencing is used to make a time series stationary, it is common to refer the resulting model as ARIMA (p,d,q) type. The 'I' refers to integrated or differencing term and'd' refers to the degree of differencing ahta si aht number of differences taken from the original time series data. Further, the study also examines the volatility forecast of crude oil price using symmetric and asymmetric models from GARCH family. GARCH (1,1) model was considered under symmetric models, while the models such as EGARCH (1,1), TGARCH (1,1) and PGARCH (1,1) were considered under asymmetric models.

The GARCH (1,1) Model
The basic GARCH model was developed by Bollerslev (1986). The models of GARCH consists of a mean equation and a variance equation. The following is the mean equation where t R is the stock return at time t,  is the mean stock return, and t  is the return residual. The variance equation in GARCH model is written as follows.
where 2 t  is the conditional variance and is the weighted term which depends on the volatility of past period ( 2 11 t   ) and also the past period's conditional variance ( 2 11 t   ). The stationarity is assumed when the coefficients of past period's volatility and conditional variance are equal to one.
where 2 ln( ) t  log normal of conditional variance, the leverage effect among return and volatility is shown by 1  , and the assumption is that 0  . A positive and significant coefficient of conditional volatility shows that there is a positive association between risk and return. A negative and significant coefficient of asymmetric term shows an increase in volatility due to bad news in the market, while a positive asymmetric term shows a decrease in volatility due to good news in the market. The truth is that investors in the market are highly inclined towards bad news than good news. Rabemananjara and Zakoian (1993) suggested the TGARCH model, also called as Threshold GARCH that examines the asymmetrical relationship between risk and return with the help of differential effect on the conditional variance.

The TGARCH Model
where the effect of leverage or asymmetric relationship between risk and return is given by the coefficient ( 1  ). This model examines the differential effect on conditional variance and is known when the volatility in the past is greater than zero (  ) shows a negative effect and is linked to bad information. Ding et al. (1993) suggested the PARCH model which is also known as Power Autoregressive Conditional Heteroscedasticity. The second order of the error term is changed by this model in to a flexible power with a coefficient asymmetric in nature that takes the effect of leverage.

The PARCH Model
where p,q denotes the lagged terms of 2  and 2  , the coefficients  , Further, this PARCH model engulfs the coefficients of other models, such as GARCH, TGARCH, GJR-GARCH. Table 1 provides a descriptive statistic of WTI crude oil prices. The oil prices had a maximum of 145.31 and a minimum of -36.98 US dollar per barrel. The mean price was 44.11 and is higher than the median, indicating that the distribution of prices in skewed to the right. The price distribution is platykur-tic as the value of kurtosis is less than 3. A graphical presentation of both the actual and log price is provided ion Figure 1.   Table 2 provides the results of stationary test using Augmented Dickey Fuller test. The results indicate that the data has a unit root at level. Upon first differencing the data becomes stationary. It is important to check the stationary of data as ARIMA can be used only on stationary data. If the data is nonstationary, the data is first converted into stationary. The selected ARMA model is (4,1,4)(0,0). This has the lowest AIC of 3.479832. The results of ARIMA model are as follows. All the AR and MA values are significant. The Fratio of the model is significant as iots p-value is less than 0.05. Also, the AR roots and MA roots are lower than one, indicating the model is adequate. The Durbin Watson statistic is also very close to 2, indicating the absence of autocorrelation. The RMSE, MAE, and MAPE for the forecast sample is 10.18411, 3.9364787, and 1.037790 respectively.  Table 4 reports the results of symmetric and asymmetric effects of crude oil prices. The study selected Normal (Gaussian) distribution to estimate the GARCH models. The results report that the coefficients of GARCH (1, 1) model are significant at the 0.01 level. The sum of coefficients ω and λ is equal to one, which shows the persistence of volatility. The parameter estimates show that the log likelihood function is higher, while the information criterion is lower, that indicates the best model to forecast volatility. Further, the residual diagnostics show the non-presence of correlation, as the p-value is more than 0.05 level of significance. Therefore, the result of GARCH (1, 1) model shows that the symmetric models of GARCH are suitable to predict the volatility of crude oil prices. The results report that the parameters of TGARCH model are significant at the 0.01 level. The leverage effect in TGARCH model is captured by i  and the sign is opposite to the EGARCH model. The result report that the coefficient capturing the leverage effect in TGARCH model (δ) is positive, which assumes that the negative shocks will enhance the volatility more than the positive shocks. The results report that the parameters of EGARCH model are significant at the 0.01 level. The leverage effect in EGARCH model is captured by δ and the sign of the coefficient confirms the asymmetric volatility. The result report that the coefficient capturing the leverage effect in EGARCH model (δ) is positive, which shows that positive shocks decrease volatility, since good news is passed to the investors, while investors are more inclined towards bad news which increases the volatility in the market. The results also report that the parameters of PGARCH model are significant at the 0.01 level. The leverage effect in PGARCH model is captured by δ and the sign of the coefficient confirms the asymmetric volatility. The result report that the coefficient capturing the leverage effect in PGARCH model (δ) is positive, which assumes that the positive shocks will pass negative information in the market which makes volatility stronger. Therefore, the results show that the asymmetric models of GARCH are suitable to forecast the volatility of crude oil prices.

EMPIRICAL RESULT
The parameter estimates of asymmetric GARCH models show that the log likelihood function is higher, while the information criterion is lower, that indicates the best model to forecast volatility. Further, the residual diagnostics show the non-presence of correlation, as the p-value is more than 0.05 level of significance. The results show that asymmetric GARCH models are able to capture the leverage effect except the EGARCH model where the leverage coefficient is positive and significant, which decreases the volatility due to good news in the market, while the investors are more prone to the bad news in the market that increases the volatility.  Table 5 shows the forecasting estimates of symmetric and asymmetric GARCH models, and a comparison also has been reported in the table in the form of ranks. The results report that, the GARCH (1, 1) is the best fitted model among the symmetric and asymmetric models, since the parameter estimates, such as RMSE, MAE, and MAPE are low. Further, the results also report that PARCH (1, 1) is the best fit model among the asymmetric models, because the parameter estimates of PARCH (1, 1) are low compared to the parameter estimates of other models.

CONCLUSION
The reported results associated with the forecasting crude oil prices using GARCH family models, show that the GARCH (1, 1) model, which is a symmetric model is a fitted model to forecast crude oil prices. The result is in line with the previous studies of (Arachchi, 2018;Ekong and Onye, 2017;Muhammad and Faruk, 2018;Herrera et al. 2018;Ayele, 2017;Shabani et al. 2016;Rahman and Malik, 2019). The results also report that the asymmetric GARCH models are also fit to forecast the crude oil prices, except the EGARCH model. The models of TGARCH and PGARCH capture the leverage effect since the coefficients of these models that capture the leverage effects are positive which shows that negative shocks (bad news) increase the volatility in the market. The results are in accordance with the previous literature of (Arachchi, 2018;Abdalla and Winker, 2012;Salisu and Fasanya, 2012;Almeida and Hotta, 2014). The coefficient of the EGARCH model that captures the leverage is positive and significant, which is in contrast to the defined coefficient of negativity. This shows that the positive effect of coefficient de-creases the volatility, while the market investors are prone to the negative information in the market since bad news increases the volatility in the market. The result of the EGARCH model is in contrast to the previous studies of ( Khan et al. 2019;Ekong and Onye, 2017;Herrera et al. 2018;Lama et al. 2015).
On the other hand, the parameters of forecasting estimates show that the symmetric model of GARCH (1,1) is a fitted model forecast, since RMSE, MAE, and MAPE is low, and further the asymmetric models, such as TGARCH and PGARCH are the best models to forecast the crude oil prices and PGARCH model is the best-fitted model among these two models, since the parameter estimates are low. Further, the forecast estimate comparison among the GARCH family models show that, the symmetric model of GARCH (1, 1) is the best-fitted model among the family, since the parameters of forecasting estimates are very low as reported in the table. Therefore, the study concludes that the GARCH (1, 1) model is the best-fitted model to forecast the crude oil prices.