Time-Series Modeling and Short Term Prediction of Annual Temperature Trend on Coast Libya Using the Box-Jenkins ARIMA Model

Aims : In this study a time series modeling was developed to predict the annual warming trend at coast Libya in the second decade of the 21 century using ARIMA model, and performing an evaluation for the results significance. Study Design: Utilizing Box-Jenkins method through, the stage of identification, parameter estimation and diagnosis, finally, a forecast of the annual surface temperature trend on Libya in the second decade of the 21 century was assembled, together with an evaluation of the significance of the predicted warming trend. Place and Duration of Study: Annual surface absolute temperature (ASAT) from 16 stations belonging to the coast of Libya during the period of (1892-2010) was used. Results: The most optimum two prediction models obtained for the above data, are non-seasonal linear trend model ARIMA (3-1-2) and quadratic trend model ARIMA (3-2-3). We found that the forecasted values followed the upward trend present in the data and the pattern of results almost followed the pattern predicted with a correlation value of approximately 80% for both models. Original Research Article El-Mallah and Elsharkawy; AIR, 6(5): 1-11, 2016; Article no.AIR.24175 2 According to linear Trend model, an increase in temperature of 0.12°C/decade and according to quadratic model, an increase of 0.53°C/decade had be en predicted until the year 2020. This increase in temperature is the same as what was predicted by the United Nations (from 1.3°C to 5.8°C between the year 1990 and 2100). Conclusion: The two models, individually, produced the best overall performance in making shortterm (∼10-year) predictions of annual surface absolute temperature in Libya. It can be used as a supplemental tool for environmental planning and decision making concerned with other environmental models.


INTRODUCTION
Global warming has already been observed over the last several decades. Future temperatures are projected to change in future. Predictions are also made that ground-level air temperatures are predicted to continue in the future to warm over land more rapidly as compared to oceans. The global warming rate deduced from the adjusted temperatures since 1980 is about 0.14±0.02°C/decade. The warming rate reported in the IPCC assessment report 4 based on observed global surface temperature set is about 20% higher, due to the warming by the Atlantic multi-decadal oscillation additional to the anthropogenic warming. Moreover, the predicted temperature evolution based on long time changes of CO 2 and the Atlantic multi-decadal oscillation index shows that the Northern Hemispheric temperatures are modulated by the Atlantic multi-decadal oscillation influence and will not change significantly to about 2040, after that they will increase speedily, just like during the last decades of the past century [1]. The changes in surface air temperature on small spatial scales, like sub-continent or regional scale, show different features. In comparison with the global mean climate, regional climate has more complex variability since it is influenced by ocean-atmospheric circulation, land cover, and associated feedback processes. Regional climate is relevant to the environment and economic production, and deserves more attention [2]. Some world regions are also expected to see larger atmospheric temperature increases in the future in relation to the global average.
Time series analysis and forecasting has become a major tool in numerous applications in meteorology and other environmental areas to understand the phenomena, like rainfall, humidity, temperature, draught etc. Balyani et al. [3] used spectral analysis technique and ARIMA model in a 50-year time period  for Shiraz. Their results show that the cycles of 2.5 and 4 years are predominant on temperature in Shiraz, furthermore, according to their modeling of temperature in ARIMA models, ARIMA (1-1-3) was selected as the optimal model, the model predicted 0.2°C increase in annual temperature in Shiraz. Babazadeh et al. [4] used ARIMA model to forecast monthly precipitation and the mean of monthly temperature at Shiraz, south of Iran, They found that despite of a continuing drought, it is likely that the precipitation will improve, and as regards to the mean monthly temperature, the trend of increasing temperature, especially in recent years, has continued and the finding of the forecasting show increase in temperature along with a narrowing of the range of variations. Anitha et al. [5] used seasonal autoregressive integrated moving average (SARIMA) model to forecast monthly mean of maximum surface air temperature of India, their results show that there is an increasing trend in the monthly mean of maximum SAT in India. Suteanu [6] used daily minimum and maximum temperature records from Canada stations in the Atlantic region and suggested a new approach to study surface air temperature pattern variability. Muhammet [7] also used ARIMA method to predict the temperature and precipitation in Afyonkarahisar Provincei, Turkey until the year of 2025, he found an increase in temperature of 1.2°C according to quadratic trend model and an increase of 0.5°C according to linear trend model. Khedhiri [8] studied the statistical properties of historical temperature data of Canada for the period (1913-2013), he determined seasonal ARIMA model for the series and predicted future temperature records. Muhammad et al. [9] used ARIMA model for forecasting and analyze the air pollution index (API) in Johar, Malaysia. It is caused that this method has been proven as an effective way in most of research area.
Here ARIMA (Auto Regressive Integrated Moving Average) models have been setup and used to carry out short-term predictions of annual surface air temperatures in Libya.

DATA
Libya is a North African country, which shares a border with the Mediterranean Sea to the north, Egypt to the east, Niger, Chad and Sudan to the south, and Algeria and Tunisia to the west. Both the Mediterranean Sea and the desert affect Libya's climate. In most of the coastal lowland, the climate is Mediterranean, with warm summers and mild winters. Rainfall is scanty, and the dry climate results in a year-round 98percent visibility. The weather is cooler in the highlands, and frosts occur at maximum elevations. Along the coast, the Mediterranean climate is characterized by a cool, rainy winter season and a hot, dry summer. The warmest months are July and August, when Benghazi and Tripoli, in the Mediterranean zone, experience average monthly temperatures of 22°C to 29°C and 17°C to 30°C, respectively. The coolest months are January and February; Benghazi has winter monthly temperatures of 10° to 17°C, and Tripoli has 8° to 16°C. Surface air temperature data from nineteen locations covering coastal Mediterranean regions in Libya are applied throughout this work. Data series are provided by the European Climate Assessment & Dataset-KNMI Climate Explorer, throughout the link http://climexp.knmi.nl/start.cgi?id=someone@so mewhere Fig. 1 shows each station's locations at coast Libya referenced to their latitude and longitude. The specifications of monitoring stations are listed below in Table 1, with location's name, longitude, latitude, elevation and the periods. The annual surface absolute temperature (ASAT) has been calculated throughout taking averages over the sixteen stations.

METHODOLOGY
The time series approach used in this study is based on Box-Jenkins model. Box-Jenkins is referred as Autoregressive Integrated Moving Average (ARIMA) method. Until nowadays, a lot of researchers still use this model in many area of research because of its resulting effectiveness in forecasting field [10][11][12].
There are three basic components to an ARIMA model: auto-regression (AR), differencing or integration (I), and moving-average (MA). In its simplest form, an ARIMA model is typically expressed as: ARIMA (p-d-q) where p is the order of auto-regression, d is the order of differencing (or integration), and q is the order of moving-average involved. The first of the three processes included in ARIMA models is autoregression. In an autoregressive (AR) process, each value in a series is a linear function of the preceding value or values. In a first-order autoregressive process, only the single preceding value is used; in a second-order process, the two preceding values are used, and so on. These processes are commonly indicated by the notation AR(p) or ARIMA(p-0-0), where the number in parentheses indicates the order. The differencing or integration component of an ARIMA model tries, through differencing, to make a series stationary. The moving-average (MA) component of an ARIMA model tries to predict future values of the series based on deviations from the series mean observed for previous values. In a moving-average process, each value is determined by the weighted average of the current disturbance and one or more previous disturbances. The order of the moving-average process specifies how many previous disturbances are averaged into the new value. In the standard notation, an MA(q) or ARIMA (0-0q) process uses q previous disturbances along with the current one.
A mixed autoregressive and moving average terms develop Autoregressive moving Average Model (ARMA). The notation is ARMA (p, q) where, p is the order of the autoregressive part and q is the order of the moving average part which represents this models.
The ARMA (p,q)) is in the form below Eq. 3: Set of data could be a non-stationary time series data patterns since the data did not fluctuate around a constant level or mean. One way to make the data stationary is by taking the difference. Therefore, the series of data generally donated as y t after difference is said to follow an integrated autoregressive moving average model, ARIMA (p-d-q). An ARIMA model can be viewed as a "filter" that tries to separate the signal from the noise, and the signal is then extrapolated into the future to obtain forecasts.
SPSS program was used in the conducted analysis.

RESULTS AND DISCUSSION
By using time series of the annual surface absolute temperature (ASAT) belonging to the coast of Libya for the period (1892-2010). First temperature data were statistically analyzed to determine the presence and to identify the temporal patterns of any trend and/or periodic oscillation. Second, the temperature signal, after the trend and periodicity had been accounted for, was modeled using a non-seasonal Auto Regressive Integrated Moving Average (ARIMA) process, following the Box-Jenkins approach [13]. Finally, a prediction of the surface temperature on Libya in the second decade of the 21 st century was made, together with an evaluation of the significance of the predicted warming trend. To obtain the model by the Box-Jenkins methodology, there are four steps that must be considered after data preparation, which are identification, parameter estimation, diagnostic checking, and finally model is used in prediction purposes.

Data Preparation
To prepare data for statistical modeling, data are transformed to stationary series by different ways. A stationary series has the same mean and variance throughout. The most common transformation is differencing, which replaces each value in the series by the difference between that value and the preceding value. The time series (ASAT) of Libya and its autocorrelation Fig. 2(a,b) shows non-stationary series since the upward trend, present in the original series. We take the first order of differencing (d=1) accounts for linear trends and a second order differencing (d=2) accounts for quadratic. Fig. 3(a,b) shows the time series after taking the first and second order differencing.

Identification
The first and most subjective step after data preparation is the identification of the processes underlying the series. It must determine the three integers p, d, and q, representing respectively the number of autoregressive orders, the number of differencing orders, and the number of moving-average orders of the ARIMA model. The autocorrelation (ACF) and partial autocorrelation functions (PACF) are used, as basic instruments, to identify the orders of ARIMA model.
At the identification stage, Figs. 4 and 5 show the ACF and PACF functions of the difference transformed for the linear and quadratic models. After taking the difference d=1 for the initial dataset, The ACF and PACF plots (Fig. 4) show exponential declines with three significant peaks on each of them with p and q values of 1, 2 and 3 indicating mixed ARIMA. And with respect to the quadratic one d=2, both ACF and PACF plots ( Fig. 5) also show exponential declines with three and four significant peaks on ACF and PACF respectively indicating mixed ARIMA with the pvalues of 1, 2, 3 and 4 while the values of q are 1, 2 and 3. We found that ARIMA linear model (3-1-2) is the best for fit and forecast the dataset (ASAT) of Libya. Autoregressive order of 3 (p=3) specifies that the value of the series three time periods in the past be used to predict the current value. While moving-average orders of 2 specify that deviations from the mean value of the series from each of the last two time periods be considered when predicting current values of the series. For the quadratic mode, we choose ARIMA (3-2-3) to fit and forecast ASAT of Libya This means that the predicted value for the next year depending on the data 3 years before and 3 years earlier error.
After establishing the identification of the model temporarily, then the estimation of parameters AR and MA should be established.  Table 2. Parameter estimates of a) ARIMA (3-1-2), b) ARIMA (3-2-3 Table 3.

Estimation of Parameters
There are several techniques to estimate model parameters such as conditional lease squares, but SPSS employs maximum likelihood (Melard's algorithm) for model estimation. Estimated values for ASAT are presented in Table 2 for best fitted ARIMA models. A t test is performed to test the statistical significance. The estimated coefficients are significantly different from zero. By using Ljung-Box test [14] and checking the pvalue of the coefficient, then the significant model can be determined.

Diagnosis
Diagnosing an ARIMA model is a crucial part of the model-building process and involves analyzing the model residuals. A residual is the difference, or error, between the observed value and the model-predicted value. In this step, the residual is tested for evaluation purposes and goodness of the fit statistics is provided. Fig. 6 (a,b) shows the ACF and PACF for the residuals of ARIMA (3-1-2) and ARIMA (3-2-3) models, it can be seen that no significant correlation are appear at any lag. The results in Table 3 show a non-significant correlation values, confirms that the residuals for both models are random, which means that the models are a good fit for the series and no essential components have been omitted from the models. Table 4(a, b) shows a goodness of the fit statistics of ARIMA models (3-1-2) and (3-2-3) for the data set (ASAT). R-squared represents an estimate of the proportion of the total variation in the series that is explained by the models. The values of 0,8 and 0.79 means that both models do an excellent job explaining the observed variations in the series. Mean percentage error (MAPE) for the models (3-1-3) and (3-2-3) are 1.012% and 1.071% respectively, a measure of how much a dependent series varies from its model-predicted level and provides an indication of the uncertainty in our predictions. Maximum Absolute Percentage Error (MaxAPE) represents the largest forecasted error, expressed as a percentage. This means that the largest errors are 4.6% and 4.7% for the two models. Also the prediction errors of the two models were the least as expressed in terms of the root-mean-squared error (RMSE), the mean absolute error (MAE) or the mean absolute percentage error (MAPE).
The model's fitting performance was measured using the adjusted coefficient of determination (R 2 ), Akaike's Information Criterion (AIC) [15], Schwarz's Bayesian Criterion (BIC) [16]. The results on Table 5(a,b) shows that the two models fit the temperature data well.

Forecast
The annual surface absolute temperature (ASAT) of Libya during the second decade of the 21 st century was forecasted using linear ARIMA (3-1-2) model and quadratic ARIMA (3-2-3) model. Fig. 7(a,b) shows a good agreement between predicted and observed values for both models,   indicating that the models have satisfactory predictive ability. A model captures well a trend of the data. The annual surface temperature of Libya was found to rise at a steady rate of 0.12°C per decade during the forecast period according to linear model Fig. 8(a), and at a rate of 0.53°C/decade according to quadratic model Fig. 8(b). These results are consistent with United Nations that predicted increases in temperatures as from 1.3°C to 5.8°C between the year 1990 and 2100.  However, our result in the linear predicted warming is considerably lower not only than IPCC's prediction of 0.2°C per decade [17] but also lower than the prediction made by [18] in which anthropogenic influences and natural variability were explicitly considered. Yet it is consistent with [19] where their model predicts with moderate confidence that the global temperature will likely continue to rise during the second decade of the 21st century at a rate of 0.12°C per decade.

CONCLUSION
This paper focuses on forecasting the warming trend for the annual surface temperature of Libya in the second decade of the 21st century using Box and Jenkins method to find the best fitted ARIMA model. The best two prediction models obtained for the above data, are non-seasonal linear trend model ARIMA (3-1-2) and quadratic trend model ARIMA (3-2-3) with Maximum Absolute Percentage Error (MaxAPE) of 4.6% and 4.8% for the two models respectively. This means that the predicted value for the next year depending on the data 3 years before, 2 years earlier error for the linear model and on the data 3 years before and 3 years earlier error for the quadratic one with largest errors of 4.6% and 4.7% respectively. According to linear model, an increase in temperature as 0.12°C and according to quadratic model, an increase in temperature as 0.53°C has been forecasted until the year 2020. This increase in temperature is the same as what was predicted by the United Nations (from 1.3°C to 5.8°C between the year 1990 and 2100). The modeling results show that linear ARIMA (3-1-2) model and quadratic ARIMA (3-2-3) model had the best overall performance in making short-term (∼10-year) predictions of annual absolute temperature in Libya. It can be used as a supplemental tool for environmental planning and decision making.

ETHICAL APPROVAL
It is not applicable.