Prediction of life expectancy in Saudi Arabia by 2030 using ARIMA models

Life expectancy at birth (LEB) is a major factor for decision-making bodies when developing new healthcare policies or improving existing ones. This paper, with the help of R language, processes and examine the LEB data in Saudi Arabia from 1960 to 2012 using time-series analysis. To test the validity of the model, LEB data from 2013 to 2018 are used. The performance of the selected auto regressive integrated moving average (ARIMA) model has been calculated using several comparing criteria, commonly used in forecasting statistical evaluation, for fitness and prediction phases. Finally, the Saudi Arabian LEB data from 2018 to 2030 are forecasted and analyzed to correspond with the Saudi Vision 2030 framework.


Introduction
Life expectancy at birth (LEB) is defined as "a standardized summary measure, sometimes used as an overall gauge of health, based on a population's age structure and mortality experience" [1]. Therefore, it is a population mortality index that theoretically describes the average number of years in which an infant will live, considering different mortality rates within each age group. Life expectancy has increased significantly over the last few decades in different parts of the world. While the average age in 1960 reached 52.5 years worldwide, the current average is around 72 years.
Recent research showed the importance of life expectancy predictions in long-term development plans to understand potential health trajectories in the future [2].
From the literature, several life expectancy prediction methods have been proposed [3], [4], [5]. Extrapolation is the most common approach in demographic forecasting. Extrapolative methods are essentially atheoretical; the only assumption is that the future will be a continuation of the past. The most commonly used extrapolation method is the univariate ARIMA model forecasting [6], [7], [8], [9].
In Saudi Arabia, life expectancy has increased from 51.69 years in 1969 to 75 years in 2018, an average annual growth of 0.76, which indicates improved prevention and medical care services in the kingdom. In 2016 the kingdom established a strategic framework to reduce Saudi Arabia's dependence on oil, diversify its economy, and develop public service sectors such as health, education, infrastructure, recreation, and tourism. One of the Saudi Vision 2030 goals in health is to increase Saudi individuals' life expectancy from 74 (75 currently) to 80 years by 2030 [10].
Research studies of this caliber are not abundant. However, a study used demographic literature to discuss the feasibility of the life expectancy vision target and the effect of cardiovascular diseases, smoking, obesity, lifestyle, and related policies on life expectancy in Saudi Arabia. The study also analyzed countries whose life expectancies are in the 70s and estimated annual gains of less than 0.31  [11].
This research study aims to predict life expectancy in Saudi Arabia by 2030 using ARIMA models. In addition, the Saudi Vision 2030 objective of increasing life expectancy needs an extensive examination using high-quality datasets. A reliable study will enable decision-makers to implement an effective development plan to sustain an increased life expectancy and overall health development and achieve this goal.

Data
This study used data extracted from the World Health Organization (WHO) and published in the World Development Indicators (WDI) because of challenges in collecting sufficient data. The data obtained from the WDI are from 1960 to 2017, and those for the year 2018 are from the Saudi Ministry of Health (MOH) statistical yearbook (see Table 1) [12]. The ARIMA model requires a minimum of 50 observations [13]; the dataset used in this study has 59 observations. Therefore, the data is adequate for analysis. LEB data from 1960 to 2012 were selected for analysis, and data from 2013-2018 were retained for verification. Table 1 shows the LEB in Saudi Arabia from 1960 to 2018.

ARIMA models
ARIMA models fall under the Box-Jenkins approach, and the terms "ARIMA models" and "Box-Jenkins models" can be used interchangeably. A nonseasonal ARIMA model includes three variables (p, d, q) where p is the number of autoregressive terms, q is the number of lagged forecast errors in the prediction equation or the order of the moving-average model, and d is the number of differences needed for stationarity. A nonseasonal ARIMA model is given in equation (1): where is the series; denotes the values of previous years; refers to the previous prediction errors. The construction of an ARIMA model consists of three main steps. Model identification establishes the degree of differencing required to make the series stationary and determine p and q. According to Wei [14], the autocorrelation function (ACF) and the partial autocorrelation function (PACF) are vital tools in identifying p and q. The ACF is the linear correlation between and and represents the q term, which is the moving average, while the PACF is known as the partial linear correlation function between and .
, . . . , and represents the autoregressive term p. The next phase estimates the model parameters using the maximum likelihood estimation process outlined in the Box-Jenkins model [15]. Finally, diagnostic checks are performed to determine the efficacy of the model and examine if residuals are identified as a white-noise process. As seen in Figure 1, the Box-Jenkins methodology is an iterative process to obtain the optimal model in forecasting future values.

Stationarity
According to Milles [16], "the autoregressive-moving average (ARMA) class of models relies on the assumption that the underlying process is weakly stationary, which restricts the mean and variance to be constant and requires the autocovariances to depend only on the time lag." A nonstationary time series can achieve stationarity using mathematical transformations, such as differencing, and can also  (2).
where is the deterministic trend; is the random walk; is the stationary error.

3.2.2.
Differencing. The Box-Jenkins methodology assumes a stationary time series. Nonstationary time series can transform into a weakly stationary series via differencing, which refers to the change between consecutive observations in the original series. Equation (3) shows the first order differencing calculations.
Usually, after the first difference is taken, the data will appear stationary; if not, we might have to difference the data a second time. Practically, it is unnecessary to go beyond second order differences as given in equation (4):

Goodness of fit
The goodness of fit for the ARIMA models is measured using the following criteria:

The Akaike information criterion (AIC).
The AIC compares the quality of a set of statistical models [17]. The calculation is as follows: where T refers to the observations used for estimation; SEE is the sum of the squares of residuals; k is the number of model predictors. However, the AIC might be incorrect when the sample size is small; in particular, when < 40, the corrected Akaike information criterion (AICc), is required [18]. The formula of AICc is given in equation (6).

Ljung-Box test.
The Ljung-Box test determines the absence of a serial autocorrelation in the model residuals up to a specified lag. The test statistic is in the following formula:

Performance indicators.
Forecast accuracy is measured using some performance indicators:  Root mean square error (RMSE): The calculation formula of RMSE is: where _ represents the predicted value at time t; _ refers to the actual value at time t; n is the total predicted time. The closer the FA value is to 100%, the greater the accuracy of the forecast. The calculation formula is given in equation (10):

Identification of the ARIMA model
The analysis was conducted using R language (4.0.2), released on June 22, 2020.  Saudi Arabia's LEB has increased over the years, an indication of the development of the country's healthcare system. In Figure 2, LEB from 1960 to 2012 shows a visible upward trend, and the series is nonstationary. Accordingly, first and second order differences are applied to detrend the series. Also, to further determine stationarity, the KPSS test is performed, as shown in Table 2.
The KPSS test results show both the original series and the first order differencing nonstationary under a 0.05 significance level. However, the second order differences are significant, with a p-value of 0.09. Thus, we have good evidence to reject the null hypothesis since the p-value is greater than 0.05 and assume that the time series is stationary. Figure 3 shows the second-order difference time series with no apparent trend. Figure 4(a) shows that the ACF plot decays and dies down in a damp sine-wave fashion while the PACF plot in (b) cuts off after the second lag. The parameter selection rules imply that the AR (2) model might be appropriate [15]. The AR (2) model is fitted to the differenced series, and the parameters are significant, which means that the fitting model for Saudi Arabia's LEB data is the ARIMA (2,2,0) model.

Estimation
Based on the model identification in 4.1, ARIMA (2,2,0) was estimated against other models. First, we overfit the model by adding an additional AR parameter. That is, fit an ARIMA (3,2,0) to the training data. Also, underfitting the model by removing an AR parameter results in fitting an ARIMA (1,2,0) model. Lastly, we introduced a moving average parameter and fit an ARIMA (2,2,1) to evaluate whether the MA term produces better fitting results. To compare the quality of each model, the AIC and AICc were evaluated. Table 3 shows that the minimum values for AIC and AICc are achieved when p = 3 and q = 0, which means ARIMA (3,2,0) is the most suitable model. The following are the results of the ARIMA (3,2,0) model fitting:  Table 4 shows the coefficients of the autoregressive terms. In R, no constant is permitted for > 1, as a quadratic or higher-order pattern is particularly dangerous when forecasting [19]. The AR (1) and AR (2) coefficients are statistically significant and have predictive power. However, the AR (3) term is nonsignificant and is thus omitted from the prediction equation (see equation (11)).

Diagnostic check
After estimating the model, we will calculate the residuals of the fitted model to determine its performance and verify its reliability. According to Box-Jenkins [15], the diagnostic check involves testing the error terms' statistical properties, the normality assumption, and weak white-noise assumption.  , the histogram and density plot have a normal distribution shape and a mean of zero. The residuals line plot in (c) fluctuates around zero with a consonant variance, suggesting that the residuals are white noise. In (d), the ACF lags are within the threshold limits, indicating no autocorrelation between the residual errors; generally, the residuals of the fitted model appear to be white noise as well. However, the plot indicates a substantial increase at lag one; thus, the Ljung-Box test is applied to check for serial autocorrelation and model fit. The results are shown in Table 5. The p-value of the Ljung-Box test is higher than the 0.05 level of significance, providing strong evidence against the null hypothesis. Therefore, the autocorrelations of the residuals are extremely small, and the first lag does not significantly affect model fitting, and the model does not show a significant lack of fit. As a result, ARIMA (3,2,0) is adequate to be used for further analysis. Diagnostic checking has shown that ARIMA (3,2,0) model met the assumptions. Thus, the model is used to forecast values for the next years. The out-of-sample from 2013 to 2018 forecasts are shown in Table 6.   According to the evaluation results, ARIMA (3,2,0) has relatively small RMSE and MAPE values, and the FA value is close to 99.9%, indicating accurate model prediction results. Therefore, the ARIMA (3,2,0) model is credible in predicting future LEB values. A possible reason for the ARIMA model's improved accuracy is that their parameterization is done by minimizing the AIC and AICc criterion, which avoids over-fitting by considering both goodness of fit and model complexity [20].  Figure 6 shows that the ARIMA (3,2,0) LEB forecast model proposed in this paper (red line) maintains a smooth error with the actual series in the first three years and accurately captures the direction of the future series. After the first three years, the forecasts drift steadily from the actual series values and appears to be biased downwards. However, the differences between the forecast and actual values are small.

Forecasting
The ARIMA (3,2,0) model is finalized after testing the model's goodness of fit and verifying its accuracy.
The finalized model is used to predict LEB by 2030, with prediction intervals of 95% against all data (see Table 8).  The LEB forecasts generally increase with an average gain of 0.26 per year. The results are similar to the study conducted in 2018, which reported an annual gain in life expectancy by 0.25 [11]. In 2030, LEB is predicted to reach 78 years, with 95% prediction intervals from 71.79 to 84.46, rising by 3.13 years from 2018. In Figure 7, forecasts from 2019 to 2030 are plotted with 95% prediction intervals.

Summary
Life expectancy has always been an important indicator of population health. Life expectancy forecasts for individuals in Saudi Arabia can support health program development in the kingdom and offer a scientific framework through which related policies can be assessed. Accurate predictions will enable decision-makers to implement effective development policies to sustain the overall rise in LEB. Accordingly, the 2030 Vision target of 80 years of life expectancy will be fulfilled.
Using the R language in this paper, we analyzed the life expectancy time series from 1960 to 2012 following the Box-Jenkins methodology. As a result, the ARIMA (3,2,0) model was constructed, and data from 2013 to 2018 were used for validation. The ARIMA (3,2,0) model succeeds in all diagnostic tests and has the lowest AIC and AIC 0 values compared with other ARIMA models, with an MSE of 0.1312, an RMSE of 0.1431, and an FA of 99.9%. Consequently, the prediction performance of the model fitted in this paper is found to be sufficient and the forecasting results reliable.
Finally, we forecasted LEB in Saudi Arabia from 2019 to 2030, which continues to increase over the years and is expected to reach 78 years by 2030. Our findings are consistent with a recent study, which reported an annual rise of 0.25 in life expectancy in Saudi Arabia, reaching 78 years by 2030 [11].
Finally, this study is intended to offer a theoretical reference for assessing and adjusting related policies in the kingdom.