FORECASTING CONFIRMED AND RECOVERED COVID-19 CASES AND DEATHS IN EGYPT AFTER THE GENETIC MUTATION OF THE VIRUS: ARIMA BOX-JENKINS APPROACH

Coronavirus disease 2019 (COVID-19) is spreading disease all over the world. It is a real test for all health authorities all over the world including Egypt. After the eruption of severe acute respiratory syndrome (SARS2002/2003) and Middle East respiratory syndrome (MERS-2012/2014) in the world, new public health crisis, called new coronavirus disease (COVID-19). The coronavirus epidemic has spread over the world, affecting practically every country. As a result, it's become critical to comprehend disease trends in order to limit the consequences. The aim of this study, use the suitable statistical prediction models to meaningful in forecasting and controlling this global pandemic threat, especially after the genetic mutation of the virus in 2021. For this purpose, the Autoregressive Integrated Moving Average (ARIMA) model based on the Box-Jenkins approach was used to predict the confirmed, recovered cases and deaths of COVID-19 in Egypt. The most recent data available to determine the best prediction 2 MOHAMED R. ABONAZEL, NESMA M. DARWISH models for daily cases and death in Egypt, and to forecast them up to April 2021. The COVID-19 confirmed, recovered, and death cases were collected on a daily basis from the official Ministry of Health. According to our results, ARIMA models with ideally selected variables are excellent tools for monitoring and predicting trends of COVID-19 cases in Egypt. The results indicated that the estimated ARIMA models have a high ability to predict the number of confirmed cases, recovered COVID-19 cases and death in Egypt. The four stages of Box-Jenkins approach are conducted to obtain an appropriate ARIMA model for the number of confirmed cases, recovered COVID-19 cases and death in Egypt. According to the goodness-of-fit measures, the best model is ARIMA (1, 1, 1) for confirmed cases, ARIMA (1, 0, 1) for recovered cases, and ARIMA (1, 0, 0) for death. Moreover, we used these models to forecast the number of confirmed cases, recovered COVID-19 cases and death for the next twenty days. The results will enable us to provide a suitable advice to help in taking decision in Egypt on how to avoid negative effects for this epidemic.


INTRODUCTION
On 11 March 2020, the World Health Organization (WHO) declared COVID-19 a pandemic. Since it was discovered in Wuhan, China, on 31 December 2019, this pathogen has rapidly spread and infected more than 4 million people globally, with over 300,000 deaths as of March 2020. This virus is highly contagious with a basic reproductive. The eruption started on 12 December 2019 continues to spread worldwide and results in fatality. Since there is no approved treatment for COVID-19 currently prevention and preparation in healthcare services are crucial. Egypt is a country in Africa, located in the northeast corner of the continent, with a population of around 100 million people and a land area of 1,010,408 km2. It is surrounded on the north by the Mediterranean Sea, on the east by the Red Sea, on the west by Libya, and on the south by Sudan.
The Egyptian Constitution declares health to be a fundamental human right. On January 26, 2020, Egypt will ban all flights from China to Egypt. The first confirmed case of a Chinese individual in Egypt was officially declared on February 14, 2020. The patient was taken to a quarantine facility. 3

COVID-19 CASES AND DEATHS IN EGYPT
The Egyptian cabinet formally denied any proven Egyptian cases on February 28, 2020. The second COVID-19 case was formally revealed on March 1, 2020. The first Egyptian case of COVID-19 death was reported on March 20, 2020, twenty days after the first case of COVID-19 death in Egypt (for a German person). Since the 7th of March through the 21st of March, all schools/universities and mosques have been closed. The Egyptian government has instituted various sorts of lockdowns in order to mitigate the epidemic's impact. Declining in the spring or summer or becoming the world's greatest pandemic. Most of Egyptian hospitals have been prepared to face COVID-19 cases to overcome shortages of medical supplies and staff.
The importance of projecting the pandemic's likely trajectory is underscored by the fact that, as far as we know, studies predicting cases and deaths in Egypt are still conservative and have not been revised to reflect the most recent scenario. Zaki et al. [1] talked about that coronavirus was confined from the sputum of a 60-year-old man who displayed with intense pneumonia and ensuing renal disappointment with a deadly result in the Kingdom of Saudi Arabia (KSA).
Using statistical and artificial intelligence methods such as Autoregressive Integrated Moving Average (ARIMA) and Nonlinear Autoregressive Artificial Neural Networks, Saba and Elsheikh [2] predicted the likely number of cases in Egypt. The study used reported instances from March 1, 2020, to May 9, 2020, to forecast one month ahead until June 8, 2020. In May 2020, a 280 percent increase in cases was predicted.
El-Ghitany [3] used data from February 14 to April 18, 2020 to make a short-term projection for the pandemic scenario in Egypt. From April 19 to June 6, the researcher used an exponential growth rate model to forecast the daily cases. According to the findings, infections are likely to reach more than 20,000 by late May, after which they will begin to fall.
Elmousalami and Hassanien [4] used different time series models such as moving average (MA), weighted moving average, and single exponential smoothing to offer daily forecasting models of COVID-19 cases. According to the findings, the number of confirmed cases in Egypt would jump to four times in April. That instance, they claim that the number of coronavirus cases in Egypt is increasing dramatically. 4 MOHAMED R. ABONAZEL, NESMA M. DARWISH Anwar and AbdelHafez [5] also predicted the expected timing of the coronavirus peak in Egypt.
They used the Susceptible, Exposed, Infectious, and Removed model with the epidemic online calculator tool. The daily reports for the period 14 February to 11 May 2020 were used in their research. According to the findings, the number of hospitalized cases is expected to peak at 20,126 in mid-June. A total of 12,303 deaths were predicted. The research also claimed that the quarantine restrictions should be kept until the end of June 2020.
Nosier and Salah [6] presented a predictive study of COVID-19 infection and mortality in Egypt, a three-part study that attempts to discover the best prediction models for daily cases and deaths in Egypt and forecast them up to November 7, 2020, using the most recent available data. Second, using Google Community Movement Reports (GCMR) to evaluate the results of easing lockdown limitations, investigate the impact of mobility on pandemic incidence. Finally, they provided some recommendations that may help lessen the spread of the virus and eradicate new deaths as possible.  Recently, there are several papers that have used ARIMA models to study the behavior of the COVID-19 pandemic in different countries, such as [10,11,12,13].

Data
Data for this study were obtained from 1 February 2021 to 31 March 2021. The Egyptian daily COVID-19 confirmed, recovered and death cases were sourced from Egyptian Ministry of Health and Population reports. This study applied ARIMA model to forecast the Egyptian daily COVID-19 confirmed, recovered and death cases. Minitab statistical software version 16 was used to conduct the analyses. Thus, the appropriate ARIMA model and then use it to forecast the confirmed, recovered cases and death for the next twenty days.

Time Series Models
Time series analysis has been used in a variety of medical, engineering, and economic disciplines.
Monitoring the responses of a phenomenon through time and anticipating future responses is a significant subject that can aid decision-makers in developing future policies and plans to address various issues that humans confront. In the literature, both statistical and artificial intelligencebased approaches for forecasting time series problems have been documented. Statistical-based methods like ARIMA are commonly used to assess this type of data and estimate future reactions as a function of time. The ARIMA model combines three processes: the Autoregressive (AR) process, the Integration (I) process (by taking the difference), and the Moving-Average (MA) process. These processes are referred to the primary univariate time series models in statistical literature, and they are widely employed in a variety of applications.

1-Autoregressive (AR) Model
Abraham and Ledolter [14] defined an AR model forecasts future behavior depending on the past behavior. It's used for forecasting when there is some correlation between values in a time series and the values that precede and succeed them. You only use past data to model the behavior, hence the name autoregressive (the Greek prefix auto-means "self."). The process is basically a linear regression of the data in the current series against one or more past values in the same series. The Where ε t is a white noise, a sequence of independently and identically distributed (iid) random variables with E(ε t ) = 0 and ( ) = 2 ; i.e. ~(0, 2 ).

2-Moving-Average (MA) model
Abraham and Ledolter [14] described that, rather than using past values of the forecast variable in a regression, a moving average (MA) model uses past forecast errors in a regression model.
The notation MA(q) refers to the moving average model of order q:

3-Autoregressive moving-average model
An Autoregressive Moving-Average (ARMA) is a model-based time series fitting model that provide a parsimonious description of a (weakly) stationary stochastic process in terms of two polynomials, one for the auto-regression and the second for the moving-average [14]. We can write an ARMA(p, q) as a mixture of AR(p) and MA(q) models:

4-ARIMA Model
ARIMA, which stands for "Autoregressive Integrated Moving Average," is a class of models that "explains" a given time series based on its own previous values, i.e., its own lags and lagged prediction errors, so that equation can be used to forecast future values. ARIMA models can be used to model any 'non-seasonal' time series and it is not random white noise. An ARIMA model is one where the time series was differenced at least once to make it stationary and you combine 7 COVID-19 CASES AND DEATHS IN EGYPT the AR and the MA terms [14]. For example, if is non-stationary series, we will take a firstdifference of so that ∆ becomes stationary, then the ARIMA (p, 1, q) model is: (4), then the model becomes a random walk model which classified as ARIMA (0, 1, 0).

Box-Jenkins Approach
The Box-Jenkins [15] model is a mathematical model designed to forecast data ranges based on inputs from a specified time series. The Box-Jenkins model can analyze several different types of time series data for forecasting purposes [16]. Its methodology uses differences between data points to determine outcomes. The methodology allows the model to identify trends using autoregresssion, moving averages, and seasonal differencing to generate forecasts. Figure 1 shows the four iterative stages of modeling according this approach. The four stages modeling in the Box-Jenkins iterative approach [17]: Step 1 (Model identification): determining the order of the model required (p, d, and q) in order to capture the salient dynamic features of the data. This usually results in the employment of graphical approaches (plotting the series, the Autocorrelation function (ACF), partial autocorrelation function (PACF), etc).
Step 2 (Model estimate and selection): estimating the parameters of various models (using step 1) and then making a first model selection (using information criteria). The most common methods for estimation are Maximum Likelihood Estimation (MLE) or non-linear least-squares estimation.
Step Step 4 (Forecasting): when the selected ARIMA model conforms to the specifications of a stationary univariate process, then we can use this model for forecasting. Figure 1: Stages in the Box-Jenkins iterative approach [17] 3. RESULTS

Estimated models
Time series analysis was made for the number of confirmed, recovered cases and deaths in Egypt.
When the time series graphs are examined, the trend is seen, after taking first differences as Figure   2. The ACF and PACF graphics were used to see this more clearly and determine it's stationary. In the ACF graph, it is seen that the series of confirmed case is not stationary since many delays exceed the confidence limits. In this case, the first order difference was applied to the series of confirmed cases and it was ensured to become stationary. Figure 2 and the ACF and PACF graphs (in Figures 3, 4, and 5) are confirmed that, after the first difference procedure to the confirmed Autocorrelation function (ACF) and partial autocorrelation (PACF) graphs were used to determine the order of the ARIMA model. The best ARIMA models for the three series are shown in Table 1: the ARIMA (1, 1, 1) model for confirmed cases, ARIMA (1, 0, 1) model for recovered cases, and ARIMA (1, 0, 0) model for deaths. The above models were compared with different ARIMA models to select the best model for the data using goodness-of-fit measures (MSE, MAD, MAPE), as in Table 1.

1-Confirmed Cases Model
The properly model for the confirmed cases is ARIMA (1, 1, 1). Modeling results of an ARIMA (1, 1, 1) process have been estimated by MLE and are presented in the Table 2. The coefficient estimate of AR (1) and MA (1) are statistically significant at 1% level of significance and the model overall are statistically significant at 5% level of significance.   According to Box-Jenkins approach, the diagnostic tests of the model are checking; the normality and the stationary of the residuals. Figure 6 shows the residuals are stationary. Also, Figure 7 shows the values of residuals are distributed normally it is obvious from histogram and Anderson-Darling Normality test, as p-value for the series of confirmed is (0.074 ) greater than 0.05 which means the data is normal with 95% confidence interval. According to the results in Table 1, the best model is ARIMA (1, 1, 1

2-Recovered Cases Model
The properly model for the recovered cases is ARIMA (1, 0, 1), see Table 1. Modeling results of an ARIMA (1, 0, 1) process have been estimated by MLE and are presented in the Table 3. The coefficient estimate of AR (1) and MA (1) are statistically significant at 1% level of significance and the model overall are statistically significant at 1% level of significance.    Figure 8 shows the residuals are stationary. Also, Figure 9 shows the values of residuals are distributed normally it is obvious from histogram and Anderson-Darling Normality test, as p-value for the series of confirmed is (0.084) greater than 0.05 which means the data is normal with 95% confidence interval.

3-Death Cases Model
The properly model for the death cases is ARIMA (1, 0, 0), see Table 1. Modeling results of an ARIMA (1, 0, 0) process have been estimated by MLE and are presented in the Table 4. The coefficient estimate of AR (1) is statistically significant at 1% level of significance and the model overall are statistically significant at 1% level of significance.  Figure 10 shows the residuals are stationary. Also, Figure 11 shows the values of residuals are distributed normally it is obvious from histogram and Anderson-Darling normality test, as p-value for the series of confirmed is (0.657) greater than 0.05 which means the data is normal with 95% confidence interval.

Forecasting
The forecasting results of the number of confirmed cases, recovered cases and death of the Egypt are given in Tables 5, 6, and 7. As ARIMA (1, 1, 1) model is fit the number of confirmed cases, then, forecasts the confirmed cases values with 95% confidence limits for the next twenty days.
The forecasted values of the number of confirmed cases are given in Table 5, also ARIMA (1, 0, 1) model is fit the number of recovered cases, therefore we forecast recovered values for the next  Table 6, finally ARIMA (1, 0, 0) model is fit the death cases, therefore we forecast death values for the next twenty days. The forecasted values of the death cases are given in Table 7. Figures 12, 13, and 14 present the trend of the actual and the forecasted confirmed, recovered and death values with their 95% confidence limits.

DISCUSSION
The aim of this study is proposed three ARIMA models to meaningful in forecasting and controlling COVID-19 cases in Egypt, especially after the genetic mutation of this virus in 2021.
Based on the recently data from the official Ministry of Health up to April 2021, we conduct three ARIMA models of COVID-19 confirmed, recovered, and death cases. The Box-Jenkins approach has been used to get more efficient ARIMA models for the data. We can summarize the resulting models in the following equations: next twenty days. The forecasted values indicated that the forecasted confirmed and death cases would increase, while the cases of recovered would decrease. So we can say that the ARIMA model is a good statistical model for predicting COVID-19 cases in Egypt even if the virus is genetically modified. Our findings will enable us to provide appropriate advice to assist decisionmaking in Egypt on how to avoid the negative effects of this pandemic.

CONCLUSION
The aim of the study was to model and forecast the number of confirmed cases, recovered COVID-19 cases and death using the Box-Jenkins approach based on data were obtained from 1 February 2021 to 31 March 2021. The four stages of Box-Jenkins approach are conducted to obtain an appropriate ARIMA model for the number of confirmed cases, recovered COVID-19 cases and death in Egypt, and we used this model to forecast the number of confirmed cases, recovered COVID-19 cases and death for the next twenty days. Time series plots were used for testing the stationarity of the data. Also, the MLE was used for estimating the model. Using the different goodness-of-fit measures (MSE, MAD, and MAPE), the various ARIMA models with different order of autoregressive and moving-average terms were compared. According to MSE, MAD, and MAPE measures, the best model is ARIMA (1, 1, 1) for confirmed cases, ARIMA (1, 0, 1) for recovered cases, and ARIMA (1, 0, 0) for death.
In Future work, we can use one of the multivariate time series models to predict COVID-19 cases in Egypt such as an autoregressive distributed lag model [18,19,20] or a vector autoregressive model [21].

CONFLICT OF INTERESTS
The author(s) declare that there is no conflict of interests.