Optimal models for estimating future infected cases of COVID-19 in Oman

The recent coronavirus disease 2019 (COVID-19) outbreak is of high importance in research topics due to its fast spreading and high rate of infections across the world. In this paper, we test certain optimal models of forecasting daily new cases of COVID-19 in Oman. It is based on solving a certain nonlinear least-squares optimization problem that determines some unknown parameters in fitting some mathematical models. We also consider extension to these models to predict the future number of infection cases in Oman. The modification technique introduces a simple ratio rate of changes in the daily infected cases. This average ratio is computed by employing the rule of Al-Baali [Numerical experience with a class of self-scaling quasi-Newton algorithms, JOTA, 96 (1998), pp. 533–553], in a sense to be defined, for measuring the infection changes.


Introduction
The coronavirus disease 2019 is an infectious disease that first began in Wuhan in the Hubei Province, China, on December 31, 2019, and rapidly spread throughout the city. Then, after 31st January 2020, the World Health Organization (WHO) declared the outbreak of COVID-19 to be a pandemic after it spread globally [3].
Many researchers have studied the outbreak of COVID-19 (see, e.g., Almeshal et al. [3], Ayoub et al. [4], Batista [5], and the references therein) to provide a prediction of the number of infections that will occur over time globally and when the spread of the virus end. Batista [5] investigated the logistic and classic susceptible-infected-recovered dynamic model to predict the daily cases and used the iterated Shanks method to obtain the final numbers of the coronavirus epidemic. Based on the confirmed data on COVID-19 in Kuwait, Almeshal et al. [3] estimated the size of the COVID-19 spread and determined its end phase. Xavier [15] fit the daily total cases of COVID-19 to the logistic model of the least-squares with application of the steepest descent method. Garbey et al. [8] used a computational model to construct a hospital workflow during a pandemic and to assist the management of the health system facing a new crisis.
Oman, with a total population of less than five million [13], also faced the new epidemic of COVID-19 and reported that the first two returning citizens from Iran tested positive for COVID-19 on February 24, 2020 [11] and were placed under domestic quarantine. By the end of March, Oman had entered the community transmission stage of COVID-19, and closed its border. The first death in Oman was recorded in April, 2020, and the lockdown measures and social distancing rules were initially imposed in the capital city of Muscat from April 10 to 22. By May 2, COVID-19 outbreak had spread to all of Oman, the lockdown measures were extended until the end of May. The daily new cases over 1000 were recorded throughout July, and the second lockdown measures and night time curfew were then imposed until August 8. In Table 5 of the appendix, we list the numbers of daily reported cases, recovery cases, death cases, and the accumulated cases for COVID-19, in Oman during the period from April 1 to July 29, 2020. We observe that on July 29, the total cases of COVID-19, recovery and death are 78569, 60240 and 412, respectively. These confirmed data, which were taken from the websites in [11] and [14], will be used to define some parameters in the optimal models to estimate the number of future cases.
In this paper, we define the infected data (t i , y i ) , i = 1, 2, . . . , 120, which are stated in the first and second columns of Table 5, to illustrate our analysis that can be extended to other data. To provide an idea of the infected daily cases in Oman, we plot the given data (the daily reported infected cases) and join each two points, in that order, by a straight line as depicted in the left of Fig. 1. The right histogram of Fig. 1 shows the cumulative cases for the period of four months.
In our analysis we predict the daily new cases of the COVID-19 epidemic in Oman, based on deriving some mathematical models, nonlinear least-squares problems, and average ratio measures. In Section 2, we derive a nonlinear least-squares problem for a certain type of models that depend on the available data. Using the infected cases in Oman, given in Table 5, we define the parameters that appear in the proposed models by solving the corresponding least-squares problems. Section 3 proposes the average ratio measure and uses it to modify some models in forecasting the future infected cases. Section 4 concludes the paper and Appendix provides the data on COVID-19 that were recorded in Oman.

Data Fitting Models and Least-Squares Problem
In this section, we consider the following form of a mathematical model: where x = (x 1 , . . . , x n ) T is a vector parameter of length n, defined by the user and based upon the model applications (here, 2 n 4). It is assumed that the function φ is continuous and differentiable for both x and t.
A value for x is to be found in a certain sense using a given available data (t i , y i ), i = 1, 2, . . . , l, (l n) (as in Table 5, for instance) such that the following residuals or the Euclidean norm r(x) , where r(x) = (r 1 , r 2 , . . . , r l ) T are sufficiently close to zero: Thus, we consider finding the least value of the function f(x) = 1 2 r(x) 2 by solving the unconstrained optimization problem that is known as the Least-Squares Problem (LSP), because the objective function f(x) is the sum of squared residuals. An iterative algorithm will be used to solve this problem provided a starting point x (1) is given.
To obtain a solution of problem (2.3) (say, x * ), we must solve the system of n equations g(x) = 0, where is the gradient of f and is the Jacobian matrix of the residual vector r. If φ(x, t) is a linear combination of x, and from (2.2), so dose all the residuals, then A is reduced to a constant matrix and g(x) = 0 to the so-called system of normal equations whereŷ = (y 1 , y 2 , . . . , y l ) T , which needs to be solved to obtain the solution x * (see, e.g., Burden and Faires [6]). In general, the residual r(x) is nonlinear so that solving the system g(x) = 0 analytically might be impossible. Thus, we consider solving problem (2.3) using a numerical optimization method that generates a sequence of points {x (k) } iteratively, which converges to x * for a certain initial estimate x (1) (for details, see, e.g., Madsen, Nielsen and Tingleff [10]). Here, we consider applying the line search descent method (for a description, see, e.g., Fletcher [7], Hansen, Pereyra and Scherer [9], Madsen, Nielsen and Tingleff [10] and Nocedal and Wright [12]). We now define some least-squares data fitting problems based on selecting certain mathematical models of form (2.1) from the literature. In particular, we consider the following five models with given starting points: Note that φ i , for i = 1, 2, 3, represents the linear least-squares problems (eg., [6], [10]), while φ 4 and φ 5 the the nonlinear least-squares problems of form (2.3) (e.g., [10]). The corresponding five models (referred to as LSP i , i = 1, 2, . . . , 5, respectively) are applied to predict the daily infected cases of COVID-19 in Oman, using the confirmed data given in Table 5, where t i denotes the number of day i, y i is the number of infected cases on day i, and l = 120. To solve these problems, we used the well-known BFGS quasi-Newton optimization method (described in Fletcher [7]) and obtained, for each problem, a solution x * , the optimal value f(x * ), and the norm g(x * ) , as listed respectively in the second, third, and fourth columns of Table 1, rounded to four significant figures. We also listed the number of iterations (Iter) and the function and gradient (f,g) calls, which are required to solve the problems. Because all values of f(x * ) are large, the problems are usually referred to large residual. Problem Substituting the five values of x * , from the second column of Table 1 for LSP i , i = 1, 2, . . . , 5, into expressions (2.5), (2.6), . . . , (2.9), respectively, we obtain the following five (new) mathematical models: for i = 1, 2, . . . , 5. In Fig. 2 we plot the results and compare them with the data as given in Fig. 1(a). It indicates that our models seem to fit reasonably well with the available 120 data for the period of April 1 to July 29, 2020. From the extended period (from July 29, 2020) to August 31, 2020 in Fig. 2, we note that the predicted daily number of infected cases decreases only for LSP 3 , and observed that the peak of the epidemic was probably on July 15, 2020. Furthermore, the end of the outbreak is expected to be at the end of August 2020. However, since the other models for LSP i do not predict a decrease in the number of infected cases, in the next section we modify (2.10) based on the new average ratio measure to improve the estimated future infected cases.

New Average Ratio Measure and Modified Models
There are several possible choices for defining the rate of changes to analyze the short-term forecasting. We consider the rule of Al-Baali (see, e.g., [2] and essentially [1]) for measuring the ratio of changes. This rule is defined using the average of the ratios R i ∈ [0, 2], i = p, p + 1, . . . , q − 1, for the range [p, q] as follows: Here, p and q are the first and last numbers in the given daily reported infected cases and R i is given by where y i and y i+1 are the number of infected cases on day i and next day, respectively (for COVID-19 in Oman as reported and listed in Table 5). The average value of ratio R (3.1) always falls in the interval [0, 2]. A value of R 1 indicates that there is a 100(1-R)% chance of reducing the daily number of infected cases. Otherwise, when R > 1, there is a 100(R-1)% chance of increasing the number of infected cases (for more details on this measurement ratio, see Al-Baali [2], for instance).
Since the incubation period of COVID-19 can be as long as 14 days, the WHO states that the clinical recovery time for mild infected cases is approximately 2 weeks. Thus, isolation and quarantine should be in place for 14 days for an infected person. For models application of the outbreak of COVID-19 in Oman, we choose two periods of 15 days and 30 days to obtain the values of R.
For q = 120 and p = 1, 16, 31, 46, 61, 76, 91, and 106, we obtained the values of R for a period of 15 days as stated in the last column of Table 2. The first column of the table listed the total number of the days for 8 periods of time as stated in the second column. We observe that when the value of ratio R is closed to 1, a slight change in the future infected cases is expected, but for R = 0.9402, a reduction in the predicted future new infected cases is expected. Instead of (2.10), the following modified models for forecasting the COVID-19 will be used to estimate the future number of infected cases: for i = 1, 2, . . . , 5 and t a = 120 is the number of days corresponds to the last data entry that is on July 29, 2020.
To illustrate the behaviour of the modified five models (2.10), we consider the eight periods of 15 days as listed in the first column of Table 2 and predict the future number of infected cases up to August 31, 2020, where t = 121, 122, . . . , 153 (which corresponds to the period of July 30 -August 31, 2020). The expected results are listed in Table 4.
We note that the modified functions φ 3 and φ 5 estimate the future infected cases better than the other modified functions in the sense that their predicted future cases are less than that of the other prediction regardless whether R > 1 or R < 1. In addition, φ 3 seems to give a little better estimate than φ 5 . For example, using R = 0.9402, the former function φ 3 prediction less number of infected cases for the future 32 days (from July 29, 2020) from 1111 to 154, while the latter function φ 5 from 1258 to 175. Furthermore, the predicted future numbers of infected cases of φ 3 for R = 0.9924 and R = 0.9402 are decreasing from 1173 and 1111 to 919 and 159, respectively. The difference between the final predicted two numbers of cases is large, although the corresponding difference in R is very small. The predicted results are very encouraging, since following the second lockdown which eventually lifted by the end of the first week of August, the daily number of infected cases in Oman have consistently dropped to below 200 until the end of August 2020.
To illustrate these results, we sketch the five models (3.3) from the period of April 1 -October 30, 2020 as shown in Fig. 3. We observe that from Fig. 3(d) where the value R=0.9402 is obtained for the period of July 15 to July 29, 2020, the peak (daily) number of infected cases is nearly 1400, and after that, the outbreak of COVID-19 in Oman vanishes by the mid of October 2020.
For comparison (in the worst case scenario), we also consider a longer monthly period of infected cases to obtain the values of R. The results are listed in Table 3 for the four months of April, May, June, and July. We calculate the largest value of R = 1.075 for May, and the smallest values of R = 0.9840 for July. The predicted results are expected to progressively change every month starting from April.
The corresponding modified function (3.3) with the new values of R and t a = 30, 61, 91, and 120, which is the number of days corresponding to the last available data on April, May, June, and July, respectively are plotted in Fig. 4. We note from Fig. 4(a-c) that the values of φ i , i = 1, 2, . . . , 5, for the four cases of R > 1, increase as t increases. However, for R < 1, they decrease as t increases (see Fig. 4(d)).
As illustrated in Fig. 4, the predicted of infected cases in Oman are progressively changing according to the monthly values of R. Fig. 4(a) shows using R=1.032 for the month of April, the predicted cases are increasing to 1400 by the end of June. We note that for the month of May, the value of R increases to R=1.075, and accordingly as shown by Fig. 4(b), the predicted number of cases is sharply increasing to 50000 by the end of July. For the month of June, we notice that, the value of R decreases to R=1.004, and as shown in Fig. 4(c), after a steady increase up to the end of July, the predicted cases are slowing down to 1300 by the end of August. Lastly for the month of July, the value of R decreases again to R=0.9840, and Fig. 4(d) shows after a steady increase up to the end of August, the predicted cases are falling sharply to below 800 by the end of September, and the number continues to drop to below 400 by the end of October 2020.  Table 4: Predicted COVID-19 cases in Oman from July 30 -August 31, 2020.

Conclusions
A prediction of the spread of the COVID-19 epidemic in Oman is presented using the methods of least-squares data fitting for the period of April 1 to July 29, 2020 and forecasts the future daily infected cases. We also introduced the new ratio measure to modify the models so that estimation of the daily infected cases becomes reasonable.
Finally, it is worth noting that the proposed analysis in predicting the future infected cases can be extended in a similar manner to other cases such as the daily number of death cases (as shown in Fig.  5). Using the period of April 1 to July 29, 2020 recorded death cases of COVID-19 in Oman, we obtain R=0.9920 and as shown in Fig. 5, it is predicted that the number of death cases is steady increasing and reaches the peak of 11 deaths by the end of July.