Forecasting COVID-19 situation in Bangladesh

Forecasting the COVID-19 confirmed cases, deaths, and recoveries demands time to know the severity of the novel coronavirus. This research aims to predict all types of COVID-19 cases (verified people, deaths, and recoveries) from the deadliest 3rd wave data of the COVID-19 pandemic in Bangladesh. We used the official website of the Directorate General of Health Services as our data source. To identify and predict the upcoming trends of the COVID-19 situation of Bangladesh, we fit the Auto-Regressive Integrated Moving Average (ARIMA) model on the data from Mar. 01, 2021 to Jul. 31, 2021. The finding of the ARIMA model (forecast model) reveals that infected, deaths, and recoveries number will have experienced exponential growth in Bangladesh to October 2021. Our model reports that confirmed cases and deaths will escalate by four times, and the recoveries will improve by five times at a later point in October 2021 if the trend of the three scenarios of COVID-19 from March to July lasts. The prediction of the COVID-19 scenario for the next three months is very frightening in Bangladesh, so the strategic planner and field-level personnel need to search for suitable policies and strategies and adopt these for controlling the mass transmission of the virus.


Introduction
Worldwide the outburst of coronavirus disease 2019 (COVID- 19) has taken place recently and has become a severe hazard to human life and livelihood [1]. COVID-19 is a transmittable disease caused by a new virus called severe acute respiratory syndrome coronavirus 2. This virus has never been recognized before by human beings [2]. The virus can spread among people from an infected person through coughing, sneezing, talking, or even breathing [2]. Several respiratory diseases such as cough, sore throat, fever, and even pneumonia can occur by this virus [1]. COVID-19 is a universal epidemic announced by the World Health Organization (WHO) on Mar. 11, 2020 [3]. The virus was first identified in Wuhan, China, in December 2019 [2]. Up to July 2021, about 195 million people have been affected, with approximately 4 million deaths due to COVID-19 worldwide [3].
The Institute of Epidemiology, Disease Control, and Research of Bangladesh (IEDCR) confirmed three cases of COVID-19 patients in Bangladesh first time in Mar. 08, 2020 [4]. After that, the number of COVID-19 patients has increased day by day in Bangladesh. From Mar. 08, 2020 to Dec. 27, 2020, 7,452 people died from this virus where the total number of infected patients was 509,148 [4,5]. In January 2021, Bangladesh was able to minimize the transmission rate of COVID-19 by following the guidelines of the WHO. The infection rate was less than 5% for the seven consecutive weeks [6]. The pandemic is considered under control when the infection rate is lower than 5% against tested cases for two consecutive weeks [7]. The rate of confirmed cases and deaths continuously experienced decreasing, and on Feb. 26, 2021, the deaths were recorded as five [5].
However, in the mid of March 2021, Bangladesh experienced the second wave (Beta variant) of COVID-19 [8]. As a result, the rate of confirmed cases and deaths in March 2021 had increased sharply. In the first week of April 2021, the situation was out of control. In April, the country saw 6,000 to 7,000 confirmed cases daily, with the number of confirmed cases being 7,626 on Apr. 07 and on Apr. 19 the number of total death recorded to 112 [9], these were the highest confirmed cases and deaths recorded respectively since the beginning of spreading of this disease. As a control measure, the government of Bangladesh imposed a strict shutdown on Apr. 14, 2021 [10]. After a drastic rise of infection and death rate in April 2021, the month of May experienced a decline both in the confirmed and death cases where the confirmed cases and deaths were recorded 1,710 and 36 respectively on May 31 [10].
While Bangladesh successfully handled the second wave of the Beta variant of the COVID-9 attack, the more dominant and fast-spreading Delta variant of the COVID-19 virus was detected for the first time (two patients) in Bangladesh in May 08 [10]. The Delta variant was identified in India, which has spread quickly around the country [6]. As per infected and death rates, the current wave of COVID-19 is considered the deadliest one for Bangladesh. The death and infected rates of COVID- 19   ing dramatically, and Bangladesh is experiencing the third wave [10]. IEDCR study revealed that in June 2021, around 78% of confirmed cases were Delta variants [10]. This variant has spread enormously in the Khulna division since Khulna has a border with West Bengal of India [6]. As a result, Bangladesh stood in the second position of badly affected countries among the southeast Asian nations [10]. The country recorded 230 deaths [4] on Jul. 11, 2021, presenting the highest number of fatalities since the breaking out of the COVID-19 pandemic in March 2020.
Since this virus has frequently changed its nature and emerged with a new variant such as Beta, Delta, etc., it is almost impossible to determine the tentative time of its disappearance. In this regard, short-term forecasting would be crucial to predicting the upcoming scenarios for managing public health efficiently while facing social, emotional, and economic challenges [11][12][13]. Many research has been done using several mathematical and statistical methods for identifying the characteristics of COVID-19 [14] and predicting the confirmed cases and the effects of the disease, based on data from 2020. Most of the research used the Susceptible, Infected and Recovered (SIR) or Susceptible, Exposed, Infected and Recovered (SEIR) model to investigate the spread of COVID-19 [13,15]. Ries et al. investigated short-term forecasting using the mathematical compartmental susceptibleinfected-recovered-dead (SIRD) model. They found that short-term forecasting is challenging whether long-term forecasting is uncertain due to the lack of models [13]. Petropoulos and Makridakis used the exponential smoothing model for forecasting the worldwide COVID-19 situation [16]. Generalized logistic growth models (GLM) were used for predicting the total confirmed cases of Hubei province, China [11]. Many researchers used the Autoregressive Integrated Moving Average (ARIMA) model to forecast the scenario of COVID-19 for different countries [12,16] as this model is an investigative, data-oriented method, allowing the user to adjust the robustly oriented procedures that change over time by forecasting the scenarios future condition based on current situations [17,18].
Someone could argue that a specific modeling technique would be more precise to predict the trend of COVID-19 than other modeling techniques. However, no evidence supports this hypothesis [11]. In fact, from the literature, we can see some instances of forecasting that failed using SEIR [19], SIRD [20] extensions of the SEIR model with more compartments [21]. Moreover, a SIR model is more suitable for revealing the variables' relationships than forecasting [22,23].
Forecasting helps countries' strategic planners to make plans and policies to prevent pandemic casualties and make public health and life smoother. Accurate forecasting requires knowledge on the spreading trend of the confirmed cases, deaths, and recoveries [12]. Since this study focused on the second wave (started from mid of March 2021) and third wave (started from the middle of May 2021) data of Bangladesh, we did not include previous months' data (January and February). This study aims to understand the future trend of confirmed cases, recoveries, and deaths due to Beta and Delta variants for the forthcoming three months for Bangladesh. This forecasting work would be helpful for the concerned government agencies, strategy and policy planners, and the people of Bangladesh in adopting the new course of actions and strengthing the ongoing preventive measures against the COVID-19 disease. It would also be helpful for the people who are the victim of the present psychosocial and socioeconomic condition.

Data source
We gathered data of cumulative daily confirmed cases, deaths, and recoveries of COVID-19 from Mar. 01, 2021, to Jul. 31, 2021, from the Directorate General of Health Services (DGHS) [24]. DGHS is a wing of the Ministry of Health and Family Welfare, responsible for pro-viding health services in Bangladesh. Our analysis depends on the cumulative figure of confirmed cases, deaths, and recovery cases of COVID-19 in Bangladesh published online from DGHS, so it does not require ethical approval.

Autoregressive Integrated Moving Average (ARIMA) model
The time-series (short-term) model has some loopholes, such as irregularities and intervals of reporting, so the cumulative curves are more stable and produce more steady and reliable estimates [25]. Thus the cumulative data is used in this research to identify the future path of the epidemic in Bangladesh. However, the number of infected people, deaths, and recoveries of COVID-19 is expected to grow exponentially over time, so we choose to use modest time-series methods such as the ARIMA model to forecast the confirmed cases number casualty and recoveries of the upcoming three months.
The ARIMA model benefits fitting and forecasting data compared with exponential smoothing [26]. It can accumulate seasonal forecasting and non-seasonal forecasting as well [12]. The ARIMA model is an investigative and data-oriented technique and allows the researcher to fit an appropriate model adapted from the data structure [16]. This model assumes that time series values are linearly related, aiming to extract local outlines by removing high-frequency noise from the data [16]. Furthermore, the ARIMA model can adapt to the dynamicallyoriented systems by updating the current events for forecasting the future state based on the recent information [12,16]. Using the fitted ARIMA model, the forecasting interval will get until the end of October 2021 in this research. To fit the model, we consider the model as ARIMA (p, d, q), where p represents the autoregressive term, d denotes the differencing order, and q indicates the moving averages term. Since the ARIMA model is the amalgamation of Autoregressive (AR) and Moving Average (MA) terms, we believe that it fits sound to the nature of the data and delivers good short-term forecasting. Parameters (p, d, q) are recognized by Autocorrelation function (ACF) and Partial Autocorrelation Function (PACF) for assessing the model fit.
In addition, ARIMA (p,d,q) is chosen based on the Akaike information criterion (AIC), a goodness of fit test where the model with minimum AIC is considered here. We use R package tseries, zoo, and forecast [27] to fit the ARIMA model. We run the ARIMA model through auto.arima function under the package forecast.  1 and 2), which are 2.3 and 2.5 times higher than Mar. 01, 2021, respectively. However, the number of recoveries is reported as 1,087,212 (Fig. 3) on Jul. 31, 2021, which is 2.2 times higher than Mar. 01, 2021.

Results
Our ARIMA model is also fitted under data from Mar. 01, 2021 to Jul. 31, 2021 (157 days), and we forecast the data up to Oct. 28, 2021 (90 days). ARIMA (2,2,2), ARIMA (0,2,0), and ARIMA (3,2,4) models are used for forecasting the number of confirmed cases, recoveries, and deaths which are presented respectively in Figs. 4-6. Our data is stationary time series data, so the mean and variance of the data are stable by nature. Fig. 4 shows a rapid exponential increase of confirmed cases and will grow by 3.5 times from March 2021 to the end of October 2021. The 95% prediction interval is from 1,608,311 to 3,021,209 (forecast value is 2,314,760) for confirmed cases which shows higher growth. Fig. 5 presents the results for deaths where deaths will be increased by 3.8 times at a later point in October. The 95% forecasting interval for deaths is 28,077 to 52,197, where the forecast value is 40,137. Likewise, the prediction model of recoveries also reveals rapid growth (Fig. 6), showing that the recoveries number would probably be increased by more than five times at a later point in October, with a 95% forecasting interval of 1,841,210 to 2,838,274 (forecast value is 2,339,742). It is observed from Fig. 4 and Fig. 6 that the lower bound of recoveries is higher than, the lower bound of confirmed cases, which shows misprediction of rallies.

Discussion and conclusion
Since the COVID-19 situation was alarming in Bangladesh in July 2021 and the deadliest third wave is running, short-term forecasting is the demand of time. This research used recent information of COVID-19 confirmed cases, deaths, and recoveries to forecast the upcoming three months. The forecasting model of this research showed that there might be almost four-times growth in confirmed cases and deaths at the end of October if the trend of March to July lasts. Similarly, the forecast model reveals that there could be a fivefold increase in recoveries at the end of October. Some research has been done using several forecasting models, including the ARIMA model [28][29][30]. Still, most of them are based on the data from the first wave (2020), and no research has been done yet using a forecasting model with recent third wave data in Bangladesh. Moreover, due to the unpredictable nature of the COVID-19 virus, which changes its variants frequently, the analysis of one-yearold data or more than that may not reflect the situation perfectly. So, it requires the latest data to forecast the upcoming scenario. Fitting the ARIMA model with recent data can provide real scenario of the pandemic situation to the strategic planner for combating the upcoming deadliest situation. Most of the research work emphasized slowdown strategies of the COVID-19 disease [31,32]. However, without forecasting the upcoming scenario of the COVID-19 situation, policymakers could be misguided to determine their policies and strategies. Thus, the forecast made by this research may be helpful for policymak-   Regardless of the advantages of forecasting we have in this study, there are also some limitations. This work depends on the secondary data source. In Bangladesh, sources of COVID-19 data are the DGHS, IEDCR, and WHO's [3,4,9] website, and all of the three sources provide the same data. If the data is not accurate, it may mislead the results. In addition, due to the socioeconomic condition, culture and religion, level of education, a significant number of the population do not like to be tested for the COVID-19.
Moreover, some superstitious concepts regarding the disease prevent them from disclosing their relatives' death caused by COVID-19. So the data from the secondary sources may not represent the actual scenarios of the COVID-19 situation for Bangladesh. Using the ARIMA model, some researchers only forecast confirmed cases [30] or deaths rather than recoveries [33]. In our study, we forecast recoveries that are inconsistent in practice. Further research will need to investigate the reason behind this. This study focuses on predicting the confirmed cases, deaths, and recoveries of COVID-19 rather than focusing on the impact of covariates on the COVID-19 transmission.
This study does not consider socioeconomic factors such as health indices, sex, age, population density, economic growth, and development. This study has also excluded clinical factors such as coinfections and co-morbidity, family members' respiratory history, lifestyle factors such as social distancing, quarantine, and smoking habits. In addition, the study does not focus on the country's health system, rules, regulations, travel restrictions, educational institutions closure, etc., that may have a crucial impact on the transmission of the COVID-19 virus [34]. Future research could investigate the scenarios of COVID-19 transmission considering the covariates. In the future, some robust models such as SIRD [13] and SIR model could be used to see the impact of uncertainties, mitigation strategies, and underreporting of cases in Bangladesh.
Since the ARIMA model is generally used for short-term forecasting so sometimes, the prediction result may not represent any particular events because of time constraints. In our study, we forecast the situation of COVID-19 up to end of October 2021. Since the Omicron, the new variant of COVID-19 is emerging and spreading around the world; the concepts and models used in this study could help future researchers to predict the future scenario of COVID-19 or any event.
Bangladesh's socioeconomic and cultural conditions are unfavorable to maintaining the WHO's health guidelines, such as social distancing, travel restriction, isolation, and strict lockdown. Besides this, the healthcare system of Bangladesh is not capable enough to handle the COVID-19 pandemic, so the Bangladesh government is looking for alternatives to face the COVID-19 pandemic. The country successfully signed contracts with seven vaccine manufacturers from different countries. Bangladesh started its mass vaccination program on Feb. 07, 2021 [35]. However, from the beginning to Aug. 31, 2021, only 4.8 % of its total population was fully vaccinated, and 11.3% of its population completed the 1st dose of vaccine [36]. The vaccine could reduce the casualty and severity, which helps reduce the number of patients in the hospitals but may not reduce the transmission of the COVID-19 virus if policy and strategy planners can get the exact and accurate forecasting of the confirmed cases, recoveries, and deaths that could help them to take proper measures for controlling the COVID-19 pandemic.
The COVID-19 pandemic could be deadliest in the most densely populated countries like Bangladesh since this disease spread fastly in crowded places. In addition, less populated and economically strong nations can easily follow WHO's health guidelines. Still, in developing nations like Bangladesh, it would be difficult to follow and implement WHO's health guidelines where socioeconomic issues are of significant concern. So, the whole country became the hotspot of the Delta variant of the COVID-19 virus from May [11]. The government of Bangladesh took several measures and steps to combat this disease that not limited   to mass awareness for keeping social distancing and stay at home, diagnosis of suspected patients, dedicated hospital for COVID-19 patients, arranging institutional isolation and quarantine, partial, regional, and whole country lockdown, closure of private and public offices, industries, and educational institutions. However, the outcomes of our study present a growing trend of COVID-19 confirmed cases, deaths, and recoveries for the next three months. Thus, the government of Bangladesh, the public health providers, and stakeholders should take integrated initiatives to reduce the further transmission of the COVID-19 disease that would be helpful to prevent the fourth wave of COVID-19 in this country.