Ensemble Forecast of COVID-19 for Vulnerability Assessment and Policy Interventions

The COVID-19 pandemic necessitates forecasts to frame science-informed policies. An accurate forecast of the size and timing of future waves could help public health officials and governments to plan appropriate responses. An ensemble forecast by aggregating different scenarios and models makes the prediction robust and reliable. We present an ensemble forecast for Wave-3 of COVID-19 in the state of Karnataka, India, using the IISc Population Balance Model for infectious disease spread. The reported data of confirmed, recovered, and deceased cases in Karnataka from 1 July 2020 to 4 July 2021 are utilized to tune the model’s parameters. An ensemble forecast is done from 5 July 2021 to 30 June 2022. The ensemble is built with 972 members by varying seven critical parameters that quantify the uncertainty in the spread dynamics (antibody waning, viral mutation) and interventions (pharmaceutical, non-pharmaceutical). The probability of Wave-3, the peak date distribution, and the peak caseload distribution are estimated from the ensemble forecast. Analysis of the ensemble forecast results shows that compliance to COVID-appropriate behaviour, daily vaccination rate, and emergence time of immune-escape new variants are the most significant causal factors that determine the timing and severity of COVID-19 Wave-3. We observe that when compliance to COVID-appropriate behaviour is similar to a lockdown-like situation, the emergence of new immune-escape variants beyond September 2021 is unlikely to induce a new wave. No or partial compliance to COVID-appropriate behaviour makes a new wave inevitable. However, increasing the vaccination rate reduces the active caseload at Wave-3’s peak. If Wave-3 emerges, on average, the daily confirmed caseload of children (Age 0–17 years) could be up to seven times more than the corresponding caseload (4390) at Wave-2’s peak. Therefore, large-scale surveillance, including genome sequencing for early detection of new variants and non-pharmaceutical interventions to improve COVID-Appropriate behaviour, is vital to prevent Wave-3 of COVID-19. Doubling the vaccination rate as of 4 July 2021 to 560K doses per day will reduce the daily confirmed cases even if Wave-3 arises. Consequently, hospitalizations, ICU, and Oxygen requirements can be decreased. Since vaccination is yet to start in children, it is essential to ramp up the public health facilities, including pediatric ICUs to treat MIS-C, by 5-9 times to handle the worst-case situation. From a modeling perspective, capturing the nonlinear dynamics induced by the uncertainties in the causal factors is the key to a successful forecast. Therefore, an effort should be made to build an ensemble forecast that contains multiple models and, more importantly, models that account for causal factor uncertainties.


Introduction
COVID-19 cases were first reported in India from February 2020, and the first nationwide lockdown was imposed on 25 March 2020 to contain the spread and improve preparedness. A graded nationwide reopening started in June 2020. After that, in the state of Karnataka, the Wave-1 began gradually building up, reaching a maximum daily confirmed caseload of 10K (7-day average) around 11 October 2020. Subsequently, the daily confirmed caseload reduced to less than 1K by December 2020 and stayed below 1K until mid-March 2021. From the 3rd week of March 2021, the caseload started increasing again, and a Wave-2 started raging, reaching a peak of 47.5K daily confirmed cases on 9 May 2021 1 . Additionally, since 16 January 2021, vaccination against COVID-19 started in India with a rationing policy based on the recipient's age 2 . Genome sequencing studies have shown that Wave-1 was caused by the B.1.1.7 (Alpha) variant and Wave-2 was caused by the B.1.617.2 (Delta) variant [3][4][5][6][7] . Despite timely NPIs (decentralized state-wise, city-wise, and community-wise lockdown), there was a widespread concern due to the high load on the medical infrastructure with reports of ICU and oxygen shortages during Wave-2.
Unfortunately, none of the mathematical models in use before Wave-2 could forecast the Wave-2 8 . Unaccounted uncertainties in human behaviour, mobility, implementation of policy at the local level, mutation, immune response, and anti-body waning could be one of the reasons for the forecast failure. For a successful forecast, an ensemble method that incorporates these uncertainties is needed. An ensemble forecast is built by aggregating predictions from different scenarios and models. Such an approach has been shown to perform well in different fields consistently [9][10][11][12] . Particularly for COVID-19, a multimodel such as lockdowns, restrictions on mass gatherings, curfew, etc., will also affect the social distancing. To study the effect of different compliance of CAB on new waves, three scenarios (i) Good compliance (Good CAB): similar to the behaviour between Mar-May'21, (ii) Bad compliance (Partial CAB), and (iii) Worse compliance (No CAB), are considered.
We vary these influencing parameters to build a total of 972 scenarios. For each scenario we estimate the date of peak (if any) and the caseload at peak. A scenario without a peak is declared as a No-Wave (NW) scenario. The conditional probability of Wave-3 given a particular value for a parameter is then estimated, from only those scenarios with that chosen parameter value, as the ratio of scenarios leading to a Wave-3 to the total number of scenarios. The conditional probability given two parameter values is estimated by considering only those scenarios in which those chosen parameter values are used. All statistical quantities (mean, median, confidence interval, inter-quartile range, probability distribution function and scatter) are estimated from this ensemble forecast.

Causal factors for a new COVID wave in Karnataka, India
Among the 972 scenarios, Wave-3 is observed in 648. In the event of Wave-3, the conditional probability (likelihood) of Wave-3 given a control variable is presented in Fig. 1 (a). Even though the Case-to-Infection ratio and the antibody waning could impact the emergence of new COVID waves, we see that these factors do not influence the probability of Wave-3 significantly. Variations in the emergence and reinfection from new immune-escape variant, vaccination rate, and compliance to NPI (CAB norms) significantly influence the emergence of a new COVID wave. In fact, the timing of the new immune-escape variant influences the probability of Wave-3 more than its reinfection percentage. Therefore, we characterize the timing of the new immune-escape variant, vaccination rate, and NPI (CAB norms) as the primary causal factors and further analyze its influence on the Wave-3.
To investigate the pair-wise combined effect of the primary causal factors on Wave-3, the conditional probability of Wave-3 given pairs of these causal parameters is presented in Fig. 1 (b). All scenarios where Good CAB is followed have the lowest probability of Wave-3. On the other hand, Wave-3 will definitely emerge (probability=1) in all scenarios where No CAB is followed. Even increasing the vaccination rate or delayed mutation do not reduce the probability when No CAB is followed. This analysis reveals that NPIs in the form of CAB norms are significant to control Wave-3. Early emergence of IENV, i.e., scenarios with IENV-Jul21 have high probabilities for Wave-3 since the fraction of vaccinated population is low. Only exception is the scenarios with Good CAB. Even though July 2021 has passed at the time of submission of this article, these results are still significant to forewarn how other infectious diseases may spread or even if a vaccine-escape mutation of SARS-CoV-2 emerges.

Wave-3 peak date and active caseload
The quantities of interest for policymaking are the peak date and active caseload at peak. Fig. 2 shows a scatter plot between the date of Wave-3's peak and active caseload at peak for different scenarios corresponding to the primary causal factors being present. A special axis entry for No Wave (NW) is used to depict the scenarios that do not show a new wave. The arrangement and color of scatter markers in the figure help us visualize the joint-effect of adherence to CAB by the public, IENV emergence and reinfection fraction, vaccination rate, and CIR in the population on caseload and date of peak. Each panel (a-i) in Fig. 2 shows only those scenarios in which the variable mentioned in the column (CAB) and the row (IENV emergence time) are active. For example, panel (a) shows all the scenarios where the CAB is set to "Good" level and IENV emergence time is set to July 2021. Within the panel, each point represents a particular scenario. Each scenario is colored by the vaccination rate, marked by a marker that represents the IENV reinfection fraction, and the border is colored by the CIR. First, we see that for scenarios with Good CAB (panels a,d,g), the active case at the peak is lower. Significantly, scenarios with Good CAB and IENV emergence time beyond September 2021 (panels d,g) do not see a new wave. In panel (a), we see a peak only when IENV causes reinfection in 100% of the previously infected and recovered population. This result further emphasizes that Good CAB is the primary intervention strategy for preventing COVID waves. However, in reality, the financial and livelihood requirements to reopen economies will result in a relaxation of CAB norms, and only a partial CAB is likely to be followed. In those scenarios with partial CAB (panels b,e,h), if an immune escape new variant emerges in July 2021 or September 2021, Wave-3 is inevitable in Karnataka. Moreover, we see that vaccination rate is the key governing factor that reduces the active caseload at peak. Crucially, reaching a vaccination level of 580K per day in Karnataka will ensure no Wave-3 if no immune escape new variant starts spreading in the population before November 2021. A higher CIR (meaning lower reported cases during Wave-2) will result in a delayed peak but at the same active caseload as seen by the clustering among the same color and marker shapes.
To complete the analysis, we look at the scenarios with no CAB. Under these scenarios, we expect the highest levels of the 4/12 caseload at peak and an assured emergence of new COVID waves in Karnataka. Fortunately, we see that vaccination reduces the caseload. The reinfection fraction significantly affects the caseload in the No CAB, IENV-Jul'21 scenarios (panel c). Here, we see a wide distribution in the caseload at the peak but a narrow distribution in the peak date. Such a situation is characteristic of the underlying populations immunity which vaccines improve. To further understand Wave-3 properties from the perspective of vaccination, Fig. 3 shows the probability distribution function of peak date and active cases at peak for different vaccination rates. Here, we only look at the scenarios resulting in a Wave-3 and ignore those that do not lead to a Wave-3. We see that almost all the distribution curves are multi-modal, indicating the joint-effects of CAB and IENV reinfection rates. It is evident from Fig. 3 that the emergence of a new immune-escape variant is likely to be followed by a third wave within 45 to 60 days. In most cases, a higher vaccination rate helps to bring down the active caseload at its peak. Doubling the daily vaccination rate (VR-560K, Fig. 3(c),(f)) results in fewer scenarios with Wave-3 despite the emergence of an immune-escape new variant in November. The critical contribution of increased vaccination rate is that more of the susceptible population are protected up to the vaccine efficacy levels.
As a ready reference for policymaking, Table 1 shows a summary of the median peak date, inter-quartile range of peak date and the minimum and maximum active caseload during this period of inter-quartile range for scenarios with specific combinations of IENV emergence time and vaccination rates. This table summarizes the information presented in the graphs above.

Effect of Wave-3 on different age-groups
The data until 10 June 2021 shows that 4.2%, 5%, 53.2%, 23% and 14.645 % of the people in the age groups 0-11, 12-17, 18-44, 45-59, and above 60 years were infected with COVID-19. However, India started an age-dependent vaccination drive, with no widespread vaccine availability for children (0-17 years). As such, for Wave-3, we expect a higher contribution to the daily confirmed caseload from children. To estimate this impact on different age groups, we analyze the age-wise daily confirmed caseload at Wave-3 peak and compare it with the corresponding numbers at Wave-2. Table. 2 shows the statistics of the ratio between the 7-day average of daily confirmed cases of Wave-3 (predicted by our model) and Wave-2 peaks. The first line in each cell shows the ensemble mean with the 95% confidence interval and the second line shows the ensemble median. Table 1. Statistics for total active cases and the date of the peak at various vaccine rates, and COVID-appropriate behaviour. The first value in the cell is the median date of peak and the mean value of active cases on the median date of peak, followed by the first and third quartile values for the date of peak, and the minimum and the maximum number of active cases in the inter-quartile range. We see that the children (0-17 years) are more affected that other age groups, primarily due to the unavailability of vaccines. In the worst case, with VR-280K and IENV-Nov21, the daily confirmed cases of children (0-11 years) at the Wave-3 peak will be on average twenty times more than that of the daily confirmed cases at Wave-3 peak. Overall, our ensemble forecast indicates that the average impact of Wave-3 on children (0-17 years) could be seven times more than that of Wave-2. Fig. 4 shows the ensemble forecast of active cases in Karnataka from May 2021 to June 2022. Each panel shows the scenarios given a pair of CAB and IENV emergence-time control variables. The colored solid line is the mean value of the ensemble of active cases, and the shaded zone in the region of uncertainty (each color corresponds to a particular vaccination rate). The rows in Fig. 4 represent the good, partial, and no CAB, whereas the columns represent the emergence of IENV in Jul'21, Sep'21, and Nov'21, respectively. These plots confirm the observations from our analysis above, but now showing a time series of the active caseload distribution. As mentioned earlier, the scenarios with Partial CAB (middle row in Fig. 4) are likely to be followed due to the reopening of the economy including offices and educational institutions. The impact of the new wave will be more when it emerges early (Fig. 4 (d)). At the peak of the new wave, the active caseload will be very high with the present vaccination rate, Table 2. Predicted impact of Wave-3 on different age groups with different vaccination rates. The first value in the cell is the mean with the confidence interval of ratios of the 7-day average of confirmed cases at Wave-3 peak and Wave-2 peak, followed by the median. In this case, the impact will be minimal. The benefits of increasing the vaccination rate can clearly be seen in Fig. 4 (e). Though a new wave is predicted in some scenarios, the active caseload is less with the doubled vaccination rate. Crucially, if a new variant does not emerge until the end of October 2021, then a new wave can be suppressed by doubling the vaccination rate ( Fig. 4 (f)). In this case, our forecast shows that the active caseload will be as high as 200K with the present vaccination rate. This finding emphasizes the necessity of vaccinating the entire population by December 2021.

Discussion
Among the three most important causal factors of Wave-3, viz., emergence of IENV, NPI (CAB norms) and vaccination rate, only the latter two are human controlled factors. NPIs such as mobility restriction, masking, physical distancing mandates, and crowd control measures must be enforced to ensure that the general public practice CAB. Public education programs and reminders are required for strict compliance. Ensuring a smooth supply chain for vaccine and enhanced production can increase the vaccination rates to the required level so as to vaccinate the entire population before a new immune escape variant emerges. Genome sequencing and large-scale surveillance measures are required to identify the emergence of an immune escape new variant and alert the public and government. If lockdown-like restrictions are imposed to ensure Good CAB, emergence of new variants beyond September 2021 is unlikely to induce new waves. We recognize that it is practically impossible to have continued imposition of lockdown-like restrictions. Nevertheless, our modeling shows that extreme caution while rolling back restrictions is required. Our results indicate that the key weapon to open up economies is increasing the vaccination rates. Even if a third wave is imminent due to the inability to enforce NPIs for Good CAB, increasing the vaccination rate will reduce the peak active caseload. This reduction is significant as vaccination will reduce severe cases that require hospitalization, ICU, and Oxygen. Specifically, the scenario with a vaccination rate of 560K per day (twice the average vaccination rate as of 4 July 2021) would have a lower probability of Wave-3 than other scenarios when Good CAB is not enforceable. The state can prepare the vaccination logistics based on this rate. Apart from arranging the vaccine doses, the state has to strengthen the micro planning and ensure better social 7/12 A longstanding worry among pediatricians and policymakers is the impact of COVID-19 on non-vaccinated children 27 . The ensemble forecast estimates that the daily confirmed cases of children (Age 0-11 and 12-17 years) at peak could be on average seven times more than the corresponding daily confirmed cases at Wave-2 peak. Based on the predicted increase in daily confirmed caseload of children in a potential Wave-3 compared to data of Wave-2, the policymakers can plan pediatric facilities for children with COVID-19. Further, it is crucial to set up a registry of infected children to monitor for short-and long-term consequences, including Multisystem Inflammatory Syndrome in Children (MIS-C). Evidence from a study in the USA 28 suggests that MIS-C incidence is 5.1 (95% CI, 4.5-5.8) persons per 1 million person-months. As such, we can expect 129 cases of MIS-C in one month during the peak. However, they also report that 316 MIS-C infections per 1 million COVID infections in children can occur. Thus, based on our results, we expect an estimated 514 children (on average) in Karnataka to

8/12
suffer from MIS-C if a Wave-3 occurs. The best policy intervention would be to strengthen the healthcare services for children, including opening new pediatric ICUs. We recommend such an approach even if the reality is less severe than estimated (due to differences in India and the USA), as the facility can strengthen the country's pediatric healthcare.
In short, policies to ensure at least partial compliance of CAB from the public, increased vaccination rates, surveillance for mutations and enhanced pediatric facilities are crucial to fight a new COVID-19 wave in Karnataka. Though the estimated Wave-3 numbers are for the state of Karnataka, the dynamics of the ensemble forecast will apply to all other states of India. Therefore, our recommendations can be contextualized for the policy interventions in the rest of the country.
Our projections were computed on 4 July 2021 29 . At the time of submission of this article, July 2021 has passed and there is no evidence for emergence of a immune-escape new variant so far. As predicted by scenarios without IENV-Jul21, no Wave-3 emerged in the month of September. The IENV-Jul21 scenarios are still significant for future planning when vaccine-escape variants emerge.
One of the limitations of our modeling is that we have not considered the possibility of new variants infecting the vaccinated population. However, we have considered a fixed efficacy of 70% for the vaccinated population that can partially offset the limitation. Further, the recovery rate is fitted to the data from July 2020 to May 2021; however, it could be more due to vaccination or less due to new variants.
From a modeling perspective, capturing the nonlinear dynamics induced by the uncertainties in the causal factors ensures a successful forecast. A national effort to build an ensemble forecast that contains multiple models, including models such as ours that account for causal factor uncertainties can strengthen the response to future waves.

IISc-COVID Model
The IISc Spatio-temporal Infectious Disease Spread COVID model proposed in 21 is employed to compute all estimates using our in-house finite element package 30,31 . The model consists of an unknown scalar function describing the dynamics of the infected population in a six-dimensional space. In particular, the active infected population is distributed in space, infection severity, duration of the infection, and age of the infected people. Let T ∞ be a given final time, Ω := Ω x ⊗ Ω ℓ be the spatial domain, and Ω ℓ := L v × L d × L a be the internal domain. Here, L v , L d and L a denote the infection severity, duration of infection, age of the people, respectively. Then, the dynamics of the infected population I(t, x, ℓ) ∈ (0, T ∞ ] × Ω x ⊗ Ω ℓ is described by the population balance equation The movement of the population within the spatial neighbourhood can be incorporated through u. The internal growth G = (G ℓ v , G ℓ d , G ℓ a ) T , the recovery rate C R , and infectious death rate C ID rate are described in our previous paper 21 .
Adding to the base model, our current implementation includes antibody waning, vaccination, breakthrough infections, new immune escape variants, case-to-infection ratio (CIR) (unreported cases), social distancing, lockdown/unlock policies. In the model, all these factors affect the new infection, i.e, the nucleation of COVID cases B nuc .
Let us first consider the modeling of antibody waning. Let N be the total number of people and N ab be the number of people with antibodies. Suppose there is no antibody waning, then the susceptible ratio is defined as To introduce antibody waning, we use the cumulative distribution function for the Weibull distribution Now define the antibody waning susceptible ratio as

9/12
Here, k = 5.67 defines the shape of the curve, λ is the duration of antibody retention, and t is the time. Next, the vaccinated population must be removed from the susceptible population. Moreover, the efficacy of the vaccine should also be considered. Suppose N v is the total number of vaccinated people with 70% efficacy, then the vaccinated population is added to N w ab , i.e., Serological surveys indicate that CIR varies between 20 and 60 or even more 32 . For instance, the first serological survey report of Karnataka shows CIR = 40 in Karnataka 33 ; that is, there are 39 unreported cases for every reported case in Karnataka during the first wave. CIR has to be incorporated into the model since CIR will significantly influence the nucleation. In our model, we included the CIR as follows: Vaccine-dependent fatality is another critical feature in our model. The straightforward approach is to define the infectious rate as a function of the vaccinated population ratio. Alternatively, vaccine-dependent fatality can also be incorporated in the nucleation when the model accounts for the severity of the infection. In particular, the newly infected but vaccinated population must be added into the model with less severity. Since one of the internal variables in our model is infection severity ℓ v and the infectious death rate is zero for ℓ v < 0.64, the newly infected but vaccinated population is distributed within ℓ v < 0.64. Next, we discuss the modeling of new immune escape variants, which are the primary source for new waves. Suppose N ab (t) be the total antibody population on the day t of the introducing a new immune escape variant. Let us assume a fraction of the antibody population, F nv * N ab , where 0 ≤ F nv < 1, is still immune to the new variant of SARS-CoV-2. Then, the antibody population with a new immune-escape variant is redistributed as N nv ab (t) = ((1 − F nv ) + F nv (1 −W (t))) N ab (t) with k = 2.67 and λ = 15. Incorporating all these models, the nucleation is given by Here, γ Q ∈ [0, 1] is the fraction of the infected population in quarantine, and it depends on testing, isolation, and comorbidity of the susceptible population. , The S D ∈ [0, 1] in the function f 1 (t, S D ) is a social distancing parameter, where S D = 1 implies a perfect social distancing and R → 0. Hence, S D is a key parameter to forecast lockdown and unlock phases. Next, the function f 5 (ℓ v ) is used to distribute the newly infected population as a function of vaccine-dependent severity. Here, the vaccinated ratio is defined as where N Vac is the total number of vaccinated people.

Data of Infected, Recovered and Deceased
Crowd sourced data from www.covid19india.org 34 is utilized for fitting the model, as well as comparing the model forecast with data from the day the fit is completed. Only three time-series data, viz., infected, recovered and deceased are utilized. Active is calculated as Infected-Recovered-Deceased. The range of values for other parameters in the model are fixed from literature as described in the modeling section.