Introduction

The COVID-19 pandemic has placed India’s relatively limited healthcare resources under tremendous pressure. Owing to the nation wide lockdown starting from 24 March 2020, imposed by the Indian government, the number of COVID-19 cases in the first wave were limited. However, with the opening up of the cities, its transport systems and allowance of various festivities, India has faced a very severe second wave with the peak at almost 0.4 million new cases a day. Although the peak seems to be over, the actual number of people to be affected in the coming days is very difficult to determine. There had been an exponential rise1 of new cases in the second wave and the third wave and as of 24 July 2022, there are close to 0.15 million active COVID-19 cases in India.

COVID-19 has disrupted demand projections, which help merchants and providers of consumer products and services determine how much to purchase or create, where to purchase goods, and how much to promote or sell. During the early stages of the pandemic, abrupt curfews and a migration to working from home prompted panic purchases of various food products and household supplies. Some things were sold out, while others remained on the shelf. Insecurity abounds today on several levels. Certain items, such as toilet paper and frozen foods, are still scarce. Food retailers are stocking seasons’ worth of basic necessities rather than days’ worth in order to best prepare for winter season, when there could be a return of illnesses and people are anticipated to stay at home. Amidst all this, there are speculations of what the next pandemic would be. The COVID-19 pandemic has showed each country their limitations.

According to a worldwide trend, we can simply state that our existing medical capacity cannot meet the high health demands caused by the coronavirus pandemic2. Infectious diseases often rise to pandemic level whenever the risk factors simultaneously happen. The circumstances can affect the availability of hospital beds, ICU beds, fans, PPEs and qualified medical staff across the country. It is therefore challenging for the authorities to supply all sectors of society with the required healthcare services. Indian medical system had similarly collapsed during the second wave of COVID-19 infection due to the high hospital admission rate.

If there was a prediction system available which could project a better estimate of the number of affected people much earlier on, then the authorities could have maintained stocks according to that. Research has been done on various aspects of COVID-19 like COVID-19 detection3,4,5, number of cases prediction etc. Researchers have proposed various models for predicting the COVID-19 cases. Various mathematical models have been used to predict and to understand the spread of the disease6. Auto-regressive Integrated Moving Average (ARIMA) has been used7,8,9 as the standard model to predict the behaviour of the infection curves in different countries. The model was able to capture the total case statistics because the number of total cases is seen to follow a standard exponential curve. On the contrary, the number of new cases each day is highly uncertain and involves a lot many variables and therefore, it is much more difficult to handle using ARIMA2,10. Researchers have explored the use of support vector machines11, adaptive neuro-fuzzy inference system12, exponential and non-linear growth models13 to predict the daily and active COVID-19 cases respectively. Recurrent neural networks (RNNs) have also been tested and have become state-of-the-art models for predicting the daily number of cases. Researchers14 have compared various RNN models like Long Short Term Memory (LSTM)15,16,17,18, Gated Recurrent Unit (GRU) and Bi-LSTMs. They have observed that these models are more robust than ARIMA or Support Vector Regression (SVR)19,20. Transfer learning and ensemble modelling have also been applied to study the statistics of daily COVID-19 cases21 and it has shown to perform even better than the standard LSTM-RNNs. Ensemble learning in conjunction with transfer learning has, however, not been tested before. It might capture trends of multiple countries and take advantage of the knowledge of infection spread trends that the test country has not experienced before.

As we have witnessed during the peaks of the infection waves that the medical supplies were generally distributed to an area based on the real-time daily new cases in that particular area/ hospital. It takes some time to get these supplies and meanwhile the patients might be critical22. Therefore, a predictive method for the estimation of COVID-19 statistics for multiple days in advance, can provide a better framework for medical logistics. A predictive model in the similar direction has been proposed in this article. Some researchers23,24,25 have attempted the multi-step prediction of cumulative COVID-19 cases. However, multi-step prediction of daily new cases, daily fatalities and total active cases simultaneously is difficult due to their chaotic nature. The proposed model alleviates these problems and gives a multi-variable (3 COVID-19 parameters: New Cases, New Deaths and Active Cases) multi-day prediction for the daily statistics.

The proposed method draws motivation from the concept of recursive learning. The aim of recursive learning is to establish a model that can learn to fill in the missing parts in its input. In prediction tasks, the recursive model trains on its predictions26. This provides the input with feedback from the output. The proposed method has been tested to give predictions for seven days ahead case. To achieve this, the proposed method uses a combination of several learning methodologies and integrates them into one. It has been shown that such a combined model is more efficient at providing multi-day forecasts than the existing standard models.

The contribution of this work can be highlighted along the following lines:

  • Introduction of a transfer learning scenario for incorporating COVID-19 spread behavior in different countries.

  • Recursive learning for 7-day prediction i.e., using the predictions recursively for new predictions.

  • Combination of the predictions from different models using a weighted ensemble.

Rest of the article is organised as follows: The preliminaries for the proposed method is outlined in details in section “Preliminaries” with special emphasis on transfer learning and recursive learning employed. Section “Proposed method” describes the proposed method. Section “Experimental results” contains the experimental results, along with a comparison with existing standard methods. Section “Discussion” contains a discussion of the results, along with statistical significance testing of the models. Section “Conclusion and future work” concludes this article with the scope for future research.

Preliminaries

This section describes the dataset and the basic building blocks which have been used in the proposed model.

Dataset

The data for this study has been taken from the database of Worldometers website27. This website provides COVID-19 related data including new cases, new deaths, active cases, total tests etc. for 222 different countries all along the period of the pandemic. To predict the multi-day ahead COVID-19 cases in India we have considered data related to daily new cases, daily new deaths, and active cases for six different countries: the USA, Brazil, Spain, Bangladesh, Australia and India for the period from 15 February 2020 to 16 June 2022.

Recurrent neural networks

The current solution relies on gated recurrent units (GRUs) as the basic building blocks for multi-day prediction of COVID-19 parameters. Gated recurrent units and long short-term memory (LSTMs), as prevalent members of recurrent neural networks (RNNs), have the innate ability to capture trends and seasonality in time-series data28. They are, therefore, the go-to methods for time series predictions.

GRUs are able to solve the exploding and vanishing gradient problem that is common for vanilla RNNs29. With its reset gate and update gate, a GRU is able to decide what information to keep from the previous state and what information to pass on to the next state. This gives GRUs the ability to keep relevant information from much earlier in the sequence while removing information that is no longer relevant for the task at hand. For a more detailed description of GRU, one can refer to29.

GRUs are one of the simplest recurrent neural network models and have been used for time series prediction tasks in multiple domains, e.g., for traffic flow prediction30, energy load forecasting31, stock market forecasting32, air pollution forecasting33. It has also been used for COVID-19 prediction with the help of deep learning based models14.

The present work of multi-day COVID-19 prediction has been done using GRUs as the basic building blocks in order to harness its effective sequence modelling function and also to prevent over-fit in our relatively small dataset.

LSTMs are also used for time-series prediction in multiple domains34,35,36,37. In the present work, we have compared the performances of GRUs with that of LSTMs, considering both as the basic building blocks.

Transfer learning

Transfer learning is the scenario where a pre-trained model for one particular problem is applied to a second, different but related problem38. Transfer learning tries to take advantage of what has already been learned in a problem and applies it to improve the generalization in another related problem.

The domain in which the model is trained is called the source domain and the domain in which the model is applied is called the target domain39. The source and the target domain may be different enough but need to have some sort of a relation. The predictive model in the source domain needs to be similar to that of target domain in order for transfer learning to work. Transfer learning is mainly applied in such target domains where sufficient labeled data is not available40.

Transfer learning has been applied in the COVID-19 scenario for different tasks. It has been used for classification of COVID-19 from non-COVID-19 patients by using chest CT images41, for face-mask detection in public areas42, COVID-19 cases and death forecasts using LSTMs43 etc.

In the present work, transfer learning has been chosen for the task of COVID-19 case prediction to learn from the experiences of countries affected by COVID-19. Countries with different circumstances, different climates, different measures for infection control are chosen as the source domain and COVID-19 cases prediction for India is done as the target domain. In one of our previous works21, it is seen that transfer learning has given better results for next day prediction for COVID-19 cases using LSTMs. In this present work, we are exploiting transfer learning with recursion for multi-day ahead prediction. The details are given below.

Recursive learning

The GRU model built in this work is able to predict the next day parameter after looking at the parameters over a period of past days (called the look-back period). As mentioned, in order to achieve a multi-day prediction of COVID-19 cases, a recursive learning methodology is adopted as shown in Figure 1.

Figure 1
figure 1

Recursive learning used in COVID-19 prediction.

In a recursive way, the predicted output of the model is fed back to the input in the next step to obtain the subsequent prediction. As an example, in Fig. 1, data for day 1 to day 4 is used as input to the model to predict the data for day 5. In the next step, the data for day 5 is added to the input and the data for day 2 to day 5 is taken as input to the model to predict the output for day 6. This process is repeated recursively till the required days of prediction is obtained.

This recursive learning methodology uses the sliding window approach with the previous predictions being used as a part of the input to do future predictions. It is intuitive that the performance will vary with the change in look-back period.

In this work, this recursive learning methodology works for 7 steps to predict the COVID-19 cases for the next 7-days. However, this process of recursion can be used for prediction of COVID-19 cases for any number of days in advance.

Ensemble learning

Through ensemble learning one could exploit the unique abilities of multiple models in an integrated manner by combining the results obtained from various models. In this work, the results obtained through recursive learning approach initiated from various transfer-learnt models (trained on data from respective countries) are ensembled to obtain final predictions. Several ensemble techniques exist in literature44,45. In the present approach, we have proposed a weighted ensemble technique.

Performance metric

To assess the performance of the proposed approach, two metrics have been used for this study. The first is the relative mean squared error (R-MSE) and the second one is relative mean absolute error (R-MAE). Rather than considering actual value, calculating the error as a fraction of the actual value is seen to be effective. As the error value is compared relative to the actual value of the parameter, hence the term relative has been used for the standard error metrics of mean squared error and mean absolute error.

These metrics are defined as follows:

$$\begin{aligned}&Relative-MSE (R-MSE) = \frac{\sum _{i=1}^{d} \left( \frac{original(i)-predicted(i)}{original(i)} \right) ^ 2}{d} \end{aligned}$$
(1)
$$\begin{aligned}&Relative-MAE (R-MAE) = \frac{\sum _{i=1}^{d}\left| \dfrac{original(i)-predicted(i)}{original(i)}\right| }{d} \end{aligned}$$
(2)

where, predicted(i) is the predicted value on the \(i\mathrm{th}\) day and original(i) is the actual value on the \(i\mathrm{th}\) day. d is the total number of days involved. Note that, R-MSE better reflects the error for higher deviations than that of R-MAE as it penalizes higher deviations with a greater error value.

For each of the three COVID-19 parameters predicted (daily new cases, daily new deaths and total active cases), these two errors values (as an average of all the prediction errors over the test set) have been shown in the results section.

Proposed method

The proposed model is an ensemble of four different models pre-trained on data from four different countries (The United States of America (USA), Brazil, Spain and Bangladesh) in order to predict the COVID-19 daily new cases, daily new deaths and active cases for India for the next 7-days. The idea was to learn the infection spread in the worst affected country from each of the seven continents.

The USA and Brazil were obvious choices from the North American and South American continents due to high number of COVID-19 infections. The USA has witnessed the most number of cases. The first wave in the USA was prolonged and the second wave has resulted in an increased number of deaths per day. The third and fourth waves also have a similar pattern. The USA has not witnessed a plateau in the total number of cases after the first wave. However, the daily new cases have decreased considerably after the peak of the third wave. Brazil has the third highest number of cases and currently the death rate is 3015 per million population, which is one of the highest in the world. However, the total cases curve in Brazil has a prolonged second wave and severe third wave. Spain has been chosen from the European continent due to its well marked first, second and third waves of infection as compared to the initial plateau of infections in Italy. Bangladesh has been chosen from the Asian continent to incorporate similar climatic conditions and being a neighbouring country of the test country, India. South Africa was first taken into consideration from the African continent, however it was not incorporated into the model as the number of cases in the waves of infection have been quite low as compared to the other countries selected. Australia was also taken into consideration, but was not introduced in the model due to the very late nature of infection with it still being in the first wave of infections. To incorporate population density into account, the data of each country is divided by the corresponding population density.

India witnessed a decline in the number of daily cases which suggested the ending of the third wave. However, cases have started increasing in some parts of the country again, signalling a possible fourth wave. This obviously brings us to the crucial question about the condition in India about the subsequent waves of infection. Now, since all of the four countries have shown different trends, it is not sure as to which path Indian trend would follow. This is why all possible combinations of these four countries were taken into account. More countries could have been taken for pre-training, but that would have added more complexity to the model.

Training of each of these models has been done using a sliding window technique with a look back period of 14 days. This look back period is the number of days in the past the model looks at while making the prediction of cases for the next day. This way the relation between the number of cases on successive days can be learned by the model for giving better predictions. The look back period has been varied from 7 to 19 in order to find the best look back period for the 7-day ahead prediction task.

The proposed model is shown in Fig. 2. It consists of three main steps which are discussed in the subsequent sections.

Figure 2
figure 2

Flowchart of the proposed method.

Step 1: Train-test splitting of input data

The data from the period 15 February 2020 to 31 December 2021 has been used for training the models for the individual four countries. This period has been chosen to take 80% of the total data for training. Indian data for the period 15 February 2021 to 31 December 2021 has been used for fine-tuning the transfer learning models before testing them on Indian data for the period 16 January 2022 to 16 June 2022. The remaining period of 01 January 2022 to 15 January 2022 for the Indian data has been used for cross-validation in the ensemble weighted averaging as mentioned in the subsequent paragraphs.

Step 2: Forming the model

Two RNN models (LSTMs and GRUs) have been taken independently as the building blocks for the proposed method. The proposed RNN models with parameters and their values chosen are given in Table 1. As mentioned earlier, the GRU and the LSTM models are trained and tested in order to do performance comparison of the two RNN types.

Table 1 Values of the parameters of the RNN models.

Step 3: Transfer learning

As stated earlier, the proposed model consists of all 16 possible combinations of four RNN networks of two types (GRUs and LSTMs), each of which is pre-trained on data from four different countries. To pre-train each model, the data of each of the countries is taken from 15 February 2020 to 31 December 2021, as mentioned earlier. The models built on the individual countries need to be fine-tuned on Indian data in order to take into account the recent trend of COVID-19 infections in the target country, India. Therefore, the pre-trained models have been fine-tuned on Indian data for the same period 15 February 2020 to 31 December 2021.

This method of pre-training followed by fine-tuning introduces a transfer learning46 ability in the individual models. Li et al.47 has shown that transfer learning can improve forecasting models that are based on deep learning. They have built a source domain of 12 countries, combining their data, and tried to predict the confirmed cases per million for the target countries. However, the scope of the study is limited by the prediction of just one COVID-19 parameter and also being tested on a shorter time period (31/12/2019–31/05/2020). Whereas, in the proposed approach, three COVID-19 parameters are predicted over an advance period of 7 days using an ensembled approach of transfer-learnt recursive model on a larger time period.

Step 4: Recursive learning

To obtain 7-day ahead predictions, we have incorporated recursive learning in the proposed model. If the COVID-19 parameters for the next 7-days is to be predicted at the \({n}\mathrm{th}\) day, in the first step, the COVID-19 parameters for the next day (\({(n+1)}\mathrm{th}\) day) will be predicted by the model. This prediction for the \({(n+1)}\mathrm{th}\) day is then fed back to the input to make the new input frame for predicting the COVID-19 parameters for the \({(n+2)}\mathrm{th}\) day. This process is repeated 7 times to get the predicted values for the 7-days (\({(n+1)}\mathrm{th}\) day to \({(n+7)}\mathrm{th}\) day). An example of the process involved is shown in Fig. 1. This method can be applied to predict the COVID-19 parameters for a different time period by changing the period setting. However, to do a stable comparison, the prediction period has been set at 7-days for this study. Experiments are also done to study the effect of varying the look-back period used for prediction.

Step 5: Ensemble

As there is diversity in the situations of the 4 different countries involved in the model (e.g., lock down measure, vaccination strategy, climatic conditions etc), different weights need to be given to them for further prediction. The weight values are considered based on errors obtained through validation data.

To do so, once the predictions are obtained for subsequent 7-days, the predictions from the combination of models are aggregated using weighted averaging. The weights are calculated based on the relative mean squared error (R-MSE) obtained from model cross-validation on validation data. 15 days of data for India (01 January 2022–15 January 2022) is kept aside for this validation task.

The relative mean squared error (R-MSE) for the validation data is calculated using the Eq. (1). The weights (\(w_i\)) for the Model i are given by Eq. (3), and the final prediction on any date D is given by Eq. (4). We have also compared our proposed ensemble method with an equally weighted ensemble technique.

$$\begin{aligned} w_i=\dfrac{1/R-MSE(Model~i)}{\sum _{j=1}^{n}1/R-MSE(Model~j)}, \end{aligned}$$
(3)

where, \(R-MSE(Model~i)\) is the relative mse obtained for the \(i\mathrm{th}\) model on validation data, n is the number of models involved in the ensemble. The transfer learnt model with less error is given a higher weight as per Eq. (3).

$$\begin{aligned} prediction(D)=\sum _{i=1}^{n} w_i*prediction_i(D), \end{aligned}$$
(4)

where, \(prediction_i(D)\) is the prediction by the \(i{th}\) model for date D. This is a weighted summation of the predictions from the transfer learnt models.

Experimental results

In order to predict the cases on any date, we need to use a look-back period. This period, in this case, is the number of days our model looks at for doing the predicting task. Finding the optimum value of the look-back period is crucial for the proposed method. This is because, depending on the look-back period, the performance of the models varies rapidly.

It is to be noted that, since we are relying on a recursive learning based multi-day prediction, where the predicted values are used as inputs for the subsequent predictions, we cannot afford to take the look-back to be smaller than the number of days in the multi-day prediction task. This will result in the last few predictions (of the recursive learning methodology) being made only on predictions and not on any actual data. We have experimented with a wide range of look-back periods and a value of 14 gives the best results for all the three variables.

The values of R-MSE and R-MAE (averaged over 20 runs) obtained for all 16 combinations using the proposed model (using both GRUs and LSTMs) with those for the support vector regression (SVR), auto-regressive integrated moving average (ARIMA) and Facebook Prophet models are shown in the Tables 2 and 3. The results are shown separately for the 3 COVID-19 parameters predicted in this study (new cases, new deaths and active cases). For establishing the efficacy of fine-tuning on Indian data, results obtained for the above mentioned methods, using GRUs, without fine-tuning are also put in Table 4. The results put in this table are the average of 20 independent runs. It is seen that the combination model of Spain and Bangladesh gives the best results for multi-day forecasting of all the three predicted variables.

Table 2 Comparison of results i.e. Mean and Standard Deviation (STD) of 20 runs, from all possible combinations of four countries considered in this study.
Table 3 Comparison of results i.e. Mean and Standard Deviation (STD) of 20 runs, from all possible combinations of four countries considered in this study.
Table 4 Comparison of results i.e. i.e. Mean and Standard Deviation(STD) of 20 runs, from all possible combinations of four countries considered in this study.

Discussion

It is clearly visible from Tables 2 and 3 that the results are similar for both GRUs and LSTMs, with fine-tuning on Indian data. Also, all the models seem to improve with the fine-tuning on Indian data as depicted in Tables 2 and 3 with respect to Table 4. As expected, in transfer learning, if a model pre-trained on data from the the source domain is fine-tuned on data from the transfer domain, the results seem to improve with respect to the model just pre-trained on the data from source domain46.

Since GRUs and LSTMs yield similar results as seen in Tables 2 and 3 , GRUs have been preferred over LSTMs for the present experimentation as they have lesser number of parameters with respect to LSTMs48. Hence, for the rest of this discussion, results obtained only with the GRU networks are analyzed further.

New cases

For the single country models, Bangladesh model gives the best results, followed by Brazil, Spain and USA. The model built using the data from Bangladesh is able to predict the trend in Indian data in a very accurate way. For the combination of two-country models, the presence of Bangladesh (with better trend tracking behaviour for Indian data) influences the performance in a positive way. Spain-Bangladesh combination gives the best result with the R-MSE of 0.0013 and R-MAE of 0.0177, and yields the best result amongst all the combinations. For the combination of more than two-country models, the performance does not improve further than the Spain-Bangladesh model. Overall, the two-country combination of Spain-Bangladesh model gives the best performance.

New deaths

For the single country models, Brazil model gives the best results, followed by Bangladesh, Spain and USA. Bangladesh is again one of the better models, similar to the case of new cases. For the combination of two-country models, Spain-Bangladesh model gives the best results with a R-MSE of 0.0021 and R-MAE of 0.0286. Addition of data from the other two countries is not able decrease this error any further.

Active cases

Prediction of active cases gives the best results as compared to the other two parameters. For the single country models, Brazil gives the best results, followed by USA, Spain and Bangladesh. For all the other models, the results improve only marginally for the Spain-Bangladesh model with a R-MSE of 0.0009 and R-MAE of 0.017. Rest of the results are all similar with no further improvement in the error metric.

Analysis

For the two-country models, the Spain-Bangladesh combination (highlighted by Bold in Table 2), gives a lower R-MSE than Bangladesh alone. There is an improvement in the result when pre-trained Spain model is combined with the pre-trained Bangladesh model.

Once the combination of Spain-Bangladesh model has been built, further addition of models built with the other two countries (Brazil, USA) data does not reduce the error any further. This may be due to the following facts: Brazil has a high number of cases and has similar geography like that of India, the infections spread happened later than that of India.

Bangladesh has similar climatic conditions and people’s behaviour like that of India. Also, the percentage of people vaccinated with respect to the total population is similar in India and Bangladesh. As a result infection trend in Bangladesh has a positive impact in predicting infection trend in India. It may also be noted that, both India and Spain were vigilant at the start of the pandemic and had imposed strict infection control measures like lockdowns, social distancing etc. One such study49 also corroborated this and compared the spread of COVID-19 infection in Spain and India by analysing the policy implications using epidemiological and social media data. Spain was one of the early COVID-19 infected countries, which is already at the end of fourth wave of infections. Whereas, India is at the end of the third wave. Also the spread increases sharply and then falls rapidly in both India and Spain. Such similar characteristics might be responsible for the low error in predictions obtained for the models built with Spain data.

It is to be noted that, the India model shown in the Table 2 is one where the GRU model is built with India data and then trained on India data i.e., it does not involve any transfer learning. It can be clearly seen from Tables 2 and 4 that the transfer learnt models are better predictors of all the three COVID-19 parameters. Models built with support vector regression (SVR), both with polynomial and RBF kernels, are unable to predict the COVID-19 parameters with a good level of accuracy. Same is the case with ARIMA and Facebook Prophet50. Different nature of infection spread in different waves is difficult to take into account for SVR and ARIMA based predictions. This is seen from the high errors of prediction obtained using these models. Predicting daily new deaths is especially very uncertain. Presence of comorbidity, age etc. play a significant role in new deaths.

Hence, the proposed method, with the advantage of transfer learning from Bangladesh and Spain data combined, is able to predict the number of daily new cases, daily new deaths and active cases with the least error.

Standard deviation (STD) of the results are also studied. STD values for 20 runs for the prediction of new cases and active cases for all the models fine-tuned on India data, are given in light blue in Fig. 3a and b respectively, showing clearly the Spain-Bangladesh combination to be the best model.

Figure 3
figure 3

R-MSE and STD over 20 runs for all the models.

The variation of prediction error with look-back period for new cases and active cases are shown in Fig. 4a and b. The prediction error is seen to be the least for the look-back period of 14 for all the three cases.

Figure 4
figure 4

Prediction error (R-MSE) vs Look-back period for Spain-Bangladesh model.

As mentioned earlier in the present method, a weighted ensemble is used for combining the models built with individual countries. Performance of the weighted ensemble method is compared with that of an equally weighted ensemble (where simple average is taken for combining the models) and the results for the best model (Spain-Bangladesh) is shown in Table 5. It is seen that our proposed method with weighted ensemble performs better than the equally weighted ensemble approach.

Table 5 Comparison of results for Spain-Bangladesh model (with fine-tuning), using the proposed weighted ensemble and equally weighted ensemble.

Statistical significance testing

The proposed method was tested on a total of 167 test sets, with each set having a duration of 7 days (i.e., 7 day prediction). In order to do a statistical significance testing of the best model (Spain-Bangladesh model) obtained in our comparisons, it has been statistically tested against each of the other single country and two-country models using the Wilcoxon signed rank test.

More than two country models were not used for calculating the statistical significance as they can be thought of as extensions of the two-country models. Only results for the prediction errors of new cases has been used in this testing.

Wilcoxon signed rank test is a non-parametric test for doing hypothesis testing of paired dependent samples. More details on Wilcoxon signed rank test can be found in51.

For doing the statistical significance testing, 167 number of R-MSE values (for the 167 test sets) for two models involved are treated as 167 number of paired observations. Wilcoxon signed rank test has been used over paired t-test as the difference between the observations are seen to be not normally distributed. Here,

Null hypothesis - H\(_0\): There is no difference between the model and Spain-Bangladesh model.

Alternate hypothesis - H\(_1\): There is a difference between the model and Spain-Bangladesh model.

Table 6 Wilcoxon signed rank test Z-score of the models with respect to Spain-Bangladesh Model.

Two tailed hypothesis with a significance level of 0.05 has been used for the testing. The Z-score for each of the one-country and two-country models when compared with Spain-Bangladesh model is given in Table 6. All the Z-scores in the Table 6 are less than − 1.96 which is the critical Z-score for a two-tailed test at a significance level of 0.05. Hence, it can be concluded that the Spain-Bangladesh model is statistically significant from the other models in consideration.

Conclusion and future work

Prediction of various COVID-19 parameters, 7 days in advance, can really be a game changer for better handling of pandemics. Proper prediction can help better plan the allocation of scarce medical resources like life-saving medicines, oxygen, and hospital beds. Understanding of the infection spread, well in advance, can also lead to better administrative decisions.

The proposed method for the prediction of new cases, new deaths and active cases for COVID-19 for India with the combination of Spain-Bangladesh model outperforms the other combinations as well as other traditional regression models considered in our experiment. This is because the proposed method leverages the capabilities of both transfer learning and ensemble learning while taking into account the excellent sequence modelling abilities of GRUs. The multi-day ahead prediction using recursive learning provides an added benefit of knowing the COVID-19 statistics multiple days ahead. The proposed method has currently been tested only on India data and can be extended for any country considering relevant source countries as the basis for transfer learning.

This study is data driven and does not explicitly take into account the steps adopted by the government to handle the pandemic. The work is experimental in nature by considering various combinations of source countries to evaluate the behaviour of the models. Better prediction could be made if such policy decisions are incorporated into the model.

The proposed method can be extended for individual states of India by incorporating information of other Indian states or other countries having similar features. Similarly, this could also be explored for predicting the cases for other countries. Individual waves of infection could also be analysed by using the transfer learning phenomenon from other waves of infection.