Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Previously submitted to: Journal of Medical Internet Research (no longer under consideration since Mar 08, 2021)

Date Submitted: Jul 19, 2020

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Prediction of the Number of COVID-19 Confirmed Cases Based on K-Means-LSTM

  • Shashank Vadyala; 
  • Sai Nethra Betgeri

ABSTRACT

Background:

COVID-19 is a pandemic disease that began to rapidly spread in the US with the first case detected on January 19, 2020, in Washington State. March 9, 2020, and then increased rapidly with total cases of 25,739 as of April 20, 2020. The Covid-19 pandemic is so unnerving that it is difficult to understand how any person is affected by the virus. Although most people with coronavirus 81%, according to the U.S. Centers for Disease Control and Prevention (CDC), will have little to mild symptoms, others may rely on a ventilator to breathe or not at all. SEIR models have broad applicability in predicting the outcome of the population with a variety of diseases. However, many researchers use these models without validating the necessary hypotheses. Far too many researchers often "overfit" the data by using too many predictor variables and small sample sizes to create models. Models thus developed are unlikely to stand validity check on a separate group of population and regions. The researcher remains unaware that overfitting has occurred, without attempting such validation.

Objective:

This study aimed to predict the incidence of COVID-19 in Louisiana State USA. Xgboost, K-Means, and long short-term memory (LSTM) neural networks to construct a prediction model (i.e., K-Means-LSTM) for short-term COVID-19 confirmed cases in Lousiana state USA.

Methods:

In the paper, we present a combination algorithm that combines similar days features selection based on the region using Xgboost, K-Means, and long short-term memory (LSTM) neural networks to construct a prediction model (i.e., K-Means-LSTM) for short-term COVID-19 cases forecasting in Louisana state USA.

Results:

The weighted k-means algorithm based on extreme gradient boosting is used to evaluate the similarity between the forecasts and past days. The results show that the method with K-Means-LSTM has a higher accuracy with an RMSE of 601.20 whereas the SEIR model with an RMSE of 3615.83.

Conclusions:

Accurate COVID-19 case forecasting is a significant problem for public health authorities to efficiently and timely coordinate patient care and other services required to solve the epidemic. In this research, we propose a K-Means-LSTM neural network to tackle the issue of variance and precision in predicting the number of reported cases in the traditional SEIR model. The findings of the study will help policy and healthcare efficiently prepare and provide services to handle the situation in these states over the next few days and weeks, including nurses, beds, and intensive care facilities. The data should be updated in real-time for more precise comparison and future perspectives.


 Citation

Please cite as:

Vadyala S, Betgeri SN

Prediction of the Number of COVID-19 Confirmed Cases Based on K-Means-LSTM

JMIR Preprints. 19/07/2020:22655

DOI: 10.2196/preprints.22655

URL: https://preprints.jmir.org/preprint/22655

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

Advertisement