An Efﬁcient COVID-19 Mortality Risk Prediction Model Using Deep Synthetic Minority Oversampling Technique and Convolution Neural Networks

: The COVID-19 virus has made a huge impact on people’s lives ever since the outbreak happened in December 2019. Unfortunately, the COVID-19 virus has not completely vanished from the world yet, and thus, global agitation is still increasing with mutations and variants of the same. Early diagnosis is the best way to decline the mortality risk associated with it. This urges the necessity of developing new computational approaches that can analyze a large dataset and predict the disease in time. Currently, automated virus diagnosis is a major area of research for accurate and timely predictions. Artiﬁcial intelligent (AI)-based techniques such as machine learning (ML) and deep learning (DL) can be deployed for this purpose. In this, compared to traditional machine learning techniques, deep Learning approaches show prominent results. Yet it still requires optimization in terms of complex space problems. To address this issue, the proposed method combines deep learning predictive models such as convolutional neural network (CNN), long short-term memory (LSTM), auto-encoder (AE), cross-validation (CV), and synthetic minority oversampling techniques (SMOTE). This method proposes six different combinations of deep learning forecasting models such as CV-CNN, CV-LSTM+CNN, IMG-CNN, AE+CV-CNN, SMOTE-CV-LSTM, and SMOTE-CV-CNN. The performance of each model is evaluated using various metrics on the standard dataset that is approved by The Monteﬁore Medical Center/Albert Einstein College of Medicine Institutional Review Board. The experimental results show that the SMOTE-CV-CNN model outperforms the other models by achieving an accuracy of 98.29%. Moreover, the proposed SMOTE-CV-CNN model has been compared to existing mortality risk prediction methods based on both machine learning (ML) and deep learning (DL), and has demonstrated superior accuracy. Based on the experimental analysis, it can be inferred that the proposed SMOTE-CV-CNN model has the ability to effectively predict mortality related to COVID-19.


Introduction
The COVID-19 virus is a new type of virus that was first identified in Wuhan, China, in November 2019. It quickly spread around the world, causing a pandemic that has affected over 76 million people and resulted in the deaths of over 6 million people as of 20 April 2023. While most people experience mild to moderate respiratory illness and recover without special treatment, older individuals and those with certain medical conditions are at higher risk of suffering from the virus, which increases the mortality rate of COVID-19 [1]. Symptoms of COVID-19 are similar to those of the flu and can be difficult to detect at an early stage, making efficient testing and treatment systems crucial [2]. The current method for detecting COVID-19, RT-PCR testing, is time-consuming and expensive, with a low detection rate. Hence, researchers and physicians are turning to artificial intelligence (AI)-based applications for various purposes, such as early disease prediction, disease diagnosis, professional decision assistance, patient support, and mortality risk assessment for serious illnesses.
Lately, AI is rapidly growing with machine learning (ML) and deep learning (DL) algorithms. DL algorithms have shown promising results in various medical applications such as classifying skin cancer, detecting breast cancer, classifying pneumonia, and segmenting lungs. DL algorithms such as convolutional neural networks (CNNs), long short-term memory (LSTM), and auto-encoder are the most widely used techniques these days [3,4]. This study investigates the effectiveness of both CNN and LSTM models in forecasting COVID-19 mortality. The incorporation of data-augmentation techniques has further improved the performance of these models, which is discussed in Section 4.3. To assess the mortality risk of COVID-19 patients, these models are compared and evaluated against current predictive models in both the deep learning and machine learning domains.

Contributions
The main contributions of this research are: • Generating the following six deep learning predictive models will help identify people with the COVID-19 disease that are at higher risk of mortality. The medical dataset is transformed as images and is applied to the proposed IMG-CNN algorithm. • Assessing the suggested model and comparing it with earlier research work. • Improving the model performance using data-augmentation techniques.

Organization of Paper
The rest of the paper is organized as follows: Section 2 describes the literature survey on predicting mortality risk and severity. Section 3 presents the application scenario of the proposed models. Section 4 presents the methodology in which Section 4.1 discusses the dataset, Section 4.2 explains the preprocessing process, Section 4.3 explores the data-augmentation techniques, and Section 4.4 elaborately presents the proposed models. Section 5 presents the experimental results and discussions. Finally, Section 6 concludes the paper.

Literature Survey
Using deep learning modules, Tekerek [5] suggested an architecture for anomalybased web attack detection in a web application. The datasets that were considered were CSIC2010v2 datasets. As one of the most well-known datasets, it is frequently used for online application security. The preprocessing that they used can be broken down into six steps: cleaning, missing values, noisy data, integration, reduction, and transformation. The model that was considered for this application was a CNN model. CNN is usually used in image-processing studies. It is a special case of neural networks where at least one of the layers is a convolutional layer. The final evaluation was performed by considering the F1-score, precision, recall, accuracy, false negative rate (FNR), false positive rate (FPR), true negative rate (TNR), and true positive rate (TPR).
Xu et al. [6] actively collected more and more COVID-19 patient data. They provide real-time case information for people to use. Both formal government websites and peerreviewed scientific papers that present primary data serve as the sources for the data. The official websites of the health departments and the social media profiles of government and public health organizations are examples of government sources. The data provided by them have been used by numerous authors to train their models. The database has 32 features. After some extraction, the features can go up to 83. The authors used one hot encoding on the data of the symptoms. The authors also provide a fixed dataset that does not have real-time case information. In the paper, they discuss their data collection and augmentation techniques.
Banoei et al. [7] considered a dataset containing 250 clinical features. In this dataset, only 108 clinical features, comorbidities, and blood markers recorded at the moment of admission from a hospitalized cohort of patients are subjected to multivariate predictive analysis. The mortality risk of COVID-19 individuals was forecasted using a partial least squares-based model. The model was able to forecast patient mortality with good accuracy (AUC > 0.85) and average predictive power (Q2 = 0.24).
The use of supervised learning techniques and their capacity to decipher intricate patterns in actual medical data were the main topics of Bikku's [8] research. Perceptions with multiple layers make up the model. A feed-forward neural network called an MLP has an output layer, one or more hidden layers, and at least one input layer. Each stratum serves a distinct purpose. To determine the most important health variables and forecast the mortality risk of COVID-19 patients, an integrated predictive model utilizing a decision tree (DT), support vector machine (SVM), logistic regression (LR), random forest (RF), and K-nearest neighbor (KNN) was put into place. The UCI ML repository provided the sample that was used. It consisted of records of more than 2,670,000 patients affected by COVID-19. It was collected from 146 countries across the globe. In order to determine other metrics, such as accuracy, precision, recall, specificity, and F1-score, the assessment is carried out using a confusion matrix. The model's obtained accuracy is 89.98%.
Yan et al. [9] used blood sample data belonging to 485 patients infected by COVID-19 in Wuhan, China. The objective was to find crucial predictive biomarkers of disease mortality. The feature importance was identified by ranking the features using a multi-tree XGBoost. Lactic dehydrogenase, lymphocytes, and C-reactive protein are some crucial disease indicators. This information is used as input into a decision tree algorithm to identify patients who require urgent medical care more than 10 days in advance. The patient information included blood sample information of all patients collected between 10 January and 18 February 2020. Using categorization accuracy, precision, sensitivity, recall, and F1-scores, the model's performance was assessed. The model's precision was a little bit higher than 90%.
A novel technique, the CNN-AE, which is a CNN trained with clinical data from patients with COVID-19 damage, was suggested by Khozeimeh et al. [10]. The dataset used by the authors is a very unbalanced one. It was a sample of 320 patients out of which 300 patients recovered and 20 died. This would result in a very poor model, however, with the inclusion of an auto-encoder model that produces more artificial samples of lower class (dead patients). The 20 instances of the lower class are given as input to an AE model that compresses and decompresses the data. The outcome of the model is 20 artificial samples. By repeating this process several times, the authors were able to balance the dataset. This dataset produced high accuracy in training. The precision of the suggested model was 96.05%.
Pourhomayoun and Shakibi [2] used several machine learning techniques for mortality risk prediction. Some of these are SVM, artificial neural networks, DT, LR, and KNN. In the process, they were also able to identify the most important symptoms that would be useful for prediction. A total of 2,670,000 samples from 146 different nations made up the information used for this procedure. Out of these, 307,000 samples were labeled samples. The performance of each technique was tabulated and presented. The evaluation criteria were accuracy, receiver operating characteristic (ROC), AUC, and the confusion matrix.
Khan et al. [11] implemented several machine learning algorithms as well as deep learning algorithms. Decision tree, logistic regression, random forest, extreme gradient boosting, and K-nearest neighbor are some of the machine learning methods that were used. A DL model contains six layers: an input layer, a CONV layer, a max pooling layer, a ReLU layer, a fully connected layer, and an output layer with softmax implemented. The dataset used for this purpose had 2,676,000 samples collected from 146 countries. The data were split using K-fold cross-validation that was then used for each model training.
Tezza et al. [12] used machine learning techniques to predict mortality risk. The algorithms used were recursive partition tree, gradient boosting machine, random forest, and SVM. The data used came from patients who were admitted to the COVID-19 referral facility 'Ospedali Riuniti Padova Sud' in the Veneto area of Italy. The patients being considered were diagnosed with COVID-19 after undergoing a PCR test. The number of patients was 374 with a median age of 74. The algorithms were evaluated using specificity, sensitivity, and ROC curve measures. The final results concluded that the random forest technique outperformed the other algorithms. Their results concluded that the RF algorithm produced the best performance with a ROC of 0.84. The study also revealed important biomarkers for predicting mortality risk-age, oxygen saturation levels, creatine, AST, and hemoglobin.
According to Wang et al. [13], the XGBoost algorithm is effective in producing mortality prediction models for COVID-19. The First People's Hospital in Wuhan's Jiangxia District provided the dataset for this research. The model was trained using a cohort of 296 individuals with information on their age, history of hypertension, and occurrence of coronary heart disease. The dataset had around 96% recovered patients and thus was highly unbalanced. The model achieved an AUC of 83% and provided some insight into the biomarkers that are key in predicting COVID-19 severity. Some of these were lymphocyte count, age, D-dimer, oxygen saturation, and glomerular filtration rate.
An XGBoost-based prognostic prediction model was trained using a sample of 375 affected patients and 201 survivors [14]. The data were collected from the Tongji Hospital in Wuhan, China. The model achieved more than 90% accuracy on mortality prediction and revealed key biomarkers such as LDH, hs-CRP, and lymphocyte count.
The SVM method was used by Sun et al. [15] to create a predictive model for the severity prediction of COVID-19. The data were collected from the Shanghai Public Health Clinical Center, which comprised a cohort of more than 330 patients. The model could successfully distinguish between mild and severe cases with an AUC value of 97.57%. This study recognizes four features out of 220 features as the most important to predict COVID-19 severity. They were age, GSH, CD3 ratio, and total protein.
The XGBoost algorithm was used by Rechtman et al. [16] to create a predictive model that was trained on patient data from a New York City healthcare system. A dataset of 8770 patients with 7656 survivors was used in the study. The Mount Sinai health facility in New York City is where the information was gathered. The model produced an AUC of 0.86 and concluded that age, gender, and high body mass index (BMI) are key features in predicting disease severity.
Early mortality risk detection in COVID-19 patients using an ML method was covered by Hu et al. [17]. The 183 patient entries in the dataset used for this purpose are from the Sino-French New City branch of Wuhan's Tongji hospital. They used details of an additional 64 patients to externally validate their predictive model. The authors used ten different approaches out of which only five proved to be effective. These included flexible discriminant analysis, partial least squares regression, elastic net model, and LR and RF. The performance of all the algorithms was similar with an AUROC value of 88.1%. Three important biomarkers-hs-CRP level, cell count, and D-Dimer levels-were discovered.
An algorithm created by Zhou et al. [18] can forecast the severity of a COVID-19 infection. The dataset used contains details of 377 patients (172 severe), and it is collected from Wuhan's Central Hospital. The model performed with an AUC of 87.9%. Key biomarkers identified include age, CRP, and D-dimer levels.
A prediction model developed by Li et al. [19] utilizes clinical and laboratory data to forecast the mortality risk of COVID-19 patients. The GitHub dataset and the Wolfram dataset were both used by the writers. The forecast model is constructed using an autoencoder. Other algorithms, including LR, RF, SVM, SVM one-class models, and isolation forest, are used to evaluate the model's performance. The study also revealed that persons with chronic conditions were at a higher risk of COVID-19-related death.
In order to approximate the infection severity in COVID-19 patients, Zhu et al. [20] suggested a model. The dataset is collected from Ningbo's Hwa Mei Hospital and has 126 records. The LR method was used to build the model, which has an AUC value of 90.0%. The model was effective in pinpointing a number of causes of severe instances of COVID-19. The neutrophil-lymphocyte ratio, fibrinogen, sialic acid, CRP, and relative pressure of oxygen were a few of these. Table 1 provides an overview of the current COVID-19 diagnostic techniques.  Table 1 suggests that machine learning methods have been more frequently used than deep learning methods for COVID-19 mortality prediction using clinical datasets. In order to address these limitations, alternative six deep learning models were implemented.
The later sections of this paper describe the application of the proposed models followed by the dataset used and the methodology. The pseudo-code and a thorough description of each model are given. After this, the results are discussed along with the inferences and conclusion.

Application
We discuss the application aspects of this specific research in this section. This study falls under the category of predictive analytics in the healthcare industry. The introduction of artificial intelligence in healthcare has drawn a lot of attention to predictive analytics as well. Using predictive analytics in healthcare has a number of benefits, some of which are discussed in this section. Figure 1 represents the application scenario of the proposed efficient COVID-19 mortality risk prediction system.

Medical Decision Making
Using predictive analytics to support decision making is its most significant addition to the healthcare sector. The amount of scientific information gathered about COVID-19 in just a small amount of time is extraordinary. However, uncertainty in medicine is unavoidable and one can never make a generalized assumption. The only method being used to diagnose COVID-19 is the viral PCR test, which has proven to have a lower rate of false negatives. However, information about the severity of the infection is still unavailable. Using the proposed prediction models in this study to predict the mortality risk of infected persons can be life-saving. It facilitates decision making and, to some degree, influences the choice of treatment. This would allow medical institutions to decide how they utilize their available resources.

Improving Patient Outcomes
All medical institutions store large amounts of patient information, which includes data about chronic health conditions, family illnesses, and so on. With the help of this information, patient results can be improved through mortality prediction. The models proposed in this study can identify warning signs before the infection becomes severe. The severity of the infection can vary from person to person. By identifying key biomarkers that contribute to COVID-19 severity, our models can identify patients who are at a greater risk of fatality.

Healthcare Workers
Medical professionals spend around 62% of their time per patient reviewing their electronic health records. By incorporating the proposed models into the healthcare system, the process can be transformed into an efficient one. Studies [23] have indicated that 90% of medical workers prefer an AI-based approach compared to relying on their instinct. They also assist the workers in the diagnostic process.

Dataset
The dataset being used for this research is from a healthcare surveillance software package [24,25]. The data were collected from a single healthcare system over a certain period. The dataset has a total of 4711 records, out of which 75% are records of recovered patients, whereas 25% are of deceased patients. Each record has a total of 85 features.
The Montefiore Medical Center/Albert Einstein College of Medicine Institutional Review Board gave its approval for the data gathering. The data are anonymized as a result of their approval of the waiver of patient-informed permission.

Preprocessing
The proposed models require the dataset to be preprocessed and is presented in three different forms: Dataset 1: This dataset will be used as input for the CV-CNN and CV-LSTM+CNN models. All the features with more than two unique values with the exception of demographic features and score features are identified. These features have to be normalized. For this purpose, we make use of StandardScaler() from the SciKit Learn library. Followed by this, we introduce a new feature (severity_class) that reduces the severity feature from 12 different categories to 4. The top 35 variables that correlate to the 'Death' feature (headache, cough, sore throat, fever, chest pain, dizziness, shortness of breath, pneumonia, diarrhea, respiratory distress, fatigue, anorexia, cardiac disease, gasp, sputum, hypoxia, eye irritation, chills, somnolence, somnolence, septic, emesis, rhinorrhea, lesions on chest radiographs, hypertension, hypertension, heart attack, expectoration, conjunctivitis, shock, kidney failure, myalgia, old, obnubilation, myelofibrosis) are extracted to be used for our models.
Dataset 2: This dataset, a novel method of data encoding, serves as the input for the IMG-CNN model. Every row of the initial dataset is transformed into an image. This is achieved by considering the numerical value for each feature as a pixel intensity value. The top 81 features out of 85 are considered so that we can form a 9 × 9 image. Each pixel in this image corresponds to a particular feature. The numerical value for that feature determines the pixel intensity. The conversion is from CSV to grayscale. A sample of these images is given in Figure 2. These are then stored in the appropriate directories to be used by the model. Dataset 3: This dataset will be a balanced version of dataset 1. A balanced dataset can yield better results. The dataset is balanced using two different methods. Method 1 is uses auto-encoders. Samples from the 'Dead' class are taken and passed through an auto-encoder model. The 84-dimensional data are compressed and reconstructed several times to produce augmented samples. This is used as input in the AE+CV-CNN model. The second method involves oversampling using SMOTE. Samples from the minority class are increased in a balanced way, resulting in a perfectly balanced dataset.

Data Augmentation
Three different techniques were considered for data augmentation. They were simple auto-encoders, variational auto-encoders and oversampling. On using auto-encoders to partially balance the dataset, it was identified that most of the new samples being generated contained many duplicates. Since the validity of a model trained on such a dataset is questionable, variational auto-encoders were used. On using variational auto-encoders, no duplicates were generated. However, the decompression process resulted in an output where most of the binary variables had decimal values. The outcome variable was also not in sync with the actual dataset. As a result, variational auto-encoders were rejected for this dataset. Then, SMOTE was used for data augmentation. SMOTE is a resampling technique that aims to balance datasets with a highly unbalanced ratio by creating synthetic samples in the minority class. This technique generates new samples of data in the minority class by interpolating between samples of this class that are in close vicinity of each other. SMOTE increases the number of minority class examples within an imbalanced dataset and enables the classifier to achieve better generalizability. To apply SMOTE, the desired amount of oversampling, N, should be set to an integer number, and three main steps should be taken iteratively, including randomly selecting a sample that belongs to the minority class, selecting the K (default 5) nearest neighbors of this sample, and selecting N of these K neighbors randomly for interpolation and generating new samples. The synthetic generation of new samples in SMOTE differs from the multiplication algorithm to avoid the issue of overfitting. Figure 3 provides an intuition of how SMOTE works [26]. SMOTE was used because of produced good samples that were also validated by checking with the original dataset. The performance of the models was also improved by using SMOTE.

Proposed Models
Six deep learning forecasting models are taken into consideration in this study. Later, efficiency improvements are made using data augmentation. They all belong to various domains of deep learning. One of these models combines LSTM and CNN while utilizing recurrent neural networks (RNN). Another model uses unsupervised learning concepts for data augmentation and makes use of auto-encoders. The other models are regular CNN models with different input types and SMOTE.

CV-CNN Model
The CV-CNN model is the initial model put forth. It is a CNN model that was developed using K-fold cross-validation and dataset 1. For each test set, we conduct 10-fold cross-validation with 10 epochs in our instance. The following describes how 10-fold crossvalidation functions. The dataset is split to have 10 folds with an equal number of records in each fold. The dataset is then divided into a 9:1 training and validation group split. On each fold, the model is trained for a predetermined number of epochs. Figure 4 displays the conceptual framework of the CV-CNN model. Algorithm 1 provides the CV-CNN model abstract code. Three convolutional layers and four maximum pooling layers make up the CNN model itself. The one-dimensional convolution and pooling layers are used because the information is in CSV format. The input size for the model is (35,1). After the feature extraction is performed, a flattened layer is used, followed by dense layers. Given that the issue is a binary classification problem, the final output layer consists of a single neuron with a sigmoid activation function. A summary of the CV-CNN model is given in Figure 5. The total number of trainable parameters is 806,529.

CV-LSTM+CNN Model
The dataset used for this model is dataset 1. It is a form of a recurrent neural network with the ability to retain key information from training in its long-term memory. The model can take input with a minimum of three dimensions. For this purpose, the input data are reshaped into (7,5,1). Due to this, the top 35 features from dataset 1 are considered. The training of the model is once again performed using 10-fold cross-validation as explained above. The conceptual framework of the CV-LSTM+CNN model is given in Figure 6. The pseudo-code for the CV-LSTM+CNN model execution is given in Algorithm 2. A summary of the CV-LSTM+CNN model is given in Figure 7. In order to implement the LSTM aspect of the model, a TimeDistributed layer is used. The activation function used in this layer is 'tan h'. Since the input has been resized, the dimensions of the data changed. As a result, Conv2D and AveragePooling2D are used.
The input is then passed to a flattened layer followed by the output layer with a sigmoid activation function. The total number of trainable parameters is 830,209.

IMG-CNN Model
The image CNN model requires dataset 2. The preprocessed dataset is organized in such a way that training and testing data are available with its corresponding labels. The ImageDataGenerator tool is used to take input from the directories where the images are stored. A total of 4710 records belonging to two different classes are used for training and testing. A CNN model is built and trained for 50 epochs. The conceptual framework of the IMG-CNN model is given in Figure 8. The pseudo-code of the IMG-CNN model is given in Algorithm 3. The IMG-CNN model summary is given in Figure 9.

AE+CV-CNN Model
This model utilizes dataset 3. The lower-class data, in our case 'Death' = 1, is passed through the auto-encoder model trained on this data. The auto-encoder compresses the 84-dimensional data into 50 dimensions. It then reconstructs these data back to 84 dimensions, but the reconstructed data are different from the original data. This would yield synthetic samples of the original data and can be used to balance the dataset. A balanced dataset would definitely produce better results than an unbalanced one. These data are then used as input in the existing CV-CNN model, thus creating an AE+CV-CNN model. The conceptual framework of the AE+CV-CNN model is shown in Figure 10. The pseudo-code of the AE+CV-CNN model is shown in Algorithm 4. The AE+CV-CNN model summary is specified in Figure 11. The CNN model is the same as Figure 5.   This model makes use of dataset 3. In this case, data augmentation is performed using SMOTE. The number of samples belonging to both classes is equal. Thus, there are 50% dead patient records and 50% recovered patient records. Once the data have undergone further preprocessing, they are used as input for the LSTM model. The improved performance was seen and is summarized in the Results section. The conceptual framework of the SMOTE-CV-LSTM model is shown in Figure 12.

SMOTE-CV-CNN Model
This model is similar to the SMOTE-CV-LSTM model. The preprocessing and the dataset used are the same, but the difference is that the data are used as input in the CV-CNN model. In both the SMOTE models, the actual model is the same as the one shown in Figures 4 and 10, except that the input data are first balanced using oversampling and then are used as input for the model. The CV-CNN model already showed good performance, and when combined with SMOTE, it produced a very reliable model. The conceptual framework of the SMOTE-CV-CNN model is shown in Figure 13.
In the case of these two models, the input used was a balanced dataset generated by oversampling. The imblearn library was used to import SMOTE. The data were classified into X and Y (independent variables and target variables). Using SMOTE with random state 45, we generated 2415 synthetic samples of the minority class. The samples generated proved to be more valid than the ones generated using other data-augmentation techniques. This dataset was then used as input in the CV-CNN and CV-LSTM models. The pseudocode of the complete implementation is given in Algorithms 5 and 6. The model summary is the same as the ones given in Figures 5 and 7.

Experimental Results and Discussions
The experimental results obtained on the standard dataset and the performance analysis are presented in this section.

Performance Evaluation Metrics
Accuracy, precision, recall, and F1-score are used for performance evaluation. These metrics are defined as follows: F1-Score: Harmonic mean of precision, and it is defined in Equation (4) F1-score = 2TP 2TP + FP + FN (4) where TP, TN, FP, and FN indicate true positives, true negatives, false positives, and false negatives, respectively.

Key Information from Preprocessing
During the preprocessing stage, there were several observations from the dataset that were relevant. We were able to identify biological markers that are vital in predicting the survival chance of a patient. Some of these are given below.
From these data, one can infer that severity, age, and MAP are a few of the many important biomarkers that can help predict COVID mortality. In Figure 14, it is noted that the number of deaths is greater than the number of recoveries with higher severity scores. However, severity is an assigned feature and not a laboratory value. Figure 15 shows an age distribution of deaths and recoveries. Figure 16 shows the percentage of death by age. The number of deaths is greater as the age increases. Thus, age is an important biomarker. Figure 17 shows the percentage of deaths by MAP. The number of deaths is greater when MAP < 70.

Experimental Result of Models
The results from the implementations of the proposed models are given in the following sections.

Results of CV-CNN Model
The CV-CNN model produced 96.7% accuracy on the test data. In Figure 18, the confusion matrix of the CV-CNN model is presented. The CV-CNN model classification report values of precision, recall, and F1-score are given in Table 2. The AUC score for the CV-CNN model was 93%. In Table 3, the accuracy of the CV-CNN model for 10-fold crossvalidation is given. Figure 19 represents the precision-recall curve of the CV-CNN model.

Results of CV-LSTM Model
The CV-LSTM model produced 85.1% accuracy on the test data. The confusion matrix of the CV-LSTM model is given in Figure 20. The precision, recall and F1-score values are shown in the classification report in Table 4. The CV-LSTM model earned an AUC score of 70.0%. Table 5 shows the results of the CV-LSTM model for 10-fold cross-validation. Figure 21 shows the precision-recall curve of the CV-LSTM model.

Results of IMG-CNN Model
The IMG-CNN model produced a training accuracy of 95.7%. The accuracy of the test data was around 75%. The confusion matrix of the IMG-CNN model is given in Figure 22. The different evaluation metrics for the IMG-CNN model are shown in Table 6.

Result of AE+CV-CNN Model
The AE+CV-CNN model produced 97.9% accuracy on the test data. Table 7 shows the classification report of the AE+CV-CNN model. Figure 23 represents the confusion matrix of the AE+CV-CNN model. The AUC score of the AE+CV-CNN model is 97.2%. Table 8 shows the result of the AE+CV-CNN model for 10-fold cross-validation. Figure 24 shows the precision-recall curve for the AE+CV-CNN model.

Results of SMOTE-CV-LSTM and SMOTE-CV-CNN Models
On using SMOTE to generate artificial samples, the results of SMOTE-based models are shown in Figures 25-28. The validity of the generated samples proved to be better than the samples generated by the AE model. The same analysis performed on the original dataset is performed on the SMOTE-generated dataset, and the results remain true to the nature of the data.

SMOTE-CV-LSTM:
The SMOTE-CV-LSTM model produced 86.74% accuracy on the test data. Table 9 shows the classification report of the SMOTE-CV-LSTM model for metrics precision, recall, and F1-score. Figure 29 represents the confusion matrix of the SMOTE-CV-LSTM model. The AUC score for the SMOTE-CV-LSTM model is 86.61%. Figure 30 shows the precision-recall curve for the SMOTE-CV-LSTM model. Table 10 shows the results of SMOTE-CV-LSTM model for 10-fold cross-validation.

SMOTE-CV-CNN:
The SMOTE-CV-CNN model produced 98.10% accuracy on the test data. The classification report of the SMOTE-CV-CNN model is presented in Table 11. Figure 31 represents the confusion matrix of the SMOTE-CV-CNN model. The SMOTE-CV-CNN model obtained an AUC score of 98.11%. Figure 32 shows the precision-recall curve for the SMOTE-CV-CNN model. Table 12 shows the results of the SMOTE-CV-CNN model for 10-fold cross-validation.

Inference
The Results section presents the outcome of the execution of the proposed models. A summary of all six model results is given in Table 13. The SMOTE-CV-CNN model produced the best performance after considering all the metrics. Therefore, it would produce reliable predictions in real-life scenarios. Table 14 compares the proposed SMOTE-CV-CNN model with the machine learning techniques proposed by other authors. Table 15 compares the proposed SMOTE-CV-CNN model with other deep learning techniques. In addition, Table 16 shows that the proposed SMOTE-CV-CNN model is compared with existing methods using accuracy metrics. Further, to interpret the resultant values easily, a graphical representation is given in this paper. Figure 33       The results of our research revealed several key inferences. It is to be noted that the performance of the LSTM model was not as good as the CNNs. This shows that LSTMs are not suitable for the kind of predictive analytics that the problem scenario requires. Similarly, the IMG-CNN model was also unable to produce promising results. The CV-CNN model produced a considerably good performance. The CNN-AE model is a modification of the CV-CNN model with the addition of a balanced dataset. From the classification report of the LSTM model, it is noted that one of the reasons for low performance is due to the fact that the data are not balanced. The metrics for the death records are not as good as the recovered records. Therefore, on using SMOTE data in the SMOTE-CV-LSTM model, the results are more balanced, and its performance is better than the CV-LSTM model. Like the CNN-AE model, the SMOTE-CV-CNN is also a modification of the CV-CNN model. The SMOTE-CV-CNN showed better performance than the CNN-AE model, and its results are also more reliable, as the data used are of better quality. Therefore, the SMOTE-CV-CNN model would produce good results for industry applications. It also has better performance than other machine learning techniques and most deep learning techniques. Therefore, it is a very efficient tool for predictive analytics.

Conclusions
In this paper, an efficient COVID-19 mortality risk prediction system using DSMOTE and CNN were developed. Six different models were implemented for the automatic diagnosis of COVID-19, namely CV-CNN, CV-LSTM+CNN, IMG-CNN, AE+CV-CNN, SMOTE-CV-LSTM, and SMOTE-CV-CNN. Each model was trained and tested using the same clinical dataset to identify the recovered and death cases. The performances of these deep learning models were enhanced using various data augmentation, auto-encoder, and oversampling techniques. In the experiment, the CV-CNN model obtained an accuracy of 96.70%. Similarly, CV-LSTM+CNN, IMG-CNN, AE+CV-CNN, SMOTE-CV-LSTM, and SMOTE-CV-CNN attained accuracies of 85.10%, 75.22%, 97.90%, 86.74% and 98.10%, respectively. From the results, it was observed that the SMOTE-CV-CNN model outperformed the other proposed models. In addition, the proposed SMOTE-CV-CNN model was compared with the existing ML and DL models in terms of accuracy, AUC, precision, recall, and F1 score. The comparative analysis showed that the proposed SMOTE-CV-CNN model gives better accuracy than the existing models. Furthermore, an efficient COVID-19 mortality risk prediction system can be applied for the detection of other pneumonia diseases. Further, this work enlisted three main challenges in the current situation as: • Available small-scale datasets restrain the detailed study by researchers.

•
No dataset is available to provide the critical level of patients. • Even though the proposed model earned high accuracy on the small dataset, it is not clear how this will perform on a large dataset.
Therefore, in order to effectively contribute to the healthcare industry in the future, mutations of COVID-19 should be observed, and the proposed framework could be updated accordingly. Institutional Review Board Statement: The data collection was authorized by the Montefiore Medical Center/Albert Einstein College of Medicine Institutional Review Board, which approved the waiver of patient-informed consent due to the retrospective design of the study.