Deep Learning Approaches for Detecting Pneumonia in COVID-19 Patients by Analyzing Chest X-Ray Images

Department of Electrical and Computer Engineering, North South University, Bashundhara, Dhaka-1229, Bangladesh Department of Computer Science and Engineering, Lovely Professional University, Punjab-144411, Phagwara, India Department of Information Technology, College of Computers and Information Technology, Taif University, P. O. Box 11099, Taif 21944, Saudi Arabia Department of Computer Science, College of Computers and Information Technology, Taif University, P. O. Box 11099, Taif 21944, Saudi Arabia


Introduction
, acknowledged by the World Health Organization (WHO) as a pandemic, has dramatically altered the course of humans' daily lives, their immediate health, and economies throughout the world [1]. A rapidly spreading viral disease, COVID-19 affects humans and animals. As per Worldometers, approximately 2,869,886 people have died due to coronavirus complications so far [2]. In most cases, it is pneumonia that makes this disease highly dangerous and potentially fatal. erefore, detecting and diagnosing pneumonia in COVID-19 patients is critical. e use of deep learning techniques can help us achieve this goal. Deep learning algorithms can be used to detect and diagnose pneumonia using only X-ray images, in the process saving both money and time. Doctors can also benefit from the use of this approach as it can help them efficiently identify highly critical patients so as to isolate them from patients with milder symptoms. In this way, appropriate medical treatments can be administered to COVID-19 patients, which can potentially save many lives.
Many organizations and research institutions are trying to develop vaccines, and there are now several approved and in use. Furthermore, studies on COVID-19 patients have shown that people with a history of lung infections are most severely affected by the virus. Chest X-ray and CT images are well-known imaging techniques to diagnose lung-related problems. And considering throat infection and experiencing difficulty while breathing are usually the first major symptoms of COVID-19 [2], these imaging techniques can be utilized in detecting lung complications induced by COVID-19 as well. Moreover, the cost of diagnosing COVID-19 is expensive in some countries [3], and thus, a low-cost method for detecting COVID-19 complications is needed.
In this research, we considered X-ray images for detecting COVID-19 patients [4]. e image dataset contained both healthy and COVID-19 patients.
is study primarily focuses on the pretrained VGG16 model for predicting pneumonia using chest X-ray images of coronavirus-infected patients.
Deep learning techniques have been applied in medical diagnoses in the last few years. It has enormous potential for extracting minute features by sampling kernels. To date, various deep learning models have been used in different fields [5][6][7][8][9][10][11]. Bar et al. [8] used deep learning models to detect chest pathologies. Razzak et al. [9] discussed various problems in the application of deep learning models in medical image processing. e work considered the chest X-ray scans PA view of COVID-19 and pneumonia patients. After preprocessing and applying data augmentation techniques on the chest X-ray images, we considered a pretrained model, VGG16. We collected 6432 chest X-ray images from Kaggle. We used 5144 images for training and 1288 images for validation. e VGG16 model shows an average accuracy of 91.69% for detecting pneumonia.
Pneumonia induced by COVID-19 can be diagnosed through genetic and imaging tests. And through a rapid detection mechanism, the spread of COVID-19 can be controlled. Various studies have been conducted to identify different types of diseases by analyzing chest X-ray images, from which some of the major works are highlighted below. e authors in [12] investigated the performance of a deep learning formula to diagnose heart failure using 952 labeled X-ray images. Two cardiologists in the National Institutes of Health examined the images and identified 260 as "normal" and 378 as "heart failure". e remaining images were discarded. e authors achieved 82% accuracy in diagnosing heart failure using data augmentation and transfer learning. e authors in [12] used data segmentation and transfer learning to predict heart disease, achieving an accuracy of 82%. In the present study, we deal with detecting pneumonia using deep learning techniques on chest X-ray, achieving an accuracy of 91.69%. e authors in [13] mainly focused on applying a deep learning technique to classify tuberculosis (TB) using chest X-ray images. e authors used different preprocessing techniques, such as augmentation and segmentation, before applying the proposed deep model images.
e dataset consisted of 700 TB infected and 3500 traditional chest X-ray images. Owing to the advancement of transfer learning approaches, many techniques such as VGG-19, ResNet-50, InceptionV3, ChestNet, and DenseNet201 have been proposed and applied to detect TB cases [13]. e accuracy, precision, sensitivity, F1-score, and specificity in detecting this infectious disease were 97.07 %, 97.34 %, 97.07 %, 97.14 %, and 97.36%, respectively. In [13], the authors used different transfer learning-based convolutional neural networks (CNNs) to predict TB. is research used a fine-tuned CNN-based VGG16 model to classify chest X-ray images of COVID-19 patients and pneumonia patients and detect COVID-19 cases with pneumonia. e authors in [14] used a system to observe carcinoma using chest X-ray images. e authors considered the 121-layer CNN model called DenseNet-121. e model used a respiratory organ nodule dataset, achieving 74.43 ± 6.01% mean accuracy. Meanwhile, the model obtained 74.96 ± 9.85% mean specificity and 74.68% ± 15.33% mean sensitivity. Grewal [15] used a deep learning technique for brain hemorrhage detection. In the present study, the VGG16 model was used, which loads a set of weights pretrained on ImageNet [16], leaving off the FC layer head. e model also used AveragePooling2D of pool_size (4,4), flatten, dense layer, and dropout of 0.5, for predicting pneumonia. e accuracy of the model was 91.69%, sensitivity was 95.92%, and specificity was 100%, which makes this model unique from those discussed in [12][13][14]. In addition to the research discussed, numerous studies have proposed models applying machine learning and deep learning to detect COVID-19 and non-COVID-19 patients by analyzing chest X-rays [17][18][19][20]. However, no model can be claimed as a standard because the models use different datasets.
is study proposed a deep learning-based model to predict pneumonia in COVID-19 patients using chest X-ray images. Pneumonia is a virus-related disease and in many cases results in patient's death. COVID-19 patients with pneumonia also face various health-related complications. is research helps to detect pneumonia in COVID-19 patients so that they could be separated from other less severe patients and given appropriate life-saving treatment. Here, some deep learning features such as AveragePooling2D, flatten, dense, Image-DataGenerator, and dropout are used to preprocess the data, and a CNN-based VGG16 model is used to classify chest X-ray images of COVID-19 patients and pneumonia patients and predict COVID-19 patients with pneumonia. Section 2 describes the methods and materials used to build the CNN-based model. Section 3 presents the results and discussion of the results. Finally, in Section 4, we provide our conclusions.

Method and Methodology
In the first few sections here, we present a basic block diagram of the system and an overview of this research work through a flowchart. e latter part highlights various deep learning features and models used to accurately detect pneumonia in COVID-19 patients. Figure 1 shows the full system overview of the model design. e model consists of the VGG16 pretrained model, an image data generator to generate images, AveragePooling2D, flatten, dense layers, and dropout layers. We also preprocessed the data and performed data augmentation, before detecting pneumonia. A classification matrix was also generated to evaluate F1score, precision, and recall and then to calculate accuracy, sensitivity, and specificity.

Working Steps of Pneumonia Prediction.
A deep learning algorithm is used to learn features from patients' X-ray images more accurately so that the model can detect pneumonia more accurately. e following working steps provide complete knowledge of this research work along with the flowchart.
Dataset collection: Kaggle's dataset [4] was used for this research project. Data processing and augmentation: after collecting images from the dataset, the noise in the X-ray images is removed and cleaned. After that, the data are resized. Feature extraction: VGG16 model was used to build a pneumonia prediction model. Data split in training and test set: the data were split into 80% training and 20% testing data. en the data are fed into the VGG16 model for training. Data test: after training the data into the model, the test data were used for prediction. Results and conclusions: the final output was then used to calculate accuracy, create a classification matrix, and determine sensitivity and specificity.
A flowchart of this process is shown in Figure 2.

e Proposed CNN Model.
e CNN model used in the present study has two major sections: feature extractors and classifiers. A CNN model uses a hierarchical model that functions to create a network and produces a fully connected layer resembling neurons connected to each other; therefore, this model generates the most efficient results in the classification of images with fewer errors. Figure 3 shows a general CNN architecture used in the present study.
VGG16 is a CNN architecture that is applied to image classification problems on a large image dataset. e model loads pretrained weights on ImageNet, leaving the fully connected (FC) layer head off. e FC layer has three fully connected layers, followed by a series of convolutional layers. ere were 4096 channels in each of the first two layers. e third layer has 1000 channels and hence performs ILSVRC classification in 1000 ways. e last layer is the softmax layer, which has the same number of nodes as the output layer.
is layer is generally used for multiclass classification problems, where class membership is required for more than two labels.
An epoch in deep learning is a full iteration of the samples. Epoch is a hyperparameter that states the number of iterations a model is applied to the training dataset. In each epoch, the sample in the dataset updates the internal model parameters during training. An epoch may have one or more batches. An epoch with one batch is called the batch gradient descent learning algorithm [22]. In the coding part of this research work, 25 epochs were used. e hyperparameter batch size defines the number of samples to pass through before updating the model parameters so that each time the model can be improved. e batch size can be considered as iteration over one or more samples to make predictions. e predictions are then compared with the expected output variables at the end of the batch, and the error is calculated. e existing model improves itself from this error, for example, by moving along the error gradient [22]. In the coding part of this study, the batch size was set to 16 and the initial learning rate was 1e − 3. Flattening is applied for transforming multidimensional data into one-dimensional data to input it into the next layer. In this study, flattening was used to output the convolutional layers to a onedimensional feature vector. e output is then forwarded to the classification layer [23]. We also used an average pooling layer [24] with a pool size of (4, 4).

Performance Analysis
We evaluated the performance of the proposed model based on different metrics: accuracy, recall, sensitivity, specificity, and precision. e metrics are evaluated by various parameters in the confusion matrix, such as true positive (TP), Sensitivity determines the percentage of actual positive cases that are accurately predicted. is metric evaluates the prediction capability of the model. e equation for calculating the sensitivity is as follows: To clarify the proportion of actual negative cases, specificity was used, which was predicted correctly. Specificity is a metric that evaluates a model's ability to predict true-negative cases of a given category. erefore, these metrics were applied to every categorical model to interpret the result. e equation for calculating the specificity is given as follows: Precision demonstrates the performance of the model on the test data. It shows the number of models predicted correctly from all positive classes. is should be as high as possible: Anaconda's Jupyter notebook was used to build the complete code for predicting pneumonia in COVID-19 patients. Jupyter notebook is an open-source platform, and all the necessary libraries can be accessed to complete this research.

Discussion
is section discusses the results achieved for predicting pneumonia in COVID-19 patients using the proposed approach. First, we applied data augmentation using the Keras image data generator technique on the images. For the image generator, the rotation angle was kept at 15 and the fill mode was set to nearest. en, they are applied on LabelBinarizer () to perform one-hot encoding on the labels. After that, the data were split into 80% training and 20% testing data, as shown in Table 1. A base model and a head model were built; the head model was transformed using AveragePooling2D [25], flatten, dense, dropout layers, and finally, the complete model was developed. Subsequently, the complete model was compiled with an Adam optimizer, and the testing phase was predicted. However, the accuracy was 91.69% in predicting pneumonia in COVID-19 patients after formulating chest X-ray images.  Figure 2: Flowchart of the working steps to predict pneumonia in COVID-19 patients.

Depth
Height W id th 28.64%, validation accuracy 91.14%, and validation loss 23.25%. After seven epochs, training accuracy was 91.1%, loss 23.69%, validation accuracy 91.69%, and validation loss 21.63%. Figure 4 represents that, after fitting the proposed model for the testing phase, it predicts X-ray images as pneumoniaaffected. To make it clear, the testing phase has been built differently. However, in Figure 5, we show that the model predicts three different classes of chest X-ray images, and the predicted class and actual class are the same. us, it can be said that the proposed model could successfully identify each class.
Tables 4 and 5 present the classification report. As can be seen, the precision of COVID-19 prediction is 99%, recall is 81%, and F1-score is 89%, whereas the precision for normal case detection is 83%, recall is 91%, F1-score is 87%, and the precision for pneumonia prediction is 95%, recall is 93%, and F1-score is 94%. erefore, it is concluded that using the proposed approach, there are lesser chances of being wrong in diagnosing pneumonia prediction compared with normal and COVID-19 cases. Figure 6 shows TP and TN pneumonia prediction cases and FP and FN pneumonia prediction cases to make this result clearer: a total of 798 people were found to be affected with pneumonia, and 387 people truly did not have pneumonia. Simultaneously, the model also gives false assumptions: a total of 57 people were falsely diagnosed as having pneumonia and 46 people were falsely diagnosed to not have pneumonia. To address this, this model's sensitivity and specificity values are calculated, which are 95.92% and 100%, respectively. e average accuracy of the proposed method after completing epochs is 91.69, which demonstrates that the model performs well in the classification of COVID-19-induced pneumonia. Figures 7 and 8 show the accuracy and loss values during the model's training and validation phases. e figures show how training and validation losses decrease and increase training accuracy and validation accuracy. e x-axis represents the number of epochs and the y-axis represents loss/ accuracy. When the epoch was 0, both the training and validation accuracies were low, and the loss was very high. But as the number of epochs increased, both training accuracy and validation accuracy increased, and training and validation loss decreased. After seven epochs, the training and validation accuracies are closer to 92%, and the training loss and validation loss are very close to 2%. erefore, using the proposed model, the patients may not need to visit a physician and spend large sums of clinical tests and examinations; rather, they only need a chest X-ray and a supporting mobile application to detect pneumonia [25]. us, underprivileged and poorer or rural sections of the society will be highly benefitted from the results of the present study. After diagnosing pneumonia through the app, they can simply follow the doctor's guidelines and take appropriate medicines. e progress made in applying deep learning-based methods using chest X-rays for COVID-19 classification is tremendous. e researchers mainly focused on the detection of three classes: COVID-19, normal, and pneumonia. e scarcity of large datasets is a significant problem in the evaluation of the proposed models. To solve the problem of small-sized datasets, transfer learning methods have been applied [19,20,[26][27][28][29][30]. e models are pretrained on the ImageNet dataset [16]. Ensemble learning techniques [18,31,32], which combine predictions from several models to produce accurate results, are also used in COVID-19 detection. is enhances the results of the model prediction by minimizing the generalization error and variance.         game-theoretic model to maintain social distancing to prevent the COVID-19 outbreak in a noncooperative situation. e authors in [36] proposed a security protocol for remote patient care systems using physically unclonable functions to enable doctors to continuously monitor and diagnose COVID-19 patients.

Conclusion
is research suggests a two-stage deep residual learning technique using lung X-ray images to identify COVID-19induced pneumonia. e model showed good performance in differentiating COVID-19 patients and patients with COVID-19-induced pneumonia using the VGG16 model. e model predicted pneumonia with an average accuracy of 91.69%, sensitivity of 95.92%, and specificity of 100%. It also reduces training loss and increases accuracy. Parallel testing can be used in the current scenario to prevent infection spread to frontline workers and generate primary diagnoses to determine whether a patient is affected by  erefore, the proposed method can be used as an alternative diagnostic tool for detecting pneumonia cases. Future research can improve the CNN architecture performance by adjusting the hyperparameters and transfer learning combinations. Another feasible way to determine the best model for pneumonia and COVID-19 could be an improved, complex network structure.

Data Availability
e data used to support the findings of this study are freely available at https://www.kaggle.com/prashant268/chestxray-covid19-pneumonia.

Conflicts of Interest
e authors declare that they have no conflicts of interest to report regarding this study. Mathematical Problems in Engineering 7