Computer-Aided Detection (CAD) for COVID-19 based on Chest X-Ray Images using Convolutional Neural Network

Covid-19 has spread throughout the world and has been declared as a pandemic by the World Health Organization (WHO). The disease was discovered by the end of 2019 in Wuhan- China. The number of deaths continues to surge sharply and spread to many countries. Covid-19 has sent billions of people on earth into lock-down when health services struggle to cope. A swift and reliable Covid-19 diagnosis system is needed, to direct the patient to the appropriate treatment and prevent the disease dissemination. During this time, we are familiar with rapid tests and Real-Time Polymerase Chain Reaction (RT-PCR) as the procedure of Covid-19 detection. Both of these procedures tend to be impractical and require specialized laboratories that are arranged in such away. It can also take several hours to wait for the amplification process until the results are known. In this study, we introduce a Covid-19 detection system based on Chest C-Ray images using Convolutional Neural Network (CNN). The dataset consists of 1000 images, 500 images each for positive Covid-19, and Pneumonia. The CNN model that was designed consisted of three hidden layers, a fully connected layer with sigmoid activation. The evaluation was conducted to determine the performance of the proposed model using matrices of precision, recall, F1, and accuracy. The experimental results show that the proposed method provides precision, recall, F1 was 1 and 100% accuracy, respectively. This research is expected to be tested in field validation, to help the medical authorities for clinical diagnosis.


Introduction
It is estimated that around the end of December 2019, a new and unknown coronavirus was discovered, the first case appeared in Huanan Seafood Wholesale Market, Wuhan (Hubei Province) -China. An early indication of the Plague pneumonia is characterized by fever, dry cough, fatigue, and sometimes gastrointestinal symptoms [1]. This virus then carries the disease that we know by the name of Corona Virus Disease 2019, briefly names as Covid-19. Covid-19 has spread throughout the world and has been declared as a pandemic by the World Health Organization (WHO). As shown in figure 1, At least, until 22 June 2020, WHO stated that there had been 8.844.171 confirmed cases, 465.460 confirm deaths, and 216 countries were expose [2]. The number of deaths continues to surge sharply and spread to many countries. Covid-19 has sent billions of people on earth into lock-down when health services struggle to cope. The emergence of Covid-19 had an impact on all aspects, millions of employees lost their jobs, lost revenue in the air service sector, the decline in foreign exchange tourism, rising inflation, and lead to the risk of worsening economic conditions. This prompted scientists in all corners of the world, from various branches of science continue to work with optimism for research related to this outbreak, including researchers in the field of biomedical image processing. As we all realize, one of the biggest challenges when dealing with COVID-19 is the necessity for fast and reliable diagnostic identification that can serve as an alternative to reverse-transcriptase testing polymerase chain reaction (RT-PCR), bearing in mind that this epidemic is spreading rapidly and globally. Togacar and colleagues conducted a study related to COVID-19 detection with MobileNetV2 and SqueezeNet modeling, each modeling used Support Vector Machine (SVM) as a classifier [3]. The results of this study are a system that can 100% detect COVID-19 images, and 99.27% for the detection of Normal and pneumonia images. Chest radiograph (CXR) plays an essential role in triage for COVID-19 detection, so as Murphy and colleagues conducted a study to evaluate the performance of artificial intelligence (AI) for detecting COVID-19 and pneumonia based on CRX [4]. The study concluded that the AI system correctly classified CXR images as COVID-19 pneumonia with an Area under the ROC curve (AUC) of 0.81. The Coronavirus novel also motivated Asif Khan and his colleagues to propose CoroNet, a Deep Convolutional Neural Network-based modeling that can detect COVID-19 infection from chest X-ray images [5]. This study concluded that CoroNet model substantially can be a very helpful tool for clinical practitioners and radiologists to aid them in the diagnosis of COVID-19. A system that can detect whether a person is exposed by Covid-19 or not based on chest CT images has been designed by using multi-objective differential evolution (MODE) and convolutional neural networks (CNN) [6]. The results showed that the CNN model yielded accuracy, F-measure, sensitivity, specificity, and Kappa statistics by 1.9789%, 2.0928%, 1.8262%, 1.6827%, and 1.9276%, respectively. Research related to the detection of COVID-19 based on chest X-rays images has been carried out using Alexnet, GoogleGene, and Restnet18 [7]. The results showed that the Googlenet architecture produced the best accuracy of 80.6%, this result was obtained when the system was designed for 4 class classifications (Covid-19, Normal, Pneumonia Bacteria, and Pneumonia Bacteria Virus). Alexnet was selected to be the best model as it achieved 85.2% accuracy when the system was designed for 3 class classifications (Covid-19, Normal, and Pneumonia Bacteria). In the third scenario, the system is designed for the 2 classes classification (COVID-19 and normal), Googlenet was chosen as the best architecture, it reaches 100% in testing accuracy and 99.9% in validation accuracy.
In this study, we propose a small CNN architecture which can conduct early detection whether someone is exposed to Covid-19 or not based on Chest X-ray images. The model designed is expected to have a simple complexity level, but produces high accuracy and short computing time.

Convolutional Neural Network (CNN)
In recent years, CNN has been very widely used to identify visual forms of images, allegedly using the least amount of calculations [8]. CNN has been widely used in medical image detection midwives, such as the retina, fundus, chest X-ray, chest CT, breast cancer, cardiac, abdominal, and musculoskeletal image analysis. Thus, convolutional neural networks have almost unlimited applications. Convolutional neural networks are also called ConvNets, built of three main layers; convolutional layer, pooling layer, and fully connected layer. As shown in Figure 2 The convolutional layer has the task of carrying out mathematical operations, convoluted the image with a filter to extract important features [9]. That is the first operation on CNN. The input image will be entered into a filter, then produces an activation map or feature map. Figure 3 below illustrates the 5 × 5 input image convolution process with a 3 × 3 kernels/ filter size, and a 1 × 1 stride, producing a 3 × 3 feature map. Rectified Linear Unit (ReLU) is a very simple activation function, and lately, it is very popular to use because it helps to overcome some optimization problems in the sigmoid activation function [10]. ReLU is simply defined as f (x) = max(0, x), the derivate of ReLU: Take a look at the following RelU Activation function graph as shown in figure 4, it appears that the function will make the value zero for negative values and grow linearly for positive values. After the convolution, the second fundamental operation on CNN is pooling. This operation is much easier to understand than convolution [11]. Pooling operation is used to down-sampling or reduced the complexity process for the next layer [12]. The reduction in complexity is achieved by reducing the size of the feature map matrix. The way to reduce the size of a matrix is by comparing neighboring pixels and making them into one value. The representative value is set as the average or maximum value. To understand this, let's look at a concrete example of max-pooling in figure 5   Figure 6. Sigmoid Curve Adaptive Moment Estimation (ADAM) optimizer, is a combination of momentum and RMSprop [15]. In terms of computing, ADAM tends to be more efficient and requires less memory [15]. ADAM is an algorithm that serves to update the network weights in the training data, upgrading of weights is done to minimize the loss value [16]. ADAM will store the exponential average of the square decay gradient v t , and calculate the average of the second moment of the gradient m t [17].
Whereas m t and v t are values of the mean and uncentered variance. Adam updates the exponential moving averages of the gradient and the squared gradient where the β 1 , β 2 ∈ [0, 1].
The final update weight is given as:

System Design
In this study, the proposed model consists of three hidden convolutional layers, with RelU activation, Max Pooling, and sigmoid classifier. The learning rate of 0.001 using in Adam Optimizer with Binary Crossentropy for loss. CNN Architecture or the designed model can be seen in figure 7 below. The dataset consists of 1000 Chest X-Ray images, 500 images each for positive Covid-19, and Pneumonia. The dataset is taken from Kaggle Chest X-ray Images, they are collected from various publicly available resources [18] [19] [20]. Total data will be divided into 75% (750 images) as training data and 25% (250 images) as validation data. This dataset will serve as input for the designed CNN model. Based on Figure 7 and Table 1, the chest x-ray images are resized into 64 × 64 pixels. In hidden layer 1, we use a filter with size 3 × 3 and channel output 16. In hidden layer 2, we use the filter with size 3 × 3 channel output 32. And last, in hidden layer 3, we use a filter with size 3 × 3 and channel output 64. Then, we do the flatten to change the image featured into one layer/dimension so that we can classify it into two classes (whether Covid-19 or Pneumonia). Activation that is used to do the classification is sigmoid.

Result and Analysis
Evaluating the model aims to check whether the predicted value is equal to the actual value during the testing process. In this study, the evaluation was carried out by using a confusion matrix. The parameters that become the reference related to system performance are Accuracy, Precision, Recall, and F1-score.
True Positive (TP) is likened to a case where both the prediction and the actual are TRUE. True Negative (TN) is the case when both the prediction and the actual are FALSE. False Positive (FP) is a case when something is estimated as positive (TRUE) but is negative. False Negative (FN) when predicted as FALSE, but in reality, it is TRUE [8]. F1-score is the harmonic mean of precision and recall.
The experimental results show that the proposed method provides the precision, recall, F1-score of 1, and accuracy of 100%. The total iteration for training data is 100 times (100 epoch). The accuracy increases for each iteration, and the difference between training accuracy and validation accuracy is not much different. It can be concluded that the designed model does not experience overfitting. Through the Confusion Matrix in Figure 9, from a total of 128 validation data, all the images detected correctly to their class.  In 100 iterations, the value of precision, recall, F1-score can be seen in Table 2 below:

Conclusion
To classify Covid-19 and Pneumonia from chest x-ray images, a system has been built using CNN with three hidden layers plus one fully connected layer with sigmoid activation to determine its classification. After doing the 100 iterations for the training and validation process, obtained an accuracy value of 100% with precision, recall, F1-score is 1.00. So it can be concluded that this system does not occur over-fitting, which means the system can recognize the type of Covid-19 and Pneumonia with new chest x-ray images. This research is expected to be tested in field validation so that it can help the medical authorities for clinical diagnosis.