Detection of COVID-19 using Hybrid ResNet and SVM

The whole world facing a huge crisis because of Corona virus also known as COVID-2019, identified first in December 2019 in the city of Wuhan located in China. The detection of persons infected with the virus is most important as it can be spread easily from him to others and also the person infected with the virus may not know that he is infected until a number of symptoms fallout from him. In this paper the virus detection is done using deep learning and machine learning algorithms using the X-ray images. A dataset is created with three classes consisting of normal, corona virus, and pneumonia images. The proposed method uses ResNet50 and SVM, deep learning features are extracted using ResNet50 and classification is done using SVM classifier. The classification accuracy obtained from the model is 100% when testing on the Corona virus and normal images, whereas the results obtained from the model is 94% when it is tested on the dataset consisting of normal, Corona virus and pneumonia images and performed well compared to VGG16.


Introduction
Corona virus infection 2019  is an exceptionally irresistible illness brought about by extreme intense respiratory disorder corona virus 2. The sickness initially began in December 2019 from Wuhan, China and from that point forward it has spread all-inclusive over the world influencing more than 200 countries [1][2][3]. The effect is to such an extent that the World Health Organization (WHO) has proclaimed the progressing pandemic of COVID-19 a Public Health Emergency of International Concern. In a matter of 30 days this infection had spread from Wuhan to smaller parts of China [4]. The spread of this infection is also found in United States of America [5] where there are about more than 300,000 by the mid of April and it was found only seven cases are there in the month of January 202Serious intense respiratory condition Corona virus (SARS-Cove) and the Middle East respiratory disorder Corona virus (MERS-Cove) have caused extreme respiratory infection and passing in people [6]. The common clinical highlights of COVID-19 incorporate fever, hack, sore throat, migraine, weakness, muscle agony, and brevity of breath [7,41].
The infection can incite the demise of individuals with debilitated immune system [8,9]. The spread of this infection is found that the transmission from individual to individual is taking place through the physical contact between them. By and large, healthy individuals can be tainted through many other methods such as person to person contact like the  [10,42]. As the infection is new to detect the COVID-19 a method called as RT-PCR known as real time reverse transcription-polymerase chain reaction is widely used [11,41]. Because of the low RT-PCR sensitivity of 60%-70%, regardless of whether negative outcomes are acquired, manifestations can be recognized by looking at radiological pictures of patients [12,13]. The RT-PRC method is used to distinguish COVID-19 pneumonia with the others using the CT images [14]. The patients when felt the effects of the infection in the initial days of 0-2 the CT can help to detect the infection [15]. It is found out that the patients with the infection; the critical illness can be felt 10days after the side effects are discovered [16].
There is a huge demand at the beginning of the pandemic for testing as the test units are not available and scare, so many specialists have been urged to come up with a better CT results and also the Chinese hospitals and testing centers used to have deficient test units, giving a high negative results which had become a big issue for them to control the spreading initially [15,17]. Many nations who did not have the testing kits relied on the CT for the discovery of the infection in the patients and distinguish them. The country like Turkey also relied on these tests as the testing kits are not accessible. Many research centers, researchers had focused on working on these types of images so that they can distinguish the COVID-19 patients easily and quickly with low error rates [6, 14, 18-20, 51, 55]. Radiologic pictures acquired from COVID-19 cases contain helpful data for diagnostics. A few investigations have experienced changes in chest X-rays and CT pictures before the start of COVID-19 symptoms [21,41].
In present trend, biomedical research is using Artificial Intelligence for speeding up the results. A large number of applications, such as data classification, Image classification [43,45,47,49], segmentation are performed using the artificial intelligence, deep learning models [22,23]. The lung infections are found in most of the people affected with the novel virus as it affects the lungs of the patients. A large number of deep learning models are developed to identify the chest X-ray images [24,53]. Pneumonia X-rays images are characterized using the deep learning models [25] in three distinctive steps. By utilizing the Resent model, they grouped the dataset into numerous marks, for example, age, sexual orientation, and so on. They additionally utilized the Multi-Layer Perceptron (MLP) as an order strategy and accomplished a normal of 82.2% precision.
In this way, in this specific situation, one essential thing that should be done and has just begun in most of the nations is testing manually, with the goal that the genuine circumstance can be comprehended, and appropriate choices can be taken. But the downsides of manual testing incorporate non availability of kits for testing, exorbitant and ineffective blood tests; a blood test takes around 5-6 hours to create the result. So, the thought is to beat these conditions utilizing the Deep Learning procedure for better and effective treatment. Since the sickness is exceptionally infectious in this way as right on time as we create the outcomes the less cases in the city that is the reason, we can utilize Convolution Neural Network to complete our activity. In Figure 1 there are images which represent the chest X-ray images of a 50 year old patient effected with the virus and the images are taken at days 1, 4, 5 and 7 [26].  Fig.2 (a) Image on day-1 (b) Image on day-4 (c) Image on day-5 (d) Image on day-7 From the figure above 2(a) that is taken on day-1 where there are no noteworthy discoveries can be found, additionally the lungs are clear. In 2(b) it is a picture taken on day 4 where we can discover patchy, ill-characterized two-sided alveolar combinations, with fringe disseminations. In 2 (c) there is a despite everything clear proof is found in the picture where radiological declining, with solidification in the left upper flap. In 2(d) radiological compounding, with the normal finding of ARDS.
Investigation of COVID-19 utilizing deep learning incorporates lungs x-rays of patients and the fundamental thought is to order the X-rays as COVID affected or ordinary. To put it plainly, the issue is a double arrangement issue where we order Normal versus COVID-19 cases. There are a few advantages and disadvantages of utilizing Deep Learning to handle such sorts of circumstances. The advantage is more efficient; more affordable; simple to work and the disadvantage is practically we need ~100% exactness as we can't wrongly distinguish the patients as it would prompt additionally spread of infection which is profoundly debilitated. Machine learning and deep learning algorithms are becoming popular and applied in many fields. In [ 37,48] authors have used the machine learning and deep learning models for analyzing the educational techniques, the authors have also showed that the machine learning techniques can be used to improve the student performance in their studies. In [38,46] authors have used the machine learning techniques like Naïve Bayes and K-Means clustering techniques for detecting the heart disease.
In [ 39] author have described in detail the challenges faced in big data machine learning, the authors also described the perspectives in big data machine learning in which they even explained the batch and stream of the technology. In [40] authors have used the machine learning techniques like the Decision tree using C4.5, Naïve Bayes and Radial basis function to diagnosis the malaria disease, they found out that the RBF has performed well than the other techniques. In [44] the authors have predicted the rainfall using the machine learning techniques like the multiple linear regression, support vector regression, and Lasso regression, they found out that the Lasso regression have performed well in predicting the rainfall. Machine learning techniques have also been successfully implemented on the telecommunication failure data [50], in the intrusion detection [52], in analysis of soil structure [54]. The applications of machine learning are still wide and it has found its application in other sectors also.
In this paper, deep transfer learning methodology was implemented for detection of corona virus based on the medical images of chest X-rays of patients. Deep Learning Architectures like ResNet50, Inception V3 and VGG16 are used for transfer learning. The base layers of these architectures are already pre-trained on imagenet dataset whereas the top layers are fine-tuned for medical image segmentation in our study. In addition, the incremental learning model ResNet50+SVM was implemented to achieve further accuracy. Deep features of the chest X-ray images were extracted using the ResNet50 model which is further classified using the linear SVM classifier.

Dataset
In our research, we have used two different datasets, one is for fetching the best classifier which give could give reliable predictions when the data is class-balanced with 2 classes and the second dataset is used for training the best performing models to predict the ICMECE 2020 IOP Conf. Series: Materials Science and Engineering 993 (2020) 012046 IOP Publishing doi:10.1088/1757-899X/993/1/012046 4 3-class data which is imbalanced. The COVID19 dataset is accessible on the Kaggle site. It is binary classed with either COVID or normal. It is a medium sized dataset with medical images of chest X-rays of people of several age groups either effected by COVID or in suspicion of getting effected by the disease. All the images are of posed at different angles and hence are preprocessed and resized to a fixed shape. Each image is labeled either COVID or normal. The sample images are shown in figure1.
In few cases, many patients who got affected with pneumonia are misdiagnosed as COVID effected and are kept in isolation in several parts of the world. In addition to this, as the data regarding COVID is scarce and the images of the patients are not splendidly available online, we created a customized dataset which is a combination of both COVID-19 and Pneumonia dataset from Kaggle. Our custom dataset consists of chest X-ray images of normal, Pneumonia and COVID people thus making it a 3-class dataset. However, compared to the other 2 classes, the data of COVID patients is comparatively low.In this experiment the total dataset is divided into two parts one is for training and the other is for testing. It is divided as 70% for training and 30% for testing and k-fold cross validation is utilized.

Fig.3. Sample Images from our custom dataset. a) Normal b) Pneumonia c) COVID
In the exploratory investigation, we utilize the three classes of datasets that are available openly. These are divided into three classes namely normal, pneumonia, and COVID-19 chest images. The number of images available related to COVID-19 is limited as the infection is new. The information of 76 pictures marked with COVID-19 was chosen for this investigation [27]. Another dataset available in Kaggle site with 219 images are also used [28]. Pneumonia chest pictures incorporate both infection and microorganisms' types, and these pictures are taken from 53 patients.
All these images are prepared by the specialists and they are made to available free for the further research [29]. Three classes are available in these consolidated datasets. The information regarding the dataset is as follows: There are 295 images related to COVID-19 class. Pneumonia class images are 98 and ordinary images with no infections are 65 making the total number of images in the dataset is 458. 70% of these images are used for training and 30% for testing. In this experiment Inception V3 Network [30,31], ResNet 50, VGG16 [34,35,36] are used and a hybrid model is proposed.

ResNet50+SVM
In medical and healthcare applications, accurate results of predictions are crucial as any misprediction could result in huge loss, sometimes it leads even to loss of human life. So, to further improve the prediction accuracy, we implemented the incremental learning of Pattern recognition. The base model applied here is the ResNet50 which is pre-trained on the imagenet dataset. The pre-trained model is then fine-tuned for our dataset and it is used for deep feature extraction of the X-ray images. Thus, we applied the Deep learning architecture for feature extraction of images. The extracted features are then classified by the linear classifier SVM which predicts whether an X-ray image is COVID affected patients or not.

RESULTS AND DISCUSSION
It has 25 X-ray images of COVID affected patients and 25 X-ray images of normal people. As the size of the dataset is small, we performed data augmentation to obtain reliable results. The transfer learning models produced the accuracy results of 80% for Inception V3, 98% for VGG16 and 60% for ResNet50. The accuracy has been improved by the hybrid architecture ResNet50+SVM which gave the exceptionally reliable results of 100% accuracy for detection of COVID-19 The ResNet50 model gave a good training accuracy of 94% in less than 15 epochs whereas it could not perform well with the validation set yielding an average accuracy of 60%. It is also evident that the validation loss has been gradually increasing whereas the training loss has seen a stable decrease in the value. Compared to the ResNet50 model, the Inception V3 architecture model gave better results of training accuracy of 98% and a decent validation accuracy of 80%. However, it reached its highest accuracy point after 15 epochs. The training loss has decreased gradually from the initial value of 0.8 to its optimal and lowest value of 0.2. However, the validation loss curve is highly unstable, and it gets fluctuated after every 5 epochs. Of all the models trained using the transfer learning methodology, the architecture of VGG16 gave the best results for this dataset compared to all the other models implemented. It gave the highest training accuracy of 100% and the validation accuracy of 98%, which is significantly higher than that of other architecture ICMECE 2020 IOP Conf. Series: Materials Science and Engineering 993 (2020) 012046 IOP Publishing doi:10.1088/1757-899X/993/1/012046 6 models. The loss curves of both training and validation have seen a gradual decrease following a specific pattern. With the application of Incremental learning where the ResNet50 architecture is used for deep feature extraction and the linear classifier SVM is used for classification, the results are highly accurate compared to all the methods used in our study. It gave the highest and best validation accuracy of 100% and the training accuracy of 100%. It is evident from the ROC curve that the Area under the curve value is 1.0 and the sensitivity and specificity is both equal to 1.0. Thus, our proposed model ResNet50+SVM gives highly accurate and reliable predictions of detecting the COVID in patients. Now the model is trained and tested on the custom dataset and the comparison is done with the VGG16. The second dataset used in our study consists of a combination of chest X-ray images of Covid patients and the chest X-ray pneumonia dataset from Kaggle. Thus, the second dataset consists of 3 classes: Normal, pneumonia and COVID. The better accurate models VGG16 and ResNet50+SVM are applied in this use case to further test their effectiveness.

CONCLUSION
In this paper, we proposed a hybrid classifier which is a balanced combination of both Deep learning and Machine Learning which were implemented for feature extraction and detection of disease based on the medical images. Our suggested model is highly accurate when compared with other standard deep learning architectures in terms of accuracy and several other evaluation metrics. The hybrid model ResNet50+SVM could provide accurate results even when the data is either class-balanced or class-imbalanced. In the case of the balanced class data set the result obtained is nearly 100% , of course the size of data set is small and in the case of unbalanced dataset which consists of three different classes the result obtained by the proposed method is 94% which is better than the VGG16 whose accuracy results are 84%. Thus, this model can be applied on any kind and size of dataset related to domains which require highly accurate decisions. However, the accuracy of the model in case of second dataset can be further enhanced by applying class balancing methods of Machine Learning. Thus, our model could give promissory results in detecting the COVID disease with the application of Artificial Intelligence.