A Survey of the Application of Artifical Intellegence on COVID-19 Diagnosis and Prediction

The importance of classification algorithms has increased during the recent years. Classification is a branch of supervised learning with the goal of predicting class labels categorical of new cases. Additionally, with Coronavirus (COVID-19) propagation since 2019, the world still faces a great challenge in defeating COVID-19 even with modern methods and technologies. This paper gives an overview of classification algorithms to provide the readers with an understanding of the concept of the state-of-the-art classification algorithms and their applications used in the COVID-19 diagnosis and detection. It also describes some of the research published on classification algorithms, the existing gaps in the research, and future research directions. This article encourages both academics and machine learning learners to further strengthen the basis of classification methods. Keywords-artificial intelligence; machine learning; deep learning; classification algorithms; COVID-19; medical image


INTRODUCTION
Machine learning is one of the fastest growing fields in Artificial Intelligence (AI). Classification algorithms refer to the problem of predictive modeling, in which the class classification is predicted for a given example of input data. Many machine learning algorithms are used. This paper will consider k-Nearest Neighbor (KNN), Support Vector Machine (SVM), Neural Networks (NNs), Decision Tree (DT), Logistic Regression (LR), and Random Forest (RF). Machine learning is divided into 3 main fields based on the learning style: supervised learning, unsupervised learning, or reinforcement learning. This research will present an overview of the classification algorithms and their application in predicting COVID-19 cases, focusing on the strengths and weaknesses of each algorithm. This research surveys the machine learning classification algorithms used to predict and diagnose COVID-19. During the COVID-19 pandemic, in May 2020, authors in [4] provided a pre-tuned boosted RF classification algorithm to help the medical industry overcome the gaps of the traditional healthcare system through applying machine learning algorithms in terms of real-time processing and prediction. Authors in [2] used a novel dataset with different algorithms. After comparing the results, the boosted RF classifier was identified as a good model for the problem. It achieved an accuracy of 94% and an F1 score of 0.86. Authors in [5] provided a comprehensive survey of various classifications by introducing basic classification algorithms and discussing some of the main classification methods, including Bayesian networks, KNN classification, RF extrapolation, and SVM. They also discussed the problems and potential solutions associated with each. Authors in [6] focused on implementing AI and machine learning in healthcare by providing better solutions with a high rate of accuracy to overcome the challenges presented by COVID-19. They presented a review of the previous studies that applied AI algorithms and techniques to tackle COVID-19, such as support vector regression algorithms, supervised multi-layered classifiers (XGBoost), deep learning LSTM networks, and regression trees. They presented the results associated with the pros and cons for each suggested model. Early detection of COVID-19 is essential. Previous studies have shown that in initial COVID-19 screenings and in all patient-care environments, blood tests are relatively rapid, cheap, and simple to conduct. Authors in [7] analyzed the stateof-the-art COVID-19 detection techniques, and clinical data were provided in order to encourage researchers to develop better disease prediction modeling. Authors in [8]   . Sixty newspaper articles, reports, factsheets, and websites dealing with COVID-19 were studied and reviewed. Most mathematical models were based on Susceptible-Infected-Recovered (SIR) models and Susceptible-Exposed-Infected-Removed (SEIR) models, while Convolutional Neural Networks (CNNs) was the main AI implementations used with X-ray and CT images. Mathematical modeling and AI have proved to be reliable tools to mitigate this pandemic.
III. RESEARCH METHODOLOGY For searching and selecting previous studies, the Saudi Digital Library (SDL), IEEE Xplore, ScienceDirect, Google Scholar databases were consulted. These databases are considered appropriate for COVID-19 prediction and cover the newest literature. The research studies retrieved from these databases are relevant and guaranteed to comprehend the purpose of AI systems in COVID- 19. The databases were searched in a 2-year period between 2019 and 2021. The following keywords were utilized to direct the process of searching: classification algorithms, classification applications, classification review, machine learning, coronavirus, and COVID-19. Only modern studies and articles containing relevant information about the application of classification techniques concerning COVID-19 were selected and extracted. All publication types that were published in conferences, journals, and review papers were selected. The primary result of the search process identified 47 articles that met the review's eligibility rules, which are the novelty of the paper and the use of AI. After examining the full texts, 40 were selected and 4 articles were excluded because other methods were used in diagnosing COVID-19 instead of AI. It is difficult to present a comprehensive review of all machine learning classification methods in one article; consequently, this survey only focused on commonly used classification techniques.

IV. CLASSIFICATION ALGORITHMS
Classification algorithms are the most widely used techniques in the machine learning field. In general, the models in machine learning are divided into 3 general approaches based on their learning style: supervised learning, unsupervised learning, and reinforcement learning. The classification is a way of learning as the machine learns how to assign the labels to each class of the data. There are several models of classification algorithms, such as LR, DT classifier, SVM, NNs, KNN, and RF. Table I shows a quick comparison of all the mentioned classification algorithms in terms of their advantages and disadvantages.

A. Support Vector Machine
SVM classification has become popular due to its impressive performance, which is comparable to the performance of advanced NNs trained in complex tasks, using high computing cost algorithms. SVMs were originally designed as an effective approach for pattern recognition and classification [9]. Immediately after their introduction, these algorithms were used for some classification problems and applications, such as speech recognition [10], computer vision, and image processing [11]. The SVM algorithm is primarily used for classification techniques. SVM generates a hyperplane that separates data into various classes. SVM can resolve either linear or non-linear issues. The main objective of the SVM is to discover a hyperplane in N-dimensional space (N matches the number of features), which clearly classifies the datasets or points of data. The accuracy of the outcome is directly related to the selected hyperplane. It discovers a plane with the maximum distance among the data points of two classes. The hyperplane is illustrated graphically as a line that divides one class from another. The data points are on different sides of the hyperplane and are assigned to different classes. The hyperplane dimension is dependent on the feature quantity. If the input is two features, then the hyperplane is a line and a 2D plane if the true of input is three. The SVM is also utilized in medical diagnosis to detect anomalies and in air quality management systems and financial analyses.

B. K-Nearest Neighbor
The KNN algorithm supposes that similar objects are close to one another. This algorithm is one of the most powerful supervised classification algorithms. KNN classification models are often used for searching the nearest neighbor class to predict the target name or label. The KNN is a simple algorithm used to save all available cases and classify new cases according to a measure of similarity, which is measured by a distance function (Manhattan, Euclidean, Minkowski, or Hamming). Nearest neighbor techniques are categorized as structureless KNN techniques and structure-based KNN techniques. Structure-based KNN techniques are based on structures of data, such as the Orthogonal Structure Tree (OST), axis tree, k-d tree, and ball tree. Additionally, data and site patterns suggesting suspicious behavior are analyzed by KNN algorithms.

C. Neural Networks
More accurately Artificial Neural Networks (ANNs). An ANN is a series of neurons associated with synapses that mimic the human brain structure and has several components of processing. The human brain, however, is much more complicated. There are neurons or nodes in a neural network. Each of these neurons accepts data, processes them, and transfers them to a different neuron. The various units are communicating by transmitting signals to each other. An ANN is constructed as a directed graph with nodes and edges that link the nodes. The edges of each node are the interconnections. NNs can essentially be used for any task, from spam filtering to computer vision. They are typically utilized for machine translation, identification of anomalies and risk management, as well as language and face recognition.

D. Logistic Regression
LR is a machine-learning algorithm that predicts the likelihood of the response variables by given a set of explanatory independent variables. It is a supervised binary classification algorithm [12]. The response variables are coded as binary values, 0 and 1, and based on the values of the binary target variables, the data will be classified. The LR types are binary or binomial, multinomial, and ordinal. The values of the target variable in the binary LR have two possible types, 0 and Multinomial LR is used when the response variable has an unordered 3 or more possible values to represent the data, such as class types A, B, C, and D. The ordinal LR is the classification in which the response variable has 3 or more ordered values, such as student's scores (high, middle, and low). The LR algorithm has a solid statistical and mathematical background, but it is sensitive to outliers and prone to multicollinearity.

E. Decision Tree
The DT classifier is a supervised machine learning algorithm that builds a tree structure to classify the labeled data based on a condition. The root node is the topmost node in the tree, and the leaf nodes represent the outcomes, where the edges connect the nodes to the leaf nodes.

F. Random Forest
RF is a classification and regression model, however, it is usually used in classification problems. It is a combination of random DTs selected to find the average of the best predictions [13]. To measure the best separates of the data, the RF uses the most voted class approach to select the best solutions [14]. This algorithm is used in different fields in real life applications, such as gene selection, finance analysis, stock market, ecommerce, and medical diagnoses. -Problems in selecting the right kernel function.
-Datasets have different results based on the kernel function.
-Not optimal for non-linear situations or problems, not the best option for a huge number of features.
KNN -Simple to understand.
-Fast and efficient. Sensitive to noise; the k number of neighbors should be picked manually.

NN
-Identifies complex associations between independent and reliable variables efficiently.
-Capable of managing noisy data.
-Prone to local minima and overfitting.
-ANN processing is complicated to interpret and demands a long processing time.
LR -Reasonable accuracy for datasets and works perfectly if the data set is separated linearly.
-Cannot solve non-linear problems, since it has a linear decision surface.
-In real-world situations, linearly separable data are rarely found.
DT -Easily processing large dimension data Interpretability works for both linear and nonlinear tasks, with no need for scaling or normalizing features.
Poor performance, and overfitting can easily occur with very small datasets.

RF
-Powerful and accurate.
-Good performance on a variety of problems.
-Need to choose the number of trees manually.

COVID-19 DETECTION AND DIAGNOSIS
In coronavirus research, diagnosis is an essential preventive phase because it has a similar appearance to other forms of pneumonia. Consequently, the discovery of COVID-19 is critically important and vital in its initial stages. Different diagnostic methods for COVID-19, including a range of techniques of medical imagery, blood tests (CBCs), and PCR, have been suggested. The World Health Organization states diagnoses of COVID-19 disease have to be checked with reverse transcriptase-polymerase chain reaction (RT-PCR) tests [15]. According to the U.S. Food and Drug Administration [39], there are 2 main types of tests that have been used to diagnose the virus, RT-PCR and imaging methods such as Xray and CT-scanning [16,17]. RT-PCR testing, however, takes time and this can be dangerous. For this reason, the initial detection of COVID-19 is often first conducted with medical imaging and the RT-PCR test is then carried out. The latest results from radiological imagery indicate that these images provide valuable details on the COVID-19 virus. The use of advanced AI technology combined with radiological imaging can be useful to accurately diagnose this disease. X-ray imaging has the benefits of being cheap and of low risk for human health [25,26]. In an X-ray, it is a relatively complex process to detect COVID-19. In these photos, the radiologist should carefully note the very long and problematic spots carrying water and pus. Therefore, illnesses such as pulmonary tuberculosis may be erroneously identified by a radiologist or a doctor as COVID-19. So, the X-ray method has a big error rate. Compute Tomography (CT) images are often utilized to detect the virus more precisely. However, CT images are much more costly for patients than X-rays. Detecting the effect of different respiratory system diseases such as ARDS, streptococcus pneumonia, chlamydia pneumonia, cavitating pneumonia, pneumococcal pneumonia, aspiration pneumonia, and COVID-19 on human lungs by using X-ray imaging techniques can be challenging as these diseases have the same impact on human lungs.
In raw Chest X-Ray (CXR) samples using deep NNs, authors in [18] implemented a personalized network (DarkCovidNet) for the automatic detection of COVID-19. The proposed model provides accurate classifications concerning Binary (COVID vs. No-Findings) and Multi-class (COVID vs. No-Findings vs. Pneumonia) classification with outstanding performance precision, 98.08% for binary-and 87.02% for multi-classification. Multi-classification demonstrated that the expert system can be applied to assist the radiologists in validating the examination process quickly and accurately.
Coronavirus causes different symptoms such as cough, fever, dyspnea, musculoskeletal, and anosmia/dysgeusia compared to the common cold, flu, and influenzas [19]. Authors in [20] proposed a method for binary classification and multi-classification problems by using deep learning CNN networks for COVID-19 detection. The model attains 99.6% accuracy on X-ray images and 71.81% on CT-scan images. Authors in [21] proposed a method that used the Decompose, Transfer, and Compose (DeTraC) technique to predict and classify the chest images of COVID-19 patients. DeTraC deals with image recognition and classification successfully. The authors applied the CNN DeTraC in an X-ray images dataset gathered from many hospitals around the world. The model was trained in 3 steps: first was the extraction of the deep features of the images. Second, the optimization step is done by using Stochastic Gradient Descent (SGD). Finally, the images were classified by using the last layer of the DeTraC network. The model achieved a good performance with 95.12% accuracy, 97.91% sensitivity, and 91.87% specificity. On the other hand, authors in [22] proposed a new deep CNN framework to detect and classify the patient's X-Ray images as positive or negative cases. The proposed framework, COVIDX-Net, includes 7 different CNN layers built by 7 models such as VGG16, InceptionV3, V3, DenseNet, and Google MobileNet to analyze and detect viruses. The COVIDX-Net achieved a good performance in VGG19 and DenseNet with F1-scores of 0.89 and 0.9, respectively. Furthermore, authors in [23] introduced another CNN model to detect the confirmed cases of the coronavirus. Authors in [17] used machine-learning for the classification of COVID-19 and non-COVID lung scan slices with 89.8% classification accuracy. Authors in [24] used chest X-ray images for COVID-19 diagnosis, and their classification accuracy was 99.56% for the disease class. Authors in [27] presented an early COVID-19 detection using SVM. The algorithm has applied to CT of abdominal images. From 150 CT images, 4 various dimensional datasets were generated and 4 different methods were employed for extracting chest CT picture features. Then, SVM was utilized to distinguish COVID-19 patients. During the classification process, 10 times cross-validation was used. The achieved classification accuracy was 99.68%.
In coronaviruses detection and prediction, deep learning with a CNN is one of the most accurate solutions. Diagnosing the infected patients currently is done by either CT-scan or Xray chest imaging. Authors in [28] used a dataset provided by the Kaggle to develop a COVID-19 detector, the dataset contained Chest X-Ray (CXR) images of COVID-19 infected patients and of uninfected people. The proposed CNN-based COVID-19 detector was trained on augmented data and achieved high accuracy in the prediction of the patients' needs in the next 7 days. In addition, the prediction of the number of deaths, new confirmed cases and recovered cases was 94.18%, 99.94%, and 90.29% respectively. Implementing the VGG16 CNN in the detector enhanced the model's accuracy. Authors in [29] introduced a new open-source network called COVID-Net. COVID-Net is a deep CNN to detect the COVID-19 positive cases by analyzing CXR images. COVIDx is the largest open-source dataset of COVID-19 patient's CSR images that have been applied to train and assess the suggested COVID-Net. The model achieved good test accuracy equal to 93.3%. Authors in [30] introduced a technique of learning pipeline and multi-view representation of COVID-19 classification utilizing different types of features extracted from CT images. 2522 CT images were used (1495 for COVID-19 patients, and 1027 for Community-Acquired Pneumonia (CAP)). The classification was carried out with different machine learning models, i.e. LR, SVM, KNN, Gaussian-Naive-Bayes classifier, and NNs. The proposed method outperformed the examined machine learning models with accuracy of 95.5%, sensitivity of 96.6%, and specificity of 93.2%.
For the selection of features and classification of COVID-19, authors in [31] suggested two optimization algorithms. There are 3 cascade phases in the proposed structure. Initially, features are being extracted using a CNN called AlexNet from the CT scans. Then, the proposed feature selection algorithm is implemented, a guided Whale Optimization Algorithm (WOA) based on Stochastic Fractal Search (SFS). The selected features were then balanced. Finally, using voting classification and guided WOA based on Particle Swarm Optimization, different classifiers' predictions were collected to determine the most voted class. For evaluating their proposed model, two data sets were used with positive and negative COVID-19 clinical CTimages. The proposed voting classification (PSO-guided-WOA) was 99.5%, higher than the other voting classifiers. Authors in [32] proposed a new architecture for COVID-19 prediction consisting of 3 principal manners: random selection of the region of interest, training of the CNN network to obtain features, and model training on a fully connected network and prediction of various classifiers. This method achieved a classification accuracy of 82.9% utilizing the modified Inception (M-Inception) deep model developed using CT images. Authors in [33] created a large dataset of CT scan and X-ray images from multiple sources and presented two CNN approaches to classify them as having COVID-19 or not, which are transfer learning AlexNet and a simple CNN architecture. They achieved 98% accuracy via the pre-trained model and 94.1% by using the modified CNN. Authors in [34] reviewed the possibility of ML models for diagnosis corona virus using X-Ray images. They used LR and CNN and Generative Adversarial Network (GAN) for data augmentation in order to overcome the overfitting problem. The model gives 95.2-97.6% accuracy without PCA and 97.6-100% with PCA.
During the coronavirus period, the collected data were often heterogeneous. Their features were categorical and numerical, or specific categorical such as the occurrence of cough, while the numerical features are quantitative, for example the temperature of the body. It is difficult to process these two forms. Moreover, this information may be incomplete as certain features have missing values. Incompleteness and heterogeneity are major challenges in identifying COVID-19 cases for data collected from many people. Heterogeneity and missing values are not directly handled by the common classification algorithms. To deal with these two issues, authors in [34] proposed the KNN Variant (KNNV) classification algorithm for incomplete and heterogeneous datasets as tools for detecting COVID19 cases accurately and efficiently. The two key concepts of the proposed algorithm are that the parameter K is chosen adaptively and that it calculates the  [36] introduced a study of analyzing X-ray images of patient's and healthy person's lungs to compare and detect the level of infection on the lungs, and then classify it into level-based effect. Because of the complexity and the limited access to X-ray images, complexity-based theory was used to analyze the images.
During the COVID-19 epidemic, classifying the patients into infected or healthy is critical to prevent the spread of the virus. Authors in [37] suggested a technique based on deep learning by utilizing an SVM classifier to detect and classify the X-ray images of the patients' lungs as infected or healthy. Due to the limitation of diagnostic kits, the diagnosis of the disease has been used by X-ray imaging machines, as they are widely available in clinics and hospitals. The proposed system, based on deep CNNs, was used to analyze the X-ray image datasets. The system was trained on two datasets, namely one dataset of positive and one of negative cases. The datasets have been used to extract the images' features based on deep architectures such as GoogleNet, ResNet50, and Inception V3. The combination of the SVM classifier, the open-source dataset, and deep feature extraction techniques achieved accuracy of 95.38%. Authors in [38] proposed a COVID-19 classification by using CT chest images to classify the patients into two groups, +Ve (infected) and -Ve (not-infected). The model, called Multi-Objective Differential Evolution (MODE), with CNN has been implemented in a CT-chest images dataset, but the deep CNN showed some hyperparameter-tuning issues. However, the image features have been extracted by implementing many convolutions and the max-pooling layer was used to minimize the spatial size. The suggested model showed a good rate of accuracy, F-measure, and Kappa with values of 97.89%, 2.0928%, 1.9276% respectively.
Authors in [40] introduced a deep learning-based transfer model to diagnose COVID-19 cases by using CT-scan and CXR images. The algorithm was trained on 3 datasets. The model showed the detection results faster than the RT-PCR method. In addition, they discovered a pattern between the infected patients with COVID-19 and pneumonia. Performing an X-ray or CT-scan in an infected lung a shadowy area called Ground Glass Opacity (GGO) will appear. Applying deep learning led to fast and efficient detection results. The aim of the model is the binary classification of the chest images in a fast and accurate way. Image classification was conducted and 3 experiments were proposed and applied on the dataset. The model achieved accuracy of 95.61%. Authors in [7] established a community learning model known as ERLX in routine blood tests diagnostics for COVID-19. The proposed model used structural diversity by 2 classificatory levels. To improve predictive capabilities, the prediction from the first classification (extra trees, RF, LR) was supplied to the second classifier (XGBoost). A series of steps in data preparation were performed using the KNN algorithm to control the null values of the data collection. Isolation Forest (iForest) was used to eliminate outlier data, and SMOTE to equalize data distribution. By comparing the findings with the state-of-the-art studies available for a publicly available dataset from the Albert Einstein Hospital in Brazil, the efficacy and reliability of the model for diagnosing COVID-19 was demonstrated. Ensemble models are more efficient, robust, and flexible since diversity is the fundamental guiding principle for capture underlying training data structure. The ensemble model achieved outstanding performance with an overall accuracy of 99.88%, sensitivity of 98.72%, and specificity of 99.99%.
Due to the small number of X-rays and CT images with COVID-19, training the machine learning and deep network models is extremely difficult. Therefore, transfer-learning could be an applicable solution that has been extensively adopted in several latterly submitted COVID-19 detection methods. A deep transfer learning-based approach has been suggested in [41] using COVID-19, normal, and viral pneumonia CXR images to detect COVID-19 pneumonia automatically, using deep CNNs. The results show that transfer learning has proven to be successful, robust, and easily deployable for detecting COVID-19 with 98% accuracy. However, the conventional transfer-learning system which uses a developed, pre-trained deep network in the images database to transmit the original information might not be an excellent option, since the features of X-rays and CT-image of COVID-19 are completely different from pictures for other apps. But after this research, researchers gave hope that COVID-19 can be recognized from other infections and normal lung conditions by utilizing CXR imaging. Authors in [42] report a COVID-19 prediction analysis in CXR imagery by applying transfer learning. They compared 4 common deep CNN predictors (ResNet18, ResNet50, SqueezeNet, and DenseNet-161). After training, the models presented an average specificity rate of ~90% and 97.5% sensitivity. Authors in [43] suggested a weakly supervised deep-learning technique to detect and identify CT image infection with COVID-19 using 3D CT volumes for classification of COVID-19 and lesion location. The proposed method can reduce the manual labeling requirements of CT images, but it can still reliably detect infections and draw a distinction between COVID-19 and non-COVID- 19  Authors in [44,45] stated that for computerized COVID-19 pneumonia detection, CT scans are utilized in deep learning systems. Although CT scans contribute more comprehensive information, X-rays are faster, easier to take, less unhealthy, and cheaper. Nevertheless, it is extremely hard to effectively train a very deep network because of the scarcity of COVID-19 X-rays at that time. The accuracy of their research in COVID-19 classification exhibited 96% AUC, 90% sensitivity, and 96% specificity and 99.6% AUC, 98.2% sensitivity, and 92.2% specificity respectively. In terms of analyzing and detecting the radiological images automatically, CNN deep learning techniques are widely used [46]. Usually in deep learning techniques, model training is conducted with large datasets, but due to the limitation of the infected X-ray or CT scan images, the network weights have been fine-tuned from pre-trained networks. Authors in [47] proposed a classification model to classify the chest CT scan of COVID-19 patients as positive (infected) or negative (uninfected). The model was based on Deep Transfer Learning (DTL) with DenseNet201 and CNN. It had 99.82% training accuracy and the achieved testing and validation accuracies were 96.25% and 97.4% respectively.
Authors in [48] proposed a deep DT classifier consisting of 3 stages. Each DT was trained using a deep learning model with a NN based on a PyTorch frame for the detection of COVID-19 in chest X-ray images. The average acquired accuracy was 95%. Authors in [49] utilized a pretrained DenseNet121 model for the classification of segmented 3D lung regions. The dataset consisted of 2724 CT scans from 2617 sufferers. Lung regions were separated via utilizing the 3d Anisotropic Hybrid Network (AH-Net) architecture, achieving 90.8% accuracy, 93% specificity, and 94.9% AUC score. The significance of using AI in the medical domain to diagnose and predict diseases is in making several decisions based on big data analysis. Authors in [50] suggested a deep learning-based model to predict the coronavirus severity level by using CT scan images of infected patients. The model was trained in 303 chest images and was tested in 105 images, achieving 97.4% training and 81.9% testing accuracy.
Based on coronavirus confirmed cases, countries have been classified into categories based on the risk rate. Authors in [51] proposed an AI-guided method to predict the long-term country-specific risk of COVID-19 by the Bayesian optimization guided shallow LSTM. On average the model showed an accuracy of 77.4%. Applying the model during the pandemic posed many challenges such as the limitation of the dataset, data uncertainty, and data confusion. Authors in [52] provided a system based on deep learning to COVID-19 detection using CNN and convolutional (ConvLSTM) with 100% accuracy and F1 score, while authors in [53] obtained 89.5% accuracy, 88% specificity and 87% sensitivity in CT images. Authors in [54] got AUC of 99% and recall of 93% in COVID-19 diagnosis. Authors in [55] used the ResNet-50 deep learning model to classify COVID-19 in CT scans with 96% AUC. Authors in [56] identified coronavirus cases rapidly using the RestNet-100 CNN along with LR achieving accuracy of 99.15%. Authors in [57] proposed COVID-19 pneumonia detection using a small number of COVID-19 CXR images. They optimized the features using the CNN model CONVNet with 98.1% accuracy. Authors in [58] investigated the performance of KNN, DT, NB, SVM, and LR to discover COVID-19 cases. The results were 93% accuracy for NB and DT, 93.60% for SVM, 93.50% for KNN, and 92.80% for LR and they concluded that ML methods have an important use in identifying COVID-19. Authors in [59] proposed a detectionclassification approach to diagnose COVID-19 by examining the patients' lungs using CT-scan and X-ray images. Their approach was compared with the SVM classifier. The comparison was made on the classification accuracy as a performance indicator. The classification-detection model achieved 84% and 75% with CXRs and CT-scans, respectively. VI. COVID-19 IMAGE DATASETS CT scanning and X-ray imaging are the most common used techniques in respiratory disease diagnosis. For decades, radiologists and other healthcare professionals have used them. Discovering COVID-19 early is crucial to prevent its spread by using diagnostic imagery methods. To show the damage of COVID-19 on the patient's lungs, many health and technology centers publish data about COVID-19, pneumonia, and acute respiratory distress syndrome. Authors in [61] produced the first public dataset of the frontal view of X-ray images. The dataset was collected from different open sources by the University of Montreal. It consists of more than 400 CT and Xray images of COVID-19, SARS, MERS-CoV, varicella, influenza, and herpes patients. The dataset is continuously updated by the newest COVID-19 cases through the GitHub link [62]. The dataset in [63] is a public dataset of 5,863 X-Ray images categorized into normal/healthy lungs and pneumoniainfected lungs. Another dataset of chest X-ray images of COVID-19 cases was published in 2020 [64]. The dataset contains X-ray images of COVID-19-infected lungs along with normal lungs in addition to viral pneumonia images. It consists of 3,487 images of normal, COVID-19, and viral patients. There is also a public dataset available on the Kaggle website [65], which contains a mixture of X-ray and CT scan images of patients diagnosed with COVID-19. Patients' images were extracted from public articles in RSNA, Radiopaedia, and SIRM [66]. ChestX-ray8 is a large public database provided in [32]. It consists of frontal-view X-ray images collected from 1992 to 2015. The dataset contains 108,948 images of more than 32,717 patients. Images are labeled and divided among 8 common thoracic pathology diseases. Authors in [67] produced a publicly available dataset of CT scan images of COVID-19 cases. The COVID-CT dataset contains 349 CT scans of confirmed cases, which were collected from relative sources, such as medRxiv, NEJM, JAMA, Lancet, and bioRxiv. The COVID-19 CT lung and infection segmentation dataset [68] contains 100 CT images collected from more than 40 patients with COVID-19. The COVID-19 imaging database is another open-source dataset used by the British Society of Thoracic Imaging [69]. This dataset is used for learning and education purposes and it consists of 59 images of SARS-CoV2 patients.

VII. DISCUSSION
This study focused on the latest studies that used AI applications to confront COVID-19. In imaging-based COVID-19 classification, AI has been successfully applied. However, much work remains to be done. A growing number of different classification and prediction models use AI along with more publicly available datasets. In this paper, many recent research studies were reviewed. These studies involved deep learning and machine learning for COVID-19 classification. Deep learning approaches, such as CNNs, automatically perform the function extraction process. Little research, however, has been done on traditional machine learning models, like KNN, SVM, RF, and DT (see Table II). Most of this research used medical images to diagnose COVID-19 because they provide more details and depict the diseased areas of the body. The AI-based classification of CXRs has the untapped potential to meet this need.
Latest research has shown that compared with CT and Magnetic Resonance Imaging (MRI), CXR-related diagnosis is the most widely used because of its low cost, short therapy time, and low radiation exposure. However, the reviewed studies that used these images have some weaknesses. Some did not involve all age groups or adequately considered the patient's gender. Improving upon this can help facilitate the classification process of machine learning models. Given the small number of CXR and CT pictures available for COVID-19 at that time, the AI models were trained on fewer pictures, which may lead to overfitting in classification models and low accuracy. Deep learning models must be based on large datasets to work efficiently. For this reason, some techniques, such as augmentation, have been used to generalize models and to increase the size of the datasets. Researchers constantly work to update classification models and to increase their reliability and usefulness in real-life circumstances using the available datasets. In addition, the availability of more and more diversified training images would make the creation of strong and scalable classification models easier. AI-based disease classification can be merged with confirmed lab tests. It would be an excellent suggestion to help diagnose and evaluate recovering/infected patients utilizing fast AI-based tools.
Deep learning has become the dominant approach for detecting and diagnosing COVID-19. The image data in COVID-19 applications, however, could be inconsistent and inaccurate, posing a challenge for the creation of an exact segmentation and diagnostic network. For this situation, weakly supervised deep learning approaches may help. Some researchers diagnose COVID-19 cases employing machine learning methods by utilizing deep learning strategies via selecting the features of the images and obtaining high quality results. Finally, all the medical images used in the previous studies were not standardized and from different sources, in addition to the different medical devices that took these pictures. All these factors make a comprehensive comparison difficult.

VIII. CONCLUSION
The negative impact of COVID-19 has increased since December 2019. AI methods and applications in the medical fields showed success in fighting the virus. In this paper, an extensive review of the latest studies in COVID-19 classification and prediction was presented with a description of the frequently used methods and techniques in disease diagnosis, detection, and prediction. The number of contributions in this field is growing exponentially due to their success in preventing the spread of COVID-19. This paper summarized the research papers in classification algorithms and their applications in COVID-19 prediction models with a review of the method's aim in addition to the models' performance. Among the published studies, the use of deep learning in radiology imaging machines improved the models' performance.

IX. FUTURE DIRECTIONS AND CHALLENGES
The researchers in machine learning -related COVID-19 detection may face several challenges, especially when dealing with data provided from different sources and different formats.
• Large-scale training data are scarce and difficult to get.
Many deep learning techniques rely on a large dataset to train models, such as medical images to build an automatic system for prediction [70]. Due to the rapid explosion of COVID-19, the interpretation and labeling of training samples are time-consuming and require expert physicians, which results in insufficient data [71].
• Inappropriate data Diverse online publications have published incorrect reports concerning COVID-19. Because of this problem, the use and reliability of AI-based methods have been reduced.
• Data protection and privacy.
To prevent the spread of COVID-19 and conduct proper contact tracing, patient's personal information is often collected, such as their identity number, contact information, and medical data. Therefore, protecting and maintaining the patient's privacy during treatment is incredibly important.
• Incorrect structured and unstructured data.
Incorrect information in text descriptions and medical images poses a challenge to researchers and machine learning models. A huge amount of data from various sources may be impossible to be used. AI also faces problems, such as dealing with unbalanced datasets, difficulty in screening, triaging patients due to restrictions [72], and social distancing, as well as dealing with poor quality data. Regarding future research directions, AI can be used in remote video investigation and consultations, biological research and knowledge of important protein structures and virus sequences, impact assessment of COVID-19, drug development, patient contact tracking, as well as diagnosis and treatment of COVID-19.  [59] A deep learning-based transfer model to diagnose COVID-19 cases by utilizing CTscan and X-ray images of the human's chest.
The aim is the binary classification of the chest images in a fast and accurate way.