Transfer Learning with Deep Convolutional Neural Network (CNN) for Pneumonia Detection using Chest X-ray

Pneumonia is a life-threatening disease, which occurs in the lungs caused by either bacterial or viral infection. It can be life-endangering if not acted upon in the right time and thus an early diagnosis of pneumonia is vital. The aim of this paper is to automatically detect bacterial and viral pneumonia using digital x-ray images. It provides a detailed report on advances made in making accurate detection of pneumonia and then presents the methodology adopted by the authors. Four different pre-trained deep Convolutional Neural Network (CNN)- AlexNet, ResNet18, DenseNet201, and SqueezeNet were used for transfer learning. 5247 Bacterial, viral and normal chest x-rays images underwent preprocessing techniques and the modified images were trained for the transfer learning based classification task. In this work, the authors have reported three schemes of classifications: normal vs pneumonia, bacterial vs viral pneumonia and normal, bacterial and viral pneumonia. The classification accuracy of normal and pneumonia images, bacterial and viral pneumonia images, and normal, bacterial and viral pneumonia were 98%, 95%, and 93.3% respectively. This is the highest accuracy in any scheme than the accuracies reported in the literature. Therefore, the proposed study can be useful in faster-diagnosing pneumonia by the radiologist and can help in the fast airport screening of pneumonia patients.


Introduction
Pneumonia is considered the greatest cause of children in all over the world. Approximately 1.4 million children die of pneumonia every year, which is 18% of the total children died at less than five years old [1]. Globally overall two billion people are suffering from pneumonia every year [1].
Pneumonia is a lung infection, which can be caused by either bacteria or viruses. Luckily, this bacterial or viral infectious disease can be well treated by antibiotics and antivirals drugs.
Nevertheless, faster diagnosis of viral or bacterial pneumonia and consequent application of correct medication can help significantly to prevent deterioration of the patient condition which eventually leads to death [2]. Chest X-rays are currently the best method for diagnosing pneumonia [3]. X-ray images of pneumonia is not very clear and often misclassified to other diseases or other benign abnormalities. Moreover, the bacterial or viral pneumonia images are sometimes miss-classified by the experts, which leads to wrong medication to the patients and thereby worsening the condition of the patients [4][5][6]. There are considerable subjective inconsistencies in the decisions of radiologists were reported in diagnosing pneumonia. There is also a lack of trained radiologist in low resource countries (LRC) especially in rural areas. Therefore, there is a pressing need for computer aided diagnosis (CAD) systems, which can help the radiologists in detecting different types of pneumonia from the chest X-ray images immediately after the acquisition.
Currently many biomedical complications (e.g., brain tumor detection, breast cancer detection, etc.) are using Artificial Intelligence (AI) based solutions [7-10]. Among the deep learning techniques, convolutional neural networks (CNNs) have shown great promise in image classification and therefore widely adopted by the research community [11] Deep Learning Machine learning techniques on chest X-Rays are getting popularity as they can be easily used with low-cost imaging techniques and there is an abundance of data available for training different machine-learning models. Several research groups [1,[12][13][14][15][16][17][18][19][20][21][22][23][24][25] have reported the use of deep machine learning algorithms in the detection of pneumonia, however only an article [16] has reported the classification of bacterial and viral pneumonia.
There are many works where the authors tried varying the parameters of deep layered CNN for pneumonia detection. The pattern of diffuse opacification in the lung radiograph due to pneumonia can be alveolar or interstitial. Patients with alveolar infiltration on the chest radiograph, especially those with lobar infiltrates, have laboratory evidence of a bacterial infection [26]. Again, interstitial infiltrations on radiograph may be linked with viral pneumonia [27]. These might be the distinctive features in the machine learning algorithms in differentiating viral and bacterial pneumonia. Some researchers have promising results such as Liang et al. [18], Vikash et al. [28], Krishnan et al. [29] and Xianghong et al. [30]. Some groups have combined CNN with rib suppression and lung filed segmentation for other classification tasks while some highlighted a visualization technique in CNN to find the region of interest (ROI) that can be used to identify pneumonia and distinguish between bacterial and viral types in pediatric Chest X-rays. Concept of transfer learning in deep learning framework was used by Vikash et al. [28] for the detection of pneumonia using pre-trained ImageNet models [31] and their ensembles. A customized VGG16 model was used in Xianghong et al. [30], which consists of two parts, lung regions identification with a fully convolutional network (FCN) model and pneumonia category classification using a deep convolutional neural network (DCNN).
A dataset containing 32,717 patients' X-rays were used in a deep learning technique by Wang et al.
[13] further went on using data augmentation along with CNN to get better results by training on small set of images. Rajpurkar et al. [32] reported a 121-layer CNN on chest X-rays to detect 14 different pathologies, including pneumonia using an ensemble of different networks. Jung et al. [33] used a 3D Deep CNN with shortcuts and dense connections which help in solving the gradient vanishing problem. Jaiswal et al. [17] and Siraz et al. [34] used a deep neural network called Mask-RCNN for segmentation of pulmonary images. This was used along with image augmentation for pneumonia identification, which was ensembled with RetinaNet to localize pneumonia. A pretrained DenseNet-121 and feature extraction techniques were used in the accurate identification of 14 thoracic diseases in [16]. Sundaram et al. [35] used AlexNet and GoogLeNet [36] with data augmentation to obtain an Area Under the Curve (AUC) of 0.94-0.95. The same authors had better results of AUC of 0.99 using modified two-network ensemble architecture.
The highest accuracy reported in the above-mentioned literatures in classifying the normal vs pneumonia patients and bacterial vs viral pneumonia using X-ray images using deep learning algorithms was 96.84% and 93.6% respectively. Therefore, there is significant room for improving the result either by using different deep learning algorithms or modifying the existing outperforming algorithms or combining several outperforming algorithms as an ensemble model to produce a better classification accuracy particularly in classifying viral and bacterial pneumonia.
The proposed study is reporting a transfer learning approach using four different pre-trained network architectures (AlexNet, ResNet18, DenseNet201, and SqueezeNet) and analyzed their performances. The key contribution of this work is to provide a CNN based transfer-learning approach using different pre-trained models to detect pneumonia and classify bacterial and viral pneumonia with higher accuracy compared to the recent works. In addition, the paper also provides the methodological details of the work, which can be utilized by any research group to take the benefit of this work. Moreover, radiographic findings are poor indicators for the diagnosis of the cause of pneumonia until now. So the motivation of the present study was to utilize the power of machine learning, firstly to diagnose pneumonia by analyzing radiograph and secondly to differentiate viral and bacterial pneumonia with better accuracy.
The rest of the paper is divided in the following section: Section 2 summarizes different pre-trained networks used for this study and Section 3 describes the methodology used in the study, where the details of the dataset and the pre-processing steps to prepare the data for training and testing are provided. Section 4 provides the results of the classification algorithms, which is compared with some other recent studies while results are discussed in Section 5 and finally, the conclusion is presented in Section 6.

Convolutional Neural Networks (CNNs)
As discussed earlier, CNNs have been popular due to their improved performance in image classification. The convolutional layers in the network along with filters help in extracting the spatial and temporal features in an image. The layers have a weight-sharing technique, which helps in reducing computation efforts [37] [38]. Architecture wise, CNNs are simply feedforward artificial neural networks (ANNs) with two constraints: neurons in the same filter are only connected to local patches of the image to preserve spatial structure and their weights are shared to reduce the total number of the model's parameters.
A CNN consists of three building blocks: i) Convolution layer to learn features, ii) Max-Pooling (subsampling) layer is to down sample the image and reduce the dimensionality and thereby reduction in computational efforts, and iii) Fully connected layer to equip the network with classification capabilities [39]. The architectural overview of CNN is illustrated in Figure 1.

Deep Transfer learning
CNNs typically outperforms in a larger dataset than a smaller one. Transfer learning can be useful in those applications of CNN where the dataset is not large. The concept of transfer learning is shown in the figure 2, where the trained model from large dataset such as ImageNet [40] can be used for application with comparatively smaller dataset. Recently transfer learning has been successfully used in various field applications such as manufacturing, medical and baggage screening [42][43][44]. This removes the requirement of having large dataset and also reduces the long training period as is required by the deep learning algorithm when developed from scratch [45,46].

AlexNet
AlexNet can classify more than 1000 different classes using deep layers consisting 650k neurons and 60 million parameters. The network is made up of 5 convolutional layers (CLs) with three pooling layers, 2 fully connected layer (FLCs) and a Softmax layer [11]. The dimension of input image for the AlexNet has to be 227×227×3 and the first CL converts input image with 96 kernels sized at 11×11×3 having a stride of 4 pixels, which is the input to second layer [49] and the remaining details are summarized in Figure 3.

ResNet18
ResNet which is originated from Residual Network (Figure 4), was originally developed to two problems such as vanishing gradient and degradation problem [47]. Residual learning tries to solve both these problems. ResNet has three different variants: ResNet18, ResNet50 and ResNet101 based on the number of layers in the residual network. ResNet was successfully used in biomedical image classification [48] for transfer learning. In this paper, we have used ResNet18 for the pneumonia detection. Typically, deep neural network layers learn low or high level features during training while ResNet learns residuals instead of features.    [50]. Each layer in DenseNet (as shown in Figure 5) has direct access to the original input image and gradients from the loss function. Therefore, the computational cost significantly reduced, which makes DenseNet a better choice for image classification.

SqueezeNet
SqueezeNet [51] is another CNN, which was trained using ImageNet database [51]. The SqueezeNet was trained with more than 1 million images and it has 50 times fewer parameters than AlexNet. The foundation of this network is a fire module, which consists of Squeeze Layer and Expand layer. The Squeeze layer has only 1 × 1 filters, which is feeding to an Expand layer than has a mixture of 1 × 1 and 3 × 3 convolution filters [51] ( Figure 6). We have used SqueezeNet pre-trained model to detect pneumonia and classify bacterial and viral pneumonia in this research.

Dataset
In this work, kaggle chest X-ray pneumonia database was used, which is comprised of 5247 chest X-ray images with resolution varying from 400p to 2000p [53]. Out of 5247 chest X-ray images, 3906 images are from different subjects affected by pneumonia (2561 images for bacterial pneumonia and 1345 images for viral pneumonia) and 1341 images are from normal subjects (Table 1). Mixed viral and bacterial infection occurs in some cases of pneumonia. However, the dataset used in this study does not include any case of viral and bacterial co-infection. This dataset was segmented into training and test set.  Figure 7 shows two sample for normal, bacterial and viral pneumonia chest X-ray images.  In this study, MATLAB (2019a) was utilized to train, evaluate and test different algorithm. Figure 8 illustrates the overview of the methodology of this study. Image sets undergo some pre-processing steps and data augmentation and then training using pre-trained algorithms: AlexNet, ResNet18, DenseNet201, and SqueezeNet and tested all the algorithms on the test dataset.
The training of the different models was carried out in a computer with Intel© i7-core @3.6GHz

Preprocessing
One of the important steps in the data preprocessing was to resize the X-Ray images as the image input for different algorithms were different. For AlexNet and SqueezeNet, the images were resized to 227×227 pixels whereas for ResNet18 and DenseNet201, the images were resized to and 224×224 pixels. All images were normalized according to the pre-trained model standards.

Data augmentation
As discussed earlier, CNNs work better with large dataset. However, the size of the working database is not very large. There is a common trend in training deep learning algorithms to make the comparatively smaller dataset to a large one using data augmentation techniques. It is reported that  Original image was horizontally translated by 10% and vertically translated by 010%.

Visualization of the Activation Layer
We investigated the features of the image by observing which areas in the convolutional layers activated on an image by comparing with the matching regions in the original images. Each layer of a CNN consists of many 2-D arrays called channels. The input image was applied to different networks and the output activations of the first convolution layer was examined.
The activations for different network models is shown in Figure 10.  Figure   10 shows the activation map in early convolutional layers, deep convolutional layer and strongest activation channel for each of the models.

Different Experiments
Three different forms of performance evaluations and comparisons were carried out in this study: two classes (normal and pneumonia), three classes (normal, bacterial pneumonia and viral pneumonia), and two classes (bacterial pneumonia and viral pneumonia) classification using four different deep learning algorithms through transfer learning.
The experiment carried out in this study consists of three steps. In the first step, the dataset divided into normal and pneumonia. Second step, the dataset divided into normal, bacterial and viral pneumonia. Last step, the dataset divided into bacterial and viral pneumonia. An end-to-end training approach was adopted to classify normal, bacterial and viral pneumonia images.

Performance Matrix for Classification
Four CNNs were trained and evaluated using five fold cross-validation in this work. The performance of different networks for testing dataset is evaluated after the completion of training phase and was compared using six performances metrics such as-accuracy, sensitivity or recall, Specificity, Precision (PPV), Area under curve (AUC), F1 score. Table 3 shows six performance metrics for different deep CNNs:

Results and Discussions
The comparative performance of training and testing accuracy for different CNNs for classification schemes were shown in Figure 11. It can be noted that for three classification schemes DenseNet201 is producing the highest accuracy for both training and testing. For normal and pneumonia classification, the test accuracy was 98%, while for normal, bacterial and viral pneumonia classification, it was 93.3%, and for bacterial and viral pneumonia classification, it was found to be 95%. Figure 12 shows the area under the curve (AUC) /receiver-operating characteristics (ROC) curve (also known as AUROC (area under the receiver operating characteristics)) for different classification schemes, which is one of the most important evaluation metrics for checking any classification model's performance. This is also evident from Figure 12 that DenseNet201 outperforms the other algorithms.     Table 5 summarized the comparison of others works in pneumonia and types of pneumonia detection from chest X-ray images. It is evident that DesneNet201 exhibits the highest accuracy than all the recent works in the best of our knowledge, which could be useful in developing a prototype that can automatically classify results into normal, bacterial and viral pneumonia. Training the network using a larger database and working on an ensemble of the pre-trained CNN algorithms might increase the detection accuracy, which can be done as a future work.

Conclusion
This there is a large degree of variability in the input images from the X-ray machines due to the variations of expertise of the radiologist. DenseNet201 exhibits an excellent performance in classifying pneumonia by effectively training itself from a comparatively lower collection of complex data such as images, with reduced bias and higher generalization. We believe that this computer aided diagnostic tool can significantly help the radiologist to take clinically more useful images and to identify pneumonia with its type immediately after acquisition. This fast classification will open up other avenues of application for this CAD tool, more particularly in the airport screening of pneumonia patients.