High-precision multiclass classification of lung disease through customized MobileNetV2 from chest X-ray images

In this study, multiple lung diseases are diagnosed with the help of the Neural Network algorithm. Specifically, Emphysema, Infiltration, Mass, Pleural Thickening, Pneumonia, Pneumothorax, Atelectasis, Edema, Effusion, Hernia, Cardiomegaly, Pulmonary Fibrosis, Nodule, and Consolidation, are studied from the ChestX-ray14 dataset. A proposed fine-tuned MobileLungNetV2 model is employed for analysis. Initially, pre-processing is done on the X-ray images from the dataset using CLAHE to increase image contrast. Additionally, a Gaussian Filter, to denoise images, and data augmentation methods are used. The pre-processed images are fed into several transfer learning models; such as InceptionV3, AlexNet, DenseNet121, VGG19, and MobileNetV2. Among these models, MobileNetV2 performed with the highest accuracy of 91.6% in overall classifying lesions on Chest X-ray Images. This model is then fine-tuned to optimise the MobileLungNetV2 model. On the pre-processed data, the fine-tuned model, MobileLungNetV2, achieves an extraordinary classification accuracy of 96.97%. Using a confusion matrix for all the classes, it is determined that the model has an overall high precision, recall, and specificity scores of 96.71%, 96.83% and 99.78% respectively. The study employs the Grad-cam output to determine the heatmap of disease detection. The proposed model shows promising results in classifying multiple lesions on Chest X-ray images.


Introduction
Lung disease affects a lot of people in various ways and is one of the major causes of death around the world.It has been demonstrated that prior lung diseases, such as emphysema, chronic bronchitis, pulmonary fibrosis, and pneumonia, are associated with an increased risk of lung cancer, even in non-smokers [1,2].The probability of contracting lung disease is quite high, particularly in developing and low-middle income regions, where millions of people endure poverty and air pollution.According to the WHO, approximately 4 million early deaths occur yearly due to domestic air pollution-related illnesses, such as asthma and pneumonia [3].Accordingly, it is essential to employ an efficient diagnostic method to assist with the early detection of lung lesions [4,5].
As early lung disease identification and treatment are crucial to optimal outcomes, early examination and diagnosis may reduce the lifethreatening aspect of lung diseases and enrich the quality of life for individuals already afflicted.Chest X-rays (CXRs) are a common method used to identify lung diseases, evaluate the severity, and detect possible complications [10].Several recent studies [11][12][13][14] demonstrate the effectiveness of lung segmentation approaches for automatic CXR image processing.Chest X-rays [15][16][17] may show several concurrent anomalies.Capturing anomalies from a complicated thoracic background with just the human eye is very time-consuming, and the procedure is sensitive, as well as susceptible to user bias.Manual labelling by radiologists, requires major healthcare resources.Therefore, computer technologies may be employed to analyze chest radiographs as effectively as radiologists, in order to enhance workflow prioritization and clinical decision support, in large-scale projects and worldwide population health initiatives.Machine learning and deep learning can play a significant role in accurate clinical diagnosis [34][35][36][37].
Deep learning-based algorithms have achieved satisfactory performance in a number computer vision tasks [7][8][9], including image classification [18], medical diagnosis [19], scene identification [20], disease prediction [21], and healthcare analysis [22].The rapid improvement of deep learning techniques has been facilitated by the creation of several annotated image datasets [23][24][25][26].Characteristics indicated by these annotations have been crucial in overcoming obstacles in a variety of medical image analysis areas, including the identification of anatomical and pathological aspects in radiological scans.Deep learning approaches have been used for the detection and classification of a variety of diseases, such as lymph nodes, interstitial lung disease [27,28], cerebral microbleed detection [29], colon cancer classification [30], spinal radiological score prediction [31], automated pancreas segmentation [32], and pulmonary nodule detection [33].
Developing countries experience a shortage of skilled radiologists, particularly in rural regions.Additionally, detecting and classifying lung disease utilizing chest X-ray imaging is a difficult task for radiologists.Therefore, it is in the interest of researchers to create automated lung disease detection tools [78][79][80].In these scenarios, a computer-aided diagnostic (CAD) system can be used to do large scale diagnosis of lung disease through examining CXR pictures.Significant improvements in computing power, as along with the availability of extremely large datasets labeled with chest X-rays, contributed to the accuracy of image classification.Several recent strategies have been presented to automatically diagnosis lung diseases in CXR images.In 2017, Wang et al. [81] presented the biggest publicly accessible chest X-ray dataset, titled "Chest X-ray 14," which included 14 of the most prevalent lung diseases.
Numerous studies [82][83][84][85] are conducted on this massive dataset.Wang et al. [86] suggested a unified weakly-supervised multi-label classification framework by taking into consideration multiple multi-label DCNN loss functions and distinct pooling algorithms.Since one chest radiograph may have many abnormal patterns, Yao et al. [87] created a method that further utilizes the statistical label correlations, resulting in enhanced performance.Likewise, Kumar et al. [88] used multi-label learning approaches and studied possible label dependencies.Rajpurkar et al. [89] constructed a deep learning model called CheXNet and made its optimization tractable by using batch normalization [90] and dense connections [91].
This research paper may help medical practitioners, as well as researchers, identify lung lesions using deep learning methodology.In this study, we implemented several pre-trained CNN models to classify lung lesions in Chest X-ray images.Of these models, we determine the model's highest performance for modification to achieve the proposed fine-tuned MobileLungNetV2 model with superior classification accuracy.To improve the performance of computer-aided diagnostic systems (CADs), we classify lung lesions in chest X-ray images using the proposed model.The primary contributions of this study are as follows.
• The proposed fine-tuned MobileLungNetV2 model is based on MobileNetV2, but with certain modifications made to achieve higher disease classification accuracy compared to pre-trained models.• The proposed model has been developed to better extract image features and identify lung abnormalities.• The proposed model outperformed both prior researches done on classification of lungs disease using deep learning models and the implemented pre-trained models of the current study as demonstrated in Table 6.
To achieve and highlight the obtained result, we followed the processes summarized as follows.
• The raw chest X-ray images utilized in the investigation are obtained from the dataset ChestX-ray14.The images are pre-processed to improve the consistency of the data and reduce noise.• A Gaussian Filter is employed to denoise the noisy images.
• CLAHE is applied to the dataset to achieve clear contrast images.
• Augmentation is performed to increase the number of data in lacking classes of the dataset.The remainder of the paper is arranged as follows: The literature study is described in Section 2 to use deep learning to diagnose lung diseases.Section 3 gives the dataset description of the study.The proposed methodology and result analysis are presented in Sections 4 and 5, respectively.Consequently, Sections 6 and 7 provide the discussion and conclusion of the study, respectively.An overview of the study is shown in Fig. 1.

Literature review
Machine learning and deep learning techniques are widely used in CADs.The diagnosis of lung disease has been the focus of numerous studies in the past.It is clear from analyzing those research articles that the majority of researchers have utilized machine learning and deep learning algorithms on X-ray images to predict the disease with high accuracy and efficiency so that they can create an appropriate diagnostic system.For instance, a two-stream collaborative network with lung segmentation, (TSCN) has been used by Chen et al. to categorize multilabel CXR images with 0.823 mean Area Under Curve (AUC) value [40].The authors used U-Net to train an efficient lung segmentation tool.They then aggregated the contextual information using a feature fusion approach.Another study [41] proposed DualCheXNet, a unique twofold asymmetric features extraction network for multi-class pulmonary disease classification in CXRs.The proposed technique supports two distinct feature fusion processes, namely feature-level fusion (FLF) and decision-level fusion (DLF) which correspond to the complimentary feature learning of DualCheXNet.The study achieves 0.823 AUC values as well.
Pan et al. [42] aimed to analyze and evaluate the usefulness of optimal CNN for abnormality diagnosis in chest radiographs.DenseNet & MobileNetV2 CNN algorithm were applied for classifying chest Xrays as normal or abnormal, as well as for predicting the occurrence of 14 distinct pathological abnormalities.MobileNetV2 outperformed Dense-Net in the study with 0.900 and 0.893 AUC values, respectively.Other authors [43] attempted to combine the effectiveness of CNN for extracting visual features from the dataset with the effectiveness of task transformation approaches for multiple label classification, utilizing problem transformation approaches such as Binary Relevance, Label Powersets, and Classifier Chains that gains 0.804, 0.811 and 0.794 AUC values, respectively.Another approach [44] is of the integration of multiple features.Two distinct techniques were employed: a localization approach that focuses on pathological areas utilizing pre-trained Den-seNet-121, and a classification strategy that integrates four types of features generated with Generalized Search Tree (GIST), Scale-Invariant Feature Transform (SIFT), Histograms of Oriented Gradient (HOG) and Local Binary Pattern (LBP), as well as convolutional network features.The study classifies the multiple diseases with average AUC value of 0.8097.
Gong et al. [45] used a deformable Gabor convolution (DGConv) which improves the interpretability of deep networks and allows complex spatial variations.To increase robustness for complex objects, the features are trained at deformable sampling points using adaptive Gabor convolutions.The DGConv layer replaces conventional convolutional layers and is readily taught with end-to-end and gains 0.8501 AUC value.Wang et al. [46] present an adaptive sampling strategy that continuously analyzes the model's performance while training and automatically increases the weight of classes with low performance.Data augmentation is done by arbitrarily repeating its data samples and the resulting dataset is shuffled and divided into batches of equal size which are input into the model.The model was tuned using a stochastic gradient descent (SGD) technique.The model's performance has shown an average AUC value of 0.082.ResNet34 and DenseNet121 were two of the network topologies evaluated in Ref. [47].The study evaluated image dimensions varying from 32 × 32 to 600 × 600 pixels.80% of the samples were utilized for training and 20% for validation.The study shows the AUC ratio of 86.7% ± 1.2 and 80.7%± 1.5 for thoracic mass and pulmonary nodule detection respectively.
Baltruschat et al. [48] investigated the ResNet-50 architecture in order to get a better understanding of the various techniques and their applicability to chest X-ray categorization.Through a systematic evaluation, they obtained an AUC value of around 0.800 utilizing 5-fold resampling and a multi-label loss function.Ho and Gwak [49] used a multi-task deep learning model to support visualizations used in saliency maps of the disease areas as well as for multiclass classification.A framework for self-training knowledge distillation (KD) was demonstrated to outperform both the well-established baseline training technique and conventional KD.
Albahli et al. [50] proposed a strategy for supplementing three deep CNN models with synthetic data to identify fourteen lung-related pathologies.The algorithms utilized were DenseNet121, ResNet152V2, and InceptionResNetV2.The proposed models were trained and tested for multiple class classification in order to detect anomalies in chest X-ray images.The Rozenber et al. study [51] was based on a unique loss function that is a continuous relaxation of a discrete conception of weak supervised learning.Additionally, the paper proposes a neural network design that compensates for both the patch dependency and shift invariance by applying Conditional Random Field layers and anti-aliasing filters.Bharati et al. [52] present a novel hybrid framework for deep learning termed VGG Data STN associated with CNN, VDSNet.This system combines CNN with VGG, and applied data augmentation, with a spatial transformer network (STN).In addition, Vanilla Gray, Hybrid CNN + VGG, Vanilla RGB and a reconfigured Capsule Network were used.The validation accuracy of the proposed VDSNet model was 73%.
The literature studied for the research successfully classify multiple lung lesions from the chestX-ray14 dataset.However, the sample used to train the dataset not large enough to create a strong model.Furthermore, the dataset is significantly unbalanced.As a result, the model is trained excessively for one class and insufficiently for another.Consequently, despite the fact that the models could detect multiple lung lesions successfully, their performance was ultimately inadequate when applied more broadly.In the proposed study, the models are trained and tested using 15 classes from the chestX-ray14 dataset.Data augmentation techniques are employed on the dataset to increase the number of data in underrepresented classes and reduced data from the overrepresented classes.Thus, the implemented models were trained to achieve improved performance and increased robustness over existing models.

Dataset pre-processing
Because the CNN technique employed for classification requires clean, improved, and balanced image data [53], image-preparation and image-balancing procedures are used to provide the model with a high-quality image.This section discusses the various image-processing techniques.The CLAHE approach separates the images into contextual portions called tiles, calculates a histogram for each, and then approximates the output to a specified histogram distribution parameter.

Denoising the image (Gaussian Filter)
Gaussian blur, often referred to as Gaussian smoothing, is the outcome of filtering an image using a Gaussian function [61].It is a frequent function in graphics software, which is often used to reduce visual noise.The visual result of this blurring approach is a smooth blur  resembling viewing the images through a transparent screen, which is distinguishable from the bokeh effect generated by an out-of-focus lens or the shadow of an object under normal light [62].In computer vision methods, Gaussian smoothing is frequently employed as a pre-processing step to improve visual structures at various scales [63].Fig. 3 displays the results of the deployment of the Gaussian Filter.

Image enhancement (CLAHE)
Contrast Limited Adaptive Histogram Equalization (CLAHE) is used to increase the contrast of a picture.CLAHE is a more advanced variant of Adaptive Histogram Equalization (AHE) [58].CLAHE has been found to improve the contrast of low-contrast images [59].CLAHE improves both the local contrast of medical imaging and its usability [60].The CLAHE technique focuses on enhancing local contrast to overcome the constraints of global methods.The tile size and clip limit are critical hyper-parameters for this method.An incorrect choosing of hyper-parameters could have a big influence on the image quality.As we can observed from Fig. 4, several combinations such as tileGridSize (5, 5) with the clip limit (0.5), tileGridSize (7, 7) with the clip limit (1.5) and tileGridSize (10, 10) and the clip limit (3) parameters are investigated, and the optimal ones (tileGridSize (10, 10) and the clip limit (3)) are chosen.It is also observable that the features of the selected image are more prominent.The histogram indicates that the contrast of a CLAHE image is much greater than that of the initial source image.Fig. 4 depicts the output findings after executing the CLAHE algorithm.

Image augmentation
A deep learning model needs a large number of inputs to function with optimal efficiency.In this work, several data augmentation approaches are used to boost the enhanced data.By adding more distinct samples to the training datasets, data augmentation may improve the performance and outcomes of machine learning algorithms.If the dataset employed to train the model is sufficiently broad and varied, the approach is more effective and precise.Through the use of image enhancement methods, accuracy of the results is increased.Moreover, data augmentation methods are an effective method for diversifying datasets.

Proposed model
The principal purpose of this study is to employ transfer learning methods to obtain an appropriate classification accuracy on the NIH chest X-ray dataset.Five pre-trained algorithms were analyzed to find the optimum effective deep learning approach for the lung lesion classification task.These models are; InceptionV3 [75], AlexNet [6], Den-seNet121 [76], VGG19 [77], and MobileNetV2 [55].Among these pre-trained models MobileNetV2 showed the highest accuracy.To further improve to the classification results, MobileNetV2 was modified.The ablation study was performed to determine the different hyper-parameters.A custom fine-tuning transfer learning approach, named MobileLungNetV2, is designed and applied by adding multiple layers to the MobileNetV2 network to obtain the highest accuracy over the existing pre-trained models.This study aimed to develop a CNN-based method for lung lesion image classification.The presented system was developed in Python utilizing Keras framework [54], which is a TensorFlow-based platform.All tests were conducted using AMD Ryzen 7 (3900) CPU running at 3.90 GHz with 8 cores, 16 threads, and 64 GB of RAM.

MobileLungNetV2
The fine-tuned MobileNetV2 frameworks known as Mobile-LungNetV2 outperform five pre-trained model architectures in classification accuracy, shown in section 5. Consequently, the MobileLungNetV2 architecture is introduced and tested using the National Institutes of Health's chest X-ray dataset built on the MobileNetV2 architecture.Additionally, hyper-parameter tuning was conducted to increase the architecture durability in terms of lung lesion identification.Fig. 5 illustrates the model structure.
The pre-trained MobileNetV2 [55] introduces a module that includes an inverting residual structure.MobileNetV2 is designed from the bottom up using fully convolutional layers made of filters and residual bottleneck layers.MobileNetV2's structure begins with fully convolutional layers made up of 32 filters and 19 residual bottlenecks.It's split into two blocks, each one with 3 layers.These blocks begin and end with a 1 × 1 convolution layer comprising 32 filters.However, the 2nd block is a fully linked layer with a depth of one.The ReLU is used at several levels of the architecture.The distinction between the two blocks is in stride length, with block one having a stride length of 1 and block two having a stride length of 2.
Fig. 5 shows MobileLungNetV2 is constructed by linking layers following the 16 blocks of pre-trained MobileNetV2.The layers consist of convolutional layers, an AveragePooling2D layer, a dropout layer, and a flatten layer that is linked to a dense layer.The network's default input layer requires an image with a size of 224 × 224, which is a gray scale image.The 16 blocks consist of 2 types of blocks in the model.One has a residual block stride of 1.The second is a block with a stride of 2 for shrinking.Each block has three layers.The first layer consists of 1 × 1 convolution with ReLU6.The second layer is the depth-wise convolution layer.The third layer is also another 1 × 1 convolution without any linearity.Similarly, up to 16 blocks are constructed with the same alignment.
After block sixteen, adding a convolution layer, an Aver-agePooling2D layer, a dropout layer, a flatten layer and a dense layer.The convolutional layer has the kernel size of 3 × 3, the same padding and a ReLU activation function with 32 channels followed by a Aver-agePooling2D layer with a pool size of 2 × 2. A dropout layer has been added with 32 channels to prevent the overfitting.A flatten layer has also been added, which is connected to a dense layer that generates output for the 15 classes.The RMSprop optimizer and SoftMax activation were employed in the final layer.Throughout the process, a frequency of learning rate of 0.001 is applied.Finally, we evaluated the f1score, recall, precision, specificity and accuracy.In Fig. 6, the strategy of fine-tuning is depicted.

Ablation study of the MobileLungNetV2
As part of ablation research, four experiments were performed by changing different sections of the suggested MobileLungNetV2 architecture based on the fine-tuned MobileNetV2 framework.It is feasible to create a more durable architecture with improved classification accuracy by altering multiple aspects of the architecture.Accordingly, the AveragePooling2D layer, Dropout layer, Flatten layer, Loss function, Learning rate and Optimizer were all subjected to an ablation analysis.
In this analysis, the validation loss is denoted by 'V_Loss,' the validation accuracy by 'V_Acc,' the test loss by 'Ts_Loss,' and the test accuracy by 'Ts_Acc.' Table 4 illustrate the case study 1, 2, 3 respectively.Case study 1 shows that the Averagepooling2D layer archives the highest accuracy for the MobileLungNetV2 model with the V_Acc 94.76% and Ts_Acc 96.97%.Furthermore, the accuracy of the Global-AveragePooling2D, GlobalMaxPooling2D, and MaxPooling2D layers drops significantly, with respective V_Acc values of 90 In case study 4, Table 5 depicted that the 'Adagrad' optimizer with a learning rate of 0.00000001 improved model performance with Ts_Acc values of 94.76% and Val_Acc values 96.97%.With such a learning rate of 0.00000001, the 'Adagrad' optimizer still had the lowest Ts_Loss of 0.10.Other optimizers, such as RMSprop, SGD, and Nadam, achieved our model's Ts_Acc values greater than 90%.

Results analysis
In the study, a total of 51987 chest x-ray images were used.The data used in each class is set out in Table 3.Each image is resized to 224 ×   The transfer learning models are assessed against the proposed Mobi-leLungNetV2 model.

Evaluation matrix
To evaluate the transfer learning models, performance parameters such as Accuracy, Specificity, Recall, Precision, False Positive Rate (FPR), False Negative Rate (FNR) and F1-score are determined.The parameters are calculated using a confusion matrix generated for each individual model.Accuracy is calculated to determine the percentage of correct predictions.Precision is calculated to determine the probability of positive classifications.Specificity determines the percentage of correctly predicted negative classifications from all performance parameters.Contrasting specificity, recall determines the percentage of correctly predicted positive classes.The F1-score is used to determine the balance between specificity and recall.The performance parameters are expressed in the following equations ( 1)- (7).
Here, True Positive (TP) is identified as lung disease where the patients actually have lung disease.True Negative (TN) is predicted as the absence of lung disease where there is no actual existence of lung disease.False Positive (FP) predicts the presence of a lung disease that is not present.Similarly, False Negative (FN) predicts no lung disease where lung disease is present.These elements are then used to form the equations to generate the values of the performance parameters.

Performance of pre-trained transfer learning models
Initially, five transfer learning models were trained and tested.To determine the performance of the models, the performance parameters are computed for each model using the numbered equations ( 1)- (7).The performance measures outcomes are presented in Fig. 7. Overall, the model MobileNetV2 shows consistent performance with 91.6% accuracy.It achieved a precision of 91.42% and recall, specificity and F1scores are 91.06%,91.23% and 91.23% respectively.Although Dense-Net121 has the high specificity score of 92.87%, it lacks accuracy with 88.9%.Similarly, the accuracy of the remaining models -VGG19, AlexNet, and InceptionV3 all have relatively low accuracy with 88.99%, 82.95% and 79.11% respectively.

Performance of proposed MobileLungNetV2
The transfer learning MobileNetV2 model shows the most promise compared to the other models used to predict lung lesions.However, the prediction accuracy of the MobileNetV2 model is not accurate enough because MobileNetV2 only has an accuracy of 91.6%, which means the incorrect prediction rate of the model is 8.4%.To diagnose any disease with this level of accuracy is unsatisfactory.We aim to build a model with the lowest possible chance of misdiagnosis.In order to achieve this, the accuracy should be as close to 100% as possible without being overfitted.Consequently, the paper proposes the fine-tuned Mobile-LungNetv2.The improved model is trained with 300 epochs.The training and validation accuracy for each epoch is recorded along with the loss value.From Fig. 8 The performance measures of the proposed model is calculated by applying the values of the confusion matrix to equations ( 1)- (7).Each of the measures is color coded and shown in Fig. 10.The accuracy and specificity of the model across all the classes are very high at over 99%.The other measures (precision, recall and f1-score), vary greatly between classes.The highest accuracy is 99.77% from class Hernia.Class Emphysema obtains the highest specificity, precision and f1-score with 99.92%, 98.86%, and 98.23% respectively.The highest recall recorder at 98.06% is from class Pneumothorax.The lowest accuracy, precision, recall and f1-score are obtained from class Nodule with 99.34%, 93.41%, 94.10% and 93.75% respectively.The lowest Specificity is obtained from class Edema with 99.62%.
To determine the false rates, the False Positive Rate (FPR) and False Negative Rate (FNR) of the proposed model are measured for each class.It is observed that the highest FPR is obtained from the Atelectasis class and the lowest rate is obtained from the Mass class with 0.0038% and 0.0012% respectively.Similarly, the highest FNR is 0.0659% for Nodule class and lowest is 0.0114% for Emphysema class.In Fig. 11, the values of FPR and FNR are recorded for each class.It shows that the FPR remains nearly same for all the classes, however the FNR varies drastically among all the classes of the dataset.
The overall performance of the model is presented in Fig. 12.The overall accuracy of the model is determined as 96.97%.Similary the precision, recall and f1-score are 96.71%96.83% and 96.76% respectively.It is notable that the model has a high specificity score of 99.78%.
While numerous efforts have been made to enhance the applicability and expandability of deep learning, it is crucial to develop the interpretability of deep convolutional neural models in medical imaging applications.Selvaraju et al. [74] illustrated the working of deep learning using a procedure dubbed Gradient Weighted Class Activation Mapping (Grad-CAM).Grad-CAM is effective with any heavily associated neural network and permits the algorithm's new information to be determined while executing prediction or classification operations.The input is a conventional X-ray image, and the suggested framework is applied as a detection strategy.Grad-CAM is applied to the last convolution layer immediately after the proposed model's label prediction.Fig. 13 demonstrates the visualization of heat maps on X-ray images using the proposed approach.

Discussion
The study aims to classify multiple lung lesions using convolutional neural networks.To achieve the best results, five transfer-learning models, and a fine-tuned model, MobileLungNetV2 based on the Mobi-leNetV2 model are trained on the ChestX-ray14 dataset.Many previous research studies have made use of the ChestX-ray14 dataset as it is one of the biggest accessible datasets of this type.Therefore, previous studies using the ChestX-ray14 dataset have been analyzed in order to compare the accuracy of findings and techniques, as shown in Table 1.In the current study, the MobileNetV2 was chosen as the model to be adapted based on its higher accuracy among the five pre-trained models.To determine the performance of the models, several performance measures are computed and the performance of the models are determined class-wise for a detailed examination.In Table 6, a performance comparison of existing studies is considered with the proposed study on the same dataset.The AUC value is used as the main performance measure for the comparison.Here, it is observed that among the existing studies, the highest average AUC value is 0.850.The fine-tuned Mobile-LungNetV2 shows a higher AUC value of 0.923 showing the higher efficiency of the proposed model in determining the classification of the lung lesions.

Conclusion
It is understood that lung disease is a major cause of mortality worldwide.This paper proposed a novel approach to detect lung lesions from X-ray imaging.An improved classification model, that is derived from the transfer learning model MobileNetV2, MobileLungNetV2, is suggested.In this study, 14 lung lesions and normal lung X-ray images are examined.Following image pre-processing, five transfer learning models are applied to the dataset.The models, InceptionV3, AlexNet, DenseNet121, VGG19 and MobileNetV2 show overall classification accuracies of 79.11%, 82.95%, 88.9%, 88.99% and 91.6% respectively over the 15 classes of the dataset.Because MobileNetV2 outperforms the other models in terms of accuracy, the model is selected to be modified further to boost performance.The improved fine-tuned Mobile-LungNetV2 model is constructed using 16 blocks with new neural network layers and is trained on 300 epochs with hyper-parameters.The proposed model has an overall accuracy, and F1-score of 96.97% and 96.76%, respectively and the test loss of the model is as low as 0.15, Fig. 13.Visualization of lung infections in X-ray images using Grad-CAM on MobileLungNetV2 model.

Table 6
Comparison of AUC value among the existing studies with the proposed study to determine performance using ChestX-Ray14 dataset.

Fig. 1 .
Fig. 1.Overview of the approach for multiclass lung lesion classification.

Fig. 3 .
Fig. 3.The result of applying the Gaussian Filter to the images.
Generally, to provide large-capacity learners with the more relevant training material, data augmentation techniques have been used to enhance the size of training sets.Nevertheless, a new trend is developing in the area of deep learning research in which samples are reinforced utilizing the test data augmentation technique [64-67].The addition of test data may increase the stability of trained models [68-70].Test data augmentation can therefore be used to improve the prediction performance of deep neural networks and open up fascinating new opportunities for medical image analysis [71-73].Mirroring, rotating, zooming, flipping, and cropping are the most often used ways of augmenting data.In this study, the dataset is adjusted using oversampling and undersampling techniques.First, a random undersampling approach is used to reduce the class (Infiltration, and No Finding) with excess data.This technique deletes data at random from the majority of classes, reducing the quantity of data per class to 5000.Then, the oversampling (data augmentation) approaches are used to increase the class with inadequate data (Emphysema, Infiltration, Mass, Pleural Thickening, Pneumonia, Pneumothorax, Atelectasis, Edema, Effusion, Hernia, Cardiomegaly, Pulmonary Fibrosis, Nodule, and Consolidation).Several augmentation procedures are used in this study: Rotate 90 • right, Rotate 90 • left, Rotate 45 • Horizontal, Vertical flip, Rotate 45• Vertical, Translate (x, y (28.0, 13.0)) and Horizontal flip on image data that has been preprocessed.Table3displays the result of the data augmentation.

Fig. 6 .
Fig. 6.The sketch map of fine-tuning strategy, transform a pre-trained MobileNetV2 to a fine-tuned MobileLungNetV2.

Fig. 7 .
Fig. 7. Performance measures of the transfer learning models.
(a) and (b), it is observed that the accuracy of the model gradually increases with increasing epoch contrary to the loss value that gradually decreases.The final training accuracy of the model stands at 95.92% and the validation accuracy is at 94.76%.For the loss value, the lowest training loss value at 300th epoch is 0.15 and the lowest validation loss is 0.192.The confusion matrix generated for the finetuned model is illustrated in Fig. 9.Here the matrix is constructed for 15 classes implementing the test dataset.The test set consists of 10397 inputs in total.Among them 10082 inputs were correctly classified by the model.The number of misclassified inputs is 310.In the matrix the vertically placed classes indicate the true or real value of a data and the horizontally placed classes indicates the prediction value of a data by the model.
To evaluate performance, a confusion matrix is generated to compute accuracy, recall, precision, specificity, false positive rate, false negative rate as well as f1-score.•Grad-CAM is produced for visual depiction of the classification outputs.• To further evaluate the results, a performance comparison with prior research on the same dataset is provided.

Table 1
Overview of research employing deep learning techniques and their performance in classifying lung lesions.

Table 2
The properties of the raw dataset.

Table 3
Data distribution for each class after augmentation.

Table 4
Altering the different layers to assess the ablation experiment.

Table 5
Shifting Optimizer and Learning rate for ablation study analysis.
that the model performs with much improved efficiency over other models.The proposed model achieves a higher classification accuracy compared to other pre-trained models.However, the study has limitations as it does not take into account the time complexity of the classification models.The time complexity of a program is the time required to execute it.As time complexity decreases, execution speeds improve.A model is determined to be highly efficient when it provides high classification accuracy with a low time complexity.Furthermore, training neural networks requires a substantial quantity of data.Though the dataset used in the study had a total of 51987 samples, an even larger dataset would increase the robustness of the model.A model's robustness depends on the change of the model's performance due to incorporating new data against training data.Additionally, the robustness of the model increases when the model is trained on multiple datasets.Some highly representative computational intelligence algorithms can be used for the classification task.For future work, algorithms such as like monarch butterfly optimization (MBO), earthworm optimization algorithm (EWA), elephant herding optimization (EHO), moth search (MS) algorithm, Slime mould algorithm (SMA), hunger games search (HGS), Runge Kutta optimizer (RUN), colony predation algorithm (CPA), and Harris Hawks optimization (HHO), can be employed for classification of pulmonary diseases.Additionally, the fine-tuned model can be applied to classify other datasets, containing different pulmonary diseases, as well as for other types of diseases that may be detectable using medical imaging. demonstrating