COVID-19 detection and analysis from lung CT images using novel channel boosted CNNs

In December 2019, the global pandemic COVID-19 in Wuhan, China, affected human life and the worldwide economy. Therefore, an efficient diagnostic system is required to control its spread. However, the automatic diagnostic system poses challenges with a limited amount of labeled data, minor contrast variation, and high structural similarity between infection and background. In this regard, a new two-phase deep convolutional neural network (CNN) based diagnostic system is proposed to detect minute irregularities and analyze COVID-19 infection. In the first phase, a novel SB-STM-BRNet CNN is developed, incorporating a new channel Squeezed and Boosted (SB) and dilated convolutional-based Split-Transform-Merge (STM) block to detect COVID-19 infected lung CT images. The new STM blocks performed multi-path region-smoothing and boundary operations, which helped to learn minor contrast variation and global COVID-19 specific patterns. Furthermore, the diverse boosted channels are achieved using the SB and Transfer Learning concepts in STM blocks to learn texture variation between COVID-19-specific and healthy images. In the second phase, COVID-19 infected images are provided to the novel COVID-CB-RESeg segmentation CNN to identify and analyze COVID-19 infectious regions. The proposed COVID-CB-RESeg methodically employed region-homogeneity and heterogeneity operations in each encoder-decoder block and boosted-decoder using auxiliary channels to simultaneously learn the low illumination and boundaries of the COVID-19 infected region. The proposed diagnostic system yields good performance in terms of accuracy: 98.21 %, F-score: 98.24%, Dice Similarity: 96.40 %, and IOU: 98.85 % for the COVID-19 infected region. The proposed diagnostic system would reduce the burden and strengthen the radiologist's decision for a fast and accurate COVID-19 diagnosis.


Introduction
The coronavirus (COVID- 19), named SARS-CoV-2, is a pathogenic viral infection that originated in Wuhan (China) in December 2019 and spread worldwide [1]. COVID-19 causes a respiratory illness and can transit across fever, cough, myalgia, pneumonia, organ failure like the kidney, heart attack, etc.
[2], [3]. The total number of COVID-19 cases is approximately 618 million, with 6.6 million deaths, while 597 million have been recovered. COVID-19 is a global pandemic that has devastatingly affected world health and the economy [4]. Therefore, diagnostic tests are required to identify COVID-19 infected patients and reduce the spread timely.
The generally used tests for assessing COVID-19 individual is genetic sequencing and radiological imaging techniques [5], [6]. Moreover, the Real-time Polymerase Chain reaction (RT-PCR) is used to detect SARS-CoV-2 [7]. However, RT-PCR testing kits are expensive, and having limited availability and sampling capacity has led to the developing of more efficient techniques [8], [9]. Therefore, the radiological images (X-Ray, Computed Tomography (CT)) analysis has been used as a detection tool to tackle false-negative PCR in suggestive patients. Moreover, the CT scan is utilized for severity assessment, clinical evaluation, monitoring, and treatment of COVID-19 patients [10], [11].
In a public health emergency, the manual examination of many CT images is a great challenge and a severe concern for remote areas without experienced radiologists [12], [13]. There is an intense need for a computer-based tool that can help radiologists in improving performance and deal with many patients [14], [15]. Therefore, deep learning (DL)-based diagnostic systems are developed to facilitate radiologists in identifying COVID-19 infection. Convolutional Neural Network (CNN) is the branch of DL that is considered a powerful tool for medical diagnosis [16]. Moreover, the deep CNN tools can detect minor irregularities that cannot be observed through a manual examination. An effective diagnostic system can overcome the radiologist's burden for manual assessment of COVID-19 images, ultimately improving the survival rate.
The correct analysis of COVID-19 infected images is challenging due to (i) minor contrast variation between the infected and the background boundaries, (ii) high texture variation within homogeneous infected regions, and (iii) the COVID-19 infected region has a high variation in size, shape, and position.. Moreover, these radiological images are normally complex in nature and highly distorted due to noise during CT image acquisition [17]. Furthermore, lungs images manifests COVID-19 specific radiological patterns that are usually categorized by various types of opacities: ground-glass-opacities (GGO), reticulation, and pleural [18], [19].
This work addressed the aforementioned challenges and presented a new deep CNN-based framework to detect COVID-19-specific features and analyze thoracic radiologic images. These deep CNN-based systems can capture useful dynamic features of the infected regions, discriminating the COVID-19 infected region from the healthy ones. The significant contributions of the proposed diagnosis system are as follows: 1. A new two-phase deep CNN-based diagnostic system is developed for detecting  infection and analyzing suspicious lesions in CT Lungs images to identify the severity and stage of the disease. The rest of the manuscript discusses the related work and proposed diagnosis framework in sections 2 and 3, respectively. Section 4 gives dataset, implementation, and performance metrics details. Section 5 provides insight into the result, discussion, and comparative analysis. Finally, section 6 outlines the manuscript conclusion and future work.

Related Work
Recently, CT technology has been used to diagnose COVID-19 infection in developed countries such as America, China, etc. However, manual examination of CT scans has a significant burden on radiologists and affects performance. Several researchers have focused on developing an automatic system to analyze suspected COVID-19 patients by analyzing CT scans [20]. Once confirmed that the suspected individual is COVID-19 infected, inform the close contacts, quarantine, and perform proper care and treatment of the infected person. Several classical techniques have been used for diagnosis but failed to show efficient performance [21], [22]. Therefore, a deep CNN-based diagnostic automatic system is developed for quick and accurate infection analysis to facilitate the radiologist [23], [24]. This way, 3. Patient data is usually imbalanced in nature and need to be evaluated using standard imbalanced measures such as F-score. The assessment of the diagnostic technique using a few metrics provides unpredictable outcomes. These systems may use the ROC and PR curves at various threshold points.
These challenges necessitate customizing the CNN architecture to exploit the region homogeneity, boundaries region, and textural variations associated with COVID-19-specific infection. Additionally, segmentation techniques are implemented to analyze the severity of the disease using stringent datasets and report the necessary performance metrics.

COVID-19 Diagnosis Framework
A new two-phase deep CNN-based system is proposed for automatically analyzing COVID-19 irregularities and specific lung radiographic patterns. COVID-19 specific radiographic patterns (region-

Proposed COVID-19 Infection Detection
A new deep CNN-based detection model, SB-STM-BRNet, is developed to discriminate COVID infectious images from healthy ones. The proposed detection phase constitutes two modules, the proposed SB-STM-BRNet, and Customized existing CNNs. The COVID-19 detection setup is illustrated in Figure 2. Moreover, we have employed data augmentation in the detection and segmentation phase to improve the efficacy and reliability of real-time diagnostics.

Data Augmentation
COVID-19 is a new infectious disease, and publically available labeled CT datasets are limited.
Therefore, the quantity of the dataset is increased by performing on-the-fly data augmentation at runtime during training models [34]. The technique is performed by applying the original image's rotation, shearing, reflection, etc., as depicted in Table 1.

COVID-19 CT Lung Images
Data Pre-Processing Figure 1 The detailed workflow of the proposed COVID-19 diagnostic system. The workflow contains CT Lung images, detection, segmentation models, and standard performance metrics.

The Proposed Channel SB-STM-RENet
The new SB-STM-BRNet deep CNN is developed, comprised of new dilated convolutional and STM blocks, and channel SB ideas to distinguish COVID-19 infected images from healthy slices. Each STM block systematically implements region and boundary operators within the dilated convolutional blocks.
The SB-STM-BRNet comprises three dilated convolutional STM blocks with an identical topology where each block is composed of four regions, and boundary-based feature extraction dilated convolutional blocks. Each convolutional blocks methodically employed region and boundary operations using max and average-pooling appropriately [14], as shown in equations 2 and 3. These  additional channels are generated using TL to achieve various channels, while blocks B and C are original channels. The STM boosted block dimensions are 128, 256, and 512 [35]. All four blocks are based on the STM technique to exploit the region, boundary, and global receptive feature extractions, the detailed architecture is shown in Figure 3.
The channels and size are represented by x and k x l. The kernels and their size are denoted by f and i x  Figure 3 The proposed SB-STM-BRNET CNN architecture for COVID-19 detection.

Significance of New Channel SB-STM Blocks
The proposed SB-STM block is based on the systematic use of dilated convolutional paired along with implementing average-pooling and max-pooling operations that enhance the feature set to distinguish healthy regions from the infected areas. Moreover, incorporating an edge operator helps the architecture learn highly discriminatory patterns, while the region operator for smoothening purposes. In addition, using various pooling operations results in down-sampling that eventually enhances the robustness of the model against any variation in input.
Furthermore, the perception of multipath-based STM blocks is used to obtain diversity in the feature set.
The systematic stacking of STM blocks and pooling operations help to explore the COVID-19-specific patterns associated with region-homogeneity and boundary for various segments. Moreover, the new SB concept is incorporated at the STM blocks to get the prominent reduced channel and then combined to achieve various boosted diverse channels. Furthermore, the idea of the dilated convolutional layers is employed to preserve the global features. They can dynamically capture the representative information from the images and classify the CT Lung images using fully connected layers and dropout layers that preserve the prominent features and reduce overfitting.

Proposed COVID-19 Infection Analysis
Investigation of infected regions is essential to gain insights into the fundamental features of the infection pattern and its effects on the surrounding lung area. CT Lung images provide a higher level of detail in analyzing infected lung regions [25]. The CT lung images classified as COVID infected in the detection phase using the proposed SB-STM-BRNet are given to segmentation CNN as input.
Moreover, data augmentation as preprocessing is employed to improve the model generalization.

The Proposed COVID-CB-RESeg Segmentation CNN
A novel COVID-CB-RESeg CNN is proposed to perform fine-grain pixel-wise segmentation to learn , is an input channel of size ( × ) and filter is represented by . The receptive field of the input channel is represented by ( , ) with respect to , and ( , ) shows the spatial dimension of the filter. In Equations 8 & 9, average and max denotes the average-pooling and and max-pooling operations, respectively, on convolved output ( , ). Where w is the window or stride size.

The Systematic Exploitation of Regional and Edge Features
Max-pooling preserves boundary and boundary details for dissimilarity-based pixel-wise object segmentation. In contrast, the systematic implementation of both average and max-pooling helps preserve hybrid and rich information feature sets to enhance the segmentation performance. These sets include region-homogeneity and boundary patterns of the COVID-19 infected region. Moreover, average-pooling suppresses noise acquired during CT image acquisition.

Significance of Systematic Exploitation
The proposed strategy helps learn diverse feature sets and control the feature dimensions to improve generalization [14]. Normally, images are affected mainly due to illumination, contrast, and texture variation. Therefore, implementing both the pooling operations individually may affect the performance of CNNs. Therefore, max-pooling skips the region-homogeneity detail information while preserved by average-pooling. Moreover, average-pooling loss boundary information is recovered through max-

Significance of Channel Boosting Utilization
In the proposed COVID-CB-RESeg segmentation CNN, the auxiliary decoder channels of TL-based customized CNNs are concatenated with the original channels of the proposed COVID-RESeg to attain diverse feature maps and improve the model's convergence. The COVID-RESeg are trained from scratch using the COVID-19 infected Lung dataset and exploiting CB using TL for persevering rich regional and edge-based information [36], [37]. The assistance of an auxiliary channel helps to enhance the proposed segmentation CNN representative's capacity.

CT Lung Image
Segmented Output

Auxiliary Channels
Original Channels

Implementation of Existing Detection and Segmentation Models
Recently, CNN has demonstrated effective performance in medical field images to detect and segment medical images [38]. Various deep CNNs are employed to segment the COVID-19 CT infected region using diverse datasets [39], [40].  [47], [48]. These segmentation models have different encoders, decoders, and other concepts of upsampling rate and skip connections.

Dataset
The lungs CT scan has a high sensitivity for the analysis of COVID-19 infection. The major benefit of using lung CT scans is that it makes the internal anatomy more apparent as overlapping structures are eliminated, thus leading to efficient analysis of the affected lung areas.
To serve the purpose, a dataset of 30 patients with 2684 CT Lung images is used provided by the Italian Society of Radiology (SIRM) [49]. The dataset is based on CT lung images, and their corresponding labels on COVID infected and healthy patients in .nii.gz format. The experienced radiologist examined the provided dataset. The CT samples were paired with a radiologist-provided binary label with marked infected lung regions. As discussed in the next chapter, the dataset is trained and tested on the proposed architecture. Figure 6 illustrates the COVID-19 CT images, highlighting infected regions.

Data Pre-processing
The dataset is primarily based on 3D images in .nii.gz format; thus, it was converted into 2D images in a critical aspect of preprocessing. Deep learning models tend to perform faster on small-sized images than on large-sized images. Thus, to improve our proposed architecture's performance, the dataset images were resized from 512 x 512 x 3 to 304 x 304 x 3, which stemmed from the loss of information in segmentation. Resizing images reduces the computation power and enhances learning capacity [50].

Dataset Distributions for Detection and Segmentation phase
In the proposed framework, detection and segmentation networks are trained separately. The dataset consists of CT lung images of COVID infected and healthy patients. The dataset is categorized into two classes, COVID infected and healthy, depending upon their labels. The CT lung images of COVID and healthy classes are used in the proposed architecture for classification purposes. For segmentation, only the COVID infected class images and their corresponding labels are used as it helps give a better insight into the infected area's analysis. The dataset distribution is available in Table 2.

Implementation Details
The COVID-19 CT image datasets are divided into training (80%) and testing (20%) for both the detection and segmentation phases. The training set is further divided into training and validation sets using cross-validation techniques, as shown in Table 2. The hold-out cross-validation is used for optimal hyper-parameters selection, and the validation data is added to the training set. The proposed novel architecture based on detection and segmentation using deep learning is built in the MATLAB 2021a tool. Since the training of the CNN network is computationally expensive, MatConvNet (a MATLAB-based DL library) was used for experiments [51].

Performance Evaluation
The proposed two-phase COVID-19 diagnosis system has been assessed using standard measures.
Moreover, the customized CNNs are also evaluated on the same metrics for comparative analysis.
Performance measures, abbreviations, and mathematical representation are detailed (  [20], [52]. The CI is used as a statistical test to evaluate the uncertainty of the segmentation CNNs.

Results and Analysis
This paper proposes a new two-phase diagnosis framework to detect and analyze the COVID-19 infectious region in the lungs. The two-phase framework has the advantages of improving diagnosis performance and reducing computational complexities. Moreover, this process rivals the clinical workflow, where patients are referred for further diagnosis after initial detection, which helps quickly identify the severity of the disease. The performance of the proposed SB-STM-BRNet and COVID-CB-RESeg CNNs are evaluated based on standard performance metrics. The proposed models are tested on unseen data and indicate exceptional performance compared to existing architectures. Moreover, a data augmentation strategy has been applied to the trained dataset, showed beneficial results, and enhanced the model generalization.

Performance of COVID-19 Infection Detection
The detection phase is optimized to recognize the COVID-19 infectious patterns and reduced false negatives. In this regard, a novel SB-STM-BRNet CNN is proposed to detect COVID-19 infectious images in the initial phase. Moreover, the learning capacity of the SB-STM-BRNet and customized CNNs for COVID-19 features is evaluated on test data.

Performance Analysis of the Proposed SB-STM-BRNet
The proposed SB-STM-BRNet achieves reasonable generalization compared to existing good- SB and TL that improved the sensitivity and maintained a high F-score in Table 5 and Figure 8.
Furthermore, SB achieved prominent and diverse feature maps learned small illumination changes and texture variation among COVID-19 infected and healthy regions in CT images [53]. However, fewer miss-classified images are due to contaminants or noise in CT images, which yield a likeness between COVID-19 and healthy people [54].

Performance Comparison with the Existing Detection CNNs
The proposed SB-STM-BRNet performance is compared with the five customized classification CNNs COVID-19 specific patterns [55]. TL has demonstrated impressive performance in medical diagnosis challenges, especially COVID-19 and cancer diagnosis.  [57] 90.00 85.00

Figure 8
The performance gain of the proposed SB-STM-BRNet over the existing detection CNN regarding standard performance metrics.

Feature Space Analysis
The feature space learned by the SB-STM-BRNet is visualized to recognize the decision-making behavior more efficiently. Usually, the distinguishable class features improve the learning capacity and reduce the model variance using a stringent dataset. The 2-D scatter plot of the principal components (PC) and their percentage variance on test data for the existing good-performing ResNet-50 are shown in Figure 9. Moreover, the 2-D scatter plot of the PC1 Vs. PC2 and PC1 Vs. PC3 and their percentage variance for the proposed SB-STM-BRNet are shown in Figure 10. Finally, the feature space of the proposed SB-STM-BRNet visualizes good discrimination between COVID-19 and healthy features, which is illustrated in Figure 10.

Detection Significance of the Proposed SB-STM-BRNet
Detection rate curves are used to quantitatively assess the discrimination ability of the developed SB-STM-BRNet. These are performance measurement curves that evaluate the generalization of the SB-STM-BRNet by analyzing the discrimination between two COVID-19 infected and healthy classes at different threshold setups. Moreover, Figure 11 shows evidence from ROC and PR curves that SB-STM-BRNet has a good learning ability compared to existing detection CNNs on unseen data.   (Table 6). The results suggest that the proposed COVID-CB-RESeg has an excellent learning ability for COVID-19 infectious patterns, evident from DS (96.40 %) and IoU (98.85 %), respectively (Table 6). Consequently, precisely learned the discriminative boundaries and achieved a higher value of BFS (99.09 %). The proposed COVID-CB-RESeg appears globally suited for moderate to severely infected regions.

Comparative Analysis of Segmentation CNNs
The proposed COVID-CB-RESeg performance is compared with four popular segmentation CNNs (SegNet, U-Net, VGG-16, and FCN). The learning ability and the segmented infected regions by the proposed COVID-CB-RESeg and existing CNNs are shown in Table 6, Figure 12, and Figure 13. The proposed COVID-CB-RESeg achieved performance gain over the existing CNNs regarding DS (2.8-7.2%), IoU (1.64 -10.24 %), and BFS score (2.06 -11.73 %), as depicted in Table 6. Furthermore, the pixel-wise segmentation of the infection region and created maps using the proposed COVID-CB-RESeg illustrate high subjective quality compared to existing CNNs ( Figure 13). The performance metrics and the visual quality of the segmented maps are evidence that the proposed COVID-CB-RESeg

TL based Segmentation Analysis
The TL-based performance gain over the trained from scratch is approximately 0.40 to 1%, as shown in  proposed COVID-CB-RESeg has low complexity and in-depth but shows more accurate performance than high complex and large depth models.

Conclusion
COVID-19 is a transmissible disease that has primarily affected worldwide. Therefore, an early suggests that an integrated approach is appropriate for the early screening of COVID-19 infection and detailed analysis of the COVID-19 infectious region. Moreover, the quick and computer-aided diagnosis will help save valuable lives and have good socio-economic impacts. In the future, the proposed diagnosis framework will be extended to the multi-class challenge (COVID-19, viral and bacterial pneumonia, and Healthy) and other types of lung abnormalities. GAN will be used to augment the dataset by generating synthetic examples to improve efficacy and reliability for real-time diagnostics.
We will extend our work on designing a computer-aided diagnostic system and deploy the automatic tool in healthcare centers.