SENext: Squeeze-and-ExcitationNext for Single Image Super-Resolution

Recent research on image and video processing using convolutional neural networks has shown remarkable improvements, especially in the area of single image super-resolution(SISR). The primary target of SISR is to recover the visually appealing high-resolution (HR) output image from the original degraded low-resolution (LR) input image. However, most recent convolutional neural networks (CNNs)-based image super-resolution frameworks often used a deeper and broader network architecture that requires a sizeable computational resource, risk of overfitting, increases computational complexity, and more memory consumption, as well as takes more processing time during the evaluations. To address these issues, we have presented a Squeeze-and-ExcitationNext for Single Image Super-Resolution approach, known as SENext. In brief, the squeeze-and-excitation blocks (SEB) are used in our network architecture with a view to reduce the computational cost and adopt the channel-wise feature mappings to recalibrate the features adaptively. Furthermore, local, sub-local, and global skip connections are employed between each SEB to enable the feature reusability and stabilize training convergence smoothly. Instead of hand-designed bicubic upsampling at pre-processing step, we have performed post-upsampling at the later stage to reconstruct the high-resolution image. Extensive quantitative and qualitative experiments are performed on the benchmark test dataset, including Set5, Set14, BSDS100, Urban100, and Manga109. These experimental evaluations validate the superiority of the SENext over other deep CNN image SR methods in terms of PSNR/SSIM, FLOPs, Number of parameters, processing speed, and visually pleasing effect.


I. INTRODUCTION
One of the most significant research area in convolutional neural network-based image processing is a single image super-resolution (SISR). SISR function is to reconstruct the visually appealing high-resolution (HR) output image from the input low-resolution (LR) image. However, SISR is still The associate editor coordinating the review of this manuscript and approving it for publication was Yizhang Jiang . a difficult task and is considered as an inverse and ill-posed problems and numerous algorithms [1], [2], [3], [4], [5] have been discussed. Although these algorithms have improved the condition of LR image, but performance is not satisfactory and has more computational complexity. Recently, deep convolutional neural networks (CNNs) captured the market of image SR, and the research community shifted from the old hand-designed approach to a new deep CNNbased approach. Initially, Dong et al. recovered a new idea of image SR models inspired by the Squeeze-and-excitation networks (SENet) [13] and Single image super-resolution with recursive squeeze and excitation networks (SESR) [14], we proposed a Squeeze-and-ExcitationNext for Single Image Super-Resolution named SENext. In our SENext method, squeeze-and-excitation block (SEB) is added to develop the interdependencies between respective channel and reweighting the new features.
Furthermore, single local skip connection-based image super-resolution approaches face the loss of feature information at the subsequent end of the layers and act as a dead layer. This issue creates the vanishing gradient problem occurring in the training phase [8], [15], [16]. Our proposed model handles this issue with the support of global and local skip connections. In addition, selecting the proper activation function is crucial for developing deep CNN methods. Rectified linear unit (ReLU) activation function are currently the most popular activation function. As Krizhevsky et al. [8], suggested that ReLU activation function performs a faster speed of the training and reduce the saturation problems. Still, several recent papers address the issues of exploding (i.e., retraining too much information) during the training [8], [15], [16]. It is desirable to suggest a novel activation function to address the abovementioned shortcomings. In contrast to ReLU and PReLU activation functions, the novel non-linear activation function proposed in this work is a LeakyReLU (LReLU) [15]. The main contribution of our proposed method is as under: i) To reduce the computational cost and obtain the faster convergence during the training phase, we have replaced standard ResNet blocks with squeeze-and-excitation block (SEB) blocks which is inspired by the Squeeze and Excitation networks. Compared to other image SR methods, our suggested model outperformed them by a factor of ×2, ×3, ×4, and ×8 benchmark not only in speed but also in terms of computational cost.
ii) The deeper model faces the problems of Dying ReLU, which means the condition in which many ReLU neurons send output values as zero, and the whole network gets stuck and never improves the performance. We have replaced the ReLU with the LReLU activation function to initiate the dead features introduced by zero gradients.
iii) The single local and global skip connection does not reconstruct the visually pleasing high-quality HR image and introduces blurry artifacts to the HR output image. We have adopted a different approach and extracted the features information from the multi-local, sub-local, and global skip connections to reconstruct the visually pleasing, high-quality HR image.
The remaining sub-section of our work is explained as follows. Section II discusses the related work of deep CNN image SR methods. Section III explains the proposed method for SISR in detail. In section IV, we discussed the experimental results with other state-of-the-art methods. Finally, section V explains the conclusions and future work.

II. RELATED WORK
The objective of SISR is to reconstruct the visually appealing HR output image that contains detailed information from a LR input image. Conventional methods [1], [2], [3], [4], [5] was resolve the image SR problem differently, but deep learning-based CNN architecture used an effective and efficient way. In this section, we will mainly discuss the recent deep learning-based image SR approaches. The first deep learning-based solution to the SISR problem proposed by Dong et al. [6] named as super-resolution convolutional neural network (SRCNN). SRCNN [6] model depends on three CNN layers to predict the output from the interpolated version of the upscaled image to reconstruct the HR image. Although, there is some weakness in this model. First, the proposed model used bicubic interpolation to upscale the original LR image, but bicubic interpolation introduced blurry results and did not design for this purpose. Second, image reconstruction information is still not satisfactory. The third is the slow convergence rate which takes more training time. Wang et al. [17] introduce the sparse prior network for reconstructing the HR image, known as the Sparse Coding Network (SCN) [17]. The computational performance of SCN is also improved to earlier SR method [6]. Wang et al. modified the model and replaced the non-linear mapping with a set of coding sparse sub-networks [18]. The main disadvantage of SCN [17] network architecture is the higher computational cost, leading to many problems in real-time applications. To speed up the reconstruction process of image super-resolution, Dong et al. introduced the fast super-resolution CNN (FSRCNN) [7] architecture. FSRCNN [7] is an upgraded and faster version of the SRCNN [6] design. The straightforward network design of FSRCNN used one deconvolution layer and four CNN layers to upsample the original input LR images without using interpolation techniques. Compared to SRCNN [7] performs better and has lower computational complexity, but still it has a smaller network capacity. Efficient sub-pixel convolutional neural network (ESPCN) [10] is a simple, efficient, and fast image super-resolution method, that can apply to real-time image and video applications. A Very Deep Super-Resolution (VDSR) network with residual skip connection was introduced by Kim et al. [9] which was modeled after the visual geometry group network (VGGNet) used in the ImageNet for classification [8] task. VDSR employed the global residual learning connection with a faster convergence rate to lower the training complexity. The VDSR [9] method uses the bicubic interpolation-based upscaled input image rather than the actual pixel values resulting in increased memory usage and high computational costs. In addition, Kim et al. presented a Deeply Recursive Convolutional Network (DRCN) [11] for an image SR framework that employs several convolution layers. The key advantage of DRCN [11] is that it has constant training parameters (number of parameters). Although there are more recursions, the main drawback of DRCN [11] is that it slows the process of training convergence. The authors also applied the skip connection recursively to enhance model performance. The residual encoder-decoder networks (RED) are a notion that Mao et al. extend and proposed the RED [19] model. In this approach authors used a residual learning with symmetric convolution operation to obtain the better performance. As a result, these findings support the idea that ''the Deeper the Better.'' Contrarily, a shallow and deeper, fast deep learningbased approach was proposed by Romano et al. named Rapid and Accurate Image Super-Resolution (RAISR) [20]. In this approach, the author classifies the input image patches concerning the angle of patches, coherence, and strength to learn the mappings from the original LR image to reconstruct the HR image. To rebuild the HR image, Lai et al. developed a deep Laplacian Pyramid Super-Resolution Network (Lap-SRN) [21], a novel image SR design. The LapSRN [21] architecture is based on many pyramid layers, each of which has a deconvolution layer acting as an upsample. Denoising convolutional neural networks (DnCNNs) were suggested by Zhang et al. [22] to speed up the development of an extremely deep convolutional neural network design. The DnCNNs network stacks convolutional neural networks with Batch Normalization (BN) layer before the ReLU activation function, just like the SRCNN [6] network. Despite producing positive results, the model is computationally expensive because it uses a BN layer. Excessive use of convolution operations will limit the advancement of image super-resolution technology, especially for low-power computing devices. To resolve said issue, Zheng et al. [23] proposed the concept of a lightweight information multi-distillation network (IMDN). To further improve the performance of SR methods Tai et al. [24] developed a 52-layer Deep Recursive Residual Network called DRRN. Ledig et al. [25] use a deep CNN with residual skip connections having 16 blocks to recover the upsampled version of an output image. Lim et al. [26] suggested an improved deep super-resolution network architecture to boost VOLUME 11, 2023 the model's training effectiveness and win the NTIRE 2017 SR Challenge [27] as well as produced the cutting-edge results named as enhanced deep super-resolution network (EDSR). Tai et al. proposed the deepest model for image restoration is a very deep persistent memory network (Mem-Net) and used several memory blocks to create persistent memory [28]. MemNet consists of cascaded memory blocks, which fuse the global features.
Yamanaka et al. [29] developed a deep convolutional neural network-based framework for image SR to combine the parallel CNN layers with skip connections. The two networks they use most frequently are the SR image reconstruction network and a feature extraction network for extracting features from various levels. Compared to VDSR [9], this model is shallower. Han et al. proposed a dual state recurrent networks (DSRN), which transmits information from the LR image state to the HR image state [30]. Authors update the signal information at each step before forwarding it to the HR state. A multi-scale residual network (MSRN) was created by Li et al. [31] and used the features fusion at various sizes by employing an adaptive feature detection strategy. This method utilized the full hierarchical-based feature type information to recreate the super-resolved HR image. Ahn et al. [32] suggested a method for handling multi-scale information and learning residuals in LR feature space to select an appropriate paths [32]. Furthermore, this method [32] provides modules for scale-specific upsampling types with multiple shortcut connections. Choi et al. [33] used a recursive neural network and proposed a fast and efficient image SR with block state-based recursive network (BSRN). This type of network architecture tracks the current information status for image features. Zhang et al. [34] proposed the super-resolution network for multiple degradations (SRMD) to reconstructs the HR image by concatenating a LR image with its degradation mapping function. Furthermore, SRMD also designed another fine-tuning-based architecture named as noise-free degradation (SRMDNF) model [34]. Multiscale inception-based super-resolution (SR) using deep learning (MSISRD) approach was proposed by Muhammad and Aramvith [35], to utilize the inception block to reconstruct the multi-scale feature information for image SR. In MSISRD approach author employed the concept of asymmetric convolution operation to reduce the model's computational cost. Wang et al. [36] demonstrated a dilated convolution neural network designed to expand the receptive field without expanding the kernel size. Furthermore, in [36] a shallow network architecture only increased the size of the receptive field. End-to-end image super-resolution via deep and shallow convolutional networks architecture provided short and long-range multi-scale information and replaced bicubic interpolation operation replaced with transposed CNN layer to reconstruct the HR image [37]. Yang et al. [38] proposed a transposed layer-based network architecture with largescale components known as a deep recurrent fusion network (DRFN). Su et al. [39] suggested a unique structure, which entails several sub-networks to reconstruct the HR image gradually and LR feature map used as an input for each subnetwork. In image super-resolution, arbitrary enlargement factor is a challenge in real-time applications. Hui et al. [40] suggested an information distillation network (IDN) method to reconstruct the HR out image. In IDN [40] approach authors directly extract the features information from the original input LR image. IDN [40] uses a multiple cascaded information distillation block (DBlock) to reconstruct the residual-based high quality output image in HR domain. Hung et al. [41] proposed a super-sampling network (SSNet) architecture to significantly reduces the number of parameters and multiplication operations due to the used of depthwise separable convolution operation. Barzegar et al. [42] suggested a modest framework to avoid the training issue in the deeper network architecture. Multi-scale Xception Based Depthwise Separable Convolution for Single Image Superresolution (MXDSIR) was proposed by Muhammad et al. [43]. MXDSIR employed a depthwise separable convolution operation to reduce computational complexity. Hsu et al. [44] were motivated by a capsule neural networks to extract additional possible feature information for image SR and created the two networks for image SR, such as the Capsule Image Restoration Neural Network (CIRNN) and the Capsule Attention and Reconstruction Neural Network (CARNN). For SR objectives and to learn the features information at various phases, Liu and Ait-Boudaoud [45] presented a new hierarchical convolutional neural network (HCNN) architecture. The HCNN method involves a three-step hierarchical procedure based on the edge branch extraction, the edge reinforcement branch, and the SR image reconstruction branch. Prior knowledge and very sensitive to noise issued SR algorithm discussed in [46]. In this approach, the author fuses the information of multi-scale image information in a non-linear manner and uses a cascading-based multi-scale global mechanism to capture the non-local feature information. Xiao et al. [47] introduced the idea of a powerful lightweight multi-scale feature extraction super-resolution network (MFEN) architecture. In the design of MFEN multi-scale feature extraction blocks (MFEBs) are stacked side-by-side to obtain multiscale with hierarchical feature information. Xiao et al. [47] also proposed a simple version of MFEN known as MFEN_S.
To resolve the issues of network depth as well as width, Qin and Zhang proposed an Attentive Residual Refinement Network (ARRFN) [48] method. Generally, the architecture of ARRFN consists of feature extraction, multi-scale separable upsampling blocks, and attentive residual refinement. Li et al. proposed an adjustable SR network (ASRN) [49], which is easily adjusts the network depth of the proposed ASRN model.

III. PROPOSED METHOD
In this section, we have discussed a detailed explanation of our proposed network architecture for SISR known as Squeeze-and-ExcitationNext for Single Image 45992 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  Super-Resolution (SENext), as shown in Figure 2. The proposed framework mainly consists of two paths with four different types of blocks such as shallow feature extraction block (SFEB), squeeze-and-excitation block (SEB), splitconcatenate block (SCB), and finally capsule unit block (CUB) with the support of special upper branch block (UBB). The information transfer pathway passes low, mid, and high-frequency information from the original low-resolution images. In our proposed method, do not change the size of the input image. Initially, we extract the feature information from the original LR input image, add them, and pass through the SCB, followed by the CUB block. To reconstruct the visually pleasing HR output, we used all feature information with a special upper branch, and then the resultant output pass through the learning-based transposed convolution layer.

A. SHALLOW FEATURE EXTRACTION BLOCK (SFEB)
According to the survey of [26] and [50], the shallow feature F 0 is extracted from the original LR input image using only one or two 3 × 3 convolutional layers followed by the ReLU activation function, as shown in Figure 3(a) and 3(b). The design of said block is straightforward, but it cannot extract the complete shallow features information from the original LR input image. Furthermore, total network architecture depends on the initial shallow feature extractions, and sometimes essential feature information is lost when a network architecture is significantly deeper. To extract the complete low and high-level features information from the original LR input image, we adapt the improved version of Figure 3(b) architecture with the use of local skip S L and global skip S G connections as shown in Figure 3(c). Our proposed, designed shallow feature extraction block is VOLUME 11, 2023 45993 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. explained as: where H SFEB (.) represents convolution operation, and (x LR ↓) is the original input LR image. After obtaining the shallow features x 0 is then used as the input of SEB.

B. SQUEEZE-AND-EXCITATION BLOCK (SEB)
For image and computer vision-based applications, the SqueezeNet deep CNN architecture mainly focuses on computational cost and model efficiency [51]. The first basic architecture of the SqueezeNet block is commonly known as a fire module, as shown in Figure 4. The whole architecture consists of two stages: a squeeze stage that applies a series of 1 × 1 kernel and the expanded stage use 3 × 3 kernels both followed by a conventional ReLU) activation function. The number of squeeze filters that can be learned is always less than the volume of the input. Consequently, the squeeze stage may be considered a dimensionality reduction process that also captures the pixel correlations between input channels. The output of the squeezing phase relates to the expansion phase, which combines learning 1×1 and 3×3 convolutions.
To reduce the vanishing gradient issue during the training as well as decrease the computational complexity, we proposed an improved squeeze-and-excitation block (SEB) by stacking a series of 1×1 convolution layers in each phase and using the LReLU activation function in place of the ReLU activation function as shown in Figure 5. Suppose the proposed SEB contains N number of Blocks, then x n−1 and x n be the input and output of the n th SEB block. The resultant output of x n feed to the SCB block.

C. SPLIT-CONCATENATION BLOCK (SCB)
Residual learning is one of the most crucial technique to ease the training for large-scale networks [52]. A global skip connection was implemented by Kim et al. in [9] and could concentrate on predicting the residual skip connection learning. Furthermore, residual skip connection technique moved the extracted feature information through every block using the short-term skip connections [53]. Numerous efforts have altered the structure of original ResNet [52] was first time developed for the image recognition task and obtained the remarkable performance. Several versions of the residual learning-based construction blocks are shown in Figures 6(a) and 6(b). The SRResNet building block [25] differs from the original ResNet block [52], due to the lacking of an activation layer following element-wise addition. The two BN layers were eliminated from SRResNet to create the EDSR building blocks [26]. The authors of EDSR [26] are recommended that BN would not be appropriate for the image super-resolution task. Thus, our proposed model adopts a split-concatenate block without BN, as shown in Figure 6(c). Initially, HR features are split into two branches with a kernel size of 3 × 3 and 5 × 5 to take the benefit of small as well as a sizeable receptive field both followed by another 3 × 3 filter with LReLU activation function to prevents gradients from saturating and mitigates the risk of vanishing gradients.

D. CAPSULE UNIT BLOCK (CUB)
To minimize the feature map dimension and merge long-term features with skip connections to rebuild the high-quality HR image discussed in Squeeze Unit [54]. To follow the concept of [54], we proposed a particular capsule unit block (CUB) with a global skip connection, as shown in Figure 7. The 45994 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.   design of the proposed CUB block consists of one bottleneck layer followed by LReLU and one 3 × 3 filter. The bottleneck layer recalibrates the information with a sub-local skip connection to overcome parameter growth and build an efficient architecture. The concatenated output is used by one convolution layer of filter size is 3 × 3.

E. UPPER BRANCH BLOCK (UBB)
Implementation of Inception [55] network architecture-based block before the transpose layer to extract the multi-scale features information is to increase the computational cost. Furthermore, inception block used before the transposed layer is to availability of a max-pooling layer. The maxpooling layer is to lose the features information, which leads to drop the performance of the model [56], [57]. Furthermore, 5 × 5 kernel size is more time-consuming and expensive.
To resolve these issues, we proposed an alternate design with a simple upper branch block (UBB) with a small kernel size as shown in Figure 8. We removed the max-pooling layer operation with a residual skip connection. In the UBB block, we utilized 10 CNN layers having a filter size is 3 × 3 with the support of the LReLU function, except the last layer. The resultant LR features are concatenated and fed through the learning-based deconvolution layer with a filter size is 3×3 to recover the visually pleasing HR reconstructed output image.

IV. EXPERIMENTAL RESULTS
In this section, we assess the effectiveness of our SENext model on different public datasets. Initially, we discuss the training and testing datasets; then, we will explain the experimental evaluations with state-of-the-art methods. Training was performed on a combination of two datasets, such as 100 image of DIV2K [27] and 300 images of BSDS300 [58]. The same combination is observed in [50] and [59]. Additionally, we apply the data augmentation technique to decrease

C. COMPARISON ANALYSIS BASED ON THE IMAGE QUALITY METRICS
In this subsection, we compare the performance of existing image super-resolution methods using PSNR/SSIM in Figure 10. The quantitative results shows that our SENext attains the best quantitative performance of existing deep CNN image SR methods. Using a squeeze-and-excitation block with local and global skip connection, our proposed model has obtained the peak value in both quality metrics (PSNR/SSIM).

D. QUANTITATIVE ANALYSIS OF RUN TIME VERSUS PSNR
In this section, we assess our SENext model's performance in terms of runtime time versus PSNR, as seen in Figure 11. To assess the state-of-the-art approaches using an Intel CPU i7-9750H having 2.60 GHz with supported card of NVIDIA GeForce RTX 2070 GPU (16 GB Memory). For evaluation purposes, we used the GitHub codes provided by the researchers. The trade-off between CPU time of execution versus PSNR on Set5 [60] enlargement factor ×2 is present in Figure 11. Our proposed method is faster than recent state-of-the-art methods except then shallow models (SRCNN and FSRCNN). Furthermore, our proposed SENext attains less computation cost regarding floating point operations (FLOPs), as shown in Figure 12.
E. PERCEPTUAL QUALITY COMPARISON Figure 13, 14, 15, 16, and 17 shows the perceptual quality of enlargement factor ×4 and ×8 image SR test datasets including BSDS100 [58], Urban100 [62] and Manga109 [63]. The results on challenging enlargement scale factor ×8 observed that more blurry results were generated by Bicubic, RFL [5], SelfExSR [62], SRCNN [6], and FSRCNN [7]. However, it is    texture detail and effectively suppresses the artifacts, because our approach follow the concept of SqueezeNet [51] in which size is extremely compact for mobile applications and has only 1.2 million parameters but achieves an accuracy similar to AlexNet [64]. During the design of SqueezeNet, the architecture used 26 convolution neural network layers without a fully-connected layer. SqueezeNet achieves a top-1 accuracy of 57.4% and a top-5 accuracy of 80.5% on ImageNet [64].
The potential applications of SqueezeNet techniques are various in the field of image and computer vision tasks. The main versatile application of SqueezeNet in healthcare [64] and self-driving cars [66], where compact and efficient models are highly desirable. Self-driving cars rely heavily on real-time object detection to safely navigate through their environment.   SqueezeNet has been used to improve the efficiency and accuracy of object detection in self-driving cars by quickly identifying objects such as pedestrians, cars, and traffic signs while consuming minimal computational resources [65]. Another major application of SqueezeNet used is in the field of medical imaging, where real-time image processing and diagnosis VOLUME 11, 2023 45999 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. has been utilized to increase the effectiveness of medical imaging systems, including computed tomography (CT) and magnetic resonance imaging (MRI) scanners [66]. Quick CT image analysis and diagnosis can lead to better patient care and treatment results. Furthermore, face-mask detection [67] is also a central application of SqueezeNet to resolve the critical problems in security and surveillance. Finally, overall summary of state-of-the-art deep learning-based image SR methods present in Table 2.  Table 3. Results in Table 3 clearly shows that a model with SFEB achieves better performance as compared to without SFEB.

2) MODEL ANALYSIS WITH DIFFERENT BLOCK ARRANGEMENTS
A more comprehensive ablation study of our proposed blocks can be found in Table 4. In this experiment, we investigated the effects of various combinations of blocks. The eight networks were trained for super-resolution application with enlargement factor ×4 and have the same configuration of training as well as validation parameters. We used the 100 images of a DIV2K [27] dataset for training and 91 images from Yang91 [1] for validation with 16 batch sizes having 100 epochs. In Table 4 PSNR value is reported and observed that the baseline network (without any block) gives  the lowest PSNR value (28.11 dB), but the best performance (28.48 dB) is observed when all blocks are used in the model.

3) MODEL ANALYSIS WITH ACTIVATION FUNCTION
Maas et al. [15] evaluated a variant of ReLU with a gradient more amenable to optimization, which leads to Leaky ReLU [1]. Most common problems facing ReLU activation function is a Dying ReLU, which resolve by LReLU. For this ablation study purpose, we trained a three different model, such a model used only ReLU activation function, and other two models used LReLU activation with different value of α. The value of α is non-zero value over the entire domain, which allows a small leakage to activate the dead neurons. In Figure 18 and 19 results shows that network with LReLU has a more quick convergence and help networks to train faster. Furthermore, a model with LReLU having value of α = 0.2 is less loss and higher value of PSNR as compare to LReLU with α = 0.5 and ReLU activation functions.

4) MODEL ANALYSIS WITH SELECTION OF OPTIMIZERS
The selection of an optimizer plays a crucial role during the training to optimize the model efficiency and reduce the chance of overfitting. Our proposed SENext model is trained on four optimizers: Adam [68], Adamax, an enhanced version of Adam, Root Mean Squared Propagation (RMSprop), and Stochastic Gradient Descent (SGD). The experimental results with loss function as shown in Figure 20. Adam optimizer shows a more stable pattern as compare to other optimizers. RMSprop (green line) decreases slowly with more ripples after 400 iterations as compared to Adam. All optimizers were trained on 1000 epochs with the base model.   For Training purposes we used 100 images obtained from a DIV2K dataset [27] and for validation used 91 images of Yang91 dataset [1] having batch size is 16.

5) MODEL ANALYSIS IN TERMS OF MEAN INFERENCE TIME
The inference time is an important factor of image superresolution methods other than the SR performance. In this part of ablation we shows the inference test time on publicly available datasets such as Set5, Set14, BSD100, Urban100 and Manga109 with enlargement factor 2×, 3×, 4× and 8× as shown in Figure 21. Form the Figure 21, it is clearly seen that as the higher enlargement factor increasing the more processing time as compared to small scale factor, because input image having 2× enlargement factor is higher than 8× enlargement factor. Therefore, the computational cost of 2× input image is higher than 8× enlargement factor.

V. CONCLUSION AND FUTURE WORK
In this study presents a novel two-stage squeeze (compress) and expanded method for Squeeze-and-ExcitationNext for Single Image Super-Resolution (SENext). Proposed SENext used SFEB, SEB, SCB, CUB, and UBB blocks with the support of local and global residual skip connection. The SFEB blocks are extract the low-frequency feature from the original LR an input image. The resultant new feature information are add to the remaining blocks through a long and short skip paths. Implementation of SEB side-by-side reduces the computational cost of the network and calculates the highfrequency features information. The use of extensive sublocal skip connections helps to reduce vanishing gradient problems during the training. In addition, to activate the dead neurons in the model during the training, we replaced the conventional ReLU activation function with LReLU. Furthermore, the comparative analysis and ablation study shows the efficiency of a squeeze and excitation network to reduce lots of parameters and computational cost. Extensive evaluations on five benchmark test datasets showed that our SENext model also improves the reconstruction results in both quantitative and qualitative criteria on challenging upsampling factor of ×4 or ×8. In future, we will further optimize our model to introduce multi-path learning with dense global and local skip connections under complex scenarios.