Frequency disentangled residual network

Singh, Satya Rajendra; Yedla, Roshan Reddy; Dubey, Shiv Ram; Sanodiya, Rakesh Kumar; Chu, Wei-Ta

doi:10.1007/s00530-023-01232-5

Frequency disentangled residual network

Regular Paper
Published: 13 January 2024

Volume 30, article number 9, (2024)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Satya Rajendra Singh¹,
Roshan Reddy Yedla^1,2,
Shiv Ram Dubey^1,3,
Rakesh Kumar Sanodiya¹ &
…
Wei-Ta Chu⁴

119 Accesses
Explore all metrics

Abstract

Residual networks (ResNets) have been utilized for various computer vision and image processing applications. The residual connection improves the training of the network with better gradient flow. A residual block consists of a few convolutional layers having trainable parameters, which leads to overfitting. Moreover, the present residual networks are not able to utilize the high- and low-frequency information suitably, which also challenges the generalization capability of the network. In this paper, a frequency-disentangled residual network (FDResNet) is proposed to tackle these issues. Specifically, FDResNet includes separate connections in the residual block for low- and high-frequency components, respectively. Basically, the proposed model disentangles the low- and high-frequency components to increase the generalization ability. Moreover, the computation of low- and high-frequency components using fixed filters further avoids the overfitting. The proposed model is tested on benchmark CIFAR-10/100, Caltech, and TinyImageNet datasets for image classification. The performance of the proposed model is also tested in the image retrieval framework. It is noticed that the proposed model outperforms its counterpart residual model. The effect of kernel size and standard deviation is also evaluated. The impact of the frequency disentangling is also analyzed using a saliency map.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Down-scale simplified non-local attention networks with application to image denoising

Article 08 August 2023

An efficient lightweight network for image denoising using progressive residual and convolutional attention feature fusion

Article Open access 25 April 2024

Image Denoising with Local Dense and Adaptive Global Residual Networks

Notes

https://www.cs.toronto.edu/~kriz/cifar.html.
http://www.vision.caltech.edu/Image_Datasets/Caltech256/.
https://www.kaggle.com/c/tiny-imagenet.
In this notation used here as well as at other places, the 1-st value in the brackets is the kernel size for high-pass filtering and the 2-nd value is the kernel size for low-pass filtering.

References

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.E.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article ADS CAS PubMed Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Dubey, S.R., Singh, S.K., Singh, R.K.: Rotation and illumination invariant interleaved intensity order-based local descriptor. IEEE Trans. Image Process. 23(12), 5323–5333 (2014)
Article ADS MathSciNet PubMed Google Scholar
Dubey, S.R., Singh, S.K., Singh, R.K.: Local wavelet pattern: a new feature descriptor for image retrieval in medical ct databases. IEEE Trans. Image Process. 24(12), 5892–5903 (2015)
Article ADS MathSciNet PubMed Google Scholar
Dubey, S.R., Singh, S.K., Singh, R.K.: Multichannel decoded local binary patterns for content-based image retrieval. IEEE Trans. Image Process. 25(9), 4018–4032 (2016)
Article ADS MathSciNet PubMed Google Scholar
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Article PubMed Google Scholar
Basha, S.S., Dubey, S.R., Pulabaigari, V., Mukherjee, S.: Impact of fully connected layers on performance of convolutional neural networks for image classification. Neurocomputing 378, 112–119 (2020)
Article Google Scholar
Dubey, S.R., Chakraborty, S., Roy, S.K., Mukherjee, S., Singh, S.K., Chaudhuri, B.B.: Diffgrad: an optimization method for convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. 31(11), 4500–4511 (2019)
Article MathSciNet Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
He, D., Xie, C.: Semantic image segmentation algorithm in a deep learning computer network. Multimed. Syst. 28(6), 2065–2077 (2022)
Article MathSciNet Google Scholar
Ma, H., Liu, D., Xiong, R., Wu, F.: iwave: CNN-based wavelet-like transform for image compression. IEEE Trans. Multimed. 22(7), 1667–1679 (2019)
Article Google Scholar
Liu, L., Chen, T., Liu, H., Pu, S., Wang, L., Shen, Q.: 2c-net: integrate image compression and classification via deep neural network. Multimed. Syst. 29(3), 945–959 (2023)
Article Google Scholar
Srivastava, Y., Murali, V., Dubey, S.R.: A performance comparison of loss functions for deep face recognition. In: Seventh National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, pp. 322–332 (2019)
Chen, Z., Chen, J., Ding, G., Huang, H.: A lightweight CNN-based algorithm and implementation on embedded system for real-time face recognition. Multimed. Syst. 29(1), 129–138 (2023)
Article Google Scholar
Choi, J.Y., Lee, B.: Combining of multiple deep networks via ensemble generalization loss, based on MRI images, for Alzheimer’s disease classification. IEEE Signal Process. Lett. 27, 206–210 (2020)
Article ADS Google Scholar
Dubey, S.R., Roy, S.K., Chakraborty, S., Mukherjee, S., Chaudhuri, B.B.: Local bit-plane decoded convolutional neural network features for biomedical image retrieval. Neural Comput. Appl. 32, 7539–7551 (2020)
Article Google Scholar
Tian, C., Xu, Y., Zuo, W., Zhang, B., Fei, L., Lin, C.W.: Coarse-to-fine CNN for image super-resolution. IEEE Trans. Multimed. 23, 1489–1502 (2020)
Article Google Scholar
Liu, J., Ge, J., Xue, Y., He, W., Sun, Q., Li, S.: Multi-scale skip-connection network for image super-resolution. Multimed. Syst. 27, 821–836 (2021)
Article Google Scholar
Fang, M., Bai, X., Zhao, J., Yang, F., Hung, C.C., Liu, S.: Integrating gaussian mixture model and dilated residual network for action recognition in videos. Multimed. Syst. 26, 715–725 (2020)
Article Google Scholar
Tripathy, S.K., Kostha, H., Srivastava, R.: Ts-mda: two-stream multiscale deep architecture for crowd behavior prediction. Multimed. Syst. 29(1), 15–31 (2023)
Article Google Scholar
Que, Y., Li, S., Lee, H.J.: Attentive composite residual network for robust rain removal from single images. IEEE Trans. Multimed. 23, 3059–3072 (2020)
Article Google Scholar
Park, K., Soh, J.W., Cho, N.I.: Dynamic residual self-attention network for lightweight single image super-resolution. IEEE Trans. Multimed. 25, 907–918 (2023)
Article Google Scholar
Akbari, M., Liang, J., Han, J., Tu, C.: Learned multi-resolution variable-rate image compression with octave-based residual blocks. IEEE Trans. Multimed. 23, 3013–3021 (2021)
Article Google Scholar
Chen, S., Tan, X., Wang, B., Lu, H., Hu, X., Fu, Y.: Reverse attention-based residual network for salient object detection. IEEE Trans. Image Process. 29, 3763–3776 (2020)
Article ADS Google Scholar
Zhou, W., Wu, J., Lei, J., Hwang, J.N., Yu, L.: Salient object detection in stereoscopic 3d images using a deep convolutional residual autoencoder. IEEE Trans. Multimed. 23, 3388–3399 (2020)
Article Google Scholar
Liu, S., Thung, K.H., Lin, W., Yap, P.T., Shen, D.: Real-time quality assessment of pediatric MRI via semi-supervised deep nonlocal residual neural networks. IEEE Trans. Image Process. 29, 7697–7706 (2020)
Article ADS Google Scholar
Tang, C., Liu, X., An, S., Wang, P.: Br 2net: Defocus blur detection via a bidirectional channel attention residual refining network. IEEE Trans. Multimed. 23, 624–635 (2020)
Article Google Scholar
Yeh, C.H., Huang, C.H., Kang, L.W.: Multi-scale deep residual learning-based single image haze removal via image decomposition. IEEE Trans. Image Process. 29, 3153–3167 (2019)
Article ADS Google Scholar
Fan, Y.Y., Liu, S., Li, B., Guo, Z., Samal, A., Wan, J., Li, S.Z.: Label distribution-based facial attractiveness computation by deep residual learning. IEEE Trans. Multimed. 20(8), 2196–2208 (2017)
Article Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)
Perez, L., Wang, J.: The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621 (2017)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Machine Learning Res. 15(1), 1929–1958 (2014)
MathSciNet Google Scholar
Agrawal, A., Mittal, N.: Using CNN for facial expression recognition: a study of the effects of kernel size and number of filters on accuracy. Visual Comput. 36(2), 405–412 (2020)
Article Google Scholar
Hayou, S., Doucet, A., Rousseau, J.: On the selection of initialization and activation function for deep neural networks. arXiv preprint arXiv:1805.08266 (2018)
Sineesh, A., Raveendranatha Panicker, M.: Edge preserved universal pooling: novel strategies for pooling in convolutional neural networks. Multimed. Syst. 29, 1277–1290 (2023)
Article Google Scholar
Yedla, R.R., Dubey, S.R.: On the performance of convolutional neural networks under high and low frequency information. In: International Conference on Computer Vision and Image Processing, pp. 214–224 (2021)
Vuilleumier, P., Armony, J.L., Driver, J., Dolan, R.J.: Distinct spatial frequency sensitivities for processing faces and emotional expressions. Nat. Neurosci. 6(6), 624–631 (2003)
Article CAS PubMed Google Scholar
Monson, B.B., Hunter, E.J., Lotto, A.J., Story, B.H.: The perceptual significance of high-frequency energy in the human voice. Front. Psychol. 5, 587 (2014)
Article PubMed PubMed Central Google Scholar
Yi, Z., Chen, Z., Cai, H., Mao, W., Gong, M., Zhang, H.: Bsd-gan: branched generative adversarial network for scale-disentangled representation learning and image synthesis. IEEE Trans. Image Process. 29, 9073–9083 (2020)
Article ADS Google Scholar
Chen, X., Wang, Y., Liu, J., Qiao, Y.: Did: Disentangling-imprinting-distilling for continuous low-shot detection. IEEE Trans. Image Process. 29, 7765–7778 (2020)
Article ADS Google Scholar
Chen, H., Deng, Y., Li, Y., Hung, T.Y., Lin, G.: Rgbd salient object detection via disentangled cross-modal fusion. IEEE Trans. Image Process. 29, 8407–8416 (2020)
Article ADS Google Scholar
Kottayil, N.K., Valenzise, G., Dufaux, F., Cheng, I.: Blind quality estimation by disentangling perceptual and noisy features in high dynamic range images. IEEE Trans. Image Process. 27(3), 1512–1525 (2017)
Article ADS MathSciNet PubMed Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 4700–4708 (2017)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 1492–1500 (2017)
Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.: Deep networks with stochastic depth. In: European Conference on Computer Vision. pp. 646–661 (2016).
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Master’s thesis, University of Tront (2009)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. CalTech Report (2007)
Le, Y., Yang, X.: Tiny imagenet visual recognition challenge. CS 231N, 7 (2015)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: Visual explanations from deep networks via gradient-based localization. In: IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Dubey, S.R.: A decade survey of content based image retrieval using deep learning. IEEE Trans. Circ. Syst. Video Technol. 32(5), 2687–2704 (2021)
Article Google Scholar

Download references

Acknowledgements

This research is funded by the Global Innovation and Technology Alliance (GITA) on Behalf of Department of Science and Technology (DST), Govt. of India under India–Taiwan joint project with Project Code GITA/DST/TWN/P-83/2019. We would like to acknowledge the NVIDIA for supporting with NVIDIA GPUs and Google Colab service for providing free computational resources which have been used in this research for the experiments.

Author information

Authors and Affiliations

Computer Science and Engineering Group, Indian Institute of Information Technology, Sri City, Andhra Pradesh, India
Satya Rajendra Singh, Roshan Reddy Yedla, Shiv Ram Dubey & Rakesh Kumar Sanodiya
Stony Brook University, Stony Brook, NY, USA
Roshan Reddy Yedla
Computer Vision and Biometrics Lab, Indian Institute of Information Technology, Allahabad, Uttar Pradesh, India
Shiv Ram Dubey
Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
Wei-Ta Chu

Authors

Satya Rajendra Singh
View author publications
You can also search for this author in PubMed Google Scholar
Roshan Reddy Yedla
View author publications
You can also search for this author in PubMed Google Scholar
Shiv Ram Dubey
View author publications
You can also search for this author in PubMed Google Scholar
Rakesh Kumar Sanodiya
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Ta Chu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Equal contribution by SRS and RRY in terms of conducting experiments and computing the results. SRD conceptualized the idea, wrote the paper and supervised the work. RKS and W-TC edited the paper and provided the supervision.

Corresponding author

Correspondence to Shiv Ram Dubey.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Communicated by B. Bao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Singh, S.R., Yedla, R.R., Dubey, S.R. et al. Frequency disentangled residual network. Multimedia Systems 30, 9 (2024). https://doi.org/10.1007/s00530-023-01232-5

Download citation

Received: 12 February 2023
Accepted: 08 December 2023
Published: 13 January 2024
DOI: https://doi.org/10.1007/s00530-023-01232-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Frequency disentangled residual network

Abstract

Access this article

Similar content being viewed by others

Down-scale simplified non-local attention networks with application to image denoising

An efficient lightweight network for image denoising using progressive residual and convolutional attention feature fusion

Image Denoising with Local Dense and Adaptive Global Residual Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Frequency disentangled residual network

Abstract

Access this article

Similar content being viewed by others

Down-scale simplified non-local attention networks with application to image denoising

An efficient lightweight network for image denoising using progressive residual and convolutional attention feature fusion

Image Denoising with Local Dense and Adaptive Global Residual Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation