Skip to main content
Log in

DRN-VideoSR: a deep recursive network for video super-resolution based on a deformable convolution shared-assignment network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Video super-resolution(videoSR) usually involves several steps: motion estimation, motion compensation, fusion, and upsampling. Here, we propose a novel architecture for video SR. First, in place of motion estimation and compensation, this architecture is based on a specially designed deformable convolution shared-assignment network. The model does not require warp operation and uses a three-layer pyramid deformable convolution network. Second, inspired by the idea of back-projection and Encoder-Decoder structure, we propose a deep recursive fusion network that fuses multi-frame information for the target frame. The fusion network adopts a Decoder-Encoder structure with shared weights to construct the back-projection network, and concatenates the output of each back-projection layer. This design not only reduces the network requirements, but also deepens the network structure so that it can extract deeper image features and achieve fusion. Extensive evaluations and comparisons with previous methods validate the strengths of this approach and demonstrate that the proposed framework is able to significantly outperform the current state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The datasets analysed during the current study are available in the Vimeo repository: http://data.csail.mit.edu/tofu/dataset/vimeo_septuplet.

References

  1. Abbass MY, Kwon KC, Alam MS, Piao YL, Lee KY, Kim N (2021) Image super resolution based on residual dense CNN and guided filters. Multimed Tools Appl 80:5403–5421

    Article  Google Scholar 

  2. Ahn N, Kang B, Sohn K (2018) Photo-realistic image super-resolution with fast and lightweight cascading residual network. The European Conference on Computer Vision (ECCV), pp 252–268

  3. Arjovsky M, Chintala S, Bottou L (2018) Wasserstein GAN. arXiv:1701.07875

  4. Berthelot D, Schumm T, Metz L (2017) BEGAN: boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703. 10717

  5. Bin H, Chen WH, Wu XM (2017) High- quality face image super resolution using conditional generative adversarial networks. arXiv preprint arXiv:1707.00737

  6. Caballero J, Ledig C, Aitken A et al (2017) Real-time video super resolution with spatio-temporal networks and motion compensation. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 4778–4787

  7. Chu M, Xie Y, Laura LT (2019) Temporally Coherent GANs for Video Super-Resolution(TecoGAN). arXiv:1811.09393

  8. Dai J, Qi H, Xiong Y et al (2017) Deformable convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp 764–773

  9. Dong C, Loy CC, He K et al (2014) Learning a deep convolutianal network for image super-resolution. In: European Conference on Computer Vision(ECCV), pp 184–199

  10. Fu L, Sun X, Zhao Y, Chen RJ, Chen H, Zhao R (2021) Video super-resolution reconstruction method based on deep Back projection and motion feature fusion. Multimed Tools Appl 80:11423–11441

    Article  Google Scholar 

  11. Haris M, Shakhnarovich G, Ukita N (2019) Recurrent back-projection network for video super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 3897–3906

  12. Haris M, Shakhnarovich G, Ukita N (2021) Deep Back-ProjectiNetworks for single image super-resolution. IEEE Trans Pattern Anal Mach Intell 43(12):4323–4337

    Article  Google Scholar 

  13. Hu XC, Mu HY, Zhang X et al (2019) Meta-SR: a magnification-arbitrary network for super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 1575–1584

  14. Isobe T, Li SJ, Jia X et al (2020) Video super-resolution with temporal group attention. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 8005–8014

    Google Scholar 

  15. Isobe T, Jia X, Gu S (2020) Video super-resolution with recurrent structure- detail network. arXiv:2008.00455v1

  16. Jiang K, Wang Z, Yi P, Wang G, Lu T, Jiang J (2019) Edge-enhanced GAN for remote sensing image Superresolution. IEEE Trans Geosci Remote Sens 8(57):5799–5812

    Article  Google Scholar 

  17. Jiang K, Wang Z, Yi P (2020) Hierarchical dense recursive network for image super-resolution. Pattern Recognit 107:107475

    Article  Google Scholar 

  18. Jo Y, Wug S, Kang J et al (2018) Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 3224–3232

  19. Kim J, Lee JK, Lee KM (2016) Deeply-recursive convolutional network for image super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 1637–1645

  20. Ledig C, Theis L, Huszar F et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 4681–4690

  21. Li Z, Yang J, Liu Z et al (2019) Feedback network for image super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 3867–3876

  22. Li S, He FX, Du B et al (2019) Fast spatio-temporal residual network for video super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 10522–10533

  23. Li F, Bai HH, Zhao Y (2020) Learning a deep dual attention network for video super-resolution. IEEE Trans Image Process 29:4474–4488

    Article  MATH  Google Scholar 

  24. Lim B, Son S, Kim H et al (2017) Enhanced deep residual networks for single image super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 136–144

  25. Maalouf A, Larabi M (2012) Colour image super-resolution using geometric grouplets. IET Image Process 6(2):168–180

    Article  MathSciNet  Google Scholar 

  26. Mehdi SM, Vemulapalli R, Brown M (2018) Frame-recurrent video super- resolution. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 6626–6634

  27. Min L, Yang P, Xu B et al (2019) Multi-image blind super-resolution in variational Bayesian framework. Opto-Electronic Engineering

  28. Shi W, Caballero J, Huszar F et al (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 1874–1883

  29. Sun W, Zhang YN (2020) Attention-guided dual spatial-temporal non-local network for video super-resolution. Neurocomputing 406:24–33

    Article  Google Scholar 

  30. Sun C, Lu J et al (2017) Method of rapid image super-resolution based on deconvolution. Acta Optica Sinica 37(12):1210004

    Google Scholar 

  31. Tai Y, Yang J, Liu X (2017) Image super-resolution via deep recursive residual network. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 3147–3155

  32. Tai Y, Yang J, Liu X (2017) Memnet: a persistent memory network for image restoration. In: IEEE International Conference on Computer Vision (ICCV), pp 4549–4557

  33. Tian Y, Zhang Y, Fu Y, Xu C (2020) TDAN: Temporally-deformable alignment network for video super-resolution. 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3357–3366

  34. Wang XT, Yu K, Dong C et al (2018) Recovering realistic texture in image super-resolution by deep spatial feature transform. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 606–615

  35. Wang XT, Yu K, Wu SX et al (2018) ESRGAN: enhanced super-resolution generative adversarial networks. The European Conference on Computer Vision (ECCV), pp 1–16

  36. Wang X, Chan KCK, Yu K et al (2019) EDVR: video restoration with enhanced deformable convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 1954–1963

  37. Wang L, Guo Y, Liu L, Lin Z, Deng X, An W (2020) Deep video super-resolution using HR optical flow estimation. IEEE Trans Image Process 29:4323–4336

    Article  MATH  Google Scholar 

  38. Wang S, Zhou T, Lu Y, Di H (2022) Detail-preserving transformer for light field image super-resolution. In: Association for the Advance of Artificial Intelligence (AAAI)

  39. Wang S, Zhou T, Lu Y, Di H (2022) Contextual transformation network for lightweight remote-sensing image super-resolution. IEEE Trans Geosci Remote Sens 60:1–13

    Google Scholar 

  40. Yi P, Wang ZY, Jiang K et al (2019) Progressive fusion video superresolution network via exploiting non-local spatio-temporal correlations. In: IEEE International Conference on Computer Vision (ICCV), pp 3106–3115

  41. Yi P, Wang Z, Jiang K et al (2020) A progressive fusion generative adversarial network for realistic and consistent video super-resolution. IEEE Trans Pattern Anal Mach Intell 5(44):2264–2280

    Google Scholar 

  42. Yi P, Wang Z, Jiang K, Shao Z, Ma J (2020) Multi- temporal ultra dense memory network for video super-resolution. IEEE Trans Circuits Syst Video Technol 8(30):2503–2516

    Article  Google Scholar 

  43. Yoon Y, Jeon H, Yoo D et al (2015) Learning a deep convolutional network for light-field image super-resolution. In: IEEE International Conference on Computer Vision Workshop, vol 17, pp 57–65

  44. Zhang YL, Tian YP, Kong Y et al (2018) Residual dense network for image super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 2472–2481

  45. Zhang S, Lin Y, Sheng H (2019) Residual networks for light field image super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 11046–11055

  46. Zhou T, Li J, Wang S, Tao R, Shen J (2020) MATNet: motion-attentive transition network for zero-shot video object segmentation. IEEE Trans Image Process 29:8326–8338

    Article  MATH  Google Scholar 

  47. Zhou T, Wang W, Liu S et al (2021) Differentiable multi-granularity human representation learning for instance-aware human semantic parsing. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1622–1631

  48. Zhou T, Li J, Li X, Shao L (2021) Target-aware object discovery and association for unsupervised video multi-object segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6981–6990

Download references

Funding

This study was funded the National Natural Science Foundation Youth Fund (61601404), General Scientific Research Projects of Zhejiang Education Department (Y201840087), and the Opening Foundation of State Key Laboratory of Cognitive Intelligence, iFLYTEK (CIOS-2022SC06).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanbing Jiang.

Ethics declarations

Conflict of interest

The author declares he has no confict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mu, S., Zhang, Y. & Jiang, Y. DRN-VideoSR: a deep recursive network for video super-resolution based on a deformable convolution shared-assignment network. Multimed Tools Appl 82, 14019–14035 (2023). https://doi.org/10.1007/s11042-022-13818-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13818-8

Keywords

Navigation