Skip to main content
Log in

High-practicability image completion using attention mechanism and joint enhancive discriminator

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

At present, image completion models are often used to handle images in public datasets and are not competent for tasks in practical scenarios such as USV scenes. On one hand, the practical missing regions are often located at the boundaries, which presents a challenge for the model to extract image features. On the other hand, real images are often blurred, and filling the content without optimizing the image quality will limit the application of completed images. To address these challenges, a novel image completion model using an attention mechanism and a joint enhancive discriminator, which can effectively fill in the missing and enhance the image quality in various missing situations, has been proposed in this paper. First, an attention mechanism is used as the condition of the generator to extract the related missing information according to different weights. To ensure that the output of the algorithm is semantically consistent with the original image and has a higher quality than the original image, a joint discriminator is introduced to constrain the conditional generator from different aspects. In particular, the cloud model (a cognitive model) in the joint enhancive discriminator can promote the quality and practicability of the generated images. Experimental results show that our model achieves better performance than state-of-the-art models in both qualitative and quantitative measurements on four public datasets and one USV real dataset. Compared with the baselines, our method has an average improvement of \(4.90\%\) and \(1.00\%\) in image completion evaluation and image quality evaluation, respectively. We also verify the performance of our model for practical application in real scenarios. The code is available at https://github.com/wrq-cqupt/IC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Qiang Z-P, He L-B (2019) Survey on deep learning image inpainting methods. J Image Graph 24(3):0447–0463

    MathSciNet  Google Scholar 

  2. Niu Z, Zhong G, Yu H (2021) A review on the attention mechanism of deep learning. Neurocomputing 452:48–62

    Article  Google Scholar 

  3. Wang Z, She Q, Ward TE (2021) Generative adversarial networks in computer vision: a survey and taxonomy. ACM Comput Surv (CSUR) 54(2):1–38

    Google Scholar 

  4. Wu Z, Shen C, Van Den Hengel A (2019) Wider or deeper: revisiting the RESNET model for visual recognition. Pattern Recogn 90:119–133

    Article  Google Scholar 

  5. Liu G, Reda FA, Shih KJ, Wang T-C, Tao A, Catanzaro B (2018) Image inpainting for irregular holes using partial convolutions. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 85–100

  6. Li W, Wang Y, Du J, Lai J (2017) Synergistic integration of graph-cut and cloud model strategies for image segmentation. Neurocomputing 257:37–46

    Article  Google Scholar 

  7. Zhang Y, Wang Y, Han Z, Tang Y et al (2022) Effective tensor completion via element-wise weighted low-rank tensor train with overlapping ket augmentation. IEEE Trans Circuits Syst Video Technol

  8. Xie M, Liu X, Yang X (2022) A nonlocal self-similarity-based weighted tensor low-rank decomposition for multichannel image completion with mixture noise. IEEE Trans Neural Netw Learn Syst

  9. Xu R, Xu Y, Quan Y (2020) Factorized tensor dictionary learning for visual tensor data completion. IEEE Trans Multimedia 23:1225–1238

    Article  Google Scholar 

  10. Zhang T, Zhao J, Sun Q, Zhang B, Chen J, Gong M (2022) Low-rank tensor completion via combined tucker and tensor train for color image recovery. Appl Intell 52(7):7761–7776

    Article  Google Scholar 

  11. Jia Z, Jin Q, Ng MK, Zhao X-L (2022) Non-local robust quaternion matrix completion for large-scale color image and video inpainting. IEEE Trans Image Process 31:3868–3883

    Article  Google Scholar 

  12. Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2536–2544

  13. Iizuka S, Simo-Serra E, Ishikawa H (2017) Globally and locally consistent image completion. ACM Transactions on Graphics (ToG) 36(4):107

    Article  Google Scholar 

  14. Xu L, Zeng X, Li W, Huang Z (2020) Multi-granularity generative adversarial nets with reconstructive sampling for image inpainting. Neurocomputing 402:220–234

    Article  Google Scholar 

  15. Shin YG, Sagong MC, Yeo YJ, Kim SW, Ko SJ (2020) Pepsi++: fast and lightweight network for image inpainting. IEEE Trans Neural Netw Learn Syst

  16. Yuan Z, Li H, Liu J, Luo J (2019) Multiview scene image inpainting based on conditional generative adversarial networks. IEEE Trans Intell Veh 5(2):314–323

    Article  Google Scholar 

  17. Quan W, Zhang R, Zhang Y, Li Z, Wang J, Yan D-M (2022) Image inpainting with local and global refinement. IEEE Trans Image Process 31:2405–2420

    Article  Google Scholar 

  18. Li H, Li G, Lin L, Yu H, Yu Y (2018) Context-aware semantic inpainting. IEEE transactions on cybernetics 49(12):4398–4411

    Article  Google Scholar 

  19. Wang N, Zhang Y, Zhang L (2021) Dynamic selection network for image inpainting. IEEE Trans Image Process 30:1784–1798

    Article  Google Scholar 

  20. Xie C, Liu S, Li C, Cheng MM, Zuo W, Liu X, Wen S, Ding E (2019) Image inpainting with learnable bidirectional attention maps. In: Proceedings of the IEEE International Conference on Computer Vision. pp 8858–8867

  21. Li J, Wang N, Zhang L, Du B, Tao D (2020) Recurrent feature reasoning for image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 7760–7768

  22. Du Y, He J, Huang Q, Sheng Q, Tian G (2022) A coarse-to-fine deep generative model with spatial semantic attention for high-resolution remote sensing image inpainting. IEEE Trans Geosci Remote Sens 60:1–13

    Google Scholar 

  23. Uittenbogaard R, Sebastian C, Vijverberg J, Boom B, Gavrila DM et al (2019) Privacy protection in street-view panoramas using depth and multi-view imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 10581–10590

  24. Wu X, Li R-L, Zhang F-L, Liu J-C, Wang J, Shamir A, Hu S-M (2019) Deep portrait image completion and extrapolation. IEEE Trans Image Process 29:2344–2355

    Article  MATH  Google Scholar 

  25. Li H, Wang W, Yu C, Zhang S (2021) SWAPINPAINT: identity-specific face inpainting with identity swapping. IEEE Trans Circuits Syst Video Technol

  26. Wan Z, Zhang J, Chen D, Liao J (2021) High-fidelity pluralistic image completion with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 4692–4701

  27. Yang J, Xiao S, Li A, Lu W, Gao X, Li Y (2021) MSTA-NET: forgery detection by generating manipulation trace based on multi-scale self-texture attention. IEEE Trans Circuits Syst Video Technol

  28. Li Y, Yang J, Wen J (2021) Entropy-based redundancy analysis and information screening. Digit Commun Netw

  29. Wang G, Xu C, Li D (2014) Generic normal cloud model. Inf Sci 280:1–15

    Article  MathSciNet  MATH  Google Scholar 

  30. Li J, He H, Li L (2018) CGAN-MBL for reliability assessment with imbalanced transmission gear data. IEEE Trans Instrum Meas 68(9):3173–3183

    Article  Google Scholar 

  31. Zhang Y, Lu Z, Ma D, Xue J-H, Liao Q (2020) Ripple-Gan: lane line detection with ripple lane line detection network and Wasserstein Gan. IEEE Trans Intell Transp Syst 22(3):1532–1542

    Article  Google Scholar 

  32. Xu S, Liu D, Xiong Z (2020) E2i: Generative inpainting from edge to image. IEEE Trans Circuits Syst Video Technol 31(4):1308–1322

    Article  Google Scholar 

  33. Lyu M, Han H, Bai X (2021) Zero-shot embedding via regularization-based recollection and residual familiarity processes. IEEE Trans Syst Man Cybern Syst

  34. Basha S, Vinakota SK, Dubey SR, Pulabaigari V, Mukherjee S (2021) AUTOFCL: automatically tuning fully connected layers for handling small dataset. Neural Comput Appl 33(13):8055–8065

    Article  Google Scholar 

  35. Loey M, Manogaran G, Taha MHN, Khalifa NEM (2021) A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic. Measurement 167:108288

    Article  Google Scholar 

  36. Barnes C, Shechtman E, Finkelstein A, Goldman DB (2009) Patchmatch: a randomized correspondence algorithm for structural image editing. In: ACM Transactions on Graphics (ToG), vol 28. p 24

  37. Soliman NF, Khalil M, Algarni AD, Ismail S, Marzouk R, El-Shafai W (2021) Efficient HEVC steganography approach based on audio compression and encryption in QFFT domain for secure multimedia communication. Multimed Tools Appl 80(3):4789–4823

    Article  Google Scholar 

  38. Ma J, Tang L, Xu M, Zhang H, Xiao G (2021) STDFUSIONNET: an infrared and visible image fusion network based on salient target detection. IEEE Trans Instrum Meas 70:1–13

    Google Scholar 

  39. Han Y, Cai Y, Cao Y, Xu X (2013) A new image fusion performance metric based on visual information fidelity. Information fusion 14(2):127–135

    Article  Google Scholar 

  40. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv Neural Inf Proces Syst 30

  41. Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas DN (2018) Stackgan++: realistic image synthesis with stacked generative adversarial networks. IEEE Trans Pattern Anal Mach Intell 41(8):1947–1962

    Article  Google Scholar 

  42. Gan Z, Bi J, Ding W, Chai X (2021) Exploiting 2D compressed sensing and information entropy for secure color image compression and encryption. Neural Comput Appl 33(19):12845–12867

    Article  Google Scholar 

  43. Matern F, Riess C, Stamminger M (2019) Gradient-based illumination description for image forgery detection. IEEE Trans Inf Forensics Secur 15:1303–1317

    Article  Google Scholar 

  44. Tian Q-C, Cohen LD (2018) A variational-based fusion model for non-uniform illumination image enhancement via contrast optimization and color correction. Signal Process 153:210–220

    Article  Google Scholar 

  45. Wang H, Xu Y, He Y, Cai Y, Chen L, Li Y, Sotelo MA, Li, Z (2022) YOLOV5-FOG: a multi-objective visual detection algorithm for fog driving scenes based on improved YOLOV5. IEEE Trans Instrum Meas

  46. Liu JJ, Hou Q, Cheng MM, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 3917–3926

Download references

Acknowledgements

This work is supported by The National Natural Science Foundations of China (61936001 and 62221005), Natural Science Foundations of Chongqing (No.cstc2019jcyj-cxttX0002, cstc2021ycjh-bgzxm0013), The Key Cooperation Project of Chongqing Municipal Education Commission(HZ2021008), Doctoral Innovation Talent Program of Chongqing University of Posts and Telecommunications, China (BYJS201913, BYJS202108) and the Chongqing Postgraduate Research Innovation Project under Grants CYB20174. We would like to thank editor and reviewers for insightful comments and advice.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guoyin Wang.

Ethics declarations

Conflict of interest

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, R., Wang, G., Zou, G. et al. High-practicability image completion using attention mechanism and joint enhancive discriminator. Appl Intell 53, 24435–24457 (2023). https://doi.org/10.1007/s10489-023-04616-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04616-2

Keywords

Navigation