Abstract
At present, image completion models are often used to handle images in public datasets and are not competent for tasks in practical scenarios such as USV scenes. On one hand, the practical missing regions are often located at the boundaries, which presents a challenge for the model to extract image features. On the other hand, real images are often blurred, and filling the content without optimizing the image quality will limit the application of completed images. To address these challenges, a novel image completion model using an attention mechanism and a joint enhancive discriminator, which can effectively fill in the missing and enhance the image quality in various missing situations, has been proposed in this paper. First, an attention mechanism is used as the condition of the generator to extract the related missing information according to different weights. To ensure that the output of the algorithm is semantically consistent with the original image and has a higher quality than the original image, a joint discriminator is introduced to constrain the conditional generator from different aspects. In particular, the cloud model (a cognitive model) in the joint enhancive discriminator can promote the quality and practicability of the generated images. Experimental results show that our model achieves better performance than state-of-the-art models in both qualitative and quantitative measurements on four public datasets and one USV real dataset. Compared with the baselines, our method has an average improvement of \(4.90\%\) and \(1.00\%\) in image completion evaluation and image quality evaluation, respectively. We also verify the performance of our model for practical application in real scenarios. The code is available at https://github.com/wrq-cqupt/IC.
Similar content being viewed by others
References
Qiang Z-P, He L-B (2019) Survey on deep learning image inpainting methods. J Image Graph 24(3):0447–0463
Niu Z, Zhong G, Yu H (2021) A review on the attention mechanism of deep learning. Neurocomputing 452:48–62
Wang Z, She Q, Ward TE (2021) Generative adversarial networks in computer vision: a survey and taxonomy. ACM Comput Surv (CSUR) 54(2):1–38
Wu Z, Shen C, Van Den Hengel A (2019) Wider or deeper: revisiting the RESNET model for visual recognition. Pattern Recogn 90:119–133
Liu G, Reda FA, Shih KJ, Wang T-C, Tao A, Catanzaro B (2018) Image inpainting for irregular holes using partial convolutions. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 85–100
Li W, Wang Y, Du J, Lai J (2017) Synergistic integration of graph-cut and cloud model strategies for image segmentation. Neurocomputing 257:37–46
Zhang Y, Wang Y, Han Z, Tang Y et al (2022) Effective tensor completion via element-wise weighted low-rank tensor train with overlapping ket augmentation. IEEE Trans Circuits Syst Video Technol
Xie M, Liu X, Yang X (2022) A nonlocal self-similarity-based weighted tensor low-rank decomposition for multichannel image completion with mixture noise. IEEE Trans Neural Netw Learn Syst
Xu R, Xu Y, Quan Y (2020) Factorized tensor dictionary learning for visual tensor data completion. IEEE Trans Multimedia 23:1225–1238
Zhang T, Zhao J, Sun Q, Zhang B, Chen J, Gong M (2022) Low-rank tensor completion via combined tucker and tensor train for color image recovery. Appl Intell 52(7):7761–7776
Jia Z, Jin Q, Ng MK, Zhao X-L (2022) Non-local robust quaternion matrix completion for large-scale color image and video inpainting. IEEE Trans Image Process 31:3868–3883
Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2536–2544
Iizuka S, Simo-Serra E, Ishikawa H (2017) Globally and locally consistent image completion. ACM Transactions on Graphics (ToG) 36(4):107
Xu L, Zeng X, Li W, Huang Z (2020) Multi-granularity generative adversarial nets with reconstructive sampling for image inpainting. Neurocomputing 402:220–234
Shin YG, Sagong MC, Yeo YJ, Kim SW, Ko SJ (2020) Pepsi++: fast and lightweight network for image inpainting. IEEE Trans Neural Netw Learn Syst
Yuan Z, Li H, Liu J, Luo J (2019) Multiview scene image inpainting based on conditional generative adversarial networks. IEEE Trans Intell Veh 5(2):314–323
Quan W, Zhang R, Zhang Y, Li Z, Wang J, Yan D-M (2022) Image inpainting with local and global refinement. IEEE Trans Image Process 31:2405–2420
Li H, Li G, Lin L, Yu H, Yu Y (2018) Context-aware semantic inpainting. IEEE transactions on cybernetics 49(12):4398–4411
Wang N, Zhang Y, Zhang L (2021) Dynamic selection network for image inpainting. IEEE Trans Image Process 30:1784–1798
Xie C, Liu S, Li C, Cheng MM, Zuo W, Liu X, Wen S, Ding E (2019) Image inpainting with learnable bidirectional attention maps. In: Proceedings of the IEEE International Conference on Computer Vision. pp 8858–8867
Li J, Wang N, Zhang L, Du B, Tao D (2020) Recurrent feature reasoning for image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 7760–7768
Du Y, He J, Huang Q, Sheng Q, Tian G (2022) A coarse-to-fine deep generative model with spatial semantic attention for high-resolution remote sensing image inpainting. IEEE Trans Geosci Remote Sens 60:1–13
Uittenbogaard R, Sebastian C, Vijverberg J, Boom B, Gavrila DM et al (2019) Privacy protection in street-view panoramas using depth and multi-view imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 10581–10590
Wu X, Li R-L, Zhang F-L, Liu J-C, Wang J, Shamir A, Hu S-M (2019) Deep portrait image completion and extrapolation. IEEE Trans Image Process 29:2344–2355
Li H, Wang W, Yu C, Zhang S (2021) SWAPINPAINT: identity-specific face inpainting with identity swapping. IEEE Trans Circuits Syst Video Technol
Wan Z, Zhang J, Chen D, Liao J (2021) High-fidelity pluralistic image completion with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 4692–4701
Yang J, Xiao S, Li A, Lu W, Gao X, Li Y (2021) MSTA-NET: forgery detection by generating manipulation trace based on multi-scale self-texture attention. IEEE Trans Circuits Syst Video Technol
Li Y, Yang J, Wen J (2021) Entropy-based redundancy analysis and information screening. Digit Commun Netw
Wang G, Xu C, Li D (2014) Generic normal cloud model. Inf Sci 280:1–15
Li J, He H, Li L (2018) CGAN-MBL for reliability assessment with imbalanced transmission gear data. IEEE Trans Instrum Meas 68(9):3173–3183
Zhang Y, Lu Z, Ma D, Xue J-H, Liao Q (2020) Ripple-Gan: lane line detection with ripple lane line detection network and Wasserstein Gan. IEEE Trans Intell Transp Syst 22(3):1532–1542
Xu S, Liu D, Xiong Z (2020) E2i: Generative inpainting from edge to image. IEEE Trans Circuits Syst Video Technol 31(4):1308–1322
Lyu M, Han H, Bai X (2021) Zero-shot embedding via regularization-based recollection and residual familiarity processes. IEEE Trans Syst Man Cybern Syst
Basha S, Vinakota SK, Dubey SR, Pulabaigari V, Mukherjee S (2021) AUTOFCL: automatically tuning fully connected layers for handling small dataset. Neural Comput Appl 33(13):8055–8065
Loey M, Manogaran G, Taha MHN, Khalifa NEM (2021) A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic. Measurement 167:108288
Barnes C, Shechtman E, Finkelstein A, Goldman DB (2009) Patchmatch: a randomized correspondence algorithm for structural image editing. In: ACM Transactions on Graphics (ToG), vol 28. p 24
Soliman NF, Khalil M, Algarni AD, Ismail S, Marzouk R, El-Shafai W (2021) Efficient HEVC steganography approach based on audio compression and encryption in QFFT domain for secure multimedia communication. Multimed Tools Appl 80(3):4789–4823
Ma J, Tang L, Xu M, Zhang H, Xiao G (2021) STDFUSIONNET: an infrared and visible image fusion network based on salient target detection. IEEE Trans Instrum Meas 70:1–13
Han Y, Cai Y, Cao Y, Xu X (2013) A new image fusion performance metric based on visual information fidelity. Information fusion 14(2):127–135
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv Neural Inf Proces Syst 30
Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas DN (2018) Stackgan++: realistic image synthesis with stacked generative adversarial networks. IEEE Trans Pattern Anal Mach Intell 41(8):1947–1962
Gan Z, Bi J, Ding W, Chai X (2021) Exploiting 2D compressed sensing and information entropy for secure color image compression and encryption. Neural Comput Appl 33(19):12845–12867
Matern F, Riess C, Stamminger M (2019) Gradient-based illumination description for image forgery detection. IEEE Trans Inf Forensics Secur 15:1303–1317
Tian Q-C, Cohen LD (2018) A variational-based fusion model for non-uniform illumination image enhancement via contrast optimization and color correction. Signal Process 153:210–220
Wang H, Xu Y, He Y, Cai Y, Chen L, Li Y, Sotelo MA, Li, Z (2022) YOLOV5-FOG: a multi-objective visual detection algorithm for fog driving scenes based on improved YOLOV5. IEEE Trans Instrum Meas
Liu JJ, Hou Q, Cheng MM, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 3917–3926
Acknowledgements
This work is supported by The National Natural Science Foundations of China (61936001 and 62221005), Natural Science Foundations of Chongqing (No.cstc2019jcyj-cxttX0002, cstc2021ycjh-bgzxm0013), The Key Cooperation Project of Chongqing Municipal Education Commission(HZ2021008), Doctoral Innovation Talent Program of Chongqing University of Posts and Telecommunications, China (BYJS201913, BYJS202108) and the Chongqing Postgraduate Research Innovation Project under Grants CYB20174. We would like to thank editor and reviewers for insightful comments and advice.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, R., Wang, G., Zou, G. et al. High-practicability image completion using attention mechanism and joint enhancive discriminator. Appl Intell 53, 24435–24457 (2023). https://doi.org/10.1007/s10489-023-04616-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04616-2