计算机科学 ›› 2024, Vol. 51 ›› Issue (1): 233-242.doi: 10.11896/jsjkx.230500035

• 计算机图形学&多媒体 • 上一篇    下一篇

基于伪标签的弱监督显著特征增强目标检测方法

史殿习1,2, 刘洋洋1,3, 宋林娜1,3, 谭杰夫1, 周晨磊1, 张轶2   

  1. 1 天津(滨海)人工智能创新中心 天津300457
    2 智能博弈与决策实验室 北京100091
    3 国防科技大学计算机学院 长沙410073
  • 收稿日期:2023-05-08 修回日期:2023-10-10 出版日期:2024-01-15 发布日期:2024-01-12
  • 通讯作者: 张轶(jxnzdl@126.com)
  • 作者简介:(dxshi@nudt.edu.cn)
  • 基金资助:
    天津市滨海新区合作共建研发平台科技项目(BHXQKJXMPT-RGZNJMZX-2019001);国家自然科学基金(91948303)

FeaEM:Feature Enhancement-based Method for Weakly Supervised Salient Object Detection via Multiple Pseudo Labels

SHI Dianxi1,2, LIU Yangyang1,3, SONG Linna1,3, TAN Jiefu1, ZHOU Chenlei1, ZHANG Yi2   

  1. 1 Tianjin Artificial Intelligence Innovation Center,Tianjin 300450,China
    2 Intelligent Game and Decision Lab(IGDL),Beijing 100091,China
    3 College of Computer,National University of Defense Technology,Changsha 410073,China
  • Received:2023-05-08 Revised:2023-10-10 Online:2024-01-15 Published:2024-01-12
  • About author:SHI Dianxi,born in 1966,Ph.D,professor,Ph.D supervisor.His main research interests include artificial intelligence,robot operating system,distributed computing, and cloud computing.
    ZHANG Yi,born in 1987,Ph.D.His main research interests include AI security and information security.
  • Supported by:
    Science and Technology Commission of Tianjin Binhai New Area(BHXQKJXM-PT-RGZNJMZX-2019001) and National Natural Science Foundation of China(91948303).

摘要: 显著性目标检测旨在检测图像中最明显的区域。传统的基于单一标签的算法不可避免地受到所采用的细化算法的影响,表现出偏见特征,从而进一步影响了显著性网络的检测性能。针对这一问题,基于多指令滤波器结构,提出了一种基于伪标签的弱监督显著特征增强目标检测方法FeaEM,通过从多个标签中集成更全面和准确的显著性线索,从而有效提升目标检测的性能。FeaEM方法的核心是引入一个新的多指令滤波器结构,利用多个伪标签来避免单一标签带来的负面影响;通过在指令滤波器中引入特征选择机制,从噪声伪标签中提取和过滤更准确的显著性线索,从而学习更多有效的具有代表性的特征;同时,针对现有的弱监督目标检测方法对输入图像的尺度十分敏感,同一图像的不同尺寸输入的预测结构存在较大偏差问题,通过引入尺度特征融合机制,以确保在输入不同尺寸的同一图像时,能输出一致的显著图,进而有效提高模型的尺度泛化能力。在多个数据集上进行的大量实验表明,所提出的FeaEM方法优于最具代表性的方法。

关键词: 深度学习, 目标检测, 显著性, 伪标签, 注意力机制

Abstract: Salient object detection is designed to detect the most obvious areas of an image.The traditional method based on single label is inevitably affected by the refinement algorithm and shows bias characteristics,which further affects the detection perfor-mance of saliency network.To solve this problem,based on the structure of multi-instruction filter,this paper proposes a feature enhancement-based method for weakly supervised salient object detection via multiple pseudo labels(FeaEM),which integrates more comprehensive and accurate saliency cues from multiple labels to effectively improve the performance of object detection.The core of FeaEM method is to introduce a new multi-instruction filter structure and use multiple pseudo-labels to avoid the negative effects of a single label.By introducing the feature selection mechanism into the instruction filter,more accurate significance clues are extracted and filtered from the noise false label,so as to learn more effective representative features.At the same time,the existing weak supervised object detection methods are very sensitive to the scale of the input image,and the prediction structure of the input of different sizes of the same image has a large deviation.The scale feature fusion mechanism is introduced to ensure that the output of the same image of different sizes is consistent,so as to effectively improve the scale generalization ability of the model.A large number of experiments on multiple data sets show that the FeaEM method proposed in this paper is superior to the most representative methods.

Key words: Deep learning, Object detection, Salient, Pseudo labels, Attention mechanism

中图分类号: 

  • TP391.41
[1]WANG T,ZHANG L,WANG S,et al.Detect globally,refine locally:A novel approach to saliency detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3127-3135.
[2]WU Z,SU L,HUANG Q.Cascaded partial decoder for fast and accurate salient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:3907-3916.
[3]ZHANG Q,CONG R,LI C,et al.Dense attention fluid network for salient object detection in optical remote sensing images[J].IEEE Transactions on Image Processing,2020,30:1305-1317.
[4]LIU J J,HOU Q,LIU Z A,et al.Poolnet+:Exploring the potential of pooling for salient object detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(1):887-904.
[5]ZHAO L,LIU G,GUO D,et al.Boosting Few-shot visual recognition via saliency-guided complementary attention[J].Neurocomputing,2022,507:412-427.
[6]WU H,ZHANG L,MA J.Remote sensing image super-resolution via saliency-guided feedback GANs[J].IEEE Transactions on Geoscience and Remote Sensing,2020,60:1-16.
[7]YANG K,ZHANG P,QIAO P,et al.Objectness consistent representation for weakly supervised object detection[C]//Proceedings of the 28th ACM International Conference on Multimedia.2020:1688-1696.
[8]HU Y T,HUANG J B,SCHWING A G.Unsupervised videoobject segmentation using motion saliency-guided spatio-temporal propagation[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:786-802.
[9]ZHANG W,ZHENG L,WANG H,et al.Saliency HierarchyModeling via Generative Kernels for Salient Object Detection[C]//Computer Vision-ECCV 2022:17th European Confe-rence,Tel Aviv,ISRAEL,Part XXVIII.Cham:Springer Nature Switzerland,2022:570-587.
[10]ZHANG M,LI J,WEI J,et al.Memory-oriented decoder forlight field salient object detection[C]//Neural Information Processing Systems.2019.
[11]ZENG Y,ZHUGE Y Z,LU H C,et al.Multi-source weak supervision for saliency detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:6074-6083.
[12]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:248-255.
[13]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//Computer Vision-ECCV 2014:13th European Conference,Zurich,Switzerland,Part V 13.Springer International Publishing,2014:740-755.
[14]CHENG M M,MITRA N J,HUANG X,et al.Global contrast based salient region detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,37(3):569-582.
[15]LEE G,TAI Y W,KIM J.Deep saliency with encoded low level distance map and high level features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:660-668.
[16]DENG Z,HU X,LEI Z,et al.R3 Net:Recurrent ResidualRefinement Network for Saliency Detection[C]//International Joint Conference on Artificial Intelligence(IJCAI).2018.
[17]HU X,ZHU L,QIN J,et al.Recurrently aggregating deep features for salient object detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018.
[18]PANG Y,ZHAO X,ZHANG L,et al.Multi-scale interactivenetwork for salient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9413-9422.
[19]ZHANG M,LIU T,PIAO Y,et al.Auto-msfnet:Search multi-scale fusion network for salient object detection[C]//Procee-dings of the 29th ACM International Conference on Multimedia.2021:667-676.
[20]ZHANG J,LIANG Q,SHI Y.Kd-scfnet:Towards more accurate and efficient salient object detection via knowledge distillation[J].arXiv:2208.02178,2022.
[21]WANG L,LU H,WANG Y,et al.Learning to detect salient objects with image-level supervision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:136-145.
[22]LI G B,XIE Y,LIN L.Weakly supervised salient object detection using image labels[C]//Proceedings of the AAAI Confe-rence on Artificial Intelligence.2018.
[23]ZHANG J,YU X,LI A,et al.Weakly-supervised salient object detection via scribble annotations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:12546-12555.
[24]PIAO Y,WU W,ZHANG M,et al.Noise-sensitive adversarial learning for weakly supervised salient object detection[J].IEEE Transactions on Multimedia,2022,25:2888-2897.
[25]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[26]LIU S,QI L,QIN H,et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8759-8768.
[27]GHIASI G,LIN T Y,LE Q V.Nas-fpn:Learning scalable feature pyramid architecture for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:7036-7045.
[28]TAN M,PANG R,LE Q V.Efficientdet:Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10781-10790.
[29]CHEN Z,XU Q,CONG R,et al.Global context-aware progressive aggregation network for salient object detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:10599-10606.
[30]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely con-nected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708.
[31]ARASLANOV N,ROTH S.Single-stage semantic segmentationfrom image labels[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:4253-4262.
[32]ACHANTA R,SHAJI A,SMITH K,et al.SLIC superpixelscompared to state-of-the-art superpixel methods[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(11):2274-2282.
[33]ZHOU B,KHOSLA A,LAPEDRIZA A,et al.Learning deep features for discriminative localization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2921-2929.
[34]KRÄHENBÜHL P,KOLTUN V.Efficient inference in fullyconnected crfs with gaussian edge potentials[J].arXiv:1210.5644,2011.
[35]LI G,YU Y.Visual saliency based on multiscale deep features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:5455-5463.
[36]YANG C,ZHANG L,LU H,et al.Saliency detection via graph-based manifold ranking[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2013:3166-3173.
[37]LI Y,HOU X,KOCH C,et al.The secrets of salient object segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:280-287.
[38]FAN D P,CHENG M M,LIU Y,et al.Structure-measure:Anew way to evaluate foreground maps[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:4548-4557.
[39]ACHANTA R,HEMAMI S,ESTRADA F,et al.Frequency-tuned salient region detection[C]//2009 IEEE conference on Computer Vision and Pattern Recognition.IEEE,2009:1597-1604.
[40]FAN D P,GONG C,CAO Y,et al.Enhanced-alignment measure for binary foreground map evaluation[C]//Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligenc.2018:698-704.
[41]PIAO Y,WANG J,ZHANG M,et al.Mfnet:Multi-filter directive network for weakly supervised salient object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:4136-4145.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!