Skip to main content
Log in

CFIDNet: cascaded feature interaction decoder for RGB-D salient object detection

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Compared with RGB salient object detection (SOD) methods, RGB-D SOD models show better performance in many challenging scenarios by leveraging spatial information embedded in depth maps. However, existing RGB-D SOD models prone to ignore the modality-specific characteristics and fuse multi-modality features by simple element-wise addition or multiplication. Thus, they may induce noise-degraded saliency maps when encountering inaccurate or blurred depth images. Besides, many models adopt the U-shape architecture to integrate multi-level features layer-by-layer. Despite the fact that low-level features can be gradually polished, little attention has been paid to enhance high-level features, which may lead to suboptimal results. In this paper, we propose a novel network named CFIDNet to tackle the above problems. Specifically, we design the feature-enhanced module to excavate informative depth cues from depth images and enhance the RGB features by employing complementary information between RGB and depth modalities. Besides, we propose the feature refinement module to exploit multi-scale complementary information between multi-level features and polish these features by applying residual connections. The cascaded feature interaction decoder (CFID) is then proposed to refine multi-level features iteratively. Equipped with these proposed modules, our CFIDNet is capable of segmenting salient objects accurately. Experimental results on 7 widely used benchmark datasets validate that our CFIDNet achieves highly competitive performance over 15 state-of-the-art models in terms of 8 evaluation metrics. Our source code will be publicly available at https://github.com/clelouch/CFIDNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Borji A, Cheng MM, Jiang H, Li J (2015) Salient object detection: a benchmark. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2015.2487833

    Article  MathSciNet  MATH  Google Scholar 

  2. Wang W, Lai Q, Fu H, Shen J, Ling H, Yang R (2021) Salient object detection in the deep learning era: an in-depth survey. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3051099

    Article  Google Scholar 

  3. Cheng MM, Liu Y, Lin WY, Zhang Z, Rosin PL, Torr PHS (2019) BING: Binarized normed gradients for objectness estimation at 300fps. Comput Vis Media. https://doi.org/10.1007/s41095-018-0120-1

    Article  Google Scholar 

  4. Cheng MM, Zhang FL, Mitra NJ, Huang X, Hu SM (2010) RepFinder: Finding approximately repeated scene elements for image editing. ACM Trans Graph. https://doi.org/10.1145/1778765.1778820

    Article  Google Scholar 

  5. Liu C et al (2020) Aggregation signature for small object tracking. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2019.2940477

    Article  MathSciNet  Google Scholar 

  6. Borji A, Frintrop S, Sihite DN, Itti L (2012) Adaptive object tracking by learning background context. In: 2012 IEEE computer society conference on computer vision and pattern recognition workshops. pp 23–30. IEEE, https://doi.org/10.1109/CVPRW.2012.6239191

  7. Hong S, You T, Kwak S, Han B (2015) Online tracking by learning discriminative saliency map with convolutional neural network. In: 32nd international conference on machine learning, ICML 2015, vol 1

  8. Zhao R, Ouyang W, Wang X (2013) Unsupervised salience learning for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition https://doi.org/10.1109/CVPR.2013.460

  9. Fan DP, Wang W, Cheng MM, Shen J (2019) Shifting more attention to video salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00875

  10. Yan P et al (2019) Semi-supervised video salient object detection using pseudo-labels. In: Proceedings of the IEEE international conference on computer vision, vol 2019-October, https://doi.org/10.1109/ICCV.2019.00738

  11. Wang W, Shen J, Yu Y, Ma KL (2017) Stereoscopic thumbnail creation via efficient stereo saliency detection. IEEE Trans Vis Comput Graph. https://doi.org/10.1109/TVCG.2016.2600594

    Article  Google Scholar 

  12. Cheng MM, Mitra NJ, Huang X, Torr PHS, Hu SM (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2014.2345401

    Article  Google Scholar 

  13. Xiao F, Peng L, Fu L, Gao X (2018) Salient object detection based on eye tracking data. Signal Process. https://doi.org/10.1016/j.sigpro.2017.10.019

    Article  Google Scholar 

  14. Jiang H, Wang J, Yuan Z, Wu Y, Zheng N, Li S (2013) Salient object detection: A discriminative regional feature integration approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2083–2090. https://doi.org/10.1109/CVPR.2013.271

  15. Zhang J, Ehinger KA, Wei H, Zhang K, Yang J (2017) A novel graph-based optimization framework for salient object detection. Pattern Recognit. https://doi.org/10.1016/j.patcog.2016.10.025

    Article  Google Scholar 

  16. Lu S, Lim JH (2012) Saliency modeling from image histograms. In: European Conference on Computer Vision, pp 312–332. Springer, Berlin 2012.

  17. Klein DA, Frintrop S (2011) Center-surround divergence of feature statistics for salient object detection. In: 2011 international conference on computer vision. IEEE, https://doi.org/10.1109/ICCV.2011.6126499

  18. Chen T, Hu X, Xiao J, Zhang G (2021) BPFINet: boundary-aware progressive feature integration network for salient object detection. Neurocomputing. https://doi.org/10.1016/j.neucom.2021.04.078

    Article  Google Scholar 

  19. Tu Z, Ma Y, Li C, Li C, Tang J, Luo B (2020) Edge-guided non-local fully convolutional network for salient object detection. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/tcsvt.2020.2980853

    Article  Google Scholar 

  20. Hou Q, Cheng MM, Hu X, Borji A, Tu Z, Torr PHS (2019) Deeply supervised salient object detection with short connections. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2018.2815688

    Article  Google Scholar 

  21. Luo Z, Mishra A, Achkar A, Eichel J, Li S, Jodoin PM (2017) Non-local deep features for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 6609–6617. https://doi.org/10.1109/CVPR.2017.698

  22. Zhang P, Wang D, Lu H, Wang H, Yin B (2017) Learning uncertain convolutional features for accurate saliency detection. In: Proceedings of the IEEE international conference on computer vision, vol 2017-October, https://doi.org/10.1109/ICCV.2017.32

  23. Zhao J, Liu JJ, Fan DP, Cao Y, Yang J, Cheng MM (2019) EGNet: Edge guidance network for salient object detection. In: Proceedings of the IEEE international conference on computer vision, vol 2019-October, https://doi.org/10.1109/ICCV.2019.00887

  24. Liu JJ, Hou Q, Cheng MM, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection, https://doi.org/10.1109/CVPR.2019.00404

  25. Zhang L, Dai J, Lu H, He Y, Wang G (2018) A bi-directional message passing model for salient object detection, https://doi.org/10.1109/CVPR.2018.00187

  26. Zhu L et al (2020) Aggregating attentional dilated features for salient object detection. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2019.2941017

    Article  Google Scholar 

  27. Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: boundary-aware salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00766

  28. Wu Z, Su L, Huang Q (2019) Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00403

  29. Wei J, Wang S, Huang Q (2020) F3Net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, no 07 SE-AAAI Technical Track: Vision, pp 12321–12328, https://doi.org/10.1609/aaai.v34i07.6916

  30. Gao S-H, Tan Y-Q, Cheng M-M, Lu C, Chen Y, Yan S (2020) Highly efficient salient object detection with 100K parameters. In: Computer Vision—ECCV 2020, pp 702–721

  31. Zhou H, Xie X, Lai J-H, Chen Z, Yang L (2020) Interactive two-stream decoder for accurate and fast saliency detection, doi: https://doi.org/10.1109/cvpr42600.2020.00916

  32. Pang Y, Zhao X, Zhang L, Lu H (2020) Multi-scale interactive network for salient object detection. CVPR. https://doi.org/10.1109/cvpr42600.2020.00943

    Article  Google Scholar 

  33. Deng Z et al (2018) R3Net: recurrent residual refinement network for saliency detection. In: IJCAI international joint conference on artificial intelligence, vol 2018-July, https://doi.org/10.24963/ijcai.2018/95

  34. Qin X, Zhang Z, Huang C, Dehghan M, Zaiane OR, Jagersand M (2020) U2-Net: Going deeper with nested U-structure for salient object detection. Pattern Recognit. https://doi.org/10.1016/j.patcog.2020.107404

    Article  Google Scholar 

  35. Zhai Y et al (2020) Bifurcated backbone strategy for RGB-D salient object detection, arXiv. 2020

  36. Chen Z, Cong R, Xu Q, Huang Q (2020) DPANet: depth potentiality-aware gated attention network for RGB-D salient object detection. IEEE Trans Image Process. https://doi.org/10.1109/tip.2020.3028289

    Article  Google Scholar 

  37. Da Jin W, Xu J, Han Q, Zhang Y, Cheng MM (2021) CDNet: complementary depth network for rgb-d salient object detection. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2021.3060167

    Article  Google Scholar 

  38. Piao Y, Rong Z, Zhang M, Ren W, Lu H (2020) A2dele: adaptive and attentive depth distiller for efficient RGB-D salient object detection, https://doi.org/10.1109/CVPR42600.2020.00908

  39. Wang N, Gong X (2019) Adaptive fusion for RGB-D salient object detection. IEEE Access 7:55277–55284. https://doi.org/10.1109/ACCESS.2019.2913107

    Article  Google Scholar 

  40. Zhang M, Fei SX, Liu J, Xu S, Piao Y, Lu H (2020) Asymmetric two-stream architecture for accurate RGB-D saliency detection. ECCV. https://doi.org/10.1007/978-3-030-58604-1_23

    Article  Google Scholar 

  41. Zhao JX, Cao Y, Fan DP, Cheng MM, Li XY, Zhang L (2019) Contrast prior and fluid pyramid integration for rgbd salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00405

  42. Li G, Liu Z, Ye L, Wang Y, Ling H (2020) Cross-modal weighting network for RGB-D salient object detection. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2020, vol 12362 LNCS, https://doi.org/10.1007/978-3-030-58520-4_39

  43. Piao Y, Ji W, Li J, Zhang M, Lu H (2019) Depth-induced multi-scale recurrent attention network for saliency detection. In: Proceedings of the IEEE international conference on computer vision, vol 2019-October, https://doi.org/10.1109/ICCV.2019.00735

  44. Zhang M, Zhang Y, Piao Y, Hu B, Lu H (2020) Feature reintegration over differential treatment: a top-down and adaptive fusion network for RGB-D salient object detection, https://doi.org/10.1145/3394171.3413969

  45. Pang Y, Zhang L, Zhao X, Lu H (2020) Hierarchical dynamic filtering network for rgb-d salient object detection. ECCV. https://doi.org/10.1007/978-3-030-58595-2_15

    Article  Google Scholar 

  46. Li G, Liu Z, Ling H (2020) ICNet: information conversion network for RGB-D based salient object detection”. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2020.2976689

    Article  Google Scholar 

  47. Wu J, Zhou W, Luo T, Yu L, Lei J (2021) Multiscale multilevel context and multimodal fusion for RGB-D salient object detection. Signal Process. https://doi.org/10.1016/j.sigpro.2020.107766

    Article  Google Scholar 

  48. Fu K, Fan DP, Ji GP, Zhao Q (2020) JL-DCF: joint learning and densely-cooperative fusion framework for RGB-D salient object detection. https://doi.org/10.1109/CVPR42600.2020.00312

  49. Chen H, Li Y, Su D (2019) Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection. Pattern Recognit. https://doi.org/10.1016/j.patcog.2018.08.007

    Article  Google Scholar 

  50. Zhao X, Zhang L, Pang Y, Lu H, Zhang L (2020) A single stream network for robust and real-time RGB-D salient object detection. ECCV. https://doi.org/10.1007/978-3-030-58542-6_39

    Article  Google Scholar 

  51. Li G, Liu Z, Chen M, Bai Z, Lin W, Ling H (2021) Hierarchical alternate interaction network for rgb-d salient object detection. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2021.3062689

    Article  Google Scholar 

  52. Wang X, Girshick R, Gupta A, He K (2018) Non-local Neural Networks. https://doi.org/10.1109/CVPR.2018.00813

  53. Lu S, Tan C, Lim JH (2014) Robust and efficient saliency modeling from image co-occurrence histograms. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2013.158

    Article  Google Scholar 

  54. Zhang P, Wang D, Lu H, Wang H, Ruan X (2017) Amulet: aggregating multi-level convolutional features for salient object detection. In: Proceedings of the IEEE international conference on computer vision, vol 2017-Octob, https://doi.org/10.1109/ICCV.2017.31

  55. Wang T, Borji A, Zhang L, Zhang P, Lu H (2017) A stagewise refinement model for detecting salient objects in images. In: Proceedings of the IEEE international conference on computer vision, vol 2017-October, https://doi.org/10.1109/ICCV.2017.433

  56. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, vol 2017-January, https://doi.org/10.1109/CVPR.2017.660

  57. Liu N, Han J, Yang MH (2018) PiCANet: learning pixel-wise contextual attention for saliency detection. https://doi.org/10.1109/CVPR.2018.00326

  58. Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00172

  59. Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00320

  60. Wu Z, Su L, Huang Q (2019) Stacked cross refinement network for edge-aware salient object detection. https://doi.org/10.1109/ICCV.2019.00736

  61. Liu JJ, Hou Q, Cheng MM (2020) Dynamic feature integration for simultaneous detection of salient object, edge, and skeleton. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2020.3017352

    Article  Google Scholar 

  62. Cheng Y, Fu H, Wei X, Xiao J, Cao X (2014) Depth enhanced saliency detection method, https://doi.org/10.1145/2632856.2632866

  63. Zhu C, Li G, Wang W, Wang R (2017) An innovative salient object detection using center-dark channel prior. In: Proceedings - 2017 IEEE international conference on computer vision workshops, ICCVW 2017, vol 2018-January, https://doi.org/10.1109/ICCVW.2017.178

  64. Peng H, Li B, Xiong W, Hu W, Ji R (2014) RGBD salient object detection: A benchmark and algorithms. ECCV. https://doi.org/10.1007/978-3-319-10578-9_7

    Article  Google Scholar 

  65. Song H, Liu Z, Du H, Sun G, Le Meur O, Ren T (2017) Depth-aware salient object detection and segmentation via multiscale discriminative saliency fusion and bootstrap learning. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2017.2711277

    Article  MathSciNet  MATH  Google Scholar 

  66. Feng D, Barnes N, You S, McCarthy C (2016) Local background enclosure for RGB-D salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2016-December, https://doi.org/10.1109/CVPR.2016.257

  67. Ju R, Ge L, Geng W, Ren T, Wu G (2014) Depth saliency based on anisotropic center-surround difference, https://doi.org/10.1109/ICIP.2014.7025222

  68. Zhu C, Cai X, Huang K, Li TH, Li G (2019) PDNet: prior-model guided depth-enhanced network for salient object detection. In: Proceedings - IEEE international conference on multimedia and expo, vol 2019-July, https://doi.org/10.1109/ICME.2019.00042

  69. Ji W, Li J, Zhang M, Piao Y, Lu H (2020) Accurate RGB-D salient object detection via collaborative learning. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 12363 LNCS, https://doi.org/10.1007/978-3-030-58523-5_4

  70. Fan D-P, Lin Z, Zhang Z, Zhu M, Cheng M-M (2020) Rethinking RGB-D salient object detection: models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.2996406

    Article  Google Scholar 

  71. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2016-December, https://doi.org/10.1109/CVPR.2016.90

  72. Gupta S, Girshick R, Arbeláez P, Malik J (2014) Learning rich features from RGB-D images for object detection and segmentation. ECCV. https://doi.org/10.1007/978-3-319-10584-0_23

    Article  Google Scholar 

  73. Krähenbühl P, Koltun V (2012) Efficient inference in fully connected CRFs with gaussian edge potentials. Adv Neural Inf Process Syst 24:109–117

    Google Scholar 

  74. Liu N, Zhang N, Han J (2020) Learning selective self-mutual attention for RGB-D saliency detection. https://doi.org/10.1109/CVPR42600.2020.01377

  75. Li N, Ye J, Ji Y, Ling H, Yu J (2017) Saliency detection on light field. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2016.2610425

    Article  Google Scholar 

  76. Li G, Zhu C (2017) A three-pathway psychobiological framework of salient object detection using stereoscopic technology. In: Proceedings - 2017 IEEE international conference on computer vision workshops, ICCVW 2017, vol 2018-January, https://doi.org/10.1109/ICCVW.2017.355

  77. Niu Y, Geng Y, Li X, Liu F (2012) Leveraging stereopsis for saliency analysis, https://doi.org/10.1109/CVPR.2012.6247708

  78. Kingma DP, Ba JL (2015) Adam: a method for stochastic optimization

  79. Margolin R, Zelnik-Manor L, Tal A (2014) How to evaluate foreground maps, https://doi.org/10.1109/CVPR.2014.39

  80. Fan DP, Cheng MM, Liu Y, Li T, Borji A (2017) Structure-measure: a new way to evaluate foreground maps, In: Proceedings of the IEEE international conference on computer vision, vol 2017-October, https://doi.org/10.1109/ICCV.2017.487

  81. Fan DP, Gong C, Cao Y, Ren B, Cheng MM, Borji A (2018) Enhanced-alignment measure for binary foreground map evaluation. In: IJCAI International joint conference on artificial intelligence, vol 2018-July, https://doi.org/10.24963/ijcai.2018/97

  82. Han J, Chen H, Liu N, Yan C (2018) Li X “CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion.” IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2017.2761775

    Article  Google Scholar 

  83. Chen H, Li Y (2018) Progressively complementarity-aware fusion network for RGB-D salient object detection, https://doi.org/10.1109/CVPR.2018.00322

  84. Chen H, Li Y (2019) Three-stream attention-aware network for rgb-d salient object detection. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2019.2891104

    Article  MathSciNet  MATH  Google Scholar 

  85. Ji W et al (2021) Calibrated RGB-D salient object detection. In: CVPR, pp 9471–9481

  86. Hussain T, Anwar S, Ullah A, Muhammad K, Baik SW (2021) Densely deformable efficient salient object detection network, In: CoRR, vol abs/2102.06407, [Online]. Available: https://arxiv.org/abs/2102.06407

  87. Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra R-CNN: towards balanced learning for object detection. CVPR. https://doi.org/10.1109/CVPR.2019.00091

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (under Grant 51807003).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Xiao.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work; there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, T., Hu, X., Xiao, J. et al. CFIDNet: cascaded feature interaction decoder for RGB-D salient object detection. Neural Comput & Applic 34, 7547–7563 (2022). https://doi.org/10.1007/s00521-021-06845-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06845-3

Keywords

Navigation