Skip to main content

Enhanced Spatial Awareness for Deep Interactive Image Segmentation

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13537))

Included in the following conference series:

  • 1435 Accesses

Abstract

Existing deep interactive segmentation approaches can extract the desired object for the user based on simple click interaction. However, the first click provided by the user on the full image space domain is generally too local to capture the global target object, which causes them to rely on a large number of subsequent click corrections for satisfactory results. This paper explores how to strengthen the spatial awareness of user interaction especially after the first click input and increase the stability during the continuous iterative correction process. We first design an interactive cascaded localization strategy to determine the spatial range of the potential target, and then integrate this space-aware prior into a dual-stream network structure as a soft constraint for the segmentation. The above operation can increase the network’s attention to the target of interest under very limited user interaction. A new training and inference strategy is also developed to completely adapt the benefit from the space-aware guidance. Furthermore, an object shape related loss is designed to better supervise the network based on user-provided prior guidance. Explicit subject, controllable correction and flexible interaction can help to significantly boost the interactive segmentation performance. The proposed method achieves state-of-the-art performance on several popular benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  2. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  3. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  4. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  5. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)

    Google Scholar 

  6. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans. Pattern Analysis Mach. Intell. 40(4), 834–848 (2018)

    Article  Google Scholar 

  7. Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)

    Google Scholar 

  8. Xu, N., Price, B., Cohen, S., Yang, J., Huang, T. S.: Deep interactive object selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 373–381 (2016)

    Google Scholar 

  9. Liew, J., Wei, Y., Xiong, W., Ong, S.H., Feng, J.: Regional interactive image segmentation networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2746–2754 (2017)

    Google Scholar 

  10. Li, Z., Chen, Q., Koltun, V.: Interactive image segmentation with latent diversity. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 577–585 (2018)

    Google Scholar 

  11. Rother, C., Kolmogorov, V., Blake, A.: “GrabCut" interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)

    Article  Google Scholar 

  12. Wu, J., Zhao, Y., Zhu, J. Y., Luo, S., Tu, Z.: Milcut: a sweeping line multiple instance learning paradigm for interactive image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 256–263 (2014)

    Google Scholar 

  13. Boykov, Y.Y., Jolly, M.P.: Interactive graph cuts for optimal boundary & region segmentation of objects in ND images. In: Proceedings Eighth IEEE International Conference on Computer Vision (ICCV 2001), vol. 1, pp. 105–112 (2001)

    Google Scholar 

  14. Bai, J., Wu, X.: Error-tolerant scribbles based interactive image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 392–399 (2014)

    Google Scholar 

  15. Forte, M., Price, B., Cohen, S., Xu, N., Pitié, F.: Getting to 99% accuracy in interactive segmentation. arXiv preprint arXiv:2003.07932 (2020)

  16. Majumder, S., Yao, A.: Content-aware multi-level guidance for interactive instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11602–11611 (2019)

    Google Scholar 

  17. Mahadevan, S., Voigtlaender, P., Leibe, B.: Iteratively trained interactive segmentation. arXiv preprint arXiv:1805.04398 (2018)

  18. Sofiiuk, K., Petrov, I.A., Konushin, A.: Reviving iterative training with mask guidance for interactive segmentation. arXiv preprint arXiv:2102.06583 (2021)

  19. Lin, Z., Zhang, Z., Chen, L.Z., Cheng, M.M., Lu, S.P.: Interactive image segmentation with first click attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13339–13348 (2020)

    Google Scholar 

  20. Jang, W.D., Kim, C.S.: Interactive image segmentation via backpropagating refinement scheme. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5297–5306 (2019)

    Google Scholar 

  21. Sofiiuk, K., Petrov, I., Barinova, O., Konushin, A.: f-BRS: rethinking backpropagating refinement for interactive segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8623–8632 (2020)

    Google Scholar 

  22. Maninis, K.K., Caelles, S., Pont-Tuset, J., Van Gool, L.: Deep extreme cut: from extreme points to object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 616–625 (2018)

    Google Scholar 

  23. Zhang, S., Liew, J.H., Wei, Y., Wei, S., Zhao, Y.: Interactive object segmentation with inside-outside guidance. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12234–12244 (2020)

    Google Scholar 

  24. Dupont, C., Ouakrim, Y., Pham, Q.C.: UCP-net: unstructured contour points for instance segmentation. In: 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 3373–3379 (2021)

    Google Scholar 

  25. Grady, L.: Random walks for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 28(11), 1768–1783 (2006)

    Article  Google Scholar 

  26. Chen, X., Zhao, Z., Yu, F., Zhang, Y., Duan, M.: Conditional diffusion for interactive segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7345–7354 (2021)

    Google Scholar 

  27. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFS. arXiv preprint arXiv:1412.7062 (2014)

  28. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

  29. Benenson, R., Popov, S., Ferrari, V.: Large-scale interactive object segmentation with human annotators. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11700–11709 (2019)

    Google Scholar 

  30. Le, H., Mai, L., Price, B., Cohen, S., Jin, H., Liu, F.: Interactive boundary prediction for object selection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 18–33 (2018)

    Google Scholar 

  31. Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)

    Google Scholar 

  32. Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: European Conference on Computer Vision, pp. 173–190 (2020)

    Google Scholar 

  33. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

    Google Scholar 

  34. Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)

    Google Scholar 

  35. Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: faster and better learning for bounding box regression. Proc. AAAI Conf. Artif. Intell. 34(07), 12993–13000 (2020)

    Google Scholar 

  36. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  37. Kontogianni, T., Gygli, M., Uijlings, J., Ferrari, V.: Continuous adaptation for interactive object segmentation by learning from corrections. In: European Conference on Computer Vision, pp. 579–596 (2020)

    Google Scholar 

  38. McGuinness, K., O’Connor, N.E.: A comparative evaluation of interactive segmentation algorithms. Pattern Recogn. 43(2), 434–444 (2010)

    Article  Google Scholar 

  39. Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 724–732 (2016)

    Google Scholar 

  40. Hariharan, B., Arbeláez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: 2011 International Conference on Computer Vision, pp. 991–998 (2011)

    Google Scholar 

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 62172221, and in part by the Fundamental Research Funds for the Central Universities under Grant No. JSGP202204.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, H., Ni, J., Li, Z., Qian, Y., Wang, T. (2022). Enhanced Spatial Awareness for Deep Interactive Image Segmentation. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13537. Springer, Cham. https://doi.org/10.1007/978-3-031-18916-6_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-18916-6_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18915-9

  • Online ISBN: 978-3-031-18916-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics