Skip to main content
Log in

Pull and concentrate: improving unsupervised semantic segmentation adaptation with cross- and intra-domain consistencies

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Unsupervised domain adaptation (UDA) is an important solution for the cross-domain problem in semantic segmentation. Existing segmentation UDA methods mainly consider the domain shift as the major challenge. This paper, from a novel viewpoint, disentangles the cross-domain problem into two negative factors beyond the domain shift. Specifically, we find that apart from the domain shift factor, the dispersed within-class distribution on the target domain is another factor that compromises cross-domain segmentation. This paper finds that the neglected target domain distribution dispersion is a challenge as crucial as the domain shift. In response to the joint of these two negative factors, we propose a “Pull-and-Concentrate” (PuCo) method comprised of two consistencies: (1) A cross-domain consistency “pulls” the source and target domain distribution (of the same class) close to each other based on a novel statistical style transfer. (2) An intra-domain consistency “concentrates” the within-class distribution on the target domain in a new unsupervised teacher-student method. Both consistencies have the advantage of being robust (or insulated) from pseudo-label noises. This advantage allows PuCo to bring consistent improvement over a battery of pseudo-label-based UDA methods. For example, on GTA5 to Cityscapes and SYNTHIA to Cityscapes, PuCo achieves \(60.3\%\) and \(57.2\%\) mean IoU, respectively. Code is available at https://github.com/Jarvis73/PuCo.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data avilability

The data used in the experiments are all open-source. For details, please refer to the source code https://github.com/Jarvis73/PuCo.

References

  1. Hoffman, J., Wang, D., Yu, F., Darrell, T.: FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation. arXiv:1612.02649 (2016)

  2. Tsai, Y.-H., Hung, W.-C., Schulter, S., Sohn, K., Yang, M.-H., Chandraker, M.: Learning to Adapt Structured Output Space for Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7472–7481 (2018)

  3. Hoffman, J., Tzeng, E., Park, T., Zhu, J.-Y., Isola, P., Saenko, K., Efros, A., Darrell, T.: Cycada: Cycle-consistent adversarial domain adaptation. In: International Conference on Machine Learning, pp. 1989–1998 (2018). PMLR

  4. Wu, Z., Han, X., Lin, Y.-L., Uzunbas, M.G., Goldstein, T., Lim, S.N., Davis, L.S.: DCAN: Dual Channel-wise Alignment Networks for Unsupervised Scene Adaptation. In: ECCV, pp. 518–534 (2018)

  5. Luo, Y., Liu, P., Guan, T., Yu, J., Yang, Y.: Significance-Aware Information Bottleneck for Domain Adaptive Semantic Segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6778–6787 (2019)

  6. Yang, Y., Soatto, S.: FDA: Fourier Domain Adaptation for Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4085–4095 (2020)

  7. Zou, Y., Yu, Z., Vijaya Kumar, B.V.K., Wang, J.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: ECCV (2018)

  8. Luo, Y., Zheng, L., Guan, T., Yu, J., Yang, Y.: Taking a Closer Look at Domain Shift: Category-Level Adversaries for Semantics Consistent Domain Adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2507–2516 (2019)

  9. Zhang, Q., Zhang, J., Liu, W., Tao, D.: Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation. Advances in Neural Information Processing Systems 32 (2019)

  10. Mei, K., Zhu, C., Zou, J., Zhang, S.: Instance Adaptive Self-training for Unsupervised Domain Adaptation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV. Lecture Notes in Computer Science, pp. 415–430. Springer International Publishing, Cham (2020)

  11. Wang, H., Shen, T., Zhang, W., Duan, L.-Y., Mei, T.: Classes Matter: A Fine-Grained Adversarial Approach to Cross-Domain Semantic Segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV. Lecture Notes in Computer Science, pp. 642–659. Springer International Publishing, Cham (2020)

  12. Zhang, P., Zhang, B., Zhang, T., Chen, D., Wang, Y., Wen, F.: Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12414–12424 (2021)

  13. Araslanov, N., Roth, S.: Self-Supervised Augmentation Consistency for Adapting Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15384–15394 (2021)

  14. Zou, Y., Yu, Z., Liu, X., Kumar, B.V.K.V., Wang, J.: Confidence Regularized Self-Training. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5982–5991 (2019)

  15. Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1501–1510 (2017)

  16. Zhou, K., Yang, Y., Qiao, Y., Xiang, T.: Domain generalization with MixStyle. In: ICLR (2021)

  17. Grill, J.-B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M., Piot, B., Kavukcuoglu, K., Munos, R., Valko, M.: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. Adv. Neural. Inf. Process. Syst. 33, 21271–21284 (2020)

    Google Scholar 

  18. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum Contrast for Unsupervised Visual Representation Learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9729–9738 (2020)

  19. Wang, X., Zhang, R., Shen, C., Kong, T., Li, L.: Dense Contrastive Learning for Self-Supervised Visual Pre-Training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3024–3033 (2021)

  20. Wei, C., Shen, K., Chen, Y., Ma, T.: Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data. In: ICLR (2020)

  21. Chapelle, O., Scholkopf, B., Zien, A. Eds.: Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006) [Book reviews]. IEEE Transactions on Neural Networks 20(3), 542–542 (2009)

  22. Amini, M.-R., Feofanov, V., Pauletto, L., Devijver, E., Maximov, Y.: Self-Training: A Survey. arXiv

  23. Zheng, Z., Yang, Y.: Rectifying Pseudo Label Learning via Uncertainty Estimation for Domain Adaptive Semantic Segmentation. IJCV 129(4), 1106–1120 (2021)

    Article  Google Scholar 

  24. Cheng, Y., Wei, F., Bao, J., Chen, D., Wen, F., Zhang, W.: Dual Path Learning for Domain Adaptation of Semantic Segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9082–9091 (2021)

  25. Li, W., Yang, X., Li, Z.: Mlcb-net: a multi-level class balancing network for domain adaptive semantic segmentation. Multimedia Systems, 1–12 (2023)

  26. Melas-Kyriazi, L., Manrai, A.K.: PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12435–12445 (2021)

  27. Wang, Z., Yu, M., Wei, Y., Feris, R., Xiong, J., Hwu, W.-m., Huang, T.S., Shi, H.: Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12635–12644 (2020)

  28. Guo, X., Yang, C., Li, B., Yuan, Y.: MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3927–3936 (2021)

  29. Li, R., Li, S., He, C., Zhang, Y., Jia, X., Zhang, L.: Class-Balanced Pixel-Level Self-Labeling for Domain Adaptive Semantic Segmentation. arXiv:2203.09744 [cs] (2022)

  30. Xie, B., Li, S., Li, M., Liu, C.H., Huang, G., Wang, G.: SePiCo: Semantic-Guided Pixel Contrast for Domain Adaptive Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 45(7), 9004–9021 (2023)

    Google Scholar 

  31. Li, T., Roy, S., Zhou, H., Lu, H., Lathuilière, S.: Contrast, Stylize and Adapt: Unsupervised Contrastive Learning Framework for Domain Adaptive Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4868–4878 (2023)

  32. Hoyer, L., Dai, D., Van Gool, L.: DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

  33. Hoyer, L., Dai, D., Van Gool, L.: HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation. arXiv:2204.13132 [cs] (2022)

  34. Gong, R., Wang, Q., Danelljan, M., Dai, D., Van Gool, L.: Continuous Pseudo-Label Rectified Domain Adaptive Semantic Segmentation With Implicit Neural Representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7225–7235 (2023)

  35. Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems 30 (2017)

  36. Laine, S., Aila, T.: Temporal Ensembling for Semi-Supervised Learning. arXiv:1610.02242 (2017)

  37. Gong, C., Wang, D., Liu, Q.: AlphaMatch: Improving Consistency for Semi-Supervised Learning With Alpha-Divergence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13683–13692 (2021)

  38. Hyun, M., Jeong, J., Kwak, N.: Class-Imbalanced Semi-Supervised Learning. arXiv:2002.06815 (2020)

  39. Sohn, K., Berthelot, D., Carlini, N., Zhang, Z., Zhang, H., Raffel, C.A., Cubuk, E.D., Kurakin, A., Li, C.-L.: Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Adv. Neural. Inf. Process. Syst. 33, 596–608 (2020)

    Google Scholar 

  40. Ghosh, A., Thiery, A.H.: On Data-Augmentation and Consistency-Based Semi-Supervised Learning. In: ICLR (2020)

  41. Lai, X., Tian, Z., Jiang, L., Liu, S., Zhao, H., Wang, L., Jia, J.: Semi-Supervised Semantic Segmentation With Directional Context-Aware Consistency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1205–1214 (2021)

  42. Wu, Y., Liu, C., Chen, L., Zhao, D., Zheng, Q., Zhou, H.: Perturbation consistency and mutual information regularization for semi-supervised semantic segmentation. Multimedia Systems, 1–13 (2022)

  43. Xie, Z., Lin, Y., Zhang, Z., Cao, Y., Lin, S., Hu, H.: Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16684–16693 (2021)

  44. Wang, W., Zhou, T., Yu, F., Dai, J., Konukoglu, E., Van Gool, L.: Exploring Cross-Image Pixel Contrast for Semantic Segmentation. arXiv:2101.11939 (2021)

  45. Liang, X., Wu, L., Li, J., Wang, Y., Meng, Q., Qin, T., Chen, W., Zhang, M., Liu, T.-Y.: R-Drop: Regularized Dropout for Neural Networks. arXiv:2106.14448 (2021)

  46. Huang, T., Sun, Y., Wang, X., Yao, H., Zhang, C.: Spatial Ensemble: A Novel Model Smoothing Mechanism for Student-Teacher Framework. In: Advances in Neural Information Processing Systems, vol. 34, pp. 15957–15968. Curran Associates, Inc

  47. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580 (2012)

  48. Yang, Y., Zhuang, Y., Pan, Y.: Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies. Frontiers of Information Technology & Electronic Engineering 22(12), 1551–1558 (2021)

    Article  Google Scholar 

  49. Gatys, L.A., Ecker, A.S., Bethge, M.: Image Style Transfer Using Convolutional Neural Networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2414–2423 (2016)

  50. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: The missing ingredient for fast stylization. arXiv:1607.08022 (2016)

  51. Peng, D., Lei, Y., Liu, L., Zhang, P., Liu, J.: Global and Local Texture Randomization for Synthetic-to-Real Semantic Segmentation 30, 6594–6608

  52. Zhao, Y., Zhong, Z., Luo, Z., Lee, G.H., Sebe, N.: Source-Free Open Compound Domain Adaptation in Semantic Segmentation, 1–1

  53. Wang, X., Zhu, L., Zheng, Z., Xu, M., Yang, Y.: Align and tell: Boosting text-video retrieval with local alignment and fine-grained supervision. IEEE Transactions on Multimedia (2022)

  54. Li, Y., Yuan, L., Vasconcelos, N.: Bidirectional Learning for Domain Adaptation of Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6936–6945 (2019)

  55. Yang, J., An, W., Wang, S., Zhu, X., Yan, C., Huang, J.: Label-Driven Reconstruction for Domain Adaptation in Semantic Segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV. Lecture Notes in Computer Science, pp. 480–498. Springer International Publishing, Cham (2020)

  56. Musto, L., Zinelli, A.: Semantically Adaptive Image-to-image Translation for Domain Adaptation of Semantic Segmentation. arXiv:2009.01166 (2020)

  57. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Bach, F.R., Blei, D.M. (eds.) Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015. JMLR Workshop and Conference Proceedings, vol. 37, pp. 448–456. JMLR.org

  58. French, G., Mackiewicz, M., Fisher, M.: Self-ensembling for visual domain adaptation. In: ICLR (2018)

  59. Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: Ground truth from computer games. In: ECCV, pp. 102–118 (2016). Springer

  60. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The Cityscapes Dataset for Semantic Urban Scene Understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3213–3223 (2016)

  61. Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016)

  62. Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE TPAMI 40(4), 834–848 (2018)

    Article  Google Scholar 

  63. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

  64. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting 15(56), 1929–1958

  65. Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv:1706.05587 (2017)

  66. Chen, X., He, K.: Exploring Simple Siamese Representation Learning. arXiv:2011.10566 (2020)

  67. Tranheden, W., Olsson, V., Pinto, J., Svensson, L.: DACS: Domain Adaptation via Cross-Domain Mixed Sampling. In: WACV, pp. 1379–1389 (2021)

  68. Vu, T.-H., Jain, H., Bucher, M., Cord, M., Perez, P.: ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2517–2526 (2019)

  69. Yang, J., Xu, R., Li, R., Qi, X., Shen, X., Li, G., Lin, L.: An Adversarial Perturbation Oriented Domain Adaptation Approach for Semantic Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence 34(07), 12613–12620 (2020)

    Article  Google Scholar 

  70. Tsai, Y.-H., Sohn, K., Schulter, S., Chandraker, M.: Domain Adaptation for Structured Output via Discriminative Patch Representations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1456–1465 (2019)

  71. Truong, T.-D., Duong, C.N., Le, N., Phung, S.L., Rainwater, C., Luu, K.: BiMaL: Bijective Maximum Likelihood Approach to Domain Adaptation in Semantic Scene Segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8548–8557 (2021)

  72. Zhang, Y., Qiu, Z., Yao, T., Ngo, C.-W., Liu, D., Mei, T.: Transferring and Regularizing Prediction for Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9621–9630 (2020)

  73. Lian, Q., Lv, F., Duan, L., Gong, B.: Constructing Self-Motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6758–6767 (2019)

  74. Ma, H., Lin, X., Wu, Z., Yu, Y.: Coarse-To-Fine Domain Adaptive Semantic Segmentation With Photometric Alignment and Category-Center Regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4051–4060 (2021)

  75. Liu, Y., Deng, J., Gao, X., Li, W., Duan, L.: BAPA-Net: Boundary Adaptation and Prototype Alignment for Cross-Domain Semantic Segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8801–8811 (2021)

  76. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A Simple Framework for Contrastive Learning of Visual Representations. ICML 1 (2020)

  77. McInnes, L., Healy, J., Melville, J.: UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:1802.03426 (2020)

Download references

Acknowledgements

Jian-Wei Zhang’s work on this paper is done during an internship at Baidu Research. This work is supported by the National Natural Science Foundation of China (62132017) and Fundamental Research Funds for the Central Universities (226-2022-00235).

Author information

Authors and Affiliations

Authors

Contributions

Jian-Wei Zhang and Yifan Sun wrote the main manuscript text and Jian-Wei Zhang prepared all the figures. All the authors reviewed the manuscript.

Corresponding author

Correspondence to Wei Chen.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Communicated by A. Liu.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, JW., Sun, Y. & Chen, W. Pull and concentrate: improving unsupervised semantic segmentation adaptation with cross- and intra-domain consistencies. Multimedia Systems 29, 2633–2650 (2023). https://doi.org/10.1007/s00530-023-01131-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-023-01131-9

Keywords

Navigation