Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation

Van Gansbeke, Wouter; Vandenhende, Simon; Van Gool, Luc

Computer Science > Computer Vision and Pattern Recognition

arXiv:2206.06363 (cs)

[Submitted on 13 Jun 2022]

Title:Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation

Authors:Wouter Van Gansbeke, Simon Vandenhende, Luc Van Gool

View PDF

Abstract:The task of unsupervised semantic segmentation aims to cluster pixels into semantically meaningful groups. Specifically, pixels assigned to the same cluster should share high-level semantic properties like their object or part category. This paper presents MaskDistill: a novel framework for unsupervised semantic segmentation based on three key ideas. First, we advocate a data-driven strategy to generate object masks that serve as a pixel grouping prior for semantic segmentation. This approach omits handcrafted priors, which are often designed for specific scene compositions and limit the applicability of competing frameworks. Second, MaskDistill clusters the object masks to obtain pseudo-ground-truth for training an initial object segmentation model. Third, we leverage this model to filter out low-quality object masks. This strategy mitigates the noise in our pixel grouping prior and results in a clean collection of masks which we use to train a final segmentation model. By combining these components, we can considerably outperform previous works for unsupervised semantic segmentation on PASCAL (+11% mIoU) and COCO (+4% mask AP50). Interestingly, as opposed to existing approaches, our framework does not latch onto low-level image cues and is not limited to object-centric datasets. The code and models will be made available.

Comments:	Code: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2206.06363 [cs.CV]
	(or arXiv:2206.06363v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2206.06363

Submission history

From: Wouter Van Gansbeke [view email]
[v1] Mon, 13 Jun 2022 17:59:43 UTC (5,406 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators