MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation

Authors

  • Minh-Quan Le University of Science, VNU-HCM, Ho Chi Minh City, Vietnam Vietnam National University, Ho Chi Minh City, Vietnam Stony Brook University, United States
  • Tam V. Nguyen University of Dayton, United States
  • Trung-Nghia Le University of Science, VNU-HCM, Ho Chi Minh City, Vietnam Vietnam National University, Ho Chi Minh City, Vietnam
  • Thanh-Toan Do Monash University, Australia
  • Minh N. Do University of Illinois at Urbana-Champaign, United States
  • Minh-Triet Tran University of Science, VNU-HCM, Ho Chi Minh City, Vietnam Vietnam National University, Ho Chi Minh City, Vietnam

DOI:

https://doi.org/10.1609/aaai.v38i3.28068

Keywords:

CV: Segmentation, ML: Deep Generative Models & Autoencoders

Abstract

Few-shot instance segmentation extends the few-shot learning paradigm to the instance segmentation task, which tries to segment instance objects from a query image with a few annotated examples of novel categories. Conventional approaches have attempted to address the task via prototype learning, known as point estimation. However, this mechanism depends on prototypes (e.g. mean of K-shot) for prediction, leading to performance instability. To overcome the disadvantage of the point estimation mechanism, we propose a novel approach, dubbed MaskDiff, which models the underlying conditional distribution of a binary mask, which is conditioned on an object region and K-shot information. Inspired by augmentation approaches that perturb data with Gaussian noise for populating low data density regions, we model the mask distribution with a diffusion probabilistic model. We also propose to utilize classifier-free guided mask sampling to integrate category information into the binary mask generation process. Without bells and whistles, our proposed method consistently outperforms state-of-the-art methods on both base and novel classes of the COCO dataset while simultaneously being more stable than existing methods. The source code is available at: https://github.com/minhquanlecs/MaskDiff.

Downloads

Published

2024-03-24

How to Cite

Le, M.-Q., Nguyen, T. V., Le, T.-N., Do, T.-T., Do, M. N., & Tran, M.-T. (2024). MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(3), 2874-2881. https://doi.org/10.1609/aaai.v38i3.28068

Issue

Section

AAAI Technical Track on Computer Vision II