SODA: Bottleneck Diffusion Models for Representation Learning

Hudson, Drew A.; Zoran, Daniel; Malinowski, Mateusz; Lampinen, Andrew K.; Jaegle, Andrew; McClelland, James L.; Matthey, Loic; Hill, Felix; Lerchner, Alexander

Computer Science > Computer Vision and Pattern Recognition

arXiv:2311.17901 (cs)

[Submitted on 29 Nov 2023]

Title:SODA: Bottleneck Diffusion Models for Representation Learning

Authors:Drew A. Hudson, Daniel Zoran, Mateusz Malinowski, Andrew K. Lampinen, Andrew Jaegle, James L. McClelland, Loic Matthey, Felix Hill, Alexander Lerchner

View PDF

Abstract:We introduce SODA, a self-supervised diffusion model, designed for representation learning. The model incorporates an image encoder, which distills a source view into a compact representation, that, in turn, guides the generation of related novel views. We show that by imposing a tight bottleneck between the encoder and a denoising decoder, and leveraging novel view synthesis as a self-supervised objective, we can turn diffusion models into strong representation learners, capable of capturing visual semantics in an unsupervised manner. To the best of our knowledge, SODA is the first diffusion model to succeed at ImageNet linear-probe classification, and, at the same time, it accomplishes reconstruction, editing and synthesis tasks across a wide range of datasets. Further investigation reveals the disentangled nature of its emergent latent space, that serves as an effective interface to control and manipulate the model's produced images. All in all, we aim to shed light on the exciting and promising potential of diffusion models, not only for image generation, but also for learning rich and robust representations.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2311.17901 [cs.CV]
	(or arXiv:2311.17901v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2311.17901

Submission history

From: Drew A. Hudson [view email]
[v1] Wed, 29 Nov 2023 18:53:34 UTC (5,366 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SODA: Bottleneck Diffusion Models for Representation Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SODA: Bottleneck Diffusion Models for Representation Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators