Training-free Content Injection using h-space in Diffusion Models

Jeong, Jaeseok; Kwon, Mingi; Uh, Youngjung

Computer Science > Computer Vision and Pattern Recognition

arXiv:2303.15403 (cs)

[Submitted on 27 Mar 2023 (v1), last revised 4 Jan 2024 (this version, v2)]

Title:Training-free Content Injection using h-space in Diffusion Models

Authors:Jaeseok Jeong, Mingi Kwon, Youngjung Uh

View PDF HTML (experimental)

Abstract:Diffusion models (DMs) synthesize high-quality images in various domains. However, controlling their generative process is still hazy because the intermediate variables in the process are not rigorously studied. Recently, the bottleneck feature of the U-Net, namely $h$-space, is found to convey the semantics of the resulting image. It enables StyleCLIP-like latent editing within DMs. In this paper, we explore further usage of $h$-space beyond attribute editing, and introduce a method to inject the content of one image into another image by combining their features in the generative processes. Briefly, given the original generative process of the other image, 1) we gradually blend the bottleneck feature of the content with proper normalization, and 2) we calibrate the skip connections to match the injected content. Unlike custom-diffusion approaches, our method does not require time-consuming optimization or fine-tuning. Instead, our method manipulates intermediate features within a feed-forward generative process. Furthermore, our method does not require supervision from external networks. The code is available at this https URL

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2303.15403 [cs.CV]
	(or arXiv:2303.15403v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2303.15403

Submission history

From: Jaeseok Jeong [view email]
[v1] Mon, 27 Mar 2023 17:19:50 UTC (46,035 KB)
[v2] Thu, 4 Jan 2024 09:23:07 UTC (26,556 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Training-free Content Injection using h-space in Diffusion Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Training-free Content Injection using h-space in Diffusion Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators