TexSliders: Diffusion-Based Texture Editing in CLIP Space
arxiv(2024)
摘要
Generative models have enabled intuitive image creation and manipulation
using natural language. In particular, diffusion models have recently shown
remarkable results for natural image editing. In this work, we propose to apply
diffusion techniques to edit textures, a specific class of images that are an
essential part of 3D content creation pipelines. We analyze existing editing
methods and show that they are not directly applicable to textures, since their
common underlying approach, manipulating attention maps, is unsuitable for the
texture domain. To address this, we propose a novel approach that instead
manipulates CLIP image embeddings to condition the diffusion generation. We
define editing directions using simple text prompts (e.g., "aged wood" to "new
wood") and map these to CLIP image embedding space using a texture prior, with
a sampling-based approach that gives us identity-preserving directions in CLIP
space. To further improve identity preservation, we project these directions to
a CLIP subspace that minimizes identity variations resulting from entangled
texture attributes. Our editing pipeline facilitates the creation of arbitrary
sliders using natural language prompts only, with no ground-truth annotated
data necessary.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要