Wavelet-Guided Acceleration of Text Inversion in Diffusion-Based Image Editing
IEEE International Conference on Acoustics, Speech, and Signal Processing(2024)
Abstract
In the field of image editing, Null-text Inversion (NTI) enables fine-grainedediting while preserving the structure of the original image by optimizing nullembeddings during the DDIM sampling process. However, the NTI process istime-consuming, taking more than two minutes per image. To address this, weintroduce an innovative method that maintains the principles of the NTI whileaccelerating the image editing process. We propose the WaveOpt-Estimator, whichdetermines the text optimization endpoint based on frequency characteristics.Utilizing wavelet transform analysis to identify the image's frequencycharacteristics, we can limit text optimization to specific timesteps duringthe DDIM sampling process. By adopting the Negative-Prompt Inversion (NPI)concept, a target prompt representing the original image serves as the initialtext value for optimization. This approach maintains performance comparable toNTI while reducing the average editing time by over 80method. Our method presents a promising approach for efficient, high-qualityimage editing based on diffusion models.
MoreTranslated text
Key words
Image editing,Null-Text Inversion,text optimization,diffusion model
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined