HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
CoRR(2024)
摘要
This study introduces HQ-Edit, a high-quality instruction-based image editing
dataset with around 200,000 edits. Unlike prior approaches relying on attribute
guidance or human feedback on building datasets, we devise a scalable data
collection pipeline leveraging advanced foundation models, namely GPT-4V and
DALL-E 3. To ensure its high quality, diverse examples are first collected
online, expanded, and then used to create high-quality diptychs featuring input
and output images with detailed text prompts, followed by precise alignment
ensured through post-processing. In addition, we propose two evaluation
metrics, Alignment and Coherence, to quantitatively assess the quality of image
edit pairs using GPT-4V. HQ-Edits high-resolution images, rich in detail and
accompanied by comprehensive editing prompts, substantially enhance the
capabilities of existing image editing models. For example, an HQ-Edit
finetuned InstructPix2Pix can attain state-of-the-art image editing
performance, even surpassing those models fine-tuned with human-annotated data.
The project page is https://thefllood.github.io/HQEdit_web.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要