VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing
arxiv(2024)
摘要
Due to the significant advances in large-scale text-to-image generation by
diffusion model (DM), controllable human image generation has been attracting
much attention recently. Existing works, such as Controlnet [36], T2I-adapter
[20] and HumanSD [10] have demonstrated good abilities in generating human
images based on pose conditions, they still fail to meet the requirements of
real e-commerce scenarios. These include (1) the interaction between the shown
product and human should be considered, (2) human parts like face/hand/arm/foot
and the interaction between human model and product should be hyper-realistic,
and (3) the identity of the product shown in advertising should be exactly
consistent with the product itself. To this end, in this paper, we first define
a new human image generation task for e-commerce marketing, i.e.,
Object-ID-retentive Human-object Interaction image Generation (OHG), and then
propose a VirtualModel framework to generate human images for product shown,
which supports displays of any categories of products and any types of
human-object interaction. As shown in Figure 1, VirtualModel not only
outperforms other methods in terms of accurate pose control and image quality
but also allows for the display of user-specified product objects by
maintaining the product-ID consistency and enhancing the plausibility of
human-object interaction. Codes and data will be released.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要