Generating Human Motion in 3D Scenes from Text Descriptions
CVPR 2024(2024)
摘要
Generating human motions from textual descriptions has gained growing
research interest due to its wide range of applications. However, only a few
works consider human-scene interactions together with text conditions, which is
crucial for visual and physical realism. This paper focuses on the task of
generating human motions in 3D indoor scenes given text descriptions of the
human-scene interactions. This task presents challenges due to the
multi-modality nature of text, scene, and motion, as well as the need for
spatial reasoning. To address these challenges, we propose a new approach that
decomposes the complex problem into two more manageable sub-problems: (1)
language grounding of the target object and (2) object-centric motion
generation. For language grounding of the target object, we leverage the power
of large language models. For motion generation, we design an object-centric
scene representation for the generative model to focus on the target object,
thereby reducing the scene complexity and facilitating the modeling of the
relationship between human motions and the object. Experiments demonstrate the
better motion quality of our approach compared to baselines and validate our
design choices.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要