Beyond Talking – Generating Holistic 3D Human Dyadic Motion for Communication
CoRR(2024)
摘要
In this paper, we introduce an innovative task focused on human
communication, aiming to generate 3D holistic human motions for both speakers
and listeners. Central to our approach is the incorporation of factorization to
decouple audio features and the combination of textual semantic information,
thereby facilitating the creation of more realistic and coordinated movements.
We separately train VQ-VAEs with respect to the holistic motions of both
speaker and listener. We consider the real-time mutual influence between the
speaker and the listener and propose a novel chain-like transformer-based
auto-regressive model specifically designed to characterize real-world
communication scenarios effectively which can generate the motions of both the
speaker and the listener simultaneously. These designs ensure that the results
we generate are both coordinated and diverse. Our approach demonstrates
state-of-the-art performance on two benchmark datasets. Furthermore, we
introduce the HoCo holistic communication dataset, which is a valuable resource
for future research. Our HoCo dataset and code will be released for research
purposes upon acceptance.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要