Twostreamvan: Improving Motion Modeling In Video Generation

2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV)(2020)

引用 5|浏览133
暂无评分
摘要
Video generation is an inherently challenging task, as it requires modeling realistic temporal dynamics as well as spatial content. Existing methods entangle the two intrinsically different tasks of motion and content creation in a single generator network, but this approach struggles to simultaneously generate plausible motion and content. To improve motion modeling in video generation task, we propose a two-stream model that disentangles motion generation from content generation, called a Two-Stream Variational Adversarial Network (TwoStreamVAN). Given an action label and a noise vector, our model is able to create clear and consistent motion, and thus yields photorealistic videos. The key idea is to progressively generate and fuse multi-scale motion with its corresponding spatial content. Our model significantly outperforms existing methods on the standard Weizmann Human Action, MUG Facial Expression and VoxCeleb datasets, as well as our new dataset of diverse human actions with challenging and complex motion.
更多
查看译文
关键词
TwoStreamVAN,motion modeling,realistic temporal dynamics,content creation,single generator network,video generation,two-stream model,motion generation,content generation,multiscale motion,complex motion,photorealistic videos,two-stream variational adversarial network,action label,noise vector
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要