Two-Channel VAE-GAN Based Image-To-Video Translation

Intelligent Computing Theories and Application(2022)

引用 1|浏览2
暂无评分
摘要
We propose a VAE-GAN network with a two-channel decoder for addressing multiple image-to-video translation tasks, i.e., generating multiple videos of different categories by a single model. We consider this image-to-video translation as a video generation task rather than a video prediction that needs multiple frames as input. After training, the model only requires the first frame of the video and its corresponding attribute to generate the required video. The advantage of combining the Variational Autoencoder (VAE) and Generative Adversarial Network (GAN) is to avoid the shortcomings of both: VAE components can give rise to blur, and unstable gradients caused by the GAN. Extensive qualitative and quantitative experiments are conducted on the MUG [1] dataset. We draw the following conclusions from this empirical study: compared with state-of-the-art approaches, our approach (VAE-GAN) exhibits significant improvements in generative capability.
更多
查看译文
关键词
Video generation, Variational autoencoder, Generative adversarial network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要