A Multi-In-Single-Out Network for Video Frame Interpolation without Optical Flow.
CoRR(2023)
摘要
In general, deep learning-based video frame interpolation (VFI) methods have
predominantly focused on estimating motion vectors between two input frames and
warping them to the target time. While this approach has shown impressive
performance for linear motion between two input frames, it exhibits limitations
when dealing with occlusions and nonlinear movements. Recently, generative
models have been applied to VFI to address these issues. However, as VFI is not
a task focused on generating plausible images, but rather on predicting
accurate intermediate frames between two given frames, performance limitations
still persist. In this paper, we propose a multi-in-single-out (MISO) based VFI
method that does not rely on motion vector estimation, allowing it to
effectively model occlusions and nonlinear motion. Additionally, we introduce a
novel motion perceptual loss that enables MISO-VFI to better capture the
spatio-temporal correlations within the video frames. Our MISO-VFI method
achieves state-of-the-art results on VFI benchmarks Vimeo90K, Middlebury, and
UCF101, with a significant performance gap compared to existing approaches.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要