DS-NeRV: Implicit Neural Video Representation with Decomposed Static and Dynamic Codes
CVPR 2024(2024)
摘要
Implicit neural representations for video (NeRV) have recently become a novel
way for high-quality video representation. However, existing works employ a
single network to represent the entire video, which implicitly confuse static
and dynamic information. This leads to an inability to effectively compress the
redundant static information and lack the explicitly modeling of global
temporal-coherent dynamic details. To solve above problems, we propose DS-NeRV,
which decomposes videos into sparse learnable static codes and dynamic codes
without the need for explicit optical flow or residual supervision. By setting
different sampling rates for two codes and applying weighted sum and
interpolation sampling methods, DS-NeRV efficiently utilizes redundant static
information while maintaining high-frequency details. Additionally, we design a
cross-channel attention-based (CCA) fusion module to efficiently fuse these two
codes for frame decoding. Our approach achieves a high quality reconstruction
of 31.2 PSNR with only 0.35M parameters thanks to separate static and dynamic
codes representation and outperforms existing NeRV methods in many downstream
tasks. Our project website is at https://haoyan14.github.io/DS-NeRV.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要