Dipnet: Dynamic Identity Propagation Network For Video Object Segmentation

2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV)(2020)

引用 22|浏览173
暂无评分
摘要
Many recent methods for semi-supervised Video Object Segmentation (VOS) have achieved good performance by exploiting the annotated first frame via one-shot fine-tuning or mask propagation. However, heavily relying on the first frame may weaken the robustness for VOS, since video objects can show large variations through time. In this work, we propose a Dynamic Identity Propagation Network (DIP-Net) that adaptively propagates and accurately segments the video objects over time. To achieve this, DIPNet factors the VOS task at each time step into a dynamic propagation phase and a spatial segmentation phase. The former utilizes a novel identity representation to adaptively propagate objects' reference information over time, which enhances the robustness to videos' temporal variations. The segmentation phase uses the propagated information to tackle the object segmentation as an easier static image problem that can be optimized via light-weight fine-tuning on the first frame, thus reducing the computational cost. As a result, by optimizing these two components to complement each other, we can achieve a robust system for VOS. Evaluations on four benchmark datasets show that DIPNet provides state-of-the-art performance with time efficiency.
更多
查看译文
关键词
deep CNN,dynamic identity propagation network,semisupervised video object segmentation,DIPNet,identity representation,spatial segmentation,one-shot fine-tuning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要