Generative Adversarial Networks for Depth Map Estimation from RGB Video

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)(2018)

引用 29|浏览26
暂无评分
摘要
Depth cues are essential to achieving high-level scene understanding, and in particular to determining geometric relations between objects. The ability to reason about depth information in scene analysis tasks can often result in improved decision-making capabilities. Unfortunately, depth-capable sensors are not as ubiquitous as traditional RGB cameras, which limits the availability of depth-related cues. In this work, we investigate data-driven approaches for depth estimation from images or videos captured with monocular cameras. We propose three different approaches and demonstrate their efficacy through extensive experimental validation. The proposed methods rely on processing of (i) a single 3-channel RGB image frame, (ii) a sequence of RGB frames, and (iii) a single RGB frame plus the optical flow field computed between the frame and a neighboring frame in the video stream, and map the respective inputs to an estimated depth map representation. In contrast to existing literature, the input-output mapping is not directly regressed; rather, it is learned through adversarial techniques that leverage conditional generative adversarial networks (cGANs).
更多
查看译文
关键词
RGB frames sequence,single 3-channel RGB image frame,geometric relation determination,conditional generative adversarial networks,high-level scene understanding,adversarial techniques,estimated depth map representation,video stream,neighboring frame,optical flow field,single RGB frame,monocular cameras,data-driven approaches,depth-related cues,traditional RGB cameras,improved decision-making capabilities,scene analysis tasks,depth information,depth cues,RGB video,depth map estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要